Probability, Distributions & Correlation | GCSE Statistics Notes

Q: What is the difference between theoretical and experimental probability?

Theoretical probability is based on logic and symmetry (e.g., 1/6 for a fair die). Experimental probability (relative frequency) is based on actual trials. As the number of trials increases, experimental probability converges toward theoretical probability.

Q: When should I use a tree diagram versus a Venn diagram?

Tree diagrams are best for sequences of events (e.g., drawing two marbles) and conditional probability. Venn diagrams are best for visualising relationships between sets and overlapping events (e.g., people who play football and/or tennis).

Q: What is the difference between Spearman's and Pearson's correlation?

Spearman's rank correlation measures any monotonic relationship using the ranks of data. Pearson's PMCC specifically measures the strength and direction of a linear relationship using raw data values.

Q: What are the quality assurance 'action lines'?

Action lines are set at three standard deviations from the mean (μ ± 3σ). If a measurement falls beyond these lines, it indicates a serious problem, and the production process should be stopped and investigated.

Q: How does the Petersen Capture-Recapture method work?

It estimates population size by capturing and marking a sample, releasing them, and then seeing what proportion of a second sample is marked. Formula: N = (Sample 1 × Sample 2) ÷ Recaptured.

📖 35 min read📅 Updated: 9 May 2026

Probability and distributions are the tools statisticians use to model the world. This chapter covers everything from basic chance to complex correlation coefficients and population estimation.

Topic 5.1 — Probability: Scale, Expected Frequency & Relative/Absolute Risk

Probability is the mathematical language of uncertainty, providing a framework for quantifying how likely events are to occur. At GCSE level, understanding probability begins with the probability scale, which runs from 0 to 1, or equivalently from 0% to 100%. A probability of 0 denotes an impossible event—such as rolling a seven on a standard six-sided die—while a probability of 1 denotes a certain event. All other probabilities fall between these two extremes. When calculating probability for equally likely outcomes, the fundamental formula is P(event) = number of favourable outcomes ÷ total number of possible outcomes.

A crucial extension of basic probability is the concept of expected frequency, which connects theoretical probability to practical prediction over multiple trials. Expected frequency is calculated by multiplying the probability of an event by the number of trials: Expected Frequency = P(event) × number of trials. It is vital to understand that the expected frequency is a long-term prediction; the actual result in any fixed number of trials may differ due to random variation.

Beyond simple events, statistics uses probability to discuss risk. Absolute risk refers to the actual probability of an event occurring in a specific group. Relative risk compares the probability of an event in one group to that in another. While relative risk highlights the magnitude of a difference, absolute risk provides the context of how common the event is overall. In exam answers, always present probabilities as fractions, decimals, or percentages—never as ratios such as 3:7.

Topic 5.2 — Theoretical vs Experimental Probability

Theoretical probability is derived from logical reasoning about a situation where all outcomes are known and equally likely. For example, the theoretical probability of rolling a six on a fair six-sided die is 1/6. This type of probability relies on the symmetry of the underlying mechanism and is calculated before any physical experiment takes place.

Experimental probability, also known as relative frequency, is determined by actually conducting an experiment and recording the outcomes: Experimental Prob = number of times occurred ÷ total trials. The discrepancy between theoretical and experimental results does not necessarily mean bias; it may simply be random variation. The Law of Large Numbers states that as the number of trials increases, the experimental probability will tend to converge towards the theoretical probability.

Topic 5.3 — Tree Diagrams, Sample Space & Two-Way Tables

When dealing with two or more events, visual tools are indispensable. A sample space diagram is a systematic list or table of all possible outcomes (e.g., a 6×6 grid for two dice). Two-way tables organise data according to two categorical variables, making it easy to read off joint and conditional probabilities.

Tree diagrams are particularly powerful for sequences of events. Each branch represents an outcome with its probability. To find the probability of a path, multiply the probabilities along the branches. To find the total probability of an event that can happen in several ways, add the probabilities at the ends of the paths. Venn diagrams also visualising the overlap (intersection) and total range (union) of events.

Topic 5.4 — Independent Events & Conditional Probability

Two events are independent if the occurrence of one does not affect the probability of the other (e.g., two coin flips). For independent events A and B: P(A and B) = P(A) × P(B). **Conditional probability** deals with situations where events do affect each other. The probability of B given A is denoted as P(B|A) and is defined as P(A and B) / P(A).

This is particularly useful in "without replacement" problems. On a tree diagram, the second set of branches changes based on the outcome of the first event. For Higher tier students, recognising independence is a key skill; you can check it by seeing if P(A) × P(B) equals P(A and B).

Topic 5.5 — Correlation: Types, Strength, Causation vs Association

Correlation describes the relationship between two quantitative variables. Positive correlation means both increase together; Negative correlation means one increases as the other decreases. Strength is measured by a coefficient: |r| ≥ 0.6 is strong, 0.2 to 0.6 is weak.

💡 Key Takeaway

Correlation does NOT imply causation. Just because ice cream sales and drownings are correlated doesn't mean one causes the other; both are caused by hot weather (a lurking variable).

Distinguish between Interpolation (predicting within data range—reliable) and Extrapolation (predicting outside range—unreliable).

Topic 5.6 — Spearman's Rank Correlation Coefficient

Spearman's Rank (rs) measures how well a relationship can be described using a monotonic function—it uses ranks rather than raw values. This is useful for ordinal data or non-linear relationships. The formula is: rs = 1 − (6Σd²) / (n(n²−1)).

To calculate: Rank each variable, find the difference in ranks (d), square them (d²), and sum them (Σd²). If values are tied, assign the average of the ranks they would have occupied. A value near +1 indicates a strong positive ranking relationship.

Topic 5.7 — Pearson's Product Moment Correlation Coefficient (PMCC)

Pearson's r measures the strength and direction of a linear relationship. It ranges from -1 to +1. Unlike Spearman's, Pearson's uses raw data values. A high Spearman's rank with a low Pearson's r suggests a strong relationship that is not linear (e.g., curved).

Topic 5.8 — Skewness: Identifying & Calculating

Skewness describes asymmetry. **Positive skew** has a long tail to the right (Mean > Median > Mode). **Negative skew** has a tail to the left (Mean < Median < Mode). For Higher tier, calculate skewness using: Skew = 3(Mean − Median) / Standard Deviation.

Topic 5.9 — Seasonal & Cyclic Trends

Seasonal trends repeat at fixed intervals (e.g., quarterly). **Cyclic trends** repeat but without a fixed period (e.g., economic cycles). **Mean seasonal variation** is the average deviation from the trend line for each season, used to make adjusted predictions.

Topic 5.10 — Rates of Change & Index Numbers (RPI, CPI, GDP)

Index numbers compare values over time relative to a base year (Index = 100). Index = (Current / Base) × 100. **RPI** and **CPI** track inflation, while **GDP** tracks economic output. **Weighted Index numbers** account for items with different importance.

Topic 5.11 — Binomial Distribution

Models scenarios with a fixed number of trials (n), two outcomes (success/failure), constant probability (p), and independent trials. You must be able to identify these conditions and calculate probabilities for small n (n ≤ 5) by considering combinations.

Topic 5.12 — Normal Distribution & Quality Assurance

The **Normal Distribution** is a bell-shaped curve where 68% of data is within 1σ, 95% within 2σ, and 99.7% within 3σ. In manufacturing, **Warning lines** are at ±2σ and **Action lines** at ±3σ. **z-scores** (Standardised scores) allow comparison between different datasets.

Topic 5.13 — Estimating Population & Petersen Capture-Recapture

Inferential statistics uses sample data to estimate population parameters. The **Petersen Capture-Recapture** method estimates total population (N) using: N = (Sample 1 × Sample 2) ÷ Recaptured. This assumes a constant population and random mixing.

Frequently Asked Questions

What is the difference between theoretical and experimental probability?▼