Relative frequency is an empirical estimate of probability, written as where is the number of successful outcomes and is the total number of trials. It is used when the true probability is unknown or hard to derive from first principles. The estimate is data-driven, so its quality depends on how representative and large the sample is.
Expected frequency is a predicted count, written as where is the probability of the event and is the number of planned trials. It answers "how many times should this happen on average" rather than "what is the chance of one trial." This idea is central when moving from probability to real-world planning and interpretation.
Observed frequency is what actually happened in an experiment, and it can differ from expected frequency due to natural random variation. Probability models describe long-run behavior, not guaranteed short-run outcomes. This distinction prevents the common mistake of treating expected values as exact outcomes.
Long-run stabilization explains why relative frequency becomes more reliable with more trials. As increases, random fluctuations tend to average out, so typically moves closer to the underlying probability. This is why large samples are preferred for inference.
Randomness and independence are core assumptions behind valid frequency-based estimates. Random selection reduces systematic bias, and independence means one trial does not change the probability structure of the next. If these assumptions fail, relative frequency may converge to the wrong value.
Model-to-data connection links theory and experiment in two directions: estimate from data using , or predict counts using . This dual use allows both diagnostic tasks (is a process fair?) and predictive tasks (how many outcomes should we expect).
Start with variable mapping by writing what each number represents before calculating. This prevents role confusion between successes, total trials, and future trials. It also makes unit checks easier because probability and frequency have different scales.
Use a reasonableness check after computing any result. For relative frequency, verify ; for expected frequency, verify the value is between and and consistent with event likelihood. Extreme answers are often signs of swapped numerator and denominator.
Memorize the direction rule: use to go from counts to probability, and use to go from probability to counts.
Treating expected frequency as guaranteed is a frequent conceptual error. Expected values represent long-run averages, so actual outcomes can be above or below expectation in a single run. The correct interpretation is probabilistic, not deterministic.
Ignoring dependence between trials can invalidate relative frequency conclusions. If outcomes affect later trials and conditions are not reset, the probability model changes during the experiment. Estimates from such data may be biased even when calculations are numerically correct.
Overconfidence from small samples leads to unstable conclusions about fairness or bias. A sample can look extreme by chance when is small, especially for low-probability events. Confidence improves when evidence is accumulated across many well-designed trials.