Range: The simplest measure of spread, calculated as . While easy to compute, it only considers the two most extreme values and ignores the distribution of the rest of the data.
Interquartile Range (IQR): The difference between the upper quartile () and lower quartile (). It measures the spread of the middle 50% of the data, making it immune to outliers.
Variance (): The average of the squared deviations from the mean. It provides a mathematical basis for spread but is measured in squared units of the original data.
Standard Deviation (): The square root of the variance. It is the most widely used measure of spread because it returns the variability to the same units as the original data.
Quartiles: These divide a data set into four equal parts. (Lower) marks the 25th percentile, (Median) the 50th, and (Upper) the 75th.
Calculation Logic: For a data set of size , the position of is found by . If this is an integer, the quartile is the midpoint between that position and the next; if not, round up to the next integer position.
Percentiles: These divide data into 100 equal parts. For example, the 90th percentile is the value below which 90% of the observations fall.
Summation: The symbol (Sigma) denotes the sum of a sequence. means the sum of all data values.
Mean Formula: for raw data, or for frequency tables.
Variance Formula: The computational version is often easier: This is described as 'the mean of the squares minus the square of the mean'.
Check the Units: Always ensure your final answer for standard deviation has the same units as the mean, while variance should have squared units.
Grouped Data Estimates: When calculating the mean from grouped data, you MUST use the midpoint of each class. Remember that this result is only an estimate because the exact values within the classes are unknown.
Rounding Errors: Carry at least 4 decimal places through intermediate calculations (like ) to avoid significant rounding errors in the final variance or standard deviation.
Sanity Check: Always verify that your calculated mean and median fall within the range of the data. If your mean is higher than your maximum value, a calculation error has occurred.