Quartiles: These are the three specific values that divide a sorted data set into four equal parts, each containing of the observations. The first quartile () marks the 25th percentile, the second quartile () is the median or 50th percentile, and the third quartile () represents the 75th percentile.
Percentiles: A percentile is a measure indicating the value below which a given percentage of observations in a group of data falls. For example, the 90th percentile () is the value such that of the data points are less than or equal to it.
Interquartile Range (IQR): This is the difference between the third and first quartiles (). It measures the spread of the middle of the data and is a critical tool for identifying the variability of the core distribution.
Data Preparation: The first and most critical step is to arrange the data in ascending order. Calculations performed on unsorted data will result in incorrect positional values.
Finding the Index: To find the position of the -th percentile in a data set of size , use the formula . If is a whole number, the percentile is the value at that position; if is a fraction, interpolation between the two surrounding values is often required.
The 1.5 x IQR Rule: To identify potential outliers, calculate the lower bound as and the upper bound as . Any data point falling outside these boundaries is statistically considered an outlier.
| Feature | Quartiles | Percentiles |
|---|---|---|
| Number of Divisions | 4 equal parts | 100 equal parts |
| Common Symbols | ||
| Primary Use | Box plots and basic spread | Standardized testing and growth charts |
| Relationship |
Always Sort First: The most common error in exams is calculating quartiles directly from a provided list without sorting it from smallest to largest. Always double-check that your sequence is strictly increasing before identifying positions.
Verify the Median: Before finding and , always find the median () first. is effectively the median of the lower half of the data, and is the median of the upper half; this hierarchical approach prevents calculation errors.
Check for Even/Odd : Be mindful of whether the data set has an even or odd number of values. If is odd, the median is a specific data point; if is even, the median is the average of the two middle points, which affects how you define the 'lower' and 'upper' halves.
Inclusive vs. Exclusive Median: There is often confusion about whether to include the median in the lower and upper halves when is odd. Most standard methods exclude the median when calculating and to ensure the quarters are truly equal in size.
Confusing Percentile with Percentage: A student scoring in the 90th percentile does not necessarily have a score on the exam. It simply means they performed better than of the other test-takers, regardless of the actual raw score.
Interpolation Errors: When the calculated position is not an integer, students often round to the nearest whole number. In rigorous statistics, you should take a weighted average of the values at the two surrounding positions.
Box-and-Whisker Plots: These visual tools are built entirely on the five-number summary: Minimum, , Median, , and Maximum. They allow for immediate visual comparison of the center and spread of multiple data sets.
Standardized Scores: Percentiles are the foundation of reporting for exams like the SAT or GRE, where raw scores are less meaningful than a candidate's standing relative to the global pool of applicants.
Skewness Detection: By comparing the distance between and to the distance between and , one can determine if a distribution is positively skewed (right-skewed) or negatively skewed (left-skewed).