Percentiles are values that divide a data set into 100 equal parts. The -th percentile () is a value such that at least % of the observations are less than or equal to it, and at least % are greater than or equal to it.
Quartiles are specific percentiles that divide the data into four equal quarters. There are three quartiles: the first quartile ( or 25th percentile), the second quartile (, the median or 50th percentile), and the third quartile ( or 75th percentile).
Rank and Position: Unlike the mean, which is a measure of center based on value magnitude, quartiles and percentiles are based on the rank or position of values within an ordered sequence.
Step 1: Sort the Data: Arrange all observations in ascending order from smallest to largest.
Step 2: Calculate the Index (): To find the -th percentile of values, calculate .
Step 3: Determine the Value: If is not a whole number, round up to the next integer and find the value at that position. If is a whole number, the percentile is the average of the value at position and the value at position .
Interquartile Range (IQR): Calculate the spread of the middle 50% of the data using the formula .
| Feature | Quartiles | Percentiles |
|---|---|---|
| Divisions | 4 equal parts | 100 equal parts |
| Notation | ||
| Granularity | Broad (25% chunks) | Fine (1% chunks) |
| Common Use | Box plots and basic spread | Standardized testing and growth charts |
Always Sort First: The most common error in exams is calculating position based on the raw, unsorted list. Always double-check that your list is in ascending order.
The 'n+1' Variation: Be aware that some textbooks and software use the formula for the index. Always check which convention your specific curriculum requires.
Sanity Check: In a box plot, the 'whiskers' usually extend to the minimum and maximum values, but only if they are not outliers. If a value is more than away from or , it should be marked as an outlier.
Interpret the Context: If a student is in the 90th percentile, it means they performed better than or equal to 90% of their peers, not that they scored 90% on the test.
Confusing Index with Value: Students often find the index (e.g., the 5th position) and provide '5' as the answer instead of the actual data value located at that 5th position.
Misinterpreting 0th and 100th: Technically, the 0th percentile is the minimum value and the 100th percentile is the maximum value, though these are rarely used in formal reporting.
Equal Values: When multiple data points have the same value (ties), the calculation of percentiles can become complex; standard methods usually treat each occurrence as a unique position in the sorted list.