How is finding the median from a cumulative frequency diagram different from finding the median from a raw ordered list?

On a cumulative frequency diagram, you first use the position $\frac{n}{2}$ on the cumulative frequency axis and then read across to the curve and down to the data axis. With a raw ordered list, you work directly with the middle value or values in the list, so the procedure is based on exact data rather than a graph-based estimate.

What is the difference between reading a value from a cumulative frequency diagram and reading a number of observations from it?

Reading a value means starting from a cumulative frequency position such as the median or a percentile and using the curve to estimate the corresponding horizontal-axis value. Reading a number of observations means starting from a horizontal-axis value and using the curve to estimate how many data values are at or below it.

How do quartiles differ from percentiles on a cumulative frequency diagram?

Quartiles divide the data into four equal parts, while percentiles divide it into 100 equal parts. The lower quartile, median, and upper quartile are just the 25th, 50th, and 75th percentiles, so percentiles are the more general framework.

What mistake happens if you use the x-axis instead of the y-axis to locate the median position?

You would be treating a data value as though it were a cumulative frequency position, which mixes up two different axes. The median position belongs on the cumulative frequency axis because it represents how many values have been counted, not the size of the values themselves.

Why is it wrong to say that the cumulative frequency read at $x = a$ is the number of values exactly equal to $a$?

A cumulative frequency counts all observations up to and including that value, not just those equal to it. The graph represents running totals, so the reading at $x = a$ includes everything smaller as well.

What common error occurs when finding how many values are above a threshold?

Students often read the cumulative frequency at the threshold and stop, even though that only gives the number at or below the threshold. To find how many are above it, you must subtract that reading from the total number of observations.

What does a cumulative frequency diagram represent?

It represents the running total of observations against the variable being measured. Each point shows how many data values are at or below a given horizontal-axis value, which allows positions and thresholds to be interpreted visually.

What is the formula for the position of the $p$th percentile?

The position of the $p$th percentile is $\frac{np}{100}$, where $n$ is the total number of observations and $p$ is the percentile number. This gives the location in the ordered data, which you then convert into an estimated value using the graph.

What positions are used for the lower quartile, median, and upper quartile?

The lower quartile is at $\frac{n}{4}$, the median is at $\frac{n}{2}$, and the upper quartile is at $\frac{3n}{4}$. These positions split the ordered data into four equal parts, so the graph can be used to estimate the associated data values.

Why are answers from a cumulative frequency diagram usually estimates?

The graph is built from grouped data rather than exact individual observations, so some detail is lost before the graph is drawn. The smooth curve also assumes the data are spread continuously within intervals, which means readings can only approximate the original values.

Library Podcasts

Courses

Referral & Rewards

Interpreting Cumulative Frequency Diagrams

Summary

1. Definition & Core Concepts

A cumulative frequency curve with a horizontal line from a frequency position to the curve and a vertical drop to the data axis, showing how a median or percentile is read.

2. Underlying Principles

A cumulative frequency curve showing horizontal lines from n over 4, n over 2, and 3n over 4 to the curve, then vertical drops to the lower quartile, median, and upper quartile values.

3. Methods & Techniques

A cumulative frequency graph illustrating both directions of reading: from a data value up to the curve and across to the cumulative frequency, and from a cumulative frequency across to the curve and down to the data axis.

4. Key Distinctions

A cumulative frequency diagram labeling the 25th, 50th, and 75th percentiles to show that quartiles are special cases of percentiles.

5. Exam Strategy & Tips

A strategy diagram comparing the two reading directions on a cumulative frequency graph: y-axis to curve to x-axis for values, and x-axis to curve to y-axis for counts.

6. Common Pitfalls & Misconceptions

7. Connections & Extensions

Interpreting Cumulative Frequency Diagrams

Summary

Interpreting a cumulative frequency diagram means using a graph of running totals to estimate positions and values within grouped continuous data. The key idea is that the vertical axis shows how many data values are at or below a given horizontal-axis value, so quartiles, medians, percentiles, and counts above or below a threshold can be read by moving horizontally and vertically between the curve and the axes. Because the graph is built from grouped data and a smooth curve, all answers are estimates rather than exact raw-data values.

1. Definition & Core Concepts

Cumulative frequency is a running total, so each point on the graph shows how many data values are less than or equal to a particular value on the horizontal axis. This makes the curve useful for answering questions about position in a data set, not just total frequency. It is especially appropriate for grouped continuous data, where exact raw values are not individually listed.
A cumulative frequency diagram plots data values on the horizontal axis and cumulative frequency on the vertical axis. The graph rises as you move right because the total counted so far can stay the same or increase, but it cannot decrease. This is why the curve should never turn downward.
The total number of data values, usually written as $n$ , is essential when interpreting the graph. It is found from the highest cumulative frequency reached by the curve, usually near the top-right end of the graph. Once $n$ is known, positions such as the median, quartiles, and percentiles can be calculated.
The graph gives estimates, not exact answers, because it represents grouped data and uses a smooth curve between class boundaries. This assumes the data are spread reasonably smoothly within each interval. As a result, readings depend on graph accuracy and careful scale reading.

A cumulative frequency curve with a horizontal line from a frequency position to the curve and a vertical drop to the data axis, showing how a median or percentile is read.

2. Underlying Principles

The diagram works because the vertical coordinate at a given horizontal value represents the number of observations up to that value. If the graph shows a cumulative frequency of 40 at $x = 12$ , that means about 40 data values are at or below 12. This turns questions about position in the ordered data into graph-reading tasks.
The median is the halfway point in the data set, so on a cumulative frequency graph it is found at position $\frac{n}{2}$ . This works because cumulative frequency counts how many values have been accumulated in order from smallest upward. Reading across from $\frac{n}{2}$ and then down gives the approximate middle data value.
The lower quartile and upper quartile split the data into four equal parts, so their positions are $\frac{n}{4}$ and $\frac{3n}{4}$ . These positions identify the 25th percentile and 75th percentile of the ordered data. The graph then converts those positions into approximate data values on the horizontal axis.
A percentile generalizes this idea by dividing the data into 100 equal parts. The position of the $p$ th percentile is $\frac{np}{100}$ where $n$ is the total number of data values and $p$ is the percentile number. This is useful when a question asks for values such as the 10th, 60th, or 90th percentile.
Counts below, above, or between values can be estimated directly from the graph because cumulative frequency is cumulative by design. Reading the curve at a threshold gives the estimated number at or below that threshold, and subtracting from $n$ gives the number above it. Similarly, subtracting two cumulative frequencies estimates how many values lie between two horizontal-axis values.

A cumulative frequency curve showing horizontal lines from n over 4, n over 2, and 3n over 4 to the curve, then vertical drops to the lower quartile, median, and upper quartile values.

3. Methods & Techniques

Reading the median, quartiles, and percentiles

Step 1: find the total number of values, $n$ , from the top of the cumulative frequency scale or the highest point reached by the curve. This matters because all positional measures depend on how many observations are represented. If you use the wrong total, every later reading will be wrong.
Step 2: calculate the position you need. Use $\frac{n}{2}$ for the median, $\frac{n}{4}$ for the lower quartile, $\frac{3n}{4}$ for the upper quartile, and $\frac{np}{100}$ for the $p$ th percentile. These formulas give the location within the ordered data, not the final data value itself.
Step 3: draw across and down. Start at the required position on the cumulative frequency axis, move horizontally until you meet the curve, then move vertically down to the horizontal axis. The reading on the horizontal axis is the estimated median, quartile, or percentile value.

Key positional formulas: Median $= \frac{n}{2}$ , Lower quartile $= \frac{n}{4}$ , Upper quartile $= \frac{3n}{4}$ , $p$ th percentile $= \frac{np}{100}$

Step 4: interpret the answer in context. If the horizontal axis measures time, length, mass, or another quantity, your answer must use those units. This helps distinguish between a position in the list and an actual measured value.

Reading numbers below, above, or between values

To estimate the number at or below a given value, start on the horizontal axis at that value, move vertically up to the curve, then horizontally to the cumulative frequency axis. The number you read is the estimated cumulative frequency up to that point. This method answers questions about how many observations do not exceed a threshold.
To estimate the number above a given value, first find the cumulative frequency up to that value, then subtract from the total: $\text{number above} = n - \text{cumulative frequency at the value}$ This works because cumulative frequency already counts everything up to the threshold. The remaining observations must therefore lie above it.
To estimate the number between two values, read the cumulative frequency at each value and subtract the smaller from the larger. This is because cumulative totals include all earlier values, so subtraction isolates the part between the two cutoffs. It is often the quickest way to answer interval-count questions.

4. Key Distinctions

Value versus position

A position tells you where a data point sits within the ordered list, while a value is the actual measurement read from the horizontal axis. For example, $\frac{n}{2}$ identifies the median position, but you still need the graph to convert that position into an estimated data value. Confusing these two ideas is one of the most common sources of error.

At or below versus above

A cumulative frequency reading gives the number at or below a value, not the number above it. To find how many are above a cutoff, you must subtract the cumulative reading from the total $n$ . This distinction matters because many exam questions are worded using phrases like "more than" or "beyond" rather than "up to".

Exact data versus estimated data

When using a cumulative frequency diagram, answers are estimates because the original data have been grouped and then smoothed. By contrast, if you had the full raw data listed in order, quartiles and medians could be found exactly from positions in the list. The graph is therefore a practical approximation tool rather than an exact record.

Quartiles versus percentiles

Quartiles divide the data into four equal parts, while percentiles divide it into 100 equal parts. The lower quartile is the 25th percentile, the median is the 50th percentile, and the upper quartile is the 75th percentile. Seeing quartiles as special cases of percentiles makes interpretation more flexible.

Measure	Position formula	Meaning
Lower quartile	$\frac{n}{4}$	About 25% of values are at or below this
Median	$\frac{n}{2}$	About 50% of values are at or below this
Upper quartile	$\frac{3n}{4}$	About 75% of values are at or below this
$p$ th percentile	$\frac{np}{100}$	About $p\%$ of values are at or below this

A cumulative frequency diagram labeling the 25th, 50th, and 75th percentiles to show that quartiles are special cases of percentiles.

5. Exam Strategy & Tips

Always identify the total number of values first before reading any quartile or percentile. Many mistakes happen because students assume a convenient total from the axis rather than the value actually reached by the curve. Checking the endpoint of the curve prevents this.
Use the correct reading direction for the question type. If the question asks for a data value such as the median or 80th percentile, begin on the cumulative frequency axis and read across to the curve, then down. If it asks how many values are below a threshold, begin on the horizontal axis and read up to the curve, then across.
Read scales carefully and include units because the final answer comes from the horizontal axis. Even if the graph is interpreted correctly, poor scale reading can still lose marks. Sensible rounding should match the scale and precision shown on the diagram.
Expect approximation and communicate it appropriately. Words such as "estimate", "about", or "approximately" reflect the fact that the graph is smooth and the original data are grouped. An apparently exact answer may suggest a misunderstanding of the method.
Check whether the question wants values below or above a cutoff. For "above", "greater than", or "beyond", remember to subtract from the total after reading the cumulative frequency. This extra subtraction step is often the key difference between a correct and an incorrect answer.

Quick exam rule: Read from the y-axis when finding a value such as a median or percentile, and read from the x-axis when finding how many observations lie up to a given value.

A strategy diagram comparing the two reading directions on a cumulative frequency graph: y-axis to curve to x-axis for values, and x-axis to curve to y-axis for counts.

6. Common Pitfalls & Misconceptions

Using the horizontal-axis value as if it were the cumulative frequency position is incorrect. The horizontal axis shows data values, while positions like $\frac{n}{2}$ and $\frac{n}{4}$ belong on the vertical cumulative frequency axis. Mixing the axes leads to meaningless readings.
Forgetting that the graph gives cumulative totals causes errors in threshold questions. If a graph reading at $x = a$ is 68, that means about 68 values are at or below $a$ , not exactly equal to $a$ . The word cumulative means the count has already been added up from the start.
Assuming answers are exact is a misconception. Since the graph represents grouped data with a smooth curve, different careful readers may obtain slightly different estimates. Small variation is normal, provided the reading follows the correct method.
Subtracting in the wrong order can produce impossible negative frequencies when finding how many values lie between two data values. The cumulative frequency at the larger value should be greater than or equal to the cumulative frequency at the smaller value. Subtracting larger minus smaller preserves the meaning of counting observations in an interval.

7. Connections & Extensions

Cumulative frequency diagrams connect directly to quartiles, interquartile range, and percentiles. Once the lower and upper quartiles are estimated, the interquartile range can be found using $\text{IQR} = Q_3 - Q_1$ where $Q_1$ is the lower quartile and $Q_3$ is the upper quartile. This shows how graph interpretation supports broader statistical analysis.
They also connect to the idea of an ordered distribution of data. The curve is effectively a visual summary of how observations accumulate from low values to high values, so steeper sections indicate data values are more concentrated in that region. Gentler sections suggest values are more spread out there.
In applied settings, cumulative frequency diagrams are useful for threshold decisions and service targets. For example, they can help estimate what proportion of items fall below a quality limit or how many events exceed a specified time. Their value lies in turning grouped data into visually interpretable estimates.

Cumulative frequency is a running total, so each point on the graph shows how many data values are less than or equal to a particular value on the horizontal axis. This makes the curve useful for answering questions about position in a data set, not just total frequency. It is especially appropriate for grouped continuous data, where exact raw values are not individually listed.
A cumulative frequency diagram plots data values on the horizontal axis and cumulative frequency on the vertical axis. The graph rises as you move right because the total counted so far can stay the same or increase, but it cannot decrease. This is why the curve should never turn downward.
The total number of data values, usually written as $n$ , is essential when interpreting the graph. It is found from the highest cumulative frequency reached by the curve, usually near the top-right end of the graph. Once $n$ is known, positions such as the median, quartiles, and percentiles can be calculated.
The graph gives estimates, not exact answers, because it represents grouped data and uses a smooth curve between class boundaries. This assumes the data are spread reasonably smoothly within each interval. As a result, readings depend on graph accuracy and careful scale reading.

The diagram works because the vertical coordinate at a given horizontal value represents the number of observations up to that value. If the graph shows a cumulative frequency of 40 at $x = 12$ , that means about 40 data values are at or below 12. This turns questions about position in the ordered data into graph-reading tasks.
The median is the halfway point in the data set, so on a cumulative frequency graph it is found at position $\frac{n}{2}$ . This works because cumulative frequency counts how many values have been accumulated in order from smallest upward. Reading across from $\frac{n}{2}$ and then down gives the approximate middle data value.
The lower quartile and upper quartile split the data into four equal parts, so their positions are $\frac{n}{4}$ and $\frac{3n}{4}$ . These positions identify the 25th percentile and 75th percentile of the ordered data. The graph then converts those positions into approximate data values on the horizontal axis.
A percentile generalizes this idea by dividing the data into 100 equal parts. The position of the $p$ th percentile is $\frac{np}{100}$ where $n$ is the total number of data values and $p$ is the percentile number. This is useful when a question asks for values such as the 10th, 60th, or 90th percentile.
Counts below, above, or between values can be estimated directly from the graph because cumulative frequency is cumulative by design. Reading the curve at a threshold gives the estimated number at or below that threshold, and subtracting from $n$ gives the number above it. Similarly, subtracting two cumulative frequencies estimates how many values lie between two horizontal-axis values.

Reading the median, quartiles, and percentiles

Step 1: find the total number of values, $n$ , from the top of the cumulative frequency scale or the highest point reached by the curve. This matters because all positional measures depend on how many observations are represented. If you use the wrong total, every later reading will be wrong.
Step 2: calculate the position you need. Use $\frac{n}{2}$ for the median, $\frac{n}{4}$ for the lower quartile, $\frac{3n}{4}$ for the upper quartile, and $\frac{np}{100}$ for the $p$ th percentile. These formulas give the location within the ordered data, not the final data value itself.
Step 3: draw across and down. Start at the required position on the cumulative frequency axis, move horizontally until you meet the curve, then move vertically down to the horizontal axis. The reading on the horizontal axis is the estimated median, quartile, or percentile value.

Key positional formulas: Median $= \frac{n}{2}$ , Lower quartile $= \frac{n}{4}$ , Upper quartile $= \frac{3n}{4}$ , $p$ th percentile $= \frac{np}{100}$

Step 4: interpret the answer in context. If the horizontal axis measures time, length, mass, or another quantity, your answer must use those units. This helps distinguish between a position in the list and an actual measured value.

Reading numbers below, above, or between values

To estimate the number at or below a given value, start on the horizontal axis at that value, move vertically up to the curve, then horizontally to the cumulative frequency axis. The number you read is the estimated cumulative frequency up to that point. This method answers questions about how many observations do not exceed a threshold.
To estimate the number above a given value, first find the cumulative frequency up to that value, then subtract from the total: $\text{number above} = n - \text{cumulative frequency at the value}$ This works because cumulative frequency already counts everything up to the threshold. The remaining observations must therefore lie above it.
To estimate the number between two values, read the cumulative frequency at each value and subtract the smaller from the larger. This is because cumulative totals include all earlier values, so subtraction isolates the part between the two cutoffs. It is often the quickest way to answer interval-count questions.

Value versus position

A position tells you where a data point sits within the ordered list, while a value is the actual measurement read from the horizontal axis. For example, $\frac{n}{2}$ identifies the median position, but you still need the graph to convert that position into an estimated data value. Confusing these two ideas is one of the most common sources of error.

At or below versus above

A cumulative frequency reading gives the number at or below a value, not the number above it. To find how many are above a cutoff, you must subtract the cumulative reading from the total $n$ . This distinction matters because many exam questions are worded using phrases like "more than" or "beyond" rather than "up to".

Exact data versus estimated data

When using a cumulative frequency diagram, answers are estimates because the original data have been grouped and then smoothed. By contrast, if you had the full raw data listed in order, quartiles and medians could be found exactly from positions in the list. The graph is therefore a practical approximation tool rather than an exact record.

Quartiles versus percentiles

Quartiles divide the data into four equal parts, while percentiles divide it into 100 equal parts. The lower quartile is the 25th percentile, the median is the 50th percentile, and the upper quartile is the 75th percentile. Seeing quartiles as special cases of percentiles makes interpretation more flexible.

Measure	Position formula	Meaning
Lower quartile	$\frac{n}{4}$	About 25% of values are at or below this
Median	$\frac{n}{2}$	About 50% of values are at or below this
Upper quartile	$\frac{3n}{4}$	About 75% of values are at or below this
$p$ th percentile	$\frac{np}{100}$	About $p\%$ of values are at or below this

Using the horizontal-axis value as if it were the cumulative frequency position is incorrect. The horizontal axis shows data values, while positions like $\frac{n}{2}$ and $\frac{n}{4}$ belong on the vertical cumulative frequency axis. Mixing the axes leads to meaningless readings.
Forgetting that the graph gives cumulative totals causes errors in threshold questions. If a graph reading at $x = a$ is 68, that means about 68 values are at or below $a$ , not exactly equal to $a$ . The word cumulative means the count has already been added up from the start.
Assuming answers are exact is a misconception. Since the graph represents grouped data with a smooth curve, different careful readers may obtain slightly different estimates. Small variation is normal, provided the reading follows the correct method.
Subtracting in the wrong order can produce impossible negative frequencies when finding how many values lie between two data values. The cumulative frequency at the larger value should be greater than or equal to the cumulative frequency at the smaller value. Subtracting larger minus smaller preserves the meaning of counting observations in an interval.

Cumulative frequency diagrams connect directly to quartiles, interquartile range, and percentiles. Once the lower and upper quartiles are estimated, the interquartile range can be found using $\text{IQR} = Q_3 - Q_1$ where $Q_1$ is the lower quartile and $Q_3$ is the upper quartile. This shows how graph interpretation supports broader statistical analysis.
They also connect to the idea of an ordered distribution of data. The curve is effectively a visual summary of how observations accumulate from low values to high values, so steeper sections indicate data values are more concentrated in that region. Gentler sections suggest values are more spread out there.
In applied settings, cumulative frequency diagrams are useful for threshold decisions and service targets. For example, they can help estimate what proportion of items fall below a quality limit or how many events exceed a specified time. Their value lies in turning grouped data into visually interpretable estimates.