The fundamental principle of a histogram is the relationship between frequency, class width, and frequency density. This is expressed by the formula:
By using frequency density on the vertical axis, the histogram ensures that the visual 'weight' (the area) of the bar correctly corresponds to the number of observations. If frequency were used as the height for a very wide class, it would visually over-represent that group compared to narrower classes.
The total area of all the bars in a histogram is proportional to the total number of observations in the data set. This property makes histograms useful for estimating the number of items within specific sub-ranges of the data.
Step 1: Boundary Adjustment: Ensure there are no gaps between classes. If data is given in discrete-looking groups (e.g., 10-19, 20-29), adjust the boundaries to the midpoints (e.g., 9.5-19.5, 19.5-29.5) to create a continuous scale.
Step 2: Calculate Class Width: For each group, subtract the lower boundary from the upper boundary ().
Step 3: Calculate Frequency Density: Divide the frequency of each class by its calculated class width ().
Step 4: Plotting: Draw the bars on a grid where the x-axis is a continuous scale of the data and the y-axis is the frequency density. Ensure the scale on the y-axis is linear and clearly labeled.
The most common mistake is plotting frequency directly on the y-axis. This results in a distorted graph where wider classes appear to have more data than they actually do relative to their density.
Another frequent error is failing to adjust class boundaries. If a table shows ages as '10-14' and '15-19', there is a gap of 1 unit. These must be adjusted to '9.5-14.5' and '14.5-19.5' so the bars touch.
Students often confuse class width with the class interval labels. The width is the actual distance between the boundaries, which is crucial for the frequency density calculation.