To classify data as discrete or continuous, consider the nature of the values it can take. If the data points are distinct, separate, and typically whole numbers resulting from counting, it is discrete.
If the data can take any value within a given interval, including fractions and decimals, and is typically obtained through measurement, then it is continuous. Think about whether you can 'zoom in' on the measurement to get more precision.
A practical heuristic is to ask: 'Do I need some sort of scale or instrument to measure it?' If the answer is yes, it's likely continuous data. If it's a count of distinct items, it's discrete.
Rounded Continuous Data: A common misconception is to classify continuous data that has been rounded or truncated as discrete. For example, 'time taken to the nearest hour' is still fundamentally a continuous variable (time) that has been made discrete by rounding, but its underlying nature remains continuous.
Non-Integer Discrete Data: Not all discrete data consists solely of integers. Examples like shoe sizes (e.g., 7, 7.5, 8) are discrete because there are fixed, distinct steps between values, even if those steps are not whole numbers.
Confusing Categories: Students sometimes confuse numerical data with categorical data. Remember that discrete and continuous are sub-types of numerical data, which always involves quantities, unlike categories (e.g., colors, types of cars).
When presented with data in an exam, always ask yourself if the values are counted or measured. This is the quickest way to determine if it's discrete or continuous.
Pay close attention to wording like 'to the nearest...' or 'number of...' as these often indicate how the data was collected and thus its type. 'Number of' almost always implies discrete data.
For 'time' or 'length' questions, check if there are any constraints on the measurement. If it's 'time taken' without qualification, assume continuous. If it's 'time taken to the nearest minute', it's discrete due to rounding, but the underlying variable is continuous.
The classification of data as discrete or continuous is foundational to all subsequent statistical analysis. It dictates the choice of appropriate graphs, measures of central tendency, and statistical tests.
For instance, calculating the mean is generally suitable for both, but the median might be preferred for skewed continuous data, while the mode is often used for discrete or categorical data.
In probability, discrete random variables have probability mass functions, while continuous random variables have probability density functions, further highlighting the importance of this distinction.