Visual Inspection: The first step in analysis is observing the 'shape' of the data. A linear pattern suggests a constant rate of change, while a curved pattern indicates a non-linear relationship that might require different statistical models.
Line of Best Fit: This is a straight line drawn through the center of the data points to represent the general trend. It should have roughly an equal number of points above and below it and follow the overall 'flow' of the data.
Spearman's Rank Correlation Coefficient (): This numerical measure quantifies the strength and direction of the relationship. It ranges from (perfect negative) to (perfect positive), with indicating no correlation.
| Feature | Correlation | Causation |
|---|---|---|
| Definition | A relationship or pattern between variables | One variable directly influences the change in another |
| Evidence | Provided by scatter diagrams and coefficients | Requires controlled experiments and logical proof |
| Example | Ice cream sales and sunburns correlate | High UV exposure causes sunburns |
Describing Trends: When asked to describe a scatter diagram, always mention both the direction (positive/negative) and the strength (strong/moderate/weak). Use specific terminology like 'strong negative linear correlation' for full marks.
Identifying Outliers: Always look for points that fall far away from the general cluster. These are outliers and can significantly affect the line of best fit; in exams, you may be asked to identify them or explain their impact.
Contextual Interpretation: If a question asks you to 'interpret' the correlation, do not just say it is positive. Explain what it means in the real world, e.g., 'As the temperature increases, the amount of energy used for cooling also increases.'
Assuming Linearity: Students often try to draw a straight line through data that is clearly curved. If the points follow a U-shape or an exponential curve, a linear correlation coefficient or line of best fit is inappropriate.
Scale Misinterpretation: Changing the scale of the axes can make a correlation look stronger or weaker than it actually is. Always look at the actual spread of the points relative to the axes rather than just the visual 'steepness' of the slope.
Ignoring Sample Size: A very small number of points might show a strong correlation by pure chance. Conclusions drawn from scatter diagrams are more reliable when based on a larger, more representative sample of data.