Standardization is the process of converting any normal distribution into the standard normal distribution . This allows for the comparison of data from different populations that may have different units or scales.
The transformation is performed using the standardization formula: where is the value from the original distribution, is the population mean, and is the population standard deviation.
This formula shifts the distribution horizontally so the mean becomes and scales it horizontally so the standard deviation becomes . It is a linear transformation that preserves the relative positioning of data points.
The cumulative distribution function for is denoted by the Greek letter , which represents . This is the area under the curve from negative infinity up to the point .
Due to the symmetry of the curve about , the probability for negative values can be found using the identity . This is essential when using tables that only provide values for positive .
To find the probability between two values and , the formula is used. This calculates the total area up to and subtracts the area up to , leaving the interval between them.
Standard normal tables typically provide the area to the left of a positive -score, usually ranging from to . The first column indicates the -value to one decimal place, while the top row provides the second decimal place.
Many tables include an 'ADD' or 'difference' section to find probabilities for -scores with three decimal places. This involves finding the base probability for the first two decimals and adding a small correction factor found in the third-decimal column.
When the required probability is not directly in the table, linear interpolation or the closest available value is used. For example, if a probability of is needed, the -score is often taken as , which lies halfway between the table entries for and .
The inverse normal process involves finding a -score when the probability (area) is already known. This is often used to find percentiles or threshold values in a dataset.
If the given probability is less than , the resulting -score must be negative. In this case, one should look up the value for in the table to find a positive , then apply a negative sign to the result.
Critical values are specific -scores that correspond to common tail probabilities. For instance, the -score that leaves in the upper tail is approximately , which is frequently used in hypothesis testing and confidence intervals.
Always sketch the curve: Drawing a quick bell curve and shading the required area is the most effective way to prevent sign errors and determine if your final probability should be greater or less than .
Check the variance vs. standard deviation: A common mistake is using the variance in the denominator of the -formula instead of the standard deviation . Always verify if the problem provides or .
Inequality signs: In continuous distributions like the Normal distribution, is identical to . You do not need to adjust for strict versus non-strict inequalities as you would in discrete distributions like the Binomial.