What is the primary difference between the standard error used in a two-sample z-test vs. a two-sample z-interval for proportions?

The hypothesis test uses a pooled proportion ($\hat{p}_c$) because it assumes $p_1 = p_2$, while the confidence interval uses individual sample proportions ($\hat{p}_1$ and $\hat{p}_2$) because it makes no assumption of equality.

When should you use a two-sample z-test for proportions instead of a chi-square test for independence?

A two-sample z-test is specifically used for comparing two levels of a categorical variable (e.g., Group A vs. Group B) across a binary outcome, whereas chi-square is used for variables with more than two categories or when testing the relationship between two categorical variables generally.

How does the calculation of the test statistic change if the alternative hypothesis is $p_1 > p_2$ versus $p_1 \neq p_2$?

The z-score calculation remains identical; however, the p-value for $p_1 > p_2$ is the area in the upper tail only, while for $p_1 \neq p_2$, the p-value is the sum of the areas in both the upper and lower tails.

What error occurs if a student uses $\hat{p}_1$ and $\hat{p}_2$ in the denominator of the z-test formula instead of $\hat{p}_c$?

This is a formulaic error that violates the null hypothesis assumption ($p_1 = p_2$). It results in an incorrect standard error, which can lead to an inaccurate z-score and p-value.

Why is it a mistake to perform a two-sample z-test on 'before and after' data from the same group of people?

The two-sample z-test requires independent samples. 'Before and after' data are dependent (paired), requiring a different procedure like a matched-pairs test or a test for a single proportion of differences.

What happens to the p-value if you forget to check the 'Large Counts' condition and it is actually violated?

If the condition is violated, the sampling distribution is not approximately normal (it may be skewed). The p-value calculated from the z-table will be unreliable and may lead to an incorrect conclusion.

Define the 'Pooled Proportion' ($\hat{p}_c$) and provide its formula.

The pooled proportion is the weighted average of the two sample proportions, calculated as $\hat{p}_c = \frac{X_1 + X_2}{n_1 + n_2}$, where $X$ is the number of successes and $n$ is the sample size.

What is the 'Large Counts' condition for a two-sample z-test for the difference in proportions?

It requires that $n_1\hat{p}_c$, $n_1(1-\hat{p}_c)$, $n_2\hat{p}_c$, and $n_2(1-\hat{p}_c)$ are all at least 10, ensuring the sampling distribution of the difference is approximately normal.

State the null hypothesis for a test comparing the proportion of residents who support a policy in City A ($p_1$) and City B ($p_2$).

$H_0: p_1 = p_2$ (or $H_0: p_1 - p_2 = 0$). This assumes there is no difference in support levels between the two cities.

Write the formula for the standard error of the difference in proportions used in a hypothesis test.

$$SE_{pooled} = \sqrt{\hat{p}_c(1-\hat{p}_c)\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}$$ where $\hat{p}_c$ is the pooled proportion.

Library Podcasts

Courses

Referral & Rewards

Hypothesis Tests for Differences in Population Proportions

Summary

A two-sample z-test for the difference between population proportions is a statistical procedure used to determine if the proportions of a specific characteristic in two independent populations are significantly different. It relies on the sampling distribution of the difference in sample proportions, which, under certain conditions, approximates a normal distribution centered at zero when the null hypothesis of equality is assumed.

1. Definition & Core Concepts

A normal distribution curve representing the sampling distribution of the difference in proportions under the null hypothesis, showing rejection regions in the tails for a two-tailed test.

2. Underlying Principles

3. Methods & Techniques

4. Key Distinctions

5. Exam Strategy & Tips

6. Common Pitfalls & Misconceptions