Comparing Two Independent Population Proportions

If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance.

Learning Objective

Demonstrate how a hypothesis test can help determine if a difference in estimated proportions reflects a difference in population proportions.

Key Points

Comparing two proportions (e.g., comparing two means) is common.
A hypothesis test can help determine if a difference in the estimated proportions reflects a difference in the population proportions.
The difference of two proportions follows an approximate normal distribution.
Generally, the null hypothesis states that the two proportions are the same.

Terms

independent sample
Two samples are independent as they are drawn from two different populations, and the samples have no effect on each other.
random sample
a sample randomly taken from an investigated population

Full Text

When comparing two population proportions, we start with two assumptions:

The two independent samples are simple random samples that are independent.
The number of successes is at least five and the number of failures is at least five for each of the samples.

Comparing two proportions (e.g., comparing two means) is common. If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance. A hypothesis test can help determine if a difference in the estimated proportions:

${ P }_{ A }^{ ' }-{ P }_{ B }^{ ' }$

reflects a difference in the population proportions.

The difference of two proportions follows an approximate normal distribution. Generally, the null hypothesis states that the two proportions are the same. That is, $H_0: p_A = p_B$. To conduct the test, we use a pooled proportion, $p_c$.

The pooled proportion is calculated as follows:

${ p }_{ c }=\dfrac { { x }_{ A }+{ x }_{ B } }{ { n }_{ A }+{ n }_{ B } }$

The distribution for the differences is:

$\displaystyle { P }_{ A }^{ ' }-{ P }_{ B }^{ ' }\sim N\left[ 0,\sqrt { { p }_{ c }\cdot (1-{ p }_{ c })\cdot \left( \frac { 1 }{ { n }_{ A } } +\frac { 1 }{ { n }_{ B } } \right) } \right]$.

The test statistic ($z$-score) is:

$\displaystyle z=\frac { { (p }_{ A }^{ ' }-{ p }_{ B }^{ ' })-({ p }_{ A }-{ p }_{ B }) }{ \sqrt { { p }_{ c }\cdot (1-{ p }_{ c })\cdot \left( \frac { 1 }{ { n }_{ A } } +\frac { 1 }{ { n }_{ B } } \right) } }$.

Example

Two types of medication for hives are being tested to determine if there is a difference in the proportions of adult patient reactions. 20 out of a random sample of 200 adults given medication $A$ still had hives 30 minutes after taking the medication. 12 out of another random sample of 200 adults given medication $B$ still had hives 30 minutes after taking the medication. Test at a 1% level of significance.

Let $A$ and $B$ be the subscripts for medication $A$ and medication $B$. Then $p_A$ and $p_B$ are the desired population proportions.

Random Variable:

${ P }_{ A }^{ ' }-{ P }_{ B }^{ ' }$

is the difference in the proportions of adult patients who did not react after 30 minutes to medication $A$ and medication $B$.

$H_0: p_A = p_Bp_A - p_B = 0$

$H_a: p_A \neq p_Bp_A - p_B \neq 0$

The words "is a difference" tell you the test is two-tailed.

Distribution for the test: Since this is a test of two binomial population proportions, the distribution is normal:

$\displaystyle { p }_{ c }=\frac { { x }_{ A }+{ x }_{ B } }{ { n }_{ A }+{ n }_{ B } } =\frac { 20+12 }{ 200+200 } =0.08 \\ 1-{ p }_{ c }=0.92$.

Therefore:

$\displaystyle { P }_{ A }^{ ' }-{ P }_{ B }^{ ' }\sim N\left[ 0,\sqrt { (0.08\cdot (0.92)\cdot \left( \frac { 1 }{ 200 } +\frac { 1 }{ 200 } \right) } \right]$

${ P }_{ A }^{ ' }-{ P }_{ B }^{ ' }$ follows an approximate normal distribution.

Calculate the $p$-value using the normal distribution: $p\text{-value} = 0.1404$.

Estimated proportion for group $A$: $\displaystyle { p }_{ A }^{ ' }=\frac { { x }_{ A } }{ n_{ A } } =\frac { 20 }{ 200 } =0.1$

Estimated proportion for group $B$: $\displaystyle { p }_{ B }^{ ' }=\frac { { x }_{ B } }{ n_{ B } } =\frac { 12 }{ 200 } =0.06$

Graph:

$p$-Value Graph

This image shows the graph of the $p$-values in our example.

$P'_A - P'_B = 0.1 -0.06 = 0.04$.

Half the $p$-value is below $-0.04$ and half is above 0.04.

Compare $\alpha$ and the $p$-value: $\alpha = 0.01$ and the $p\text{-value}=0.1404$. $\alpha = p\text{-value}$.

Make a decision: Since $\alpha = p\text{-value}$, do not reject $H_0$.

Conclusion: At a 1% level of significance, from the sample data, there is not sufficient evidence to conclude that there is a difference in the proportions of adult patients who did not react after 30 minutes to medication $A$ and medication $B$.

[ edit ]

Prev Concept

Comparing Two Independent Population Means

Comparing Matched or Paired Samples

Next Concept