Confidence Interval for a Population Proportion

The procedure to find the confidence interval and the confidence level for a proportion is similar to that for the population mean.

Learning Objective

Calculate the confidence interval given the estimated proportion of successes

Key Points

Confidence intervals can be calculated for the true proportion of stocks that go up or down each week and for the true proportion of households in the United States that own personal computers.
To form a proportion, take $X$ (the random variable for the number of successes) and divide it by $n$ (the number of trials, or the sample size).
If we divide the random variable by $n$, the mean by $n$, and the standard deviation by $n$, we get a normal distribution of proportions with $P'$, called the estimated proportion, as the random variable.
This formula is similar to the error bound formula for a mean, except that the "appropriate standard deviation" is different.

Term

error bound
The margin or error that depends on the confidence level, sample size, and the estimated (from the sample) proportion of successes.

Example

Suppose that a market research firm is hired to estimate the percent of adults living in a large city who have cell phones. 500 randomly selected adult residents in this city are surveyed to determine whether they have cell phones. Of the 500 people surveyed, 421 responded yes, they own cell phones. Using a 95% confidence level, compute a confidence interval estimate for the true proportion of adults residents of this city who have cell phones.

Full Text

During an election year, we often read news articles that state confidence intervals in terms of proportions or percentages. For example, a poll for a particular presidential candidate might show that the candidate has 40% of the vote, within 3 percentage points. Often, election polls are calculated with 95% confidence. This mean that pollsters are 95% confident that the true proportion of voters who favor the candidate lies between 0.37 and 0.43:

$(0.40-0.03, 0.40+0.03)$

Investors in the stock market are interested in the true proportion of stock values that go up and down each week. Businesses that sell personal computers are interested in the proportion of households (say, in the United States) that own personal computers. Confidence intervals can be calculated for both scenarios.

Although the procedure to find the confidence interval, sample size, error bound, and confidence level for a proportion is similar to that for the population mean, the formulas are different.

Proportion Problems

How do you know if you are dealing with a proportion problem? First, the underlying distribution is binomial (i.e., there is no mention of a mean or average). If $X$ is a binomial random variable, then $X\sim B(n,p)$ where $n$ is the number of trials and $p$ is the probability of a success. To form a proportion, take $X$ (the random variable for the number of successes) and divide it by $n$ (the number of trials or the sample size). The random variable $P'$ (read "$P$ prime") is that proportion:

$\displaystyle { P }^{ ' }=\frac { X }{ n }$

Sometimes the random variable is denoted as $\hat{P}$ (read as $P$ hat)

When $n$ is large and $p$ is not close to 0 or 1, we can use the normal distribution to approximate the binomial.

$X\sim N\left( n\cdot p,\sqrt { n\cdot p\cdot q } \right)$

If we divide the random variable by $n$, the mean by $n$, and the standard deviation by $n$, we get a normal distribution of proportions with $P'$, called the estimated proportion, as the random variable. (Recall that a proportion is the number of successes divided by $n$.)

$\displaystyle \frac { X }{ n } ={ P }^{ ' }\sim N\left( \frac { n-p }{ n } ,\frac { \sqrt { n\cdot p\cdot q } }{ n } \right)$

Using algebra to simplify:

$\displaystyle \frac { \sqrt { n\cdot p\cdot q } }{ n } =\sqrt { \frac { p\cdot q }{ n } }$

$P'$ follows a normal distribution for proportions:

${ P }^{ ' }\sim N\left( p,\sqrt { \frac { p\cdot q }{ n } } \right)$

The confidence interval has the form $(p'-\text{EBP}, p'+\text{EBP})$.

$\displaystyle{{ p }^{ ' }=\frac { x }{ n }}$
$p'$ is the estimated proportion of successes ($p'$ is a point estimate for $p$, the true proportion)
$x$ is the number of successes
$n$ is the size of the sample

The error bound for a proportion is seen in the formula in:

$\displaystyle \text{EBP} = z_{\frac{\alpha}{2}}\sqrt{\frac{p'q'}{n}}$

where $q'=1-p'$.

This formula is similar to the error bound formula for a mean, except that the "appropriate standard deviation" is different. For a mean, when the population standard deviation is known, the appropriate standard deviation that we use is $\frac { \sigma }{ \sqrt { n } }$. For a proportion, the appropriate standard deviation is $\sqrt { \frac { p\cdot q }{ n } }$.

However, in the error bound formula, we use $\sqrt { \frac { { p }^{ ' }\cdot { q }^{ ' } }{ n } }$ as the standard deviation, instead of $\sqrt { \frac { p\cdot q }{ n } }$.

In the error bound formula, the sample proportions $p'$ and $q'$ are estimates of the unknown population proportions $p$ and $q$. The estimated proportions $p'$ and $q'$ are used because $p$ and $q$ are not known. $p'$ and $q'$ are calculated from the data. $p'$ is the estimated proportion of successes. $q'$ is the estimated proportion of failures.

The confidence interval can only be used if the number of successes $np'$ and the number of failures $nq'$ are both larger than 5.

Solution

This image shows the solution to our example.

[ edit ]

Prev Concept

Determining Sample Size

Confidence Interval for a Population Mean, Standard Deviation Known

Next Concept