Confidence Interval: Proportion

This lesson describes how to construct a confidence interval for a sample proportion, p.

Estimation Requirements

The approach described in this lesson is valid whenever the following conditions are met:

  • The sampling method is simple random sampling.
  • The sample includes at least 10 successes and 10 failures. (Some texts say that 5 successes and 5 failures are enough.)

The Variability of the Sample Proportion

To construct a confidence interval for a sample proportion, we need to know the variability of the sample proportion. This means we need to know how to compute the standard deviation and/or the standard error of the sampling distribution.

  • Suppose k possible samples of size n can be selected from the population. The standard deviation of the sampling distribution is the "average" deviation between the k sample proportions and the true population proportion, P. The standard deviation of the sample proportion σp is:

    σp = sqrt[ P * ( 1 - P ) / n ] * sqrt[ ( N - n ) / ( N - 1 ) ]

    where P is the population proportion, n is the sample size, and N is the population size. When the population size is much larger (at least 10 times larger) than the sample size, the standard deviation can be approximated by:

    σp = sqrt[ P * ( 1 - P ) / n ]

  • When the true population proportion P is not known, the standard deviation of the sampling distribution cannot be calculated. Under these circumstances, use the standard error. The standard error (SE) provides an unbiased estimate of the standard deviation. It can be calculated from the equation below.

    SEp = sqrt[ p * ( 1 - p ) / n ] * sqrt[ ( N - n ) / ( N - 1 ) ]

    where p is the sample proportion, n is the sample size, and N is the population size. When the population size at least 10 times larger than the sample size, the standard error can be approximated by:

    SEp = sqrt[ p * ( 1 - p ) / n ]

Alert

The Advanced Placement Statistics Examination only covers the "approximate" formulas for the standard deviation and standard error. However, students are expected to be aware of the limitations of these formulas; namely, the approximate formulas should only be used when the population size is at least 10 times larger than the sample size.


How to Find the Confidence Interval for a Proportion

Previously, we described how to construct confidence intervals. For convenience, we repeat the key steps below.

  • Identify a sample statistic. Use the sample proportion to estimate the population proportion.

  • Select a confidence level. The confidence level describes the uncertainty of a sampling method. Often, researchers choose 90%, 95%, or 99% confidence levels; but any percentage can be used.

  • Find the margin of error. Previously, we showed how to compute the margin of error.

  • Specify the confidence interval. The range of the confidence interval is defined by the sample statistic + margin of error. And the uncertainty is denoted by the confidence level.

In the next section, we work through a problem that shows how to use this approach to construct a confidence interval for a proportion.

Sample Planning Wizard

As you may have noticed, the steps required to estimate a population proportion are not trivial. They can be time-consuming and complex. Stat Trek's Sample Planning Wizard does this work for you - quickly, easily, and error-free. In addition to constructing a confidence interval, the Wizard creates a summary report that lists key findings and documents analytical techniques. Whenever you need to construct a confidence interval, consider using the Sample Planning Wizard. The Sample Planning Wizard is a premium tool available only to registered users. > Learn more

Register Now View Demo View Wizard

Test Your Understanding of This Lesson

Problem 1

A major metropolitan newspaper selected a simple random sample of 1,600 readers from their list of 100,000 subscribers. They asked whether the paper should increase its coverage of local news. Forty percent of the sample wanted more local news. What is the 99% confidence interval for the proportion of readers who would like more coverage of local news?

(A) 0.30 to 0.50
(B) 0.32 to 0.48
(C) 0.35 to 0.45
(D) 0.37 to 0.43
(E) 0.39 to 0.41

Solution

The answer is (D). The approach that we used to solve this problem is valid when the following conditions are met.

  • The sampling method must be simple random sampling. This condition is satisfied; the problem statement says that we used simple random sampling.
  • The sample should include at least 10 successes and 10 failures. Suppose we classify a "more local news" response as a success, and any other response as a failure. Then, we have 0.40 * 1600 = 640 successes, and 0.60 * 1600 = 960 failures - plenty of successes and failures.
  • If the population size is much larger than the sample size, we can use an "approximate" formula for the standard deviation or the standard error. This condition is satisfied, so we will use one of the simpler "approximate" formulas.

Since the above requirements are satisfied, we can use the following four-step approach to construct a confidence interval.

  • Identify a sample statistic. Since we are trying to estimate a population proportion, we choose the sample proportion (0.40) as the sample statistic.

  • Select a confidence level. In this analysis, the confidence level is defined for us in the problem. We are working with a 99% confidence level.

  • Find the margin of error. Elsewhere on this site, we show how to compute the margin of error when the sampling distribution is approximately normal. The key steps are shown below.

    • Find standard deviation or standard error. Since we do not know the population proportion, we cannot compute the standard deviation; instead, we compute the standard error. And since the population is more than 10 times larger than the sample, we can use the following formula to compute the standard error (SE) of the proportion:

      SE = sqrt [ p(1 - p) / n ] = sqrt [ (0.4)*(0.6) / 1600 ] = sqrt [ 0.24/1600 ] = 0.012

    • Find critical value. The critical value is a factor used to compute the margin of error. Because the sampling distribution is approximately normal and the sample size is large, we can express the critical value as a z score by following these steps.

      • Compute alpha (α): α = 1 - (confidence level / 100) = 1 - (99/100) = 0.01
      • Find the critical probability (p*): p* = 1 - α/2 = 1 - 0.01/2 = 0.995
      • The critical value is the z score having a cumulative probability equal to 0.995. From the Normal Distribution Calculator, we find that the critical value is 2.58.

    • Compute margin of error (ME): ME = critical value * standard error = 2.58 * 0.012 = 0.03

  • Specify the confidence interval. The range of the confidence interval is defined by the sample statistic + margin of error. And the uncertainty is denoted by the confidence level.

Therefore, the 99% confidence interval is 0.37 to 0.43. That is, we are 99% confident that the true population proportion is in the range defined by 0.4 + 0.03.