### Survey Sampling

#### Introduction

#### Simple Random Samples

#### Stratified Samples

#### Cluster Samples

#### Sample Planning

#### Hypothesis Testing

#### Small Samples

#### Appendix

### Survey Sampling: Table of Contents

#### Introduction

- About This Tutorial
- Survey Sampling Overview
- Survey Sampling Methods
- Bias in Survey Sampling
- Survey Analysis

#### Simple Random Samples

#### Stratified Samples

#### Cluster Samples

#### Sample Planning

#### Hypothesis Testing

#### Small Samples

#### Appendix

# Confidence Interval: Proportion (Small Sample)

In this lesson, we explain how to estimate a confidence interval for a proportion, when the sample size is small.

## How to Estimate Confidence Intervals With Small Samples

In a previous lesson, we showed how to estimate a confidence interval for a proportion when a simple random sample includes at least 10 successes and 10 failures.

When the sample does not include at least 10 successes and 10 failures, the sample size will often be too small to justify the estimation approach presented in the previous lesson. This lesson describes how to construct a confidence interval for a proportion when the sample has fewer than 10 successes and/or fewer than 10 failures. The key steps are:

- Use the sample proportion as a point estimate of the population proportion.
- Determine whether the sample proportion is the outcome of a binomial experiment or a hypergeometric experiment.
- Based on the above two bullet points, define the sampling distribution of the proportion.
- Use the sampling distribution to develop a confidence interval.

## Estimation Requirements

The approach described in this lesson is valid whenever the following conditions are met:

- The sampling method is simple random sampling.
- The sample includes at least 1 success and 1 failure.

The following examples illustrate how this works. The first example involves a binomial experiment; and the second example, a hypergeometric experiment.

## Example 1: Find Confidence Interval When Sampling With Replacement

Suppose an urn contains 30 marbles. Some marbles are red, and the rest are green. Five marbles are randomly selected, with replacement, from the urn. Two of the selected marbles are red, and three are green. Construct an 80% confidence interval for the proportion of red marbles in the urn.

*Solution:* To solve this problem, we need to define the sampling
distribution of the proportion.

- First, we assume that the population proportion is equal to the sample proportion. Thus, since 2 of the 5 marbles were red, we assume the proportion of red marbles is equal to 0.4.
- Second, since we sampled with replacement, the sample proportion can be considered an outcome of a binomial experiment.
- Assuming that the population proportion is 0.4 and the sample proportion is the outcome of a binomial experiment, the sampling distribution of the proportion can be determined. It appears in the table below. (Previously, we showed how to compute binomial probabilities that form the body of the table.)

Number of red marbles in sample | Sample prop | Prob | Cumulative probability |
---|---|---|---|

0 | 0.0 | 0.07776 | 0.07776 |

1 | 0.2 | 0.2592 | 0.3396 |

2 | 0.4 | 0.3456 | 0.68256 |

3 | 0.6 | 0.2304 | 0.91296 |

4 | 0.8 | 0.0768 | 0.98976 |

5 | 1.0 | 0.01024 | 1.00 |

We see that the probability of getting 0 red marbles in the sample is 0.07776;
the probability of getting 1 red marble is 0.2592; etc. Given the entries in
the above table, it is not possible to create an 80% confidence interval *exactly*.
However, we can come close. When the true population proportion is 0.4, the probability
that a sample proportion falls between 0.2
and 0.6 is equal to 0.2592 + 0.3456 + 0.2304 or 0.8352. Thus, based on this
sample, we can say that an 83.52% confidence interval is described by
the range from 0.2 to 0.6.

## Example 2: Find Confidence Interval When Sampling Without Replacement

Let's take another look at the problem from Example 1. This time, however, we will assume that the marbles are sampled without replacement. Suppose an urn contains 30 marbles. Some marbles are red, and the rest are green. Five marbles are randomly selected, without replacement, from the urn. Two of the selected marbles are red, and three are green. Construct an 80% confidence interval for the proportion of red marbles in the urn.

*Solution:* To solve this problem, we need to define the sampling
distribution of the proportion.

- First, we assume that the population proportion is equal to the sample proportion. Thus, since 2 of the 5 marbles were red, we assume the proportion of red marbles is equal to 0.4.
- Second, since we sampled without replacement, the sample proportion can be considered an outcome of a hypergeometric experiment.
- Assuming that the population proportion is 0.4 and the sample proportion is the outcome of a hypergeometric experiment, the sampling distribution of the proportion can be determined. It appears in the table below. (Previously, we showed how to compute hypergeometric probabilities that form the body of the table.)

Number of red marbles in sample | Sample proportion | Prob | Cumulative probability |
---|---|---|---|

0 | 0.0 | 0.0601 | 0.0601 |

1 | 0.2 | 0.2577 | 0.3178 |

2 | 0.4 | 0.3779 | 0.6957 |

3 | 0.6 | 0.2362 | 0.9319 |

4 | 0.8 | 0.0625 | 0.9944 |

5 | 1.0 | 0.0056 | 1.0000 |

We see that the probability of getting 0 red marbles in the sample is 0.0601;
the probability of getting 1 red marble is 0.2577; etc. Given the entries in
the above table, it is not possible to create an 80% confidence interval *exactly*.
However, we can come close. When the true population proportion is 0.4, the probability
that a sample proportion falls between 0.2
and 0.6 is equal to 0.2577 + 0.3779 + 0.2362 or 0.8718. Thus, based on this
sample, we can say that an 87.18% confidence interval is described by
the range from 0.2 to 0.6.

It is informative to compare the findings from Examples 1 and 2. In both problems, the interval estimate ranged from 0.2 to 0.6. However, the confidence level was greater for Example 2 (which sampled without replacement) than for Example 1 (which sampled with replacement). This illustrates the fact that precision is greater when sampling without replacement than when sampling with replacement.