Hypothesis Test of a Proportion (Small Sample)
This lesson explains
how to test a hypothesis about a proportion when a simple random sample has fewer
than 10 successes or 10 failures - a situation that often occurs
with small samples. (In a
previous lesson, we showed how
to conduct a hypothesis test for a proportion when a simple random sample
includes at least 10 successes and 10 failures.)
Key Steps
The approach described in this lesson is appropriate, as long as the sample includes
at least one success and one failure. The key steps are:
-
Formulate the hypotheses to be tested. This means stating the
null hypothesis and the
alternative hypothesis.
-
Determine the sampling
distribution of the proportion. If the sample proportion is the outcome
of a binomial
experiment, the sampling distribution will be binomial. If it is the
outcome of a hypergeometric
experiment, the sampling distribution will be hypergeometric.
-
Specify the significance
level. (Researchers often set the significance level equal to 0.05 or
0.01, although other values may be used.)
-
Based on the hypotheses, the sampling distribution, and the significance level,
define the region of
acceptance.
-
Test the null hypothesis. If the sample proportion falls within the region
of acceptance, do not reject the null hypothesis; otherwise, reject the null
hypothesis.
The following examples illustrate how to test hypotheses with small samples. The first example involves a
binomial experiment; and the second example, a hypergeometric experiment.
Example 1: Sampling With Replacement
Suppose an urn contains 30 marbles. Some marbles are red, and the rest are
green. A researcher hypothesizes that the urn contains 15 or more red marbles.
The researcher randomly samples five marbles,
with replacement, from the urn. Two of the selected marbles are red,
and three are green. Based on the sample results, should the researcher reject the null hypothesis? Use a significance level of 0.20.
Solution: There are five steps in conducting a hypothesis test, as
described in the previous section. We work through each of the five steps below:
-
Formulate hypotheses. The first step is to state the null
hypothesis and an alternative hypothesis.
Null hypothesis: P >= 0.50
Alternative hypothesis: P < 0.50
Note that these hypotheses constitute a
one-tailed test. The null hypothesis will be rejected only if the
sample proportion is too small.
-
Determine sampling distribution. Since we sampled with
replacement, the sample proportion can be considered an outcome of a binomial
experiment. And based on the null hypothesis, we assume that at least 15 of 30
marbles are red. Thus, the true population proportion is assumed to be 15/30 or
0.50.
Given those inputs (a binomial distribution where the true population
proportion is equal to 0.50), the sampling distribution of the proportion can
be determined. It appears in the table below. (Previously,
we showed
how to compute binomial probabilities that form the
body of the table.)
Number of red marbles in sample
|
Sample prop |
Binomial prob |
Cumu prob |
0
|
0.0 |
0.03125 |
0.03125 |
1
|
0.2 |
0.15625 |
0.1875 |
2
|
0.4 |
0.3125 |
0.5 |
3
|
0.6 |
0.3125 |
0.8125 |
4
|
0.8 |
0.15625 |
0.96875 |
5
|
1.0 |
0.03125 |
1.00 |
-
Specify significance level. The significance level was set at
0.20. (This means that the probability of making a
Type I error is 0.20, assuming that the null hypothesis is true.)
-
Define the region of acceptance. From the sampling
distribution (see above table), we see that it is not possible to define a
region of acceptance for which the significance level is exactly 0.20.
However, we can define a region of acceptance for which the significance level
would be no more than 0.20. From the table, we see that if the true
population proportion is equal to 0.50, we would be very unlikely to pick 0 or
1 red marble in our sample of 5 marbles. The probability of selecting 1 or 0
red marbles would be 0.1875. Therefore, if we let the significance level equal
0.1875, we can define the
region of rejection as any sampled outcome that includes only 0 or 1
red marble (i.e., a sampled proportion equal to 0 or 0.20). We can define the
region of acceptance as any sampled outcome that includes at least 2 red
marbles. This is equivalent to a sampled proportion that is greater than or
equal to 0.40.
-
Test the null hypothesis. Since the sample proportion (0.40) is
within the region of acceptance, we cannot reject the null hypothesis.
Example 2: Sampling Without Replacement
The Acme Advertising company has 25 clients. Account executives at Acme claim
that 80 percent of these clients are very satisfied with the service they
receive. To test that claim, Acme's CEO commissions a survey of 10 clients.
Survey participants are randomly sampled,
without replacement, from the client population. Six of the ten sampled
customers (i.e., 60 percent) say that they are very satisfied. Based on the
sample results, should the CEO accept or reject the hypothesis that 80 percent
of Acme's clients are very satisfied. Use a significance level of 0.10.
Solution: There are five steps in conducting a hypothesis test, as
described in the previous section. We work through each of the five steps below:
-
Formulate hypotheses. The first step is to state the null
hypothesis and an alternative hypothesis.
Null hypothesis: P >= 0.80
Alternative hypothesis: P < 0.80
Note that these hypotheses constitute a
one-tailed test. The null hypothesis will be rejected only if the
sample proportion is too small.
-
Determine sampling distribution. Since we sampled without
replacement, the sample proportion can be considered an outcome of a
hypergeometric experiment. And based on the null hypothesis, we assume that at
least 80 percent of the 25 clients (i.e. 20 clients) are very satisfied.
Given those inputs (a hypergeometric distribution where 20 of 25 clients are
very satisfied), the sampling distribution of the proportion can be determined.
It appears in the table below. (Previously,
we showed
how to compute hypergeometric probabilities
that form the body of the table.)
Number of satisfied clients in sample
|
Sample prop |
Prob |
Cumu prob |
4 or less
|
0.4 or less |
0.00 |
0.00 |
5
|
0.5 |
0.00474 |
0.00474 |
6
|
0.6 |
0.05929 |
0.06403 |
7
|
0.7 |
0.23715 |
0.30119 |
8
|
0.8 |
0.38538 |
0.68656 |
9
|
0.9 |
0.25692 |
0.94348 |
10
|
1.0 |
0.05652 |
1.00 |
-
Specify significance level. The significance level was set at
0.10. (This means that the probability of making a
Type I error is 0.10, assuming that the null hypothesis is true.)
-
Define the region of acceptance. From the sampling
distribution (see above table), we see that it is not possible to define a
region of acceptance for which the significance level is exactly 0.10.
However, we can define a region of acceptance for which the significance level
would be no more than 0.10. From the table, we see that if the true
proportion of very satisfied clients is equal to 0.80, we would be very
unlikely to have fewer than 7 very satisfied clients in our sample. The
probability of having 6 or fewer very satisfied clients in the sample would be
0.064. Therefore, if we let the significance level equal 0.064, we can define
the region of
rejection as any sampled outcome that includes 6 or fewer very
satisfied customers. We can define the region of acceptance as any sampled
outcome that includes 7 or more very satisfied customers. This is equivalent to
a sample proportion that is greater than or equal to 0.70.
- Test the null hypothesis. Since the sample proportion (0.60) is
outside the region of acceptance, we cannot accept the null hypothesis
at the 0.064 level of significance.