Stat Trek

Teach yourself statistics

Stat Trek

Teach yourself statistics


Region of Acceptance

For a hypothesis test, a researcher collects sample data. From the sample data, the researcher computes a test statistic. If the statistic falls within a specified range of values, the researcher cannot reject the null hypothesis. That range of values is called the region of acceptance.

In this lesson, we describe how to find the region of acceptance for a hypothesis test.

One-Tailed and Two-Tailed Hypothesis Tests

The steps taken to define the region of acceptance will vary, depending on whether the null hypothesis calls for a one- or a two-tailed hypothesis test.

The table below shows three sets of hypotheses. Each makes a statement about how the population mean μ is related to a specified value M. (In the table, the symbol ≠ means " not equal to ".)

Set Null hypothesis Alt hypothesis Number of tails
1 μ = M μ ≠ M 2
2 μ > M μ < M 1
3 μ < M μ > M 1

The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. That is, if the sample mean were much bigger or much smaller than M, we would reject the null hypothesis.

The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis. For example, for Set 2, we would reject the null hypothesis only if the sample mean were much smaller than M. And for Set 2, we would reject the null hypothesis only if the sample mean were much bigger than M.

How to Find the Region of Acceptance

We define the region of acceptance in such a way that the chance of making a Type I error is equal to the significance level. Here is how that is done.

  • Estimate population variance. The formula(s) to estimate variance will vary, depending on the sampling method and the parameter in the null hypothesis.
    • Proportions. If you are testing a hypothesis about a population proportion, use this formula to estimate population variance (s2):

      s2 = P * (1 - P)

      where s2 is an estimate of population variance, and P is the value of the proportion in the null hypothesis.

    • Simple random sampling with means or totals. If you use a simple random sample to test a hypothesis about a mean or a total score, use this formula to estimate variance:

      s2 = Σ ( xi - x )2 / ( n - 1 )

      where s2 is a sample estimate of population variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample.

    • Stratified sampling. If you use a stratified sample to test a hypothesis about a mean or a total score, you will need to estimate variance within each stratum. Use this formula:

      s2h = Σ ( xih - xh )2 / ( nh - 1 )

      where s2h is a sample estimate of population variance in stratum h, xih is the value of the ith element from stratum h, xh is the sample mean from stratum h, and nh is the number of sample observations from stratum h.

    • Variance within clusters. If you use two-stage cluster sampling to test a hypothesis about a mean or total score, you need to estimate the variance within clusters. Use this formula:

      s2h = Σ ( xih - xh )2 / ( mh - 1 )

      where s2h is a sample estimate of population variance in cluster h, xih is the value of the ith element from cluster h, xh is the sample mean from cluster h, and mh is the number of observations sampled from cluster h.

    • Variance between clusters. If you use cluster sampling to estimate a total score, you need to estimate the variance between clusters. Use this formula:

      s2b = Σ ( th - t/N )2 / ( n - 1 )

      where s2b is a sample estimate of the variance between sampled clusters, th is the total from cluster h, t is the sample estimate of the population total, N is the number of clusters in the population, and n is the number of clusters in the sample.

      You can estimate the population total (t) from the following formula:

      Population total = t = N/n * ΣMh * xh

      where Mh is the number of observations in the population from cluster h, and xh is the sample mean from cluster h.

  • Compute standard error. The right formula to compute standard error will vary, depending on the sampling method and the parameter under study.
    • Simple random sampling (mean or proportion). When we estimate a mean or a proportion from a simple random sample, the standard error (SE) of the estimate is:

      SE = sqrt [ (1 - n/N) * s2 / n ]

      where n is the sample size, N is the population size, and s is a sample estimate of the population standard deviation.

    • Simple random sampling (total score). When we use a mean or a proportion to estimate a population total from a simple random sample, the standard error (SE) of the estimate is:

      SE = sqrt [ N2 * (1 - n/N) * s2 / n ]

      where N is the population size, n is the sample size, and s2 is a sample estimate of the population variance.

    • Stratified sampling (mean or proportion). When we estimate a mean or a proportion from a stratified random sample, the standard error (SE) of the estimate is:

      SE = (1 / N) * sqrt { Σ [ N2h * ( 1 - nh/Nh ) * s2h / nh ] }

      where nh is the number of sample observations from stratum h, Nh is the number of elements from stratum h in the population, N is the number of elements in the population, and s2h is a sample estimate of the population variance in stratum h.

    • Stratified sampling (total score). When we estimate a total from a stratified random sample, the standard error (SE) of the estimate is:

      SE = sqrt { Σ [ N2h * ( 1 - nh/Nh ) * s2h / nh ] }

      where Nh is the number of elements from stratum h in the population, nh is the number of sample observations from stratum h, and s2h is a sample estimate of the population variance in stratum h.

    • Cluster sampling (mean). When we estimate a population mean from a cluster sample, the standard error (SE) of the estimate is:
      SE =  ( 1 / M ) * sqrt { [ N2 * ( 1 - n/N ) / n ] * Σ ( Mh * xh - t / N )2 / ( n - 1 )
      + ( N / n ) * Σ [ ( 1 - mh / Mh ) * M2h * s2h / mh ] }

      where M is the number of observations in the population, N is the number of clusters in the population, n is the number of clusters in the sample, Mh is the number of elements from cluster h in the population, mh is the number of elements from cluster h in the sample, xh is the sample mean from cluster h, s2h is a sample estimate of the population variance in stratum h, and t is a sample estimate of the population total. For the equation above, use the following formula to estimate the population total.

      t = N/n * Σ Mhxh

      With one-stage cluster sampling, the formula for the standard error reduces to:

      SE =  ( 1 / M ) * sqrt { [ N2 * ( 1 - n/N ) / n ] * Σ ( Mh * xh - t / N )2 / ( n - 1 )
    • Cluster sampling (proportion). When we estimate a population proportion from a cluster sample, the standard error (SE) of the estimate is:
      SE =  ( 1 / M ) * sqrt [ ( N2 * ( 1 - n/N ) / n ] * Σ ( Mh * ph - t / N )2 } / ( n - 1 )
      + ( N / n ) * Σ [ ( 1 - mh / Mh ) * M2h * ph * ( 1 - ph ) / ( mh - 1 ) ] }

      where M is the number of observations in the population, N is the number of clusters in the population, n is the number of clusters in the sample, Mh is the number of elements from cluster h in the population, mh is the number of elements from cluster h in the sample, ph is the value of the proportion from cluster h, and t is a sample estimate of the population total. For the equation above, use the following formula to estimate the population total.

      t = N/n * Σ Mhph

      With one-stage cluster sampling, the formula for the standard error reduces to:

      SE =  ( 1 / M ) * sqrt [ ( N2 * ( 1 - n/N ) / n ] * Σ ( Mh * ph - t / N )2 } / ( n - 1 )
    • Cluster sampling (total score). When we estimate a population total from a cluster sample, the standard error (SE) of the estimate is:
      SE =  N * sqrt { [ ( 1 - n/N ) / n ] * s2b/n +
      N/n * Σ ( 1 - mh/Mh ) * M2h * s2h/mh ) }

      where N is the number of clusters in the population, n is the number of clusters in the sample, s2b is a sample estimate of the variance between clusters, mh is the number of elements from cluster h in the sample, Mh is the number of elements from cluster h in the population, and s2h is a sample estimate of the population variance in cluster h.

      With one-stage cluster sampling, the formula for the standard error reduces to:

      SE = N * sqrt { [ ( 1 - n/N ) / n ] * s2b/n }

  • Choose a significance level. The significance level (denoted by α) is the probability of committing a Type I error. Researchers often set the significance level equal to 0.05 or 0.01.
  • Find the critical value. Often expressed as a t-score or a z-score, the critical value is a factor used to determine upper and lower limits of the region of acceptance.

    When the null hypothesis is one-tailed, the critical value is the z-score or t-score that has a cumulative probability equal to 1 - α/2. When the null hypothesis is one-tailed, the critical value has a cumulative probability equal to 1 - α.

    Researchers use a t-score when sample size is small; a z-score when it is large (at least 30). You can use the Normal Distribution Calculator to find the critical z-score, and the t Distribution Calculator to find the critical t-score.

    If you use a t-score, you will have to find the degrees of freedom (df). With simple random samples, df is often equal to the sample size minus one.

    Note: The critical value for a one-tailed hypothesis does not equal the critical value for a two-tailed hypothesis. The critical value for a one-tailed hypothesis is smaller.

  • Find the upper limit (UL) of the region of acceptance. There are two possibilities, depending on the form of the null hypothesis.
    • If the null hypothesis is μ < M or if the null hypothesis is μ = M: The upper limit of the region of acceptance will be:

      UL = M + SE * CV

      where M is the parameter value in the null hypothesis, SE is the standard error, and CV is the critical value.
    • If the null hypothesis is μ > M: The theoretical upper limit of the region of acceptance is plus infinity, unless the parameter in the null hypothesis is a proportion or a percentage. The upper limit is 1 for a proportion, and 100 for a percentage.
  • In a similar way, we find the lower limit (LL) of the range of acceptance. There are two possibilities, depending on the form of the null hypothesis.
    • If the null hypothesis is μ > M or if the null hypothesis is μ = M: The lower limit of the region of acceptance will be:

      LL = M - SE * CV

      where M is the parameter value in the null hypothesis, SE is the standard error, and CV is the critical value.
    • If the null hypothesis is μ < M: The theoretical lower limit of the region of acceptance is minus infinity, unless the test statistic is a proportion or a percentage. The lower limit for a proportion or a percentage is zero.

The region of acceptance is defined by the range between LL and UL.

Test Your Understanding

In this section, three sample problems illustrate step-by-step how to define the region of acceptance. The first problem shows how to find the standard error; the second problem, how to find the critical value; and the third problem, how to find upper and lower limits for the region of acceptance.

Sample Size Calculator

As you probably noticed, defining the region of acceptance can be complex and time-consuming. Stat Trek's Sample Size Calculator can do the same job quickly, easily, and error-free.The calculator is easy to use, and it is free. You can find the Sample Size Calculator in Stat Trek's main menu under the Stat Tools tab. Or you can tap the button below.

Sample Size Calculator

Problem 1

An inventor has developed a new, energy-efficient lawn mower engine. Suppose a simple random sample of 50 engines is tested. The engines run for an average of 295 minutes, with a standard deviation of 20 minutes.

What is the standard error of the estimate?

Solution: The right formula to compute standard error will vary, depending on the sampling method and the parameter under study. Here, we are using simple random sampling to estimate a mean score; so the right formula for the standard error (SE) is:

SE = sqrt [ (1 - n/N) * s2 / n ]

where n is the sample size, N is the population size, and s is a sample estimate of the population standard deviation.

For this problem, we know that the sample size is 50, and the standard deviation is 20. The population size is not stated explicitly; but, in theory, the manufacturer could produce an infinite number of motors. Therefore, the population size is a very large number. For the purpose of the analysis, we'll assume that the population size is 100,000. Plugging those values into the formula, we find that the standard error is:

SE = sqrt [ (1 - n/N) * s2 / n ]

SE = sqrt [ (1 - 50/100,000) * 202 / 50 ]

SE = sqrt(0.9995 * 8) = 2.828

Problem 2

An inventor has developed a new, energy-efficient lawn mower engine. He hypothesizes that the engine will run continuously for at least 300 minutes on a single ounce of regular gasoline.

What are the null and alternative hypotheses for this test? Given these hypotheses, find the critical value. Assume the significance level (α) is 0.05.

Solution: In this problem, the inventor states that his engine will run at least 300 minutes. That is the null hypothesis. The alternative hypothesis is that the engine will run less than 300 minutes. These hypotheses can be expressed as:

Null hypothesis Alternative hypothesis Number of tails
μ > 300 μ < 300 1

Notice that this is a one-tailed test, since an extreme value on only one side of the sampling distribution would cause the inventor to reject the null hypothesis. That is, he would reject the null hypothesis only if the mean running time were much less than 300 minutes.

When the sample size is large (at least 30), researchers can express the critical value as a t-score or as a z-score. Here, the sample size is much larger than 30 (n=50), so we will express the critical value as a z-score.

Since the null hypothesis is one-tailed, the critical value is the z-score that has a cumulative probability equal to 1 - α. For this problem, the significance level (α) is 0.05, so the critical value will be the z-score that has a cumulative probability equal to 0.95.

We use the Normal Distribution Calculator to find that the z-score with a cumulative probability of 0.95 is 1.645. Thus, the critical value is 1.645.

Problem 3

Use the findings from Problems 1 and 2 to define the region of acceptance for the hypothesis test described in Problem 2. In the tests, the engines run for an average of 295 minutes. Based on these findings and your analysis of the region of acceptance, should the inventor reject his hypothesis that the engine will run continuously for 300 minutes on a single ounce of gasoline?

Solution: From the previous problems, we know three important facts:

  • The null hypothesis is: μ > 300.
  • The standard error is 2.828.
  • The critical value is 1.645.

For this type of one-tailed hypothesis, the theoretical upper limit of the region of acceptance is plus infinity; since any run time greater than 300 is consistent with the null hypothesis. The lower limit (LL) of the region of acceptance will be:

LL = M - SE * CV

where M is the parameter value in the null hypothesis, SE is the standard error, and CV is the critical value. So, for this problem, we compute the lower limit of the region of acceptance as:

LL = 300 - 2.828 * 1.645

LL = 300 - 4.652

LL = 295.35

Therefore, the region of acceptance for this hypothesis test is 295.35 to plus infinity. Any run time within that range is consistent with the null hypothesis. In the tests, the engines ran for an average of 295 minutes. That value is outside the region of acceptance, so the inventor should reject the null hypothesis that the engines run for at least 300 minutes.