Power of a Hypothesis Test

The probability of failing to reject a null hypothesis when it is false (i.e., committing a Type II error) is called β. The probability of not committing a Type II error is called the power of a hypothesis test. The power of a test is:

Power = 1 = β

Effect Size

The effect size is the difference between the true value of a population parameter and the value specified in the null hypothesis.

Effect size = True value - Hypothesized value

For example, suppose the null hypothesis states that a population mean is equal to 100. A researcher might ask: What is the probability of rejecting the null hypothesis if the true population mean were really equal to 90? In this example, the effect size would be 90 - 100, which equals -10.

Factors That Affect Power

The power of a hypothesis test is affected by the following factors:

Sample size (n). Other things being equal, the greater the sample size, the greater the power of the test.
Significance level (α). The lower the significance level (α), the lower the power of the test. If you reduce the significance level (e.g., from 0.05 to 0.01), the region of acceptance gets bigger. As a result, you are less likely to reject the null hypothesis when it is false, so you are more likely to make a Type II error.
The "true" value of the parameter being tested. The greater the difference between the "true" value of a parameter and the value specified in the null hypothesis, the greater the power of the test. That is, the greater the effect size, the greater the power of the test.
Variability in the population. More variability makes it harder to detect true differences, which decreases power; and less variability makes it easier to detect true differences, which increases power.
Tails in hypothesis test. Using a two-tailed hypothesis test, rather that a one-tailed hypothes test decreases power. A two-tailed test splits the significance level (α) between two tails, requiring stronger evidence to reject the null hypothesis.

Note: Students are not expected to compute power on the AP Statistics test. But they are expected to know how the factors listed above affect power.

Test Your Understanding

Problem 1

In the context of hypothesis testing, what does the power of a hypothesis test refer to?

(A) The probability of not rejecting the null hypothesis with it is false.
(B) The probability of rejecting the null hypothesis when it is true.
(C) The probability of rejecting the null hypothesis when it is false.
(D) The probability of not rejecting the null hypothesis when it is true.
(E) The smallest significance level at which the null hypothesis can be rejected.

Solution

The answer is C. Power is the probability of correctly rejecting a false null hypothesis. That is, power is the probability of not committing a Type II error.

Option A is the probability of committing a Type II error. Option B is the probability of committing a Type I error. Option D describes the significance level. Option E describes the P-value.

Problem 2

Other things being equal, which of the following actions will increase the power of a hypothesis test?

I. Increasing sample size.
II. Changing the significance level from 0.01 to 0.05.
III. Increasing beta, the probability of a Type II error.

(A) I only
(B) II only
(C) III only
(D) All of the above
(E) None of the above

Solution

The correct answer is (C). Increasing sample size makes the hypothesis test more sensitive - more likely to reject the null hypothesis when it is, in fact, false. Changing the significance level from 0.01 to 0.05 makes the region of acceptance smaller, which makes the hypothesis test more likely to reject the null hypothesis, thus increasing the power of the test. Since, by definition, power is equal to one minus beta, the power of a test will get smaller as beta gets bigger.

Problem 3

Suppose a researcher conducts an experiment to test a hypothesis. Other things being equal (sample size, significance level, etc.), which of the following will increase if she uses a one-tailed test instead of a two-tailed test?

I. The power of the hypothesis test.
II. The effect size of the hypothesis test.
III. The probability of making a Type II error.

(A) I only
(B) I and II
(C) I and III
(D) All of the above
(E) None of the above

Solution

The correct answer is (A). Using a one-tailed hypothesis test instead of a two-tailed hypothesis test increases power. A two-tailed test splits the significance level (α) between two tails, requiring stronger evidence to reject the null.

The effect size is the same, whether the researcher uses a one-tailed test or a two-tailed test. Since the power of the hypothesis test gets bigger with a one-tailed test, the probability of making a Type II error gets smaller, not bigger.

Last lesson Next lesson