Sphericity and Repeated Measures ANOVA

A standard analysis of variance with repeated measures assumes that a condition called sphericity is satisfied. This lesson answers four questions about sphericity.

  • What is sphericity?
  • Why is sphericity important?
  • How can you test for sphericity?
  • How can you correct for sphericity violations?

After we answer those questions, we'll work through an example to illustrate the various steps an analyst should take to deal with sphericity.

What is Sphericity?

Sphericity is a key assumption that must be satisfied to justify a standard analysis of variance with data from a repeated measures experiment. The sphericity assumption is satisfied when the variance of the difference between scores for any two levels of a repeated measures factor is constant. The sphericity assumption is violated when the variance of the difference between scores for any two levels of a repeated measures factor is not constant.

Consider the data from the repeated measures experiment shown below. Each of the first six rows shows raw scores or difference scores for a single subject. The first three columns show raw scores for three treatment levels (T1, T2, and T3). The last three columns show difference scores for each pair of treatment levels (T1-T2, T1-T3, and T2-T3). And finally, the last row shows sample variances for each column of difference scores.

Raw scores Difference scores
T1 T2 T3 T1-T2 T1-T3 T2-T3
9 8 7 1 2 1
8 6 4 2 4 2
7 7 6 0 1 1
7 8 6 -2 1 2
6 5 3 1 3 2
5 5 5 0 0 0
. . .
Variance 1.87 2.17 0.67

The variances in the table are not equal. This should make the analyst suspicious that the sphericity assumption is not satisfied in this set of data.

Note: The sphericity assumption is always satisfied when a repeated measures treatment variable has exactly two levels. With only two levels, there is only one set of difference scores and only one variance; so a scenario with unequal variances cannot occur.

Why is Sphericity Important?

With certain repeated measures experiments, the sphericity assumption has implications for hypothesis testing.

  • When the sphericity assumption is satisfied, the F-test in a standard analysis of variance is appropriate.
  • When the sphericity assumption is violated, the F-test in a standard analysis of variance will be positively biased; that is, you will be more likely to make a Type I error (i.e., reject the null hypothesis when it is, in fact, true).

If the standard F-test indicates that a repeated measures effect is statistically significant, be wary. Unless the sphericity assumption is satisfied, the standard statistical test may be misleading.

If possible, you should test the validity of the sphericity assumption whenever a repeated measures statistical test is significant. And if the sphericity assumption is violated, you should adjust the standard F-test to reduce the likelihood of making a Type I error. Later in this lesson, we describe how to adjust the F-test in a standard analysis of variance.

How to Test for Sphericity

If you have a good statistical software package (e.g., SPSS, SAS), you can test the validity of the sphericity assumption. The most common test is Mauchly's sphericity test.

Mauchly's Sphericity Test

If Mauchly's sphericity test is statistically significant (p < .05), you reject the sphericity assumption and you make an adjustment to the standard F-test. If the sphericity test is not statistically significant (p ≥ .05), you can use the standard F-test in your analysis.

With most software packages, output from Mauchly's sphericity test looks something like this:

Within subjects effect Mauchly's W Chi square DF Sig
FACTOR 1 0.01 13.5 2 0.001

The important information is in the last column - the significance level associated with Mauchly's sphericity test. In this hypothetical example, the significance level (0.001) is less than 0.05. This suggests that we are dealing with a violation of the sphericity assumption, and we will want to adjust the F-test.

A Sphericity Fallback

Computations required to conduct Mauchly's sphericity test are complex, beyond the scope of what we cover on this website. So, if you don't have the right statistical software and you're not a world-class statistician, Mauchly's sphericity test is not an option.

Without a good way to test the validity of the sphericity assumption, you have two choices:

  • Assume the sphericity assumption is satisfied, and use the standard F-test without adjustment.
  • Assume the sphericity assumption is violated, and adjust the F-test accordingly.

In the real world, the sphericity assumption is seldom satisfied exactly, so adjusting the F-test to correct for potential sphericity violations is the safer choice.

How to Correct for Sphericity

If the sphericity assumption is violated, the P value in a standard ANOVA table will underestimate the true P value and increase the likelihood of a Type I error. To correct for a sphericity violation, it is necessary to compute a new P value that accounts for the departure from sphericity. The key to this computation is a correction factor called epsilon (ε).

Epsilon

Epsilon measures the severity of sphericity problem. The value of epsilon ranges between 1 and 1/(k-1), where k is the number of levels of the repeated measures factor. When the sphericity assumption is valid, epsilon will equal 1 exactly. The more severe the sphericity violation, the smaller the value of epsilon.

There are different ways to estimate epsilon, each with advantages and disadvantages. Here are the three most common methods:

  • Huynh-Feldt estimate. As a rule of thumb, use this estimate when ε is less than 0.75.
  • Greenhouse-Geisser estimate. As a rule of thumb, use this estimate when ε is greater than 0.75.
  • Lower-bound estimate. Use this estimate when neither of the other two estimates are known. This estimate assumes a maximum violation of the sphericity assumption, so ε = 1/(k‑1).

Like Mauchly's sphericity test, the first two estimates (Huynh-Feldt and Greenhouse-Geisser) are challenging to calculate; so you will probably need a software package (e.g., SPSS, SAS) to use those estimates. The lower-bound estimate is easy to calculate, but it increases the likelihood of a Type II error (accepting the null hypothesis when it is, in fact, false).

Use Epsilon to Adjust P Value

Whether you use the Huynh-Feldt, the Greenhouse-Geisser, or the lower-bound estimate of epsilon, you follow the same four steps to define a new P value.

  • Generate a standard ANOVA table.
  • Calculate the value of epsilon.
  • Multiply the degrees of freedom for the F ratio of the repeated measures effect by epsilon.
  • Using the existing F ratio and the newly-defined degrees of freedom, find a new P value.

Use the new P value to assess the significance of the repeated measures treatment effect.

A Software Solution

If you use statistical software, the computer will do all the heavy lifting. For example, software output may look something like this:

Source Epsilon df F P
Factor 1 Sphericity assumed 10 3 0.005
Greenhouse-Geisser 6 3 0.02
Huynh-Feldt 7 3 0.01
Lower-bound 1 3 0.14

Error Sphericity assumed 50
Greenhouse-Geisser 30
Huynh-Feldt 35
Lower-bound 5

The computer generates output for four scenarios. The "Sphericity assumed" rows show standard ANOVA output, with no correction for sphericity violations. The remaining rows show adjustments attributable to the Greenhouse-Geisser epsilon correction, the Huynh-Feldt epsilon correction, and the lower-bound epsilon correction. Notice that the F ratio does not change under any adjustment scenario; only the degrees of freedom and the P value change.

In this hypothetical example, a researcher would reject the null hypothesis, based on a standard analysis (p = 0.005), a Greenhouse-Geisser epsilon correction (p = 0.02), or a Huynhh-Feldt epsilon correction (p = 0.01). But not based on the lower-bound epsilon correction (p = 0.14). This may be an instance that illustrates the vulnerability of the lower-bound correction to Type II errors (accepting the null hypothesis when it is, in fact, false).

A Manual Solution

Without the right statistical software, you may not have a choice of options for estimating epsilon. Your best option for dealing with sphericity violations may be to make a lower-bound epsilon correction to output from a standard ANOVA table. To illustrate the process, we'll work through an example.

In the last lesson, we analyzed data from a single-factor, repeated measures experiment. In this experiment, each subject provided dependent variable scores at three different treatment levels (k=3), and the analysis produced the following standard ANOVA table:

Analysis of Variance Table

Source SS df MS F P
Treatment 5.44 2 2.72 7.0 0.01
Subjects 59.78 5 11.96 30.7 <0.01
Error 3.89 10 0.39
Total 69.11 17

The experimenter specified a significance level of 0.05 for this study. This analysis, with sphericity assumed, yielded a statistically significant P value of 0.01 for the treatment effect. However, if the sphericity assumption is not warranted, the significant P value shown in the ANOVA table may be misleading. It could cause us to reject the null hypothesis when it is true (i.e., make a Type I error).

To reduce the risk of making a Type I error, we can make a lower-bound correction to our analysis. Using data that is readily available from the standard ANOVA table, we can make a lower-bound correction in three steps:

  • Calculate the value of epsilon.
  • Multiply the degrees of freedom for the F ratio of the repeated measures effect by epsilon.
  • Using the existing F ratio and the newly-defined degrees of freedom, find a new P value that will be less vulnerable to a Type I error.
Calculate Epsilon

A lower bound estimate of epsilon (ε) is:

ε = 1/(k‑1)

ε = 1/(3‑1) = 0.5

In the formula above, k is the number of treatment levels for the repeated measures factor. Notice that the (k‑1) term in the formula above is the degrees of freedom for the repeated measures treatment factor.

Adjust Degrees of Freedom

Next, we define a new set of degrees of freedom for the F ratio of the repeated measures treatment effect. The degrees of freedom for the treatment effect F ratio are the treatment degrees of freedom (dfTR) and the error degrees of freedom (dfE). To define the new degrees of freedom, we multiply the old degrees of freedom (from the standard ANOVA table) by epsilon (ε).

New dfTR = ε * dfTR

New dfTR = 0.5 * 2 = 1

New dfE = ε * dfE

New dfE = 0.5 * 10 =5

Find New P Value

And finally, using the new degrees of freedom, we find a new P value for the F ratio from the ANOVA table. Specifically, the P-value we are looking for is the probability that an F with 1 and 5 degrees of freedom is greater than 7. We want to know:

P [ F(1, 5) > 7 ]

Now, we are ready to use the F Distribution Calculator. We enter the new degrees of freedom (v1 = 1) for the treatment effect, the new degrees of freedom (v2 = 5) for the error effect, and the F value (7) into the calculator; and hit the Calculate button.

F-Distribution calculator shows cumulative probability equals 0.95.

The calculator reports that the probability that F is less than or equal to 7 is 0.95. Therefore, the probability that F is greater than 7 equals 1 minus 0.95 or 0.05. Hence, the correct P-value for the treatment variable is 0.05.

Let's review. When we conducted a standard analysis of variance, which assumes the sphericity assumption is satisfied, we observed a significant treatment effect (p = 0.01). But we couldn't be confident in that result because of potential sphericity issues. When we used a lower-bound epsilon correction to account for a potential violation of the sphericity assumption, we still observed a significant treatment effect (p = 0.05).

The lower-bound epsilon correction assumes a maximum violation of sphericity. Even so, the corrected P-value was still statistically significant. This provides confidence that the original decision to reject the null hypothesis was valid.

Conclusion: If you don't have access to good software and cannot use the Greenhouse-Geisser correction or the Huynh-Feldt correction, this example shows that the lower-bound epsilon correction can be helpful, despite its propensity for Type II errors.

Test Your Understanding

Problem 1

Which of the following statements about sphericity is true?

(A) The sphericity assumption should be satisfied for ANOVA with independent groups.
(B) Violation of the sphericity assumption increases the likelihood of a Type II error.
(C) The sphericity assumption is never satisfied for a treatment factor with only two levels.
(D) None of the above.
(E) All of the above.

Solution

The correct answer is (D).

None of the statements about sphericity are correct. The sphericity assumption is required for ANOVA with repeated measures, not for ANOVA with independent groups. Violation of the sphericity assumption increases the likelihood of a Type I error, not a Type II error. And finally, the sphericity assumption is always satisfied for treatment factors with two levels. For factors with three or more levels, the sphericity assumption is often (but not always) violated.


Problem 2

A standard ANOVA assumes the sphericity assumption is not violated. Suppose you conduct a standard ANOVA for a repeated measures experiment; and based on the hypothesis test, you cannot reject the null hypothesis. If you found out later that the sphericity assumption was actually violated, what would you do?

(A) I would reject the null hypothesis.
(B) I would not know whether or not to reject null hypothesis.
(C) I would not reject the null hypothesis.
(D) I would re-test the null hypothesis with an epsilon correction.
(E) None of the above.

Solution

The correct answer is (C).

The effect of a sphericity violation on a standard ANOVA is to increase the likelihood of a Type I error (reject the null hypothesis when it is, in fact, true). In this case, however, the standard ANOVA did not reject the null hypothesis; so there is no chance that a Type I error occurred. As a result, this finding of the standard ANOVA is not called into question by violation of the sphericity assumption.

Key takeaway: If you conduct a standard ANOVA for a repeated measures experiment and fail to reject the null hypothesis, you can accept that result - even when the sphericity assumption is violated. There is no need to repeat the analysis with an epsilon correction. On the other hand, if you conduct a standard ANOVA for a repeated measures experiment and reject the null hypothesis, that finding could be questioned if the sphericity assumption is invalid. In that case, it would make sense to repeat the analysis with an epsilon correction.