Stat Trek

Teach yourself statistics

Stat Trek

Teach yourself statistics


One-Way Analysis of Variance: Example

In this lesson, we apply one-way analysis of variance to some fictitious data, and we show how to interpret the results of our analysis.

Note: Computations for analysis of variance are usually handled by a software package. For this example, however, we will do the computations "manually", since the gory details have educational value.

Problem Statement

A pharmaceutical company conducts an experiment to test the effect of a new cholesterol medication. The company selects 15 subjects randomly from a larger population. Each subject is randomly assigned to one of three treatment groups. Within each treament group, subjects receive a different dose of the new medication. In Group 1, subjects receive 0 mg/day; in Group 2, 50 mg/day; and in Group 3, 100 mg/day.

The treatment levels represent all the levels of interest to the experimenter, so this experiment used a fixed-effects model to select treatment levels for study.

After 30 days, doctors measure the cholesterol level of each subject. The results for all 15 subjects appear in the table below:

Dosage
Group 1,
0 mg
Group 2,
50 mg
Group 3,
100 mg
210 210 180
240 240 210
270 240 210
270 270 210
300 270 240

In conducting this experiment, the experimenter had two research questions:

  • Does dosage level have a significant effect on cholesterol level?
  • How strong is the effect of dosage level on cholesterol level?

To answer these questions, the experimenter intends to use one-way analysis of variance.

Is One-Way ANOVA the Right Technique?

Before you crunch the first number in one-way analysis of variance, you must be sure that one-way analysis of variance is the correct technique. That means you need to ask two questions:

  • Is the experimental design compatible with one-way analysis of variance?
  • Does the data set satisfy the critical assumptions required for one-way analysis of variance?

Let's address both of those questions.

Experimental Design

As we discussed in the previous lesson (see One-Way Analysis of Variance: Fixed Effects), one-way analysis of variance is only appropriate with one experimental design - a completely randomized design. That is exactly the design used in our cholesterol study, so we can check the experimental design box.

Critical Assumptions

We also learned in the previous lesson that one-way analysis of variance makes three critical assumptions:

  • Independence. The dependent variable score for each experimental unit is independent of the score for any other unit.
  • Normality. In the population, dependent variable scores are normally distributed within treatment groups.
  • Equality of variance. In the population, the variance of dependent variable scores in each treatment group is equal. (Equality of variance is also known as homogeneity of variance or homoscedasticity.)

Therefore, for the cholesterol study, we need to make sure our data set is consistent with the critical assumptions.

Independence of Scores

The assumption of independence is the most important assumption. When that assumption is violated, the resulting statistical tests can be misleading.

The independence assumption is satisfied by the design of the study, which features random selection of subjects and random assignment to treatment groups. Randomization tends to distribute effects of extraneous variables evenly across groups.

Normal Distributions in Groups

Violations of normality can be a problem when sample size is small, as it is in this cholesterol study. Therefore, it is important to be on the lookout for any indication of non-normality.

There are many different ways to check for normality. On this website, we describe three at: How to Test for Normality: Three Simple Tests. Given the small sample size, our best option for testing normality is to look at the following descriptive statistics:

  • Central tendency. The mean and the median are summary measures used to describe central tendency - the most "typical" value in a set of values. With a normal distribution, the mean is equal to the median.
  • Skewness. Skewness is a measure of the asymmetry of a probability distribution. If observations are equally distributed around the mean, the skewness value is zero; otherwise, the skewness value is positive or negative. As a rule of thumb, skewness between -2 and +2 is consistent with a normal distribution.
  • Kurtosis. Kurtosis is a measure of whether observations cluster around the mean of the distribution or in the tails of the distribution. The normal distribution has a kurtosis value of zero. As a rule of thumb, kurtosis between -2 and +2 is consistent with a normal distribution.

The table below shows the mean, median, skewness, and kurtosis for each group from our study.

  Group 1,
0 mg
Group 2,
50 mg
Group 3,
100 mg
Mean 258 246 210
Median 270 240 210
Range 90 60 60
Skewness -0.40 -0.51 0.00
Kurtosis -0.18 -0.61 2.00

In all three groups, the difference between the mean and median looks small (relative to the range). And skewness and kurtosis measures are consistent with a normal distribution (i.e., between -2 and +2). These are crude tests, but they provide some confidence for the assumption of normality in each group.

Note: With Excel, you can easily compute the descriptive statistics in Table 1. To see how, go to: How to Test for Normality: Example 1.

Homogeneity of Variance

When the normality of variance assumption is satisfied, you can use Hartley's Fmax test to test for homogeneity of variance. Here's how to implement the test:

  • Step 1. Compute the sample variance ( s2j ) for each group.
     
    kΣj=1
    ( X i, j - X j ) 2
    s2j =
    ( n j - 1 )

    where X i, j is the score for observation i in Group j , X j is the mean of Group j, and n j is the number of observations in Group j.

    Here is the variance ( s2j ) for each group in the cholesterol study.

    Group 1,
    0 mg
    Group 2,
    50 mg
    Group 3,
    100 mg
    1170 630 450
  • Step 2. Compute an F ratio from the following formula:

    FRATIO = s2MAX / s2MIN

    FRATIO = 1170 / 450

    FRATIO = 2.6

    where s2MAX is the largest group variance, and s2MIN is the smallest group variance.

  • Step 3. Compute degrees of freedom ( df ).

    df = n - 1

    df = 5 - 1

    df = 4

    where n is the largest sample size in any group.

  • Step 4. Based on the degrees of freedom ( 4 ) and the number of groups ( 3 ), Find the critical F value from the Table of Critical F Values for Hartley's Fmax Test. From the table, we see that the critical Fmax value is 15.5.

    Note: The critical F values in the table are based on a significance level of 0.05.

  • Step 5. Compare the observed F ratio computed in Step 2 to the critical F value recovered from the Fmax table in Step 4. If the F ratio is smaller than the Fmax table value, the variances are homogeneous. Otherwise, the variances are heterogeneous.

    Here, the F ratio (2.6) is smaller than the Fmax value (15.5), so we conclude that the variances are homogeneous.

Note: Other tests, such as Bartlett's test, can also test for homogeneity of variance. For the record, Bartlett's test yields the same conclusion for the cholesterol study; namely, the variances are homogeneous.

Analysis of Variance

Having confirmed that the critical assumptions are tenable, we can proceed with a one-way analysis of variance. That means taking the following steps:

  • Specify a mathematical model to describe the causal factors that affect the dependent variable.
  • Write statistical hypotheses to be tested by experimental data.
  • Specify a significance level for a hypothesis test.
  • Compute the grand mean and the mean scores for each group.
  • Compute sums of squares for each effect in the model.
  • Find the degrees of freedom associated with each effect in the model.
  • Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
  • Compute a test statistic, based on observed mean squares and their expected values.
  • Find the P value for the test statistic.
  • Accept or reject the null hypothesis, based on the P value and the significance level.
  • Assess the magnitude of the effect of the independent variable, based on sums of squares.

Now, let's execute each step, one-by-one, with our cholesterol medication experiment.

Mathematical Model

For every experimental design, there is a mathematical model that accounts for all of the independent and extraneous variables that affect the dependent variable. In our experiment, the dependent variable ( X ) is the cholesterol level of a subject, and the independent variable ( β ) is the dosage level administered to a subject.

For example, here is the fixed-effects model for a completely randomized design:

X i j = μ + β j + ε i ( j )

where X i j is the cholesterol level for subject i in treatment group j, μ is the population mean, β j is the effect of the dosage level administered to subjects in group j; and ε i ( j ) is the effect of all other extraneous variables on subject i in treatment j.

Statistical Hypotheses

For fixed-effects models, it is common practice to write statistical hypotheses in terms of the treatment effect β j. With that in mind, here is the null hypothesis and the alternative hypothesis for a one-way analysis of variance:

  • Null hypothesis: The null hypothesis states that the independent variable (dosage level) has no effect on the dependent variable (cholesterol level) in any treatment group. Thus,

    H0: β j = 0 for all j

  • Alternative hypothesis: The alternative hypothesis states that the independent variable has an effect on the dependent variable in at least one treatment group. Thus,

    H1: β j ≠ 0 for some j

If the null hypothesis is true, the mean score (i.e., mean cholesterol level) in each treatment group should equal the population mean. Thus, if the null hypothesis is true, mean scores in the k treatment groups should be equal. If the null hypothesis is false, at least one pair of mean scores should be unequal.

Significance Level

The significance level (also known as alpha or α) is the probability of rejecting the null hypothesis when it is actually true. The significance level for an experiment is specified by the experimenter, before data collection begins.

Experimenters often choose significance levels of 0.05 or 0.01. For this experiment, let's use a significance level of 0.05.

Mean Scores

Analysis of variance begins by computing a grand mean and group means:

  • Grand mean. The grand mean (X) is the mean of all observations, computed as follows:
    n =
    kΣj=1
    n j = 5 + 5 + 5 = 15
    X = ( 1 / n )
    kΣj=1
    n jΣi=1
    ( X i j )

    X = ( 1 / 15 ) * ( 210 + 210 + ... + 270 + 240 )

    X = 238

  • Group means. The mean of group j ( X j ) is the mean of all observations in group j, computed as follows:
X j = ( 1 / n j )
n jΣi=1
( X i j )

X 1 = 258

X 2 = 246

X 3 = 210

In the equations above, n is the total sample size across all groups; and n j is the sample size in Group j .

Sums of Squares

A sum of squares is the sum of squared deviations from a mean score. One-way analysis of variance makes use of three sums of squares:

  • Between-groups sum of squares. The between-groups sum of squares (SSB) measures variation of group means around the grand mean. It can be computed from the following formula:
    SSB =
    kΣj=1
    n jΣi=1
    X  j - X )2  = 
    kΣj=1
    nj ( X  j - X )2

    SSB = 5 * [ ( 238-258 )2 + ( 238-246)2 + ( 238-210 )2 ]

    SSB = 6240

  • Within-groups sum of squares. The within-groups sum of squares (SSW) measures variation of all scores around their respective group means. It can be computed from the following formula:
    SSW =
    kΣj=1
    n jΣi=1
    ( X i j - X j )2

    SSW = 2304 + ... + 900 = 9000

  • Total sum of squares. The total sum of squares (SST) measures variation of all scores around the grand mean. It can be computed from the following formula:
    SST =
    kΣj=1
    n jΣi=1
    ( X i j - X )2

SST = 784 + 4 + 1084 + ... + 784 + 784 + 4

SST = 15,240

It turns out that the total sum of squares is equal to the between-groups sum of squares plus the within-groups sum of squares, as shown below:

SST = SSB + SSW

15,240 = 6240 + 9000

Degrees of Freedom

The term degrees of freedom (df) refers to the number of independent sample points used to compute a statistic minus the number of parameters estimated from the sample points.

To illustrate what is going on, let's find the degrees of freedom associated with the various sum of squares computations:

  • Between-groups degrees of freedom. The between-groups sum of squares formula appears below:
    SSB = 
    kΣj=1
    nj ( X  j - X )2

    Here, the formula uses k independent sample points, the sample means X  j . And it uses one parameter estimate, the grand mean X, which was estimated from the sample points. So, the between-groups sum of squares has k - 1 degrees of freedom ( dfBG ).

    dfBG = k - 1 = 5 - 1 = 4

  • Within-groups degrees of freedom. The within-groups sum of squares formula appears below:
    SSW =
    kΣj=1
    n jΣi=1
    ( X i j - X j )2

    Here, the formula uses n independent sample points, the individual subject scores X i j . And it uses k parameter estimates, the group means X j , which were estimated from the sample points. So, the within-groups sum of squares has n - k degrees of freedom ( dfWG ).

    n = Σ n i = 5 + 5 + 5 = 15

    dfWG = n - k = 15 - 3 = 12

  • Total degrees of freedom. The total sum of squares formula appears below:
    SST =
    kΣj=1
    n jΣi=1
    ( X i j - X )2

    Here, the formula uses n independent sample points, the individual subject scores X i j . And it uses one parameter estimate, the grand mean X, which was estimated from the sample points. So, the total sum of squares has n - 1 degrees of freedom ( dfTOT ).

    dfTOT = n - 1 = 15 - 1 = 14

The degrees of freedom for each sum of squares are summarized in the table below:

Sum of squares Degrees of freedom
Between-groups k - 1 = 2
Within-groups n - k =12
Total n - 1 = 14

Mean Squares

A mean square is an estimate of population variance. It is computed by dividing a sum of squares (SS) by its corresponding degrees of freedom (df), as shown below:

MS = SS / df

To conduct a one-way analysis of variance, we are interested in two mean squares:

  • Within-groups mean square. The within-groups mean square ( MSWG ) refers to variation due to differences among experimental units within the same group. It can be computed as follows:

    MSWG = SSW / df WG

    MSWG = 9000 / 12 = 750

  • Between groups mean square. The between-groups mean square ( MSBG ) refers to variation due to differences among experimental units within the same group plus variation due to treatment effects. It can be computed as follows:

    MSBG = SSB / df BG

    MSBG = 6240 / 2 = 3120

Expected Value

The expected value of a mean square is the average value of the mean square over a large number of experiments.

Statisticians have derived formulas for the expected value of the within-groups mean square ( MSWG ) and for the expected value of the between-groups mean square ( MSBG ). For one-way analysis of variance, the expected value formulas are:

Fixed- and Random-Effects:

E( MSWG ) = σε2

Fixed-Effects:

kΣj=1
 β j2
E( MSBG ) = σε2 +
( k - 1 )

Random-Effects:

E( MSBG ) = σε2 + nσβ2

In the equations above, E( MSWG ) is the expected value of the within-groups mean square; E( MSBG ) is the expected value of the between-groups mean square; n is total sample size; k is the number of treatment groups; β j is the treatment effect in Group j; σε2 is the variance attributable to everything except the treatment effect (i.e., all the extraneous variables); and σβ2 is the variance due to random selection of treatment levels.

Notice that MSBG should equal MSWG when the variation due to treatment effects ( β j for fixed effects and σβ2 for random effects) is zero (i.e., when the independent variable does not affect the dependent variable). And MSBG should be bigger than the MSWG when the variation due to treatment effects is not zero (i.e., when the independent variable does affect the dependent variable)

Conclusion: By examining the relative size of the mean squares, we can make a judgment about whether an independent variable affects a dependent variable.

Test Statistic

Suppose we use the mean squares to define a test statistic F as follows:

F(v1, v2) = MSBG / MSWG

F(2, 12) = 3120 / 750 = 4.16

where MSBG is the between-groups mean square, MSWG is the within-groups mean square, v1 is the degrees of freedom for MSBG, and v2 is the degrees of freedom for MSWG.

Defined in this way, the F ratio measures the size of MSBG relative to MSWG. The F ratio is a convenient measure that we can use to test the null hypothesis. Here's how:

  • When the F ratio is close to one, MSBG is approximately equal to MSWG. This indicates that the independent variable did not affect the dependent variable, so we cannot reject the null hypothesis.
  • When the F ratio is significantly greater than one, MSBG is bigger than MSWG. This indicates that the independent variable did affect the dependent variable, so we must reject the null hypothesis.

What does it mean for the F ratio to be significantly greater than one? To answer that question, we need to talk about the P-value.

P-Value

In an experiment, a P-value is the probability of obtaining a result more extreme than the observed experimental outcome, assuming the null hypothesis is true.

With analysis of variance, the F ratio is the observed experimental outcome that we are interested in. So, the P-value would be the probability that an F statistic would be more extreme (i.e., bigger) than the actual F ratio computed from experimental data.

We can use Stat Trek's F Distribution Calculator to find the probability that an F statistic will be bigger than the actual F ratio observed in the experiment. Enter the between-groups degrees of freedom (2), the within-groups degrees of freedom (12), and the observed F ratio (4.16) into the calculator; then, click the Calculate button.

F distribution calculator

From the calculator, we see that the P ( F > 4.16 ) equals about 0.04. Therefore, the P-Value is 0.04.

Hypothesis Test

Recall that we specified a significance level 0.05 for this experiment. Once you know the significance level and the P-value, the hypothesis test is routine. Here's the decision rule for accepting or rejecting the null hypothesis:

  • If the P-value is bigger than the significance level, accept the null hypothesis.
  • If the P-value is equal to or smaller than the significance level, reject the null hypothesis.

Since the P-value (0.04) in our experiment is smaller than the significance level (0.05), we reject the null hypothesis that drug dosage had no effect on cholesterol level. And we conclude that the mean cholesterol level in at least one treatment group differed significantly from the mean cholesterol level in another group.

Magnitude of Effect

The hypothesis test tells us whether the independent variable in our experiment has a statistically significant effect on the dependent variable, but it does not address the magnitude of the effect. Here's the issue:

  • When the sample size is large, you may find that even small differences in treatment means are statistically significant.
  • When the sample size is small, you may find that even big differences in treatment means are not statistically significant.

With this in mind, it is customary to supplement analysis of variance with an appropriate measure of effect size. Eta squared (η2) is one such measure. Eta squared is the proportion of variance in the dependent variable that is explained by a treatment effect. The eta squared formula for one-way analysis of variance is:

η2 = SSB / SST

where SSB is the between-groups sum of squares and SST is the total sum of squares.

Given this formula, we can compute eta squared for this drug dosage experiment, as shown below:

η2 = SSB / SST = 6240 / 15240 = 0.41

Thus, 41 percent of the variance in our dependent variable (cholesterol level) can be explained by variation in our independent variable (dosage level). It appears that the relationship between dosage level and cholesterol level is significant not only in a statistical sense; it is significant in a practical sense as well.

ANOVA Summary Table

It is traditional to summarize ANOVA results in an analysis of variance table. The analysis that we just conducted provides all of the information that we need to produce the following ANOVA summary table:

Analysis of Variance Table

Source SS df MS F P
BG 6,240 2 3,120 4.16 0.04
WG 9,000 12 750
Total 15,240 14

This ANOVA table allows any researcher to interpret the results of the experiment, at a glance.

The P-value (shown in the last column of the ANOVA table) is the probability that an F statistic would be more extreme (bigger) than the F ratio shown in the table, assuming the null hypothesis is true. When the P-value is bigger than the significance level, we accept the null hypothesis; when it is smaller, we reject it. Here, the P-value (0.04) is smaller than the significance level (0.05), so we reject the null hypothesis.

To assess the strength of the treatment effect, an experimenter might compute eta squared (η2). The computation is easy, using sum of squares entries from the ANOVA table, as shown below:

η2 = SSB / SST = 6,240 / 15,240 = 0.41

where SSB is the between-groups sum of squares and SST is the total sum of squares.

For this experiment, an eta squared of 0.41 means that 41% of the variance in the dependent variable can be explained by the effect of the independent variable.

An Easier Option

In this lesson, we showed all of the hand calculations for a one-way analysis of variance. In the real world, researchers seldom conduct analysis of variance by hand. They use statistical software. In the next lesson, we'll analyze data from this problem with Excel. Hopefully, we'll get the same result.