Introduction to Analysis of Variance

This lesson provides a brief, non-technical description of analysis of variance (ANOVA). We'll answer four questions:

In future lessons, we'll cover technical details required to implement analysis of variance with different experimental designs.

What Is ANOVA?

Analysis of variance refers to a set of techniques for comparing sample means among two or more groups. If the comparison reveals a statistically significant difference, the researcher concludes that the population means in one or more groups are different.

Why Use ANOVA?

Researchers use analysis of variance to test causal relationships in controlled experiments. In a controlled experiment, an experimenter manipulates an independent variable (a potential cause) and measures the effect on a dependent variable.

The goal of the experiment is to determine whether the independent variable has a causal effect on the dependent variable. Analysis of variance provides objective decision rules for determining whether observed differences (in mean scores) between groups are attributable to random chance or to the independent variable(s) manipulated by the experimenter.

When Can ANOVA Be Used?

Because analysis of variance compares mean scores between groups, it can be used with a compatible experimental design when the dependent variable in an experiment is measured on an interval scale or a ratio scale.

Analysis of variance would not be the right technique when the dependent variable in an experiment is measured on an ordinal scale or a categorical scale, because you cannot compute a mean score from ordinal or categorical data.

How Does ANOVA Work?

Analysis of variance refers to a set of techniques for interpreting experimental data. These techniques have similarities and differences.

How ANOVA Techniques Are the Same

Certain aspects of analysis of variance are the same across designs. With every experimental design, analysis of variance includes the following steps:

  • Specify a mathematical model to describe the causal factors that affect the dependent variable.
  • Write statistical hypotheses to be tested by experimental data.
  • Specify a significance level for a hypothesis test.
  • Compute sums of squares for each effect in the model.
  • Find degrees of freedom for each effect in the model.
  • Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
  • Find the expected value of the mean squares for each effect in the model.
  • Compute test statistics, based on observed mean squares and their expected values.
  • Find a P value for each observed test statistic.
  • Accept or reject the null hypothesis, based on the P value and the significance level.
  • Assess the magnitude of the effect of the independent variable(s), based on sums of squares.

The steps required to implement analysis of variance are illustrated in the flowchart below.

Here's the good news: After you learn to implement analysis of variance for one design, it will be easier to learn to implement analysis of variance for other designs; because so many steps in the analysis are common to all designs.

How ANOVA Techniques Are Different

Analysis of variance and experimental design are intertwined. The procedures used to implement analysis of variance depend on experimental design attributes, such as the following:

For example, formulas used to compute test statistics can be different for fixed-effects designs than for random-effects designs. Some assumptions about data are different for independent groups designs than for repeated measures designs.

Bottom line: The logic of the analysis is basically the same for every experimental design. But details (formulas, assumptions, etc.) can be different. In the coming lessons, we'll explain how to apply the right procedures to different designs.

Test Your Understanding

Problem 1

You're running an experiment to test the hypothesis that political ads influence voter choice. You randomly assign a sample of voters to one of three groups:

  • Group 1 sees ads produced by Democrats.
  • Group 2 sees ads produced by Republicans.
  • Group 3 sees ads produced by Libertarians.

At the end of the experiment, you ask each voter to choose a party - Democrat, Republican, or Libertarian.

Why is analysis of variance the wrong technique for this experiment?

(A) This is not a true experiment.
(B) Voters were randomly assigned to groups.
(C) The dependent variable (party choice) is a categorical variable.
(D) The independent variable (type of ad) is a categorical variable.
(E) Each voter sees ads produced by only one political party.

Solution

The correct answer is (C). When you use analysis of variance to test a hypothesis in a true experiment, the dependent variable must be measured on an interval or ratio scale. It cannot be a categorical variable or an ordinal variable.