Introduction to Analysis of Variance
This lesson provides a brief, non-technical description of analysis of variance (ANOVA). We'll answer four questions:
- What is analysis of variance?
- Why use analysis of variance?
- When can analysis of variance be used?
- How does analysis of variance work?
In future lessons, we'll cover technical details required to implement analysis of variance with different experimental designs.
What Is ANOVA?
Analysis of variance refers to a set of techniques for comparing sample statistics (means, proportions, etc.) among two or more groups. If the comparison reveals a statistically significant difference, the researcher concludes that the corresponding population parameters in one or more groups are different.
Why Use ANOVA?
Researchers use analysis of variance to test causal relationships between variables or to assess observed differences between groups.
In a true experiment, an experimenter manipulates an independent variable (a potential cause) and measures the effect on a dependent variable. The goal of the experiment is to determine whether the independent variable has a causal effect on the dependent variable. Analysis of variance provides objective decision rules for determining whether observed differences between groups (mean scores, proportions, etc.) are attributable to random chance or to the independent variable(s) manipulated by the experimenter.
Analysis of variance may also be useful with quasi-experiements. Some quasi-experiments have all the characteristics of a true experiment, except they use an independent variable that cannot be manipulated by an experimenter (e.g., age, gender, religion). Often, the goal of a quasi-experiment is to determine whether mean scores differ between treatment groups. Analysis of variance provides objective decision rules for making that determination.
When Can ANOVA Be Used?
Analysis of variance can be used when the dependent variable in an experiment or a quasi-experiment is measured on an interval scale or a ratio scale. Analysis of variance would not be the right technique when the dependent variable is measured on an ordinal scale or a categorical scale.
How Does ANOVA Work?
Analysis of variance refers to a set of techniques for interpreting differences between groups (in true experiments or in quasi-experiments). These techniques have similarities and differences.
How ANOVA Techniques Are the Same
Certain aspects of analysis of variance are the same across designs. With every experimental design, analysis of variance includes the following steps:
- Specify a mathematical model to describe how the independent variable(s) and the dependent variable are related.
- Write statistical hypotheses to be tested by experimental data.
- Specify a significance level for a hypothesis test.
- Compute sums of squares for each effect in the model.
- Find degrees of freedom for each effect in the model.
- Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
- Find the expected value of the mean squares for each effect in the model.
- Compute test statistics, based on observed mean squares and their expected values.
- Find a P value for each observed test statistic.
- Accept or reject the null hypothesis, based on the P value and the significance level.
- Assess the magnitude of the effect of the independent variable(s), based on sums of squares.
The steps required to implement analysis of variance are illustrated in the flowchart below.
Here's the good news: After you learn to implement analysis of variance for one design, it will be easier to learn to implement analysis of variance for other designs; because so many steps in the analysis are common to all designs.
How ANOVA Techniques Are Different
Analysis of variance and experimental design are intertwined. The procedures used to implement analysis of variance depend on experimental design attributes, such as the following:
- The number of independent variables under investigation.
- The way that treatment levels are selected (e.g., fixed-effects model versus random-effects model.)
- The way that subjects are assigned to treatment groups (e.g., independent groups design versus repeated measures design.)
- The number of subjects assigned to treatment groups (equal vs. unequal sample sizes between groups).
- The mathematical model that describes how independent variables and extraneous variables affect the dependent variable.
For example, formulas used to compute test statistics can be different for fixed-effects designs than for random-effects designs. Some assumptions about data are different for independent groups designs than for repeated measures designs.
Bottom line: The logic of the analysis is basically the same for every experimental design. But details (formulas, assumptions, etc.) can be different. In upcoming lessons, we'll explain how to apply the right procedures to different designs.
Test Your Understanding
Problem 1
You're running an experiment to test the hypothesis that political ads influence voter choice. You randomly assign a sample of voters to one of three groups:
- Group 1 sees ads produced by Democrats.
- Group 2 sees ads produced by Republicans.
- Group 3 sees ads produced by Libertarians.
At the end of the experiment, you ask each voter to choose a party - Democrat, Republican, or Libertarian.
Why is analysis of variance the wrong technique for this experiment?
(A) This is not a true experiment.
(B) Voters were randomly assigned to groups.
(C) The dependent variable (party choice) is a categorical variable.
(D) The independent variable (type of ad) is a categorical variable.
(E) Each voter sees ads produced by only one political party.
Solution
The correct answer is (C). When you use analysis of variance to test a hypothesis in an experiment, the dependent variable must be measured on an interval or a ratio scale. The dependent variable cannot be a categorical variable or an ordinal variable.