### Analysis of Variance

#### Introduction

#### Completely randomized design

#### Full factorial design

#### Randomized block design

#### Calculators

#### Appendices

### Analysis of Variance:

Table of Contents

#### Introduction

#### Completely randomized design

#### Full factorial design

#### Randomized block design

#### Calculators

#### Appendices

# Two-Factor Analysis of Variance With Excel

In this lesson, we demonstrate how to use Excel to conduct analysis of variance on results from a balanced, two-factor, full-factorial experiment. We'll explain how to conduct the analysis and how to interpret results for fixed-effects models, random-effects models, and mixed models.

**Note:** If you're curious about what goes on "behind the scenes" with Excel, read the previous lesson:
Two-Factor Analysis of Variance: Examples. That lesson shows the hand computations
required to solve the same problem that we will solve in this lesson with Excel.

## The Analysis ToolPak

To access the analysis of variance functions in Excel, you need a free Microsoft add-in called the Analysis ToolPak, which may or may not be already installed on your copy of Excel.

To determine whether you have the Analysis ToolPak, click the **Data** tab in the main Excel menu.
If you see **Data Analysis** in the **Analysis** section, you're good. You have the ToolPak.

If you don't have the ToolPak, you need to get it. Go to: How to Install the Data Analysis ToolPak in Excel.

## Problem Statement

A researcher conducts a full factorial experiment to test the effect of Factor A and Factor B on a continuous variable. The design calls for two levels of Factor A and three levels of Factor B - six treatment groups in all. The researcher selects 30 subjects randomly from a larger population and randomly assigns five subjects to each treatment group.

The researcher collects one dependent variable score from each subject, as shown in the table below:

Table 1. Dependent Variable Scores

A_{1} |
A_{2} |
||||
---|---|---|---|---|---|

B_{1} |
B_{2} |
B_{3} |
B_{1} |
B_{2} |
B_{3} |

1 2 3 2 1 |
1 2 4 3 1 |
2 3 4 3 2 |
2 3 4 3 2 |
2 3 5 2 2 |
3 4 5 4 3 |

In conducting this experiment, the researcher has two research questions:

- Do the independent variables have a significant effect on the dependent variable?
- How strong is the effect of independent variables on the dependent variable?

We will use the analysis of variance module in Excel to answer both of these questions.

## Fixed-Effects Model

By default, Excel assumes a fixed-effects model; that is, Excel assumes that Factor A and Factor B are both fixed factors. In this section, we'll show how to produce the default output appropriate for a fixed-effects model. In later sections, we'll show how to massage the default output for a random-effects model and for a mixed model.

### How to Conduct the Analysis

When you conduct a two-factor analysis of variance with Excel, the main output is an ANOVA summary table. As we've seen in previous lessons, an ANOVA summary table holds all the information we need to test hypotheses and measure the magnitude of effects.

Here is a step-by-step guide for producing an ANOVA summary table with Excel:

**Step 1.**Enter data in rows and columns. Include labels for levels of Factor A in the first column (shaded in gray) and labels for levels of Factor B in the first row (also shaded in gray), as shown below:**Step 2.**From Excel's main navigation menu, click**Data / Data Analysis**to display the Data Analysis dialog box.**Step 3.**In the Data Analysis dialog box, select "Anova: Two-Factor With Replication" and click the**OK**button to display the Anova: Two-Factor With Replication dialog box.**Step 4.**In the Anova: Two-Factor With Replication dialog box, enter the input range. In the "Rows per sample" textbox, enter 5; because there are 5 observations per treatment group. And finally, enter a value for Alpha, the significance level. For this exercise, we'll use a significance level of 0.05, as shown below:**Step 5.**From the Anova: Two-Factor With Replication dialog box, click the**OK**button to display the ANOVA summary table.**Note:**In the first row of the body of the ANOVA table, "Sample" refers to Factor A; and in the second row, "Columns" refers to Factor B.

### How to Interpret Results

Recall that the researcher undertook this study to answer two questions:

- Do the independent variables have a significant effect on the dependent variable?
- How strong is the effect of independent variables on the dependent variable?

Answers to both questions can be found in the ANOVA summary table.

#### Statistical Significance

In a two-factor, full factorial experiment, analysis of variance tests a null hypothesis for each of three treatment effects:

**Factor A.**The null hypothesis for Factor A states that Factor A had no effect of the dependent variable.**Factor B.**The null hypothesis for Factor B states that Factor B had no effect of the dependent variable.**AB interaction.**The null hypothesis for the AB interaction states that Factor A and Factor B did not interact to affect the dependent variable.

The P-value (shown in the last column of the ANOVA table) is the probability that an F statistic would be more extreme (bigger) than the
*F* ratio shown in the table, assuming the null hypothesis is true. When the P-value is bigger
than the significance level, we accept the null hypothesis for a treatment effect; when it is smaller, we reject it.
Based on that decision rule, here are

- For Factor A, the P-value (0.03) is smaller than the significance level (0.05), so we reject the null hypothesis and conclude that Factor A had a statistically significant effect on the dependent variable.
- For Factor B, the P-value (0.09) is bigger than the significance level (0.05), so we cannot reject the null hypothesis that Factor B had a statistically significant effect on the dependent variable.
- For the AB interaction, the P-value (0.88) is bigger than the significance level (0.05), so we cannot reject the null hypothesis that the AB interaction had a statistically significant effect on the dependent variable.

#### Magnitude of Effect

The hypothesis test tells us whether a main effect or an interaction effect has a statistically significant effect on the dependent variable, but it does not address the magnitude (i.e., strength) of the effect. Here's the issue:

- When the sample size is large, you may find that even small effects are statistically significant.
- When the sample size is small, you may find that even big effects are not statistically significant.

With this in mind, it is customary to supplement analysis of variance with an appropriate measure
of the magnitude of each treatment effect.
Eta squared (η^{2}) is one such measure. Eta squared is the proportion of variance in the
dependent variable that is explained by a treatment effect. The eta squared formula
for one-way analysis of variance is:

η^{2} = SS_{EFFECT} / SST

where SS_{EFFECT} is the sum of squares for a treatment effect and SST is the total sum of squares.

Given this formula, we can compute eta squared for each treatment effect in this experiment, as shown below:

η^{2}_{A} = SSA / SST = 5.63 / 36.3 = 0.155

η^{2}_{B} = SSB / SST = 5.60 / 36.3 = 0.154

η^{2}_{AB} = SSAB / SST = 0.27 / 36.3 = 0.007

Thus, 15.5% of the variance in the dependent variable can be explained by Factor A; 15.4%, by Factor B; and 0.7% by the AB interaction.

## Fixed vs. Random Factors

By default, Excel assumes a fixed-effects model; that is, Excel assumes that Factor A and Factor B are both fixed factors. Given this assumption, when you use Excel to conduct analysis of variance on data from a two-factor, full factorial experiment, you produce output that is only appropriate for a fixed-effects model.

Luckily, though, the ANOVA table produced by Excel for fixed factors provides all the information needed to handle random factors. It just takes a few extra steps. To illustrate the process, let's repeat the analysis of variance for our experiment with a mixed model and with a random-effects model.

### Mixed Model

If you compare the ANOVA table for a two-factor analysis of variance with a fixed-effects model (two fixed factors) to the ANOVA table for a two-factor analysis of variance with a mixed model (one fixed factor and one random factor), there are only two differences.

- The
*F*ratio for the fixed factor will be different. - The P-value for both the fixed factor will also be different.

So, if you had an ANOVA table for a fixed-effects model, you could produce a comparable ANOVA table
for a mixed model by (1) finding the correct *F* ratio for the fixed factor and (2)
finding the correct P-value for the new *F* ratio. Here is the step-by-step guide for
making that happen, assuming Factor A is the fixed factor:

**Step 1.**Using Excel, generate the ANOVA table for a fixed-effects model. We completed this step in the first part of this lesson. Here is the table we generated:*F*ratio and P-value cells for main effects are shaded in blue. For a mixed model, we need to replace the values in shaded cells with new values.

**Note:**In the first row of the body of the ANOVA table, "Sample" refers to Factor A; and in the second row, "Columns" refers to Factor B.**Step 2.**Find the new*F*ratio, assuming a mixed model. The table below shows how to construct*F*ratios for a fixed effect when an experiment uses a mixed model.Fixed effect F ratio Degrees of freedom v _{1}v _{2}A MS_{A}MS_{AB}p-1 = 1 (p-1)(q-1) = 2 In the table, p is the number of levels of Factor A; q is the number of levels of Factor B; v

_{1}is the degrees of freedom for the numerator of the*F*ratio; and v_{2}is the degrees of freedom for the denominator of the ratio.Applying the

*F*ratio formula from the table, we can compute a new*F*ratio for Factor A (the fixed main effect), as shown below:F

_{A}= F(v_{1}, v_{2}) = F(1, 2) = MS_{A}/MS_{AB}= 5.63 / 0.133 = 42.3**Step 3.**Find a new P-value, given the new*F*for Factor A. The P-value is the probability that an F statistic would be more extreme (bigger) than the*F*ratio, assuming the null hypothesis is true.Since we know the value of the new

*F*ratio and we know the degrees of freedom associated with the new*F*ratio, we can use Stat Trek's F Distribution Calculator to find the probability that an F statistic will be bigger than an actual*F*ratio observed in the experiment. We enter three inputs into the F Distribution Calculator: the degrees of freedom v_{1}(1), the degrees of freedom v_{2}(2), and the observed*F*ratio (42.3).__<__42.3 ) equals 0.98; so the P ( F > 42.3 ) equals 1 minus 0.98 or 0.02. Therefore, the P-value for Factor A is 0.02.**Step 4.**Replace the shaded entries in the table from Step 1 with values for the*F*ratio and P-value that are appropriate for a mixed model. Here is the ANOVA table when Factor A is a fixed factor, and Factor B is a random factor.Analysis of Variance Table: Mixed Model

Source SS df MS F P A 5.63 1 5.63 42.3 0.02 B 5.6 2 2.8 21.1 0.05 AB 0.27 2 0.135 0.13 0.88 WG 24.8 24 1.033 Total 36.3 29

### Random-Effects Model

If you compare the ANOVA table for a two-factor analysis of variance with two *fixed* factors to the
ANOVA table for a two-factor analysis of variance with two *random* factors, there are only a few differences.

- The
*F*ratio for both main effects will be different. - The P-value for both main effects will also be different.

So, if you had an ANOVA table for a fixed-effects model, you could produce a comparable ANOVA table
for a random-effects model by (1) finding the correct *F* ratios for main effects and (2)
finding the correct P-values for the new *F* ratios. Here is the step-by-step guide for
making that happen:

**Step 1.**Using Excel, generate the ANOVA table for a fixed-effects model. We completed this step in the first part of this lesson. Here is the table we generated:*F*ratio and P-value cells for main effects are shaded in blue. For a random-effects model, we need to replace the values in shaded cells with new values.

**Note:**In the first row of the body of the ANOVA table, "Sample" refers to Factor A; and in the second row, "Columns" refers to Factor B.**Step 2.**Find new*F*ratios, assuming a random-effects model. The table below shows how to construct*F*ratios for each main effect when an experiment uses a random-effects model.Effect F ratio Degrees of freedom v _{1}v _{2}A MS_{A}MS_{AB}p-1 (p-1)(q-1) B MS_{B}MS_{AB}q-1 (p-1)(q-1) In the table, p is the number of levels of Factor A; and q is the number of levels of Factor B. For each effect, v

_{1}is the degrees of freedom for the numerator of the*F*ratio; and v_{2}is the degrees of freedom for the denominator of the ratio.Applying formulas from the table, we can compute new

*F*ratios for the main effects, as shown below:F

_{A}= F(v_{1}, v_{2}) = F(1, 2) = MS_{A}/MS_{AB}= 5.63 / 0.133 = 42.3F

_{B}= F(v_{1}, v_{2}) = F(2, 2) = MS_{B}/MS_{AB}= 2.8 / 0.133 = 21.1**Step 3.**Find new P-values for each new*F*ratio. The P-value is the probability that an F statistic would be more extreme (bigger) than the*F*ratio, assuming the null hypothesis is true. Since we know the value of each*F*ratio and we know the degrees of freedom associated with each*F*ratio, we can use Stat Trek's F Distribution Calculator to find the probability that an F statistic will be bigger than an actual*F*ratio observed in the experiment.To illustrate how this can be done, we'll find the P-value for the

*F*ratio associated Factor B. We enter three inputs into the F Distribution Calculator: the degrees of freedom v_{1}(2), the degrees of freedom v_{2}(2), and the observed*F*ratio (21.1).__<__21.1 ) equals 0.95; so the P ( F > 21.1 ) equals 1 minus 0.95 or 0.05. Therefore, the P-value for Factor A is 0.05. Following the same procedure, we can find that the P-value for Factor A is 0.02.**Step 4.**Based on Steps 1 through 3 above, we can generate a new ANOVA table with entries that reflect a random-effects model. Here is that table.Analysis of Variance Table: Random Effects

Source SS df MS F P A 5.63 1 5.63 42.3 0.02 B 5.6 2 2.8 21.1 0.05 AB 0.27 2 0.135 0.13 0.88 WG 24.8 24 1.033 Total 36.3 29