Randomized Block Experiments: Data Analysis
This lesson explains how to use analysis of variance (ANOVA) with a balanced, independent groups, randomized block experiment. The discussion covers analysis with fixed factors and with random factors.
For stepbystep examples that demonstrate the analysis, see the following lessons:
Prerequisite: The lesson assumes general familiarity with randomized block designs. If you are unfamiliar with terms like blocks, blocking, and blocking variables, review the previous lesson: Randomized Block Designs.
Note: The discussion in this lesson is confined to randomized block designs with independent groups. Randomized block designs with repeated measures involve some special issues, so we will discuss the repeated measures design in a future lesson.
What is a Randomized Block Experiment?
A randomized block experiment with independent groups is distinguished by the following attributes:
 The design has one or more factors (i.e., one or more independent variables), each with two or more levels.
 Treatment groups are defined by a unique combination of nonoverlapping factor levels.
 Experimental units are randomly selected from a known population.
 Each experimental unit is assigned to one block, such that variability within blocks is less than variability between blocks.
 The number of experimental units within each block is equal to the number of treatment groups.
 Within each block, each experimental unit is randomly assigned to a different treatment group.
 Each experimental unit provides one dependent variable score.
The table below shows the layout for a typical randomized block experiment.
T_{1}  T_{2}  T_{3}  T_{4}  

B_{1}  X_{1,1}  X_{1,2}  X_{1,3}  X_{1,4} 
B_{2}  X_{2,1}  X_{2,2}  X_{2,3}  X_{2,4} 
B_{3}  X_{3,1}  X_{3,2}  X_{3,3}  X_{3,4} 
B_{4}  X_{4,1}  X_{4,2}  X_{4,3}  X_{4,4} 
B_{5}  X_{5,1}  X_{5,2}  X_{5,3}  X_{5,4} 
In this experiment, there are five blocks ( B_{i} ) and four treatment levels ( T_{j} ). Dependent variable scores are represented by X_{ i, j} , where X_{ i, j} is the score for the subject in block i who received treatment j.
Data Requirements
The data requirements for analysis of variance with a randomized block design are very similar to the requirements for other designs that we've covered previously in this tutorial (e.g., see OneWay Analysis of Variance and ANOVA With Full Factorial Experiments). Like the other designs, a randomized block design requires that the dependent variable be measured on an interval scale or a ratio scale. In addition, it makes three assumptions about dependent variable scores:
 Independence. The dependent variable score for each experimental unit is independent of the score for any other unit.
 Normality. In the population, dependent variable scores are normally distributed within treatment groups.
 Equality of variance. In the population, the variance of dependent variable scores in each treatment group is equal. (Equality of variance is also known as homogeneity of variance or homoscedasticity.)
The assumption of independence is the most important assumption. When that assumption is violated, the resulting statistical tests can be misleading. This assumption is tenable when (a) experimental units are randomly selected from the population and (b) experimental units within blocks are randomly assigned to treatments.
Analytical Logic
To implement analysis of variance with a balanced, independent groups, randomized block experiment, a researcher takes the following steps:
 Specify a mathematical model to describe how main effects and the blocking variable influence the dependent variable.
 Write statistical hypotheses to be tested by experimental data.
 Specify a significance level for a hypothesis test.
 Compute the grand mean and marginal means for the independent variable and for the blocking variable.
 Compute sums of squares for each effect in the model.
 Find the degrees of freedom associated with each effect in the model.
 Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
 Find the expected value of the mean squares for each effect in the model.
 Compute a test statistic for the independent variable and a test statistic for the blocking variable, based on observed mean squares and their expected values.
 Find the P value for each test statistic.
 Accept or reject null hypotheses, based on P value and significance level.
 Assess the magnitude of effect, based on sums of squares.
Below, we'll explain how to implement each step in the analysis. Along the way, we'll point out different treatments for fixedeffects models, randomeffects models, and mixed models.
Mathematical Model
For every experimental design, there is a mathematical model that accounts for all of the independent and extraneous variables that affect the dependent variable. Here is the mathematical model for an independent groups, randomized block experiment:
X_{ i j} = μ + β_{ i} + τ_{ j} + ε_{ ij}
where X_{ i j} is the dependent variable score for the subject in block i that receives treatment j, μ is the population mean, β_{ i} is the effect of Block i; τ_{ j} is the effect of Treatment j; and ε_{ ij} is the experimental error (i.e., the effect of all other extraneous variables).
For this model, it is assumed that ε_{ ij} is normally and independently distributed with a mean of zero and a variance of σ_{ε}^{2}. The mean ( μ ) is constant.
Note: Unlike the model for a full factorial experiment, the model for a randomized block experiment does not include an interaction term. That is, the model assumes there is no interaction between block and treatment effects.
Statistical Hypotheses
With a randomized block experiment, it is possible to test both block ( β_{ i} ) and treatment ( τ_{ j} ) effects. Here are the null hypotheses (H_{0}) and alternative hypotheses (H_{1}) for each effect.
H_{0}: β_{ i} = 0 for all i
H_{1}: β_{ i} ≠ 0 for some i
H_{0}: τ_{ j} = 0 for all j
H_{1}: τ_{ j} ≠ 0 for some j
With a randomized block experiment, the main hypothesis test of interest is the test of the treatment effect(s).
Block effects are of less intrinsic interest, because a blocking variable is thought to be a nuisance variable that is only included in the experiment to control for a potential source of undesired variation. Nevertheless, saavy experimenters pay attention to tests of block effects. A nonsignificant result indicates that the blocking variable may not affect the dependent variable. It is a signal that another design (e.g., a singlefactor experiment or a full factorial experiment) might provide a more powerful test of the treatment effect.
Significance Level
The significance level (also known as alpha or α) is the probability of rejecting the null hypothesis when it is actually true. The significance level for an experiment is specified by the experimenter, before data collection begins. Experimenters often choose significance levels of 0.05 or 0.01.
A significance level of 0.05 means that there is a 5% chance of rejecting the null hypothesis when it is true. A significance level of 0.01 means that there is a 1% chance of rejecting the null hypothesis when it is true. The lower the significance level, the more persuasive the evidence needs to be before an experimenter can reject the null hypothesis.
Mean Scores
Analysis of variance for a randomized block experiment begins by computing a grand mean and marginal means for independent variables and for blocks. Here are formulas for computing the various means for a randomized block experiment with one independent variable and one blocking variable:
 Grand mean. The grand mean (X) is the mean of all observations,
computed as follows:
N = nkX = ( 1 / N )nΣi=1kΣj=1( X_{ i j} )
 Marginal means for treatment levels. The mean for treatment level j is computed as follows:
X_{ j} = ( 1 / n )nΣi=1( X_{ i j} )
 Marginal means for blocks. The mean for block i is computed as follows:
X_{ i} = ( 1 / k )kΣj=1( X_{ i j} )
In the equations above, N is the total sample size; n is the number of blocks, and k is the number of treatment levels.
Sums of Squares
A sum of squares is the sum of squared deviations from a mean score. A randomized block design makes use of four sums of squares:
 Sum of squares for treatments. The sum of squares for treatments (SSTR) measures variation of the marginal means
of treatment levels ( X_{ j} )
around the grand mean ( X ). It can be computed from the following formula:
SSTR = nkΣj=1( X_{ j}  X )^{2}
 Sum of squares for blocks. The sum of squares for Factor B (SSB) measures variation of the marginal means
of blocks ( X_{ i} )
around the grand mean ( X ). It can be computed from the following formula:
SSB = knΣi=1( X_{ i}  X )^{2}
 Error sum of squares. The error sum of squares (SSE) measures variation of all scores
( X_{ i j} ) attributable to extraneous variables.
It can be computed from the following formula:
SSE =nΣi=1kΣj=1( X_{ i j}  X _{i}  X _{j} + X )^{2}
 Total sum of squares. The total sum of squares (SST) measures variation of all scores
( X_{ i j} ) around the grand mean
( X ).
It can be computed from the following formula:
SST =nΣi=1kΣj=1( X_{ i j }  X )^{2}
In the formulas above, n is the number of blocks, and k is the number of treatment levels. And the total sum of squares is equal to the sum of the component sums of squares, as shown below:
SST = SSTR + SSB + SSE
Degrees of Freedom
The term degrees of freedom (df) refers to the number of independent sample points used to compute a statistic minus the number of parameters estimated from the sample points.
The degrees of freedom used to compute the various sums of squares for an independent groups, randomized block experiment are shown in the table below:
Sum of squares  Degrees of freedom 

Treatment  k  1 
Block  n  1 
Error  ( k  1 )( n  1 ) 
Total  nk  1 
Notice that there is an additive relationship between the various sums of squares. The degrees of freedom for total sum of squares (df_{TOT}) is equal to the degrees of freedom for the treatment sum of squares (df_{TR}) plus the degrees of freedom for the blocks sum of squares (df_{B}) plus the degrees of freedom for the error sum of squares (df_{E}). That is,
df_{TOT} = df_{TR} + df_{B} + df_{E}
Mean Squares
A mean square is an estimate of population variance. It is computed by dividing a sum of squares (SS) by its corresponding degrees of freedom (df), as shown below:
MS = SS / df
To conduct analysis of variance with a randomized block experiment, we are interested in three mean squares:
 Treatment mean square. The treatment mean square ( MS_{T} ) measures
variation due to treatment levels. It can be computed as follows:
MS_{T} = SSTR / df_{TR}
 Block mean square. The block mean square ( MS_{B} ) measures
variation due to blocks. It can be computed as follows:
MS_{B} = SSB / df_{B}
 Error mean square. The error mean square ( MS_{E} ) measures
variation due to extraneous variables (anything other than the treatment or the blocking variable).
The error mean square can be computed as follows:
MS_{E} = SSE / df_{E}
Expected Value
The expected value of a mean square is the average value of the mean square over a large number of experiments.
Statisticians have derived formulas for the expected value of mean squares, assuming the mathematical model described earlier is correct. The expected values differ, depending on whether the treatment variable and the blocking variable are both random factors, both fixed factors, or a mix of fixed and random factors.
FixedEffects Model
A fixedeffects model describes an experiment in which all factors are fixed factors. The table below shows the expected value of mean squares for a randomized block experiment when the independent variable (i.e., the treatment) and the blocking variable are both fixed:
Mean square  Expected value 

MS_{T}  σ^{2}_{E} + nσ^{2}_{T} 
MS_{B}  σ^{2}_{E} + kσ^{2}_{B} 
MS_{E}  σ^{2}_{E} + σ^{2}_{TB} 
In the table above, n is the number of blocks, k is the number of treatment levels (i.e., the number of levels of the independent variable), σ^{2}_{T} is the variance due to the treatment (i.e., the independent variable), σ^{2}_{B} is the variance due to the blocking variable, σ^{2}_{TB} is the variance due to the interaction between the blocking variable and the independent variable, and σ^{2}_{E} is the variance due to extraneous variables. (σ^{2}_{E} is also known as variance due to experimental error.)
RandomEffects Model
A randomeffects model describes an experiment in which all factors are random factors. The table below shows the expected value of mean squares for a randomized block experiment when the independent variable (i.e., the treatment) and the blocking variable are both random:
Mean square  Expected value 

MS_{T}  σ^{2}_{E} + σ^{2}_{TB} + nσ^{2}_{T} 
MS_{B}  σ^{2}_{E} + σ^{2}_{TB} + kσ^{2}_{B} 
MS_{E}  σ^{2}_{E} + σ^{2}_{TB} 
Mixed Models
A mixed model describes an experiment in which at least one factor is a fixed factor, and at least one factor is a random factor. The table below shows the expected value of mean squares for a randomized block experiment when the independent variable (i.e., the treatment) is fixed and the blocking variable is random:
Mean square  Expected value 

MS_{T}  σ^{2}_{E} + nσ^{2}_{T} 
MS_{B}  σ^{2}_{E} + σ^{2}_{TB} + kσ^{2}_{B} 
MS_{E}  σ^{2}_{E} + σ^{2}_{TB} 
The table below shows the expected value of mean squares for a randomized block experiment when the independent variable (i.e., the treatment) is random and the blocking variable is fixed:
Mean square  Expected value 

MS_{T}  σ^{2}_{E} + σ^{2}_{TB} + nσ^{2}_{T} 
MS_{B}  σ^{2}_{E} + kσ^{2}_{B} 
MS_{E}  σ^{2}_{E} + σ^{2}_{TB} 
All Models Assuming Zero Interaction
In the tables above, some of the expected mean squares include a term (σ^{2}_{TB}) for the variance of the interaction between the blocking variable and the independent variable. When we specified the mathematical model for a randomized block design, we did not include an interaction term. That is, we assumed the interaction was zero.
If we assume the variance of the interaction is zero, the interaction term goes away; and the expected mean squares for all models can be expressed as shown below:
Mean square  Expected value 

MS_{T}  σ^{2}_{E} + nσ^{2}_{T} 
MS_{B}  σ^{2}_{E} + kσ^{2}_{B} 
MS_{E}  σ^{2}_{E} 
Test Statistics
Suppose we want to test the significance of an independent variable or a blocking variable in a randomized block experiment. We can use the mean squares to define a test statistic F for each source of variation, as shown in the table below:
Source  Mean square: Expected value 
F ratio 

Treatment (T)  σ^{2}_{E} + nσ^{2}_{T} 
MS_{T}
MS_{E}

Block (B)  σ^{2}_{E} + kσ^{2}_{B} 
MS_{B}
MS_{E}

Error  σ^{2}_{E} 
Consider the F ratio for the treatment effect in the table above. Notice that numerator should equal the denominator when the variation due to the treatment ( σ^{2}_{ T} ) is zero (i.e., when the treatment does not affect the dependent variable). And the numerator should be bigger than the denominator when the variation due to the treatment is not zero (i.e., when the treatment does affect the dependent variable).
The F ratio for the blocking variable works the same way. When the blocking variable does not affect the dependent variable, the numerator of the F ratio should equal the denominator. Otherwise, the numerator should be bigger than the denominator.
Each F ratio is a convenient measure that we can use to test the null hypothesis about the effect of a source (the treatment or the blocking variable) on the dependent variable. Here's how to conduct the test:
 When the F ratio is close to one, the numerator of the F ratio is approximately equal to the denominator. This indicates that the source did not affect the dependent variable, so we cannot reject the null hypothesis.
 When the F ratio is significantly greater than one, the numerator is bigger than the denominator. This indicates that the source did affect the dependent variable, so we must reject the null hypothesis.
What does it mean for the F ratio to be significantly greater than one? To answer that question, we need to talk about the Pvalue.
Warning: Recall that this analysis assumes that the interaction between blocking variable and independent variable is zero. If that assumption is incorrect, the F ratio for a fixedeffects variable will be biased. It may indicate that an effect is not significant, when it truly is significant.
PValue
In an experiment, a Pvalue is the probability of obtaining a result more extreme than the observed experimental outcome, assuming the null hypothesis is true.
With analysis of variance for a randomized block experiment, the F ratios are the observed experimental outcomes that we are interested in. So, the Pvalue would be the probability that an F ratio would be more extreme (i.e., bigger) than the actual F ratio computed from experimental data.
How does an experimenter attach a probability to an observed F ratio? Luckily, the F ratio is a random variable that has an F distribution. The degrees of freedom (v_{1} and v_{2}) for the F ratio are the degrees of freedom associated with the mean squares used to compute the F ratio.
For example, consider the F ratio for a treatment effect. That F ratio ( F_{T} ) is computed from the following formula:
F_{T} = F(v_{1}, v_{2}) = MS_{T} / MS_{E}
MS_{T} (the numerator in the formula) has degrees of freedom equal to df_{TR }; so for F_{T }, v_{1} is equal to df_{TR }. Similarly, MS_{E} (the denominator in the formula) has degrees of freedom equal to df_{E }; so for F_{T }, v_{2} is equal to df_{E }. Knowing the F ratio and its degrees of freedom, we can use an F table or an online calculator to find the probability that an F ratio will be bigger than the actual F ratio observed in the experiment.
F Distribution Calculator
To find the Pvalue associated with an F ratio, use Stat Trek's free F distribution calculator. You can access the calculator by clicking a link in the table of contents (at the top of this web page in the left column). find the calculator in the Appendix section of the table of contents, which can be accessed by tapping the "Analysis of Variance: Table of Contents" button at the top of the page. Or you can click tap the button below.
F Distribution CalculatorFor an example that shows how to find the Pvalue for an F ratio, see Problem 1 at the end of this lesson.
Hypothesis Test
Recall that the experimenter specified a significance level early on  before the first data point was collected. Once you know the significance level and the Pvalues, the hypothesis tests are routine. Here's the decision rule for accepting or rejecting a null hypothesis:
 If the Pvalue is bigger than the significance level, accept the null hypothesis.
 If the Pvalue is equal to or smaller than the significance level, reject the null hypothesis.
A "big" Pvalue for a source of variation (an independent variable or a blocking variable) indicates that the source did not have a statistically significant effect on the dependent variable. A "small" Pvalue indicates that the source did have a statistically significant effect on the dependent variable.
Magnitude of Effect
The hypothesis tests tell us whether sources of variation in our experiment had a statistically significant effect on the dependent variable, but the tests do not address the magnitude of the effect. Here's the issue:
 When the sample size is large, you may find that even small effects (indicated by a small F ratio) are statistically significant.
 When the sample size is small, you may find that even big effects are not statistically significant.
With this in mind, it is customary to supplement analysis of variance with an appropriate measure of effect size. Eta squared (η^{2}) is one such measure. Eta squared is the proportion of variance in the dependent variable that is explained by a source of variation. The eta squared formula for an independent variable or a blocking variable is:
η^{2} = SS_{SOURCE} / SST
where SS_{SOURCE} is the sum of squares for a source of variation (i.e., an independent variable or a blocking variable) and SST is the total sum of squares.
ANOVA Summary Table
It is traditional to summarize ANOVA results in an analysis of variance table. Here, filled with hypothetical data, is an analysis of variance table for a randomized block experiment with one independent variable and one blocking variable.
Analysis of Variance Table
Source  SS  df  MS  F  P 

Treatment  68  k  1 = 4  17  3.4  0.05 
Block  60  n  1 = 3  20  4.0  0.03 
Error  60  (k1)(n1) = 12  5  
Total  188  nk  1 = 19 
Many of the table entries are derived from the sum of squares (SS) and degrees of freedom (df), based on the following formulas:
MS_{T} = SS_{T} / df_{TR} = 68/4 = 17
MS_{B} = SS_{B} / df_{B} = 60/3 = 20
MS_{E} = MS_{E} / df_{E} = 60/12= 5
F_{T} = MS_{T} / MS_{E} = 17/5 = 3.4
F_{B} = MS_{B} / MS_{E} = 20/5 = 4
where MS_{T} is mean square for treatments (i.e., for the independent variable), MS_{B} is mean square for blocks, MS_{E} is the error mean square, F_{T} is the F ratio for treatments, and F_{B} is the F ratio for blocks.
An ANOVA table provides all the information an experimenter needs to (1) test hypotheses and (2) assess the magnitude of treatment effects.
Hypothesis Tests
The Pvalue (shown in the last column of the ANOVA table) is the probability that an F statistic would be more extreme (bigger) than the F ratio shown in the table, assuming the null hypothesis is true. When a Pvalue for an independent variable or a blocking variable is bigger than the significance level, we accept the null hypothesis for the effect; when it is smaller, we reject the null hypothesis.
Source  SS  df  MS  F  P 

Treatment  68  k  1 = 4  17  3.4  0.04 
Block  60  n  1 = 3  20  4.0  0.03 
Error  60  (k1)(n1) = 12  5  
Total  188  nk  1 = 19 
For example, based on the F ratios in the table above, we can draw the following conclusions:
 The Pvalue for treatments (i.e., the independent variable) is 0.04. Since the Pvalue is smaller than the significance level (0.05), we reject the null hypothesis that the independent variable has no effect on the dependent variable.
 The Pvalue for the blocking variable is 0.03. Since that Pvalue is also smaller than the significance level (0.05), we reject the null hypothesis that the blocking variable has no effect on the dependent variable.
In addition, two other points are worthy of note:
 The fact that the blocking variable is statistically significant is good news in a randomized block experiment. It confirms the suspicion that the blocking variable was a nuisance variable that could have obscured effects of the dependent variable. And it justifies the decision to use a randomized block experiment to control nuisance effects of the blocking variable.
 The independent variable was also statistically significant with a Pvalue of 0.04. However, had the experimenter used a different design that did not control the nuisance effect of the blocking variable, the experiment might not have produced a significant effect for the independent variable.
Magnitude of Effects
To assess the strength of a treatment effect, an experimenter can compute eta squared (η^{2}). The computation is easy, using sum of squares entries from an ANOVA table in the formula below:
η^{2} = SS_{SOURCE} / SST
where SS_{SOURCE} is the sum of squares for the independent variable or the blocking variable being tested and SST is the total sum of squares.
To illustrate how to this works, let's compute η^{2} for the independent variable and the blocking variable in the ANOVA table below:
Source  SS  df  MS  F  P 

Treatment  68  k  1 = 4  17  3.4  0.04 
Block  60  n  1 = 3  20  4.0  0.03 
Error  60  (k1)(n1) = 12  5  
Total  188  nk  1 = 19 
Based on the table entries, here are the computations for eta squared (η^{2}):
η^{2}_{T} = SSTR / SST = 68 / 188 = 0.36
η^{2}_{B} = SSB / SST = 60 / 188 = 0.32
Conclusion: In this experiment, the independent variable accounted for 36% of the variance in the dependent variable; and the blocking variable accounted for 32% of the variance.
Test Your Understanding
Problem 1
In the ANOVA table shown below, the Pvalue for the blocking variable is missing. What is the correct entry for the missing Pvalue?
Source  SS  df  MS  F  P 

Treatment  100  4  25  2.5  0.07 
Block  200  20  10  1.33  ??? 
Error  800  80  10  
Total  1300  104 
Hint: Stat Trek's F Distribution Calculator may be helpful.
(A) 0.11
(B) 0.19
(C) 0.43
(D) 0.81
(E) 0.89
Solution
The correct answer is (B).
A Pvalue is the probability of obtaining a result more extreme (bigger) than the observed F ratio, assuming the null hypothesis is true. From the ANOVA table, we know the following:
 The observed value of the F ratio for blocking variable is 1.33.
 The F ratio (F_{B}) was computed from the following formula:
F_{B} = F(v_{1}, v_{2}) = MS_{B} / MS_{E}
 The degrees of freedom (v_{1}) for the blocking variable mean square (MS_{B}) is 20.
 The degrees of freedom (v_{2}) for the error mean square (MS_{E}) is 80.
Therefore, the Pvalue we are looking for is the probability that an F with 20 and 80 degrees of freedom is greater than 1.33. We want to know:
P [ F(20, 80) > 1.33 ]
Now, we are ready to use the F Distribution Calculator. We enter the degrees of freedom (v1 = 20) for the block mean square, the degrees of freedom (v2 = 80) for the error mean square, and the F value (1.33) into the calculator; and hit the Calculate button.
The calculator reports that the probability that F is greater than 1.33 equals about 0.19. Hence, the correct Pvalue is 0.19.
Note: The Pvalue for the blocking variable in this experiment is not statistically significant. This suggests that the blocking variable may not be as much of a nuisance variable as the experimenter probably thought it was. It also suggests that another experimental design (perhaps a oneway analysis of variance) might have been a better choice than a randomized block design.