Statistics Dictionary
To see a definition, select a term from the dropdown text box below. The statistics
dictionary will display the definition, plus links to related web pages.
Select term:
ChiSquare Test for Independence
A chisquare test for independence is applied when you have two
categorical variables
from a single population. It is used to determine whether
there is a significant association between the two variables.
The test consists of four steps: (1) state the hypotheses,
(2) formulate an analysis plan, (3) analyze sample data, and
(4) interpret results.
 State the hypotheses. A chisquare test for independence is
conducted on two categorical variables.
Suppose that Variable A has r levels, and
Variable B has c levels.
The
null hypothesis
states that knowing the level of
Variable A does not help you predict the level of
Variable B. That is, the variables are independent. The alternative
hypothesis states that the variables are not independent.
 Formulate analysis plan. The analysis plan describes
how to use sample data to accept or reject the null
hypothesis. The plan should specify a
significance level
and should identify the chisquare test for independence
as the test method.
 Analyze sample data.
Using sample data, find the
degrees of freedom, expected frequencies,
test statistic, and the Pvalue associated with the test statistic.

Degrees of freedom. The
degrees of freedom
(DF) is equal to:
DF = (r  1) * (c  1)
where r is the number of levels for one catagorical variable, and
c is the number of levels for the other categorical
variable.

Expected frequencies. The expected frequency counts
are computed separately for each level of one categorical variable
at each level of the other categorical variable. Compute r*c
expected frequencies, according to the following formula.
E_{r,c} = (n_{r} * n_{c}) / n
where
E_{r,c} is the expected frequency count for
level r of Variable A and
level c of Variable B,
n_{r} is the total number of sample observations at
level r of Variable A,
n_{c} is the total number of sample observations at
level c of Variable B, and
n is the total sample size.

Test statistic. The test statistic is a chisquare random variable
(Χ^{2}) defined by
the following equation.
Χ^{2} =
Σ [ (O_{r,c}  E_{r,c})^{2} / E_{r,c} ]
where
O_{r,c} is the observed frequency count at level r of
Variable A and level c of Variable B, and
E_{r,c} is the expected frequency count at level r of
Variable A and level c of Variable B.

Pvalue. The Pvalue is the probability of observing a
sample statistic as extreme as the test statistic. Since the
test statistic is a chisquare, use the
ChiSquare Distribution Calculator
to assess the probability associated with the test statistic. Use
the degrees of freedom computed above.

Interpret results. If the sample findings are unlikely, given
the null hypothesis, the researcher rejects the null hypothesis.
Typically, this involves comparing the Pvalue to the
significance level
,
and rejecting the null hypothesis when the Pvalue is less than
the significance level.