Chi-Square Distribution
The distribution of the chi-square statistic is called the chi-square distribution. In this lesson, we learn to compute the chi-square statistic and find the probability associated with the statistic. And we'll work through some chi-square examples to illustrate key points.
The Chi-Square Statistic
Suppose we conduct the following statistical experiment. We select a random sample of size n from a normal population, having a standard deviation equal to σ. We find that the standard deviation in our sample is equal to s. Given these data, we can define a statistic, called chi-square, using the following equation:
Χ2 =
[ ( n - 1 ) * s2 ] / σ2
The distribution of the chi-square statistic is called the chi-square distribution. The chi-square distribution is defined by the following probability density function:
Y = Y0 * ( Χ2 ) ( v/2 - 1 ) * e-Χ2 / 2
where Y0 is a constant that depends on the number of degrees of freedom, Χ2 is the chi-square statistic, v = n - 1 is the number of degrees of freedom, and e is a constant equal to the base of the natural logarithm system (approximately 2.71828). Y0 is defined, so that the area under the chi-square curve is equal to one.
In the figure below, the red curve shows the distribution of chi-square values computed from all possible samples of size 3, where degrees of freedom is n - 1 = 3 - 1 = 2. Similarly, the green curve shows the distribution for samples of size 5 (degrees of freedom equal to 4); and the blue curve, for samples of size 11 (degrees of freedom equal to 10).
The chi-square distribution has the following properties:
- The mean of the distribution is equal to the number of degrees of freedom: μ = v.
- The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v
- When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when Χ2 = v - 2.
- As the degrees of freedom increase, the chi-square curve approaches a normal distribution.
Cumulative Probability and the Chi-Square Distribution
The chi-square distribution is constructed so that the total area under the curve is equal to 1. The area under the curve between 0 and a particular chi-square value is a cumulative probability associated with that chi-square value. For example, in the figure below, the shaded area represents a cumulative probability associated with a chi-square statistic equal to A; that is, it is the probability that the value of a chi-square statistic will fall between 0 and A.
To find the probability associated with a chi-square statistic, use a chi-square distribution table (found in the appendix of most introductory statistics texts), a graphing calculator, or an online chi-square distribution calculator, like Stat Trek's Chi-Square Calculator.
Chi-Square Calculator
The Chi-Square Calculator solves common statistics problems, based on the chi-square distribution. The calculator computes cumulative probabilities, based on simple inputs. Clear instructions guide you to an accurate solution, quickly and easily. If anything is unclear, frequently-asked questions and sample problems provide straightforward explanations. The calculator is free. It can found in the Stat Trek main menu under the Stat Tools tab. Or you can tap the button below.
Chi-Square CalculatorTest Your Understanding
Problem 1
The Acme Battery Company has developed a new cell phone battery. On average, the battery lasts 60 minutes on a single charge. The standard deviation is 4 minutes.
Suppose the manufacturing department runs a quality control test. They randomly select 7 batteries. The standard deviation of the selected batteries is 6 minutes. What would be the chi-square statistic represented by this test?
Solution
We know the following:
- The standard deviation of the population is 4 minutes.
- The standard deviation of the sample is 6 minutes.
- The number of sample observations is 7.
To compute the chi-square statistic, we plug these data in the chi-square equation, as shown below.
Χ2 =
[ ( n - 1 ) * s2 ] / σ2
Χ2 = [ ( 7 - 1 )
* 62 ] / 42 = 13.5
where Χ2 is the chi-square statistic, n is the sample size, s is the standard deviation of the sample, and σ is the standard deviation of the population.
Problem 2
Let's revisit the problem presented above. The manufacturing department ran a
quality control test, using 7 randomly selected batteries. In their test, the
standard deviation was 6 minutes, which equated to a chi-square statistic of
13.5.
Suppose they repeated the test with a new random sample of 7 batteries. What is the probability that the standard deviation in the new test would be greater than 6 minutes?
Solution
We know the following:
- The sample size n is equal to 7.
- The degrees of freedom are equal to n - 1 = 7 - 1 = 6.
- The chi-square statistic is equal to 13.5 (see Example 1 above).
Given the degrees of freedom, we can determine the cumulative probability that the chi-square statistic will fall (a) between 0 and any positive value or (b) between any positive value and plus infinity. To find those cumulative probabilities, we enter the degrees of freedom (6) and the chi-square statistic (13.5) into the Chi-Square Distribution Calculator.
The calculator displays two cumulative probabilities:
- P(Χ2 ≤ 13.5) = 0.96425
- P(Χ2 ≥ 13.5) = 0.03575
This tells us that the probability that a standard deviation would be less than or equal to 6 minutes is about 0.96; and the probability that the standard deviation would be greater than or equal to 6 minutes is about 0.04.