Mean Difference Between Matched Pairs
This lesson describes how to construct a
confidence interval to estimate
the mean difference between matched
data pairs.
Estimation Requirements
The approach described in this lesson is valid whenever the
following conditions are met:
- The
sampling distribution of the mean difference between data pairs
(d) is approximately
normally distributed.
Generally, the sampling distribution will be approximately
normally distributed if the sample is described by at least
one of the following statements.
- The sample size is greater than 40, without outliers.
The Variability of the Mean Difference Between Matched Pairs
Suppose d is the mean difference
between sample data pairs. To construct a
confidence interval for d,
we need to know how to compute the
standard deviation
and/or the
standard error
of the
sampling distribution for d.
Note: In real-world analyses, the standard deviation of the
population is seldom known. Therefore, the standard error is used
more often than the standard deviation.
Alert
The Advanced Placement Statistics
Examination only covers the "approximate" formulas for the standard
deviation and standard error. However, students are expected to be
aware of the limitations of these formulas; namely, the
approximate formulas should only be used when the population
size is at least 10 times larger than the sample size.
How to Find the Confidence Interval for Mean Difference With
Paired Data
Previously, we described
how to construct confidence intervals. For convenience, we
repeat the key steps below.
- Identify a sample statistic. Use the mean difference between
sample data pairs (d
to estimate the mean difference between population data
pairs μd.
- Select a confidence level. The confidence level describes the
uncertainty of a sampling
method. Often, researchers choose 90%, 95%, or 99% confidence
levels; but any percentage can be used.
- Find the margin of error. Previously, we showed
how to compute the margin of error, based on the
critical value and standard deviation.
When the sample size is large, you can use a t score or a
z score
for the critical value.
Since it does not require computing degrees of freedon, the
z score is a little easier. When the sample
sizes are small (less than 40), use a
t score
for the critical value.
If you use a t score, you will need to compute
degrees of freedom (DF). In this case, the degrees
of freedom is equal to the sample size minus one:
DF = n - 1.
- Specify the confidence interval. The range of the confidence
interval is defined by the sample statistic +
margin of error. And the uncertainty is denoted
by the confidence level.
Test Your Understanding of This Lesson
Problem
Twenty-two students were randomly selected from a population of
1000 students. The sampling method was simple random sampling.
All of the students were given a standardized English test and
a standardized math test. Test results are summarized below.
| Student |
English |
Math |
Difference, d |
(d - d)2 |
| 1 |
95 |
90 |
5 |
16 |
| 2 |
89 |
85 |
4 |
9 |
| 3 |
76 |
73 |
3 |
4 |
| 4 |
92 |
90 |
2 |
1 |
| 5 |
91 |
90 |
1 |
0 |
| 6 |
53 |
53 |
0 |
1 |
| 7 |
67 |
68 |
-1 |
4 |
| 8 |
88 |
90 |
-2 |
9 |
| 9 |
75 |
78 |
-3 |
16 |
| 10 |
85 |
89 |
-4 |
25 |
| 11 |
90 |
95 |
-5 |
36 |
|
| Student |
English |
Math |
Difference, d |
(d - d)2 |
| 12 |
85 |
83 |
2 |
1 |
| 13 |
87 |
83 |
4 |
9 |
| 14 |
85 |
83 |
2 |
1 |
| 15 |
85 |
82 |
3 |
4 |
| 16 |
68 |
65 |
3 |
4 |
| 17 |
81 |
79 |
2 |
1 |
| 18 |
84 |
83 |
1 |
0 |
| 19 |
71 |
60 |
11 |
100 |
| 20 |
46 |
47 |
-1 |
4 |
| 21 |
75 |
77 |
-2 |
9 |
| 22 |
80 |
83 |
-3 |
16 |
|
Σ(d - d)2 = 270
d = 1
Find the 90% confidence interval for the mean difference between
student scores on the math and English tests. Assume that the
mean differences are approximately normally distributed.
Solution
The approach that we used to solve this
problem is valid when the following conditions are met.
- The
sampling distribution
should be approximately normally distributed. The problem statement
says that the differences were normally distributed; so this
condition is satisfied.
Since the above requirements are satisfied, we can use the following
four-step approach to construct a confidence interval.
- Identify a sample statistic. Since we are trying to estimate
a population mean difference in math and English test scores,
we use the sample mean difference
(d = 1) as the sample
statistic.
- Select a confidence level. In this analysis, the confidence level
is defined for us in the problem. We are working with a 90%
confidence level.
- Find the margin of error. Elsewhere on this site, we show
how to compute the margin of error when the sampling
distribution is approximately normal. The key steps are
shown below.
- Specify the confidence interval. The range of the confidence
interval is defined by the sample statistic +
margin of error. And the uncertainty is denoted
by the confidence level.
Therefore, the 90% confidence interval is -0.3 to 2.3. That is, we are 90%
confident that the difference between test scores in the population is in the range
defined by 1 + 1.3.