Difference Between Means
Statistics problems often involve comparisons between two
independent sample means. This lesson explains how to compute
probabilities associated with differences between means.
Difference Between Means: Theory
Suppose we have two
populations
with means equal to μ1 and μ2. Suppose
further that we take all possible
samples
of size n1 and n2. And finally, suppose that the
following assumptions are valid.
- The set of
differences between sample means is normally
distributed. This will be true if each population is normal
or if the sample sizes are large. (Based on the
central limit theorem, sample sizes of 40 are
large enough).
Given these assumptions, we know the following.
It is straightforward to derive the last bullet point, based on material
covered in previous lessons. The derivation starts with a recognition
that the variance of the difference between independent random variables
is equal to the sum of the individual variances. Thus,
σ2d =
σ2
(x1 -
x2) =
σ2
x1 +
σ2
x2
If the populations N1 and N2 are both large
relative to n1 and n2, respectively,
then
σ2
x1 =
σ21 / n1
And
σ2
x2 =
σ22 / n2
Therefore,
σd2 =
σ12 / n1 +
σ22 / n2
And
σd =
sqrt( σ12 / n1 +
σ22 / n2 )
Difference Between Means: Sample Problem
In this section, we work through a sample problem to show how to apply
the theory presented above. The approach presented is valid
whenever we need to analyze
differences between independent sample means. In this example,
differences between means are modeled with a normal distribution;
so we use Stat Trek's
Normal Distribution Calculator
to compute probabilities. The Calculator is free.
Normal Distribution Calculator
The normal calculator solves common statistical problems, based on the normal
distribution. The calculator computes cumulative probabilities, based on three
simple inputs. Simple instructions guide you quickly to an accurate solution.
If anything is unclear, frequently-asked questions and sample
problems provide straightforward explanations. Access this free calculator
from the Stat Tables tab, which appears in the header of every Stat Trek web page.
Problem 1
For boys, the average number of absences in the first grade
is 15 with a standard deviation of 7; for girls, the average
number of absences is 10 with a standard deviation of 6.
In a nationwide survey, suppose 100 boys and 50 girls are
sampled. What is the probability that the male sample
will have at most three more days of absences than
the female sample?
(A) 0.025
(B) 0.035
(C) 0.045
(D) 0.055
(E) None of the above
Solution
The correct answer is B. The solution involves three or four steps, depending on
whether you work directly with raw scores or z-scores. The "raw score" solution
appears below:
- Find the probability. This problem requires us to find the
probability that the average number of absences in the boy sample
minus the average number of absences in the girl sample
is less than 3.
To find this probability, we use Stat Trek's
Normal Distribution Calculator.
Specifically, we enter the following inputs: 3, for the normal random variable;
5, for the mean; and 1.1, for the standard deviation.
We find that the probability of the mean difference
(male absences minus female absences) being 3 or less
is about 0.035.
Thus, the probability that the difference between samples will be
no more than 3 days is 0.035.
Alternatively, we could have worked with z-scores (which have a mean of 0 and
a standard deviation of 1). Here's the z-score solution:
- Find the probability. To find this probability, we use Stat Trek's
Normal Distribution Calculator.
Specifically, we enter the following inputs: -1.818, for the normal random variable;
0, for the mean; and 1, for the standard deviation.
We find that the probability of probability of a z-score being -1.818 or less
is about 0.035.
Note that result is the same, whether you work with raw scores or z-scores.