# Difference Between Proportions

Statistics problems often involve comparisons between two independent sample proportions. This lesson explains how to compute probabilities associated with differences between proportions.

## Difference Between Proportions: Theory

Suppose we have two
populations
with proportions equal to P_{1} and P_{2}. Suppose
further that we take all possible
samples
of size n_{1} and n_{2}. And finally, suppose that the
following assumptions are valid.

- The size of each population is large relative to the sample
drawn from the population. That is, N
_{1}is large relative to n_{1}, and N_{2}is large relative to n_{2}. (In this context, populations are considered to be large if they are at least 10 times bigger than their sample.) - The samples from each population are big enough to justify using a
normal
distribution to model differences between proportions. The sample
sizes will be big enough when the following conditions are met:
n
_{1}P_{1}__>__10, n_{1}(1 -P_{1})__>__10, n_{2}P_{2}__>__10, and n_{2}(1 - P_{2})__>__10. - The samples are independent; that is, observations in population 1 are not affected by observations in population 2, and vice versa.

Given these assumptions, we know the following.

- The set of differences between sample proportions will be normally distributed. We know this from the central limit theorem.
- The
expected value of the difference between all
possible sample proportions
is equal to the difference between population proportions. Thus,
E(p
_{1}- p_{2}) = P_{1}- P_{2}. - The standard deviation of the difference between sample
proportions (σ
_{d}) is approximately equal to:

σ_{d}= sqrt{ [P_{1}(1 - P_{1}) / n_{1}] + [P_{2}(1 - P_{2}) / n_{2}] }

It is straightforward to derive the last bullet point, based on material covered in previous lessons. The derivation starts with a recognition that the variance of the difference between independent random variables is equal to the sum of the individual variances. Thus,

σ^{2}_{d} =
σ^{2}_{P1} _{-} _{P2} =
σ^{2}_{1} + σ^{2}_{2}

If the populations N_{1} and N_{2} are both large
relative to n_{1} and n_{2}, respectively,
then

σ^{2}_{1} =
P_{1}(1 - P_{1}) / n_{1}
And
σ^{2}_{2} =
P_{2}(1 - P_{2}) / n_{2}

Therefore,

σ^{2}_{d} =
[ P_{1}(1 - P_{1}) / n_{1} ] +
[ P_{2}(1 - P_{2}) / n_{2} ]

And

σ_{d} =
sqrt{ [ P_{1}(1 - P_{1}) / n_{1} ] +
[ P_{2}(1 - P_{2}) / n_{2} ] }

## Difference Between Proportions: Sample Problem

In this section, we work through a sample problem to show how to apply the theory presented above. The approach presented is valid whenever we need to analyze differences between independent sample proportions. In this example, differences between proportions are modeled with a normal distribution; so we use Stat Trek's Normal Distribution Calculator to compute probabilities. The calculator is free.

## Normal Distribution Calculator

The normal calculator solves common statistical problems, based on the normal distribution. The calculator computes cumulative probabilities, based on three simple inputs. Simple instructions guide you quickly to an accurate solution. If anything is unclear, frequently-asked questions and sample problems provide straightforward explanations. Access this free calculator from the Stat Tables tab, which appears in the header of every Stat Trek web page.

Normal Calculator |

**Problem 1**

In one state, 52% of the voters are Republicans, and 48% are Democrats. In a second state, 47% of the voters are Republicans, and 53% are Democrats. Suppose 100 voters are surveyed from each state. Assume the survey uses simple random sampling.

What is the probability that the survey will show a greater percentage of Republican voters in the second state than in the first state?

(A) 0.04

(B) 0.05

(C) 0.24

(D) 0.71

(E) 0.76

**Solution**

The correct answer is C. For this analysis, let P_{1} =
the proportion of Republican voters in the first state,
P_{2} = the proportion of Republican voters in the second state,
p_{1} = the proportion of Republican voters in the
sample from the first state, and
p_{2} = the proportion of Republican voters in the
sample from the second state. The number of voters sampled from
the first state (n_{1}) = 100, and the number of voters
sampled from the second state (n_{2}) = 100.

The solution involves four steps.

- Make sure the samples from each population are big enough to model
differences with a normal distribution. Because
n
_{1}P_{1}= 100 * 0.52 = 52, n_{1}(1 - P_{1}) = 100 * 0.48 = 48, n_{2}P_{2}= 100 * 0.47 = 47, and n_{2}(1 - P_{2}) = 100 * 0.53 = 53 are each greater than 10, the sample size is large enough. - Find the mean of the difference in sample proportions:
E(p
_{1}- p_{2}) = P_{1}- P_{2}= 0.52 - 0.47 = 0.05. - Find the standard deviation of the difference.
σ
_{d}= sqrt{ [ P_{1}(1 - P_{1}) / n_{1}] + [ P_{2}(1 - P_{2}) / n_{2}] }

σ_{d}= sqrt{ [ (0.52)(0.48) / 100 ] + [ (0.47)(0.53) / 100 ] }

σ_{d}= sqrt (0.002496 + 0.002491) = sqrt(0.004987) = 0.0706 - Find the probability. This problem requires us to find the
probability that p
_{1}is less than p_{2}. This is equivalent to finding the probability that p_{1}- p_{2}is less than zero. To find this probability, we need to transform the random variable (p_{1}- p_{2}) into a z-score. That transformation appears below.zUsing Stat Trek's Normal Distribution Calculator, we find that the probability of a z-score being -0.7082 or less is 0.24._{p1 - p2}= (x - μ_{p1 - p2}) / σ_{d}= = (0 - 0.05)/0.0706 = -0.7082

Therefore, the probability that the survey will show a greater percentage of Republican voters in the second state than in the first state is 0.24.