Hypergeometric Distribution
The probability distribution of a hypergeometric random variable is called a
hypergeometric distribution. This lesson describes how
hypergeometric random variables, hypergeometric experiments,
hypergeometric probability, and the hypergeometric distribution are
all related.
Notation
The following notation is helpful, when we talk about hypergeometric
distributions and hypergeometric probability.
-
h(x; N, n, k): hypergeometric probability
- the probability that an n-trial hypergeometric experiment results in exactly
x successes, when the population consists of N items, k of
which are classified as successes.
Hypergeometric Experiments
A hypergeometric experiment is a
statistical experiment that has the following properties:
-
In the population, k items can be classified as successes, and N - k
items can be classified as failures.
Consider the following statistical experiment. You have an urn of 10 marbles - 5
red and 5 green. You randomly select 2 marbles without replacement and count
the number of red marbles you have selected. This would be a hypergeometric
experiment.
Note that it would not be a
binomial experiment. A binomial experiment requires that the
probability of success be constant on every trial. With the above experiment,
the probability of a success changes on every trial. In the beginning, the
probability of selecting a red marble is 5/10. If you select a red marble on
the first trial, the probability of selecting a red marble on the second trial
is 4/9. And if you select a green marble on the first trial, the probability of
selecting a red marble on the second trial is 5/9.
Note further that if you selected the marbles with replacement, the probability
of success would not change. It would be 5/10 on every trial. Then, this would
be a binomial experiment.
Hypergeometric Distribution
A hypergeometric random variable is the number of
successes that result from a hypergeometric experiment. The
probability distribution of a hypergeometric random variable is called
a hypergeometric distribution.
Given x, N, n, and k, we can compute the
hypergeometric probability based on the following formula:
Hypergeometric Formula.. Suppose a
population consists of N items, k of which are successes. And a
random sample drawn from that population consists of n items, x of
which are successes. Then the hypergeometric probability is:
h(x; N, n, k) = [ kCx ] [ N-kCn-x
] / [ NCn ]
The hypergeometric distribution has the following properties:
- The variance
is n * k * ( N - k ) * ( N - n ) /
[ N2 * ( N - 1 ) ] .
Example 1
Suppose we randomly select 5 cards without replacement from an ordinary deck of
playing cards. What is the probability of getting exactly 2 red cards (i.e.,
hearts or diamonds)?
Solution: This is a hypergeometric experiment in which we know the
following:
-
x = 2; since 2 of the cards we select are red.
We plug these values into the hypergeometric formula as follows:
h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]
h(2; 52, 5, 26) = [ 26C2 ] [ 26C3 ] / [ 52C5 ]
h(2; 52, 5, 26) = [ 325 ] [ 2600 ] / [ 2,598,960 ]
h(2; 52, 5, 26) = 0.32513
Thus, the probability of randomly selecting 2 red cards is 0.32513.
Hypergeometric Calculator
As you surely noticed, the hypergeometric formula requires many time-consuming
computations. The Stat Trek Hypergeometric Calculator can do this work for you -
quickly, easily, and error-free. Use the Hypergeometric Calculator to compute
hypergeometric probabilities and cumulative hypergeometric probabilities. The
calculator is free. It can found in the Stat Trek
main menu under the Stat Tools tab. Or you can tap the button below.
Hypergeometric Calculator
Cumulative Hypergeometric Probability
A cumulative hypergeometric probability refers to the
probability that the hypergeometric random variable is greater than or equal to
some specified lower limit and less than or equal to some specified
upper limit.
For example, suppose we randomly select five cards from an ordinary deck of
playing cards. We might be interested in the cumulative hypergeometric
probability of obtaining 2 or fewer hearts. This would be the probability of
obtaining 0 hearts plus the probability of obtaining 1 heart plus the
probability of obtaining 2 hearts, as shown in the example below.
Example 1
Suppose we select 5 cards from an ordinary deck of playing cards. What is the
probability of obtaining 2 or fewer hearts?
Solution: This is a hypergeometric experiment in which we know the
following:
-
x = 0 to 2; since our selection includes 0, 1, or 2 hearts.
We plug these values into the hypergeometric formula as follows:
h(x < x; N, n, k) = h(x <
2; 52, 5, 13)
h(x < 2; 52, 5, 13) = h(x = 0; 52,
5, 13) + h(x = 1; 52, 5, 13) + h(x
= 2; 52, 5, 13)
h(x < 2; 52, 5, 13) = [ (13C0)
(39C5) / (52C5) ] + [ (13C1)
(39C4) / (52C5) ] + [ (13C2)
(39C3) / (52C5) ]
h(x < 2; 52, 5, 13) = [
(1)(575,757)/(2,598,960) ] + [ (13)(82,251)/(2,598,960) ] + [ (78)(9139)/(2,598,960) ]
h(x < 2; 52, 5, 13) = [ 0.2215 ] + [
0.4114 ] + [ 0.2743 ]
h(x < 2; 52, 5, 13) = 0.9072
Thus, the probability of randomly selecting at most 2 hearts is 0.9072.