Analysis of Simple Random Samples
Simple random sampling refers to a sampling method that has the
following properties.
-
All possible samples of n objects are equally likely to occur.
An important benefit of simple random sampling is that it allows researchers to use
statistical methods to analyze sample results. For example, given a simple random
sample, researchers can use statistical methods to define a
confidence interval around a sample mean. Statistical
analysis is not appropriate when non-random sampling methods are used.
There are many ways to obtain a simple random sample. One way would be the
lottery method. Each of the N population members is assigned a unique
number. The numbers are placed in a bowl and thoroughly mixed. Then, a
blind-folded researcher selects n numbers. Population members having the
selected numbers are included in the sample.
Notation
The following notation is helpful, when we talk about simple random sampling.
-
Σ = Summation symbol, used to compute sums
over the sample. ( To illustrate its use, Σ
xi = x1 + x2 + x3 + ... + xm-1
+ xm )
The Variability of the Estimate
The
precision
of a
sample design
is directly related to the variability of the estimate.
Two common measures of variability are the
standard deviation (SD) of the estimate and the
standard error (SE) of the estimate. The tables below show how to
compute both measures, assuming that the sample method is
simple random sampling.
The first table shows how to compute variability for a mean score. Note
that the table shows four sample designs. In two of the designs, the true
population variance is known; and in two, it is estimated from sample data.
Also, in two of the designs, the researcher sampled with replacement; and in
two, without replacement.
| Population variance
|
Replacement strategy |
Variability |
| Known
|
With replacement |
SD = sqrt [ σ2 / n ] |
| Known
|
Without replacement |
SD =
sqrt { ( 1 - n/N ) * [ N / ( N - 1 ) ] * σ2
/ n } |
| Estimated
|
With replacement |
SE = sqrt [ s2 / n ] |
| Estimated
|
Without replacement |
SE = sqrt [ ( 1 - n/N ) * s2 / n ] |
The next table shows how to compute variability for a proportion. Like
the previous table, this table shows four sample designs. In two of the
designs, the true population proportion is known; and in two, it is estimated
from sample data. Also, in two of the designs, the researcher sampled with replacement;
and in two, without replacement.
| Population proportion
|
Replacement strategy |
Variability |
| Known
|
With replacement |
SD = sqrt [ P * ( 1 - P ) / n ] |
| Known
|
Without replacement |
SD = sqrt { [ ( N - n ) / ( N - 1 ) ] * P * ( 1 - P ) / n } |
| Estimated
|
With replacement |
SE = sqrt [ p * ( 1 - p ) / ( n - 1 ) ] |
| Estimated
|
Without replacement |
SE = sqrt [ ( 1 - n / N ) * p * ( 1 - p ) / ( n - 1 ) ] |
Sample Problem
This section presents a sample problem that illustrates how to analyze survey
data when the sampling method is simple random sampling. (In a
subsequent lesson, we re-visit this problem and see how simple
random sampling compares to other sampling methods.)
Sample Planning Wizard
The analysis of data collected via simple random sampling can be complex and
time-consuming. Stat Trek's Sample Planning Wizard can help. The Wizard computes
survey precision, sample size requirements, costs, etc., as well as estimates
population parameters and tests hypotheses. It also creates a summary report that
lists key findings and documents analytical techniques. Whenever you work with
simple random samples, consider using the Sample Planning Wizard. The Sample
Planning Wizard is a premium tool available only to registered users.
>
Learn more
Problem 1
At the end of every school year, the state administers a reading test to a
simple random sample drawn without replacement from a population of 20,000
third graders. This year, the test was administered to 36 students selected via
simple random sampling. The test score from each sampled student is shown
below:
50, 55, 60, 62, 62, 65, 67, 67, 70, 70, 70, 70, 72, 72, 73, 73, 75, 75,
75, 78, 78, 78, 78, 80, 80, 80, 82, 82, 85, 85, 85, 88, 88, 90, 90, 90
Using sample data, estimate the mean reading achievement level in the
population. Find the margin
of error and the
confidence interval. Assume a 95%
confidence level.
Solution: Previously we described
how to compute the confidence interval for a mean score. We
follow that process below.
Therefore, the 95% confidence interval is 71.75 to 78.25. And the margin
of error is equal to 3.25. That is, we are 95%
confident that the true population mean is in the range
defined by 75 + 3.25.