Analysis of Simple Random Samples

Simple random sampling refers to a sampling method that has the following properties.

  • The population consists of N objects.
  • The sample consists of n objects.
  • All possible samples of n objects are equally likely to occur.

An important benefit of simple random sampling is that it allows researchers to use statistical methods to analyze sample results. For example, given a simple random sample, researchers can use statistical methods to define a confidence interval around a sample mean. Statistical analysis is not appropriate when non-random sampling methods are used.

There are many ways to obtain a simple random sample. One way would be the lottery method. Each of the N population members is assigned a unique number. The numbers are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n numbers. Population members having the selected numbers are included in the sample.

Notation

The following notation is helpful, when we talk about simple random sampling.

  • σ: The known standard deviation of the population.
  • σ2: The known variance of the population.
  • P: The true population proportion.
  • N: The number of observations in the population.
  • x: The sample estimate of the population mean.
  • s: The sample estimate of the standard deviation of the population.
  • s2: The sample estimate of the population variance.
  • p: The proportion of successes in the sample.
  • n: The number of observations in the sample.
  • SD: The standard deviation of the sampling distribution.
  • SE: The standard error. (This is an estimate of the standard deviation of the sampling distribution.)
  • Σ = Summation symbol, used to compute sums over the sample. ( To illustrate its use, Σ xi = x1 + x2 + x3 + ... + xm-1 + xm )

The Variability of the Estimate

The precision of a sample design is directly related to the variability of the estimate. Two common measures of variability are the standard deviation (SD) of the estimate and the standard error (SE) of the estimate. The tables below show how to compute both measures, assuming that the sample method is simple random sampling.

The first table shows how to compute variability for a mean score. Note that the table shows four sample designs. In two of the designs, the true population variance is known; and in two, it is estimated from sample data. Also, in two of the designs, the researcher sampled with replacement; and in two, without replacement.

Population variance Replacement strategy Variability
Known With replacement SD = sqrt [ σ2 / n ]
Known Without replacement SD = sqrt { ( 1 - n/N ) * [ N / ( N - 1 ) ] * σ2 / n }
Estimated With replacement SE = sqrt [ s2 / n ]
Estimated Without replacement SE = sqrt [ ( 1 - n/N ) * s2 / n ]

The next table shows how to compute variability for a proportion. Like the previous table, this table shows four sample designs. In two of the designs, the true population proportion is known; and in two, it is estimated from sample data. Also, in two of the designs, the researcher sampled with replacement; and in two, without replacement.

Population proportion Replacement strategy Variability
Known With replacement SD = sqrt [ P * ( 1 - P ) / n ]
Known Without replacement SD = sqrt { [ ( N - n ) / ( N - 1 ) ] * P * ( 1 - P ) / n }
Estimated With replacement SE = sqrt [ p * ( 1 - p ) / ( n - 1 ) ]
Estimated Without replacement SE = sqrt [ ( 1 - n / N ) * p * ( 1 - p ) / ( n - 1 ) ]

Sample Problem

This section presents a sample problem that illustrates how to analyze survey data when the sampling method is simple random sampling. (In a subsequent lesson, we re-visit this problem and see how simple random sampling compares to other sampling methods.)

Sample Planning Wizard

The analysis of data collected via simple random sampling can be complex and time-consuming. Stat Trek's Sample Planning Wizard can help. The Wizard computes survey precision, sample size requirements, costs, etc., as well as estimates population parameters and tests hypotheses. It also creates a summary report that lists key findings and documents analytical techniques. Whenever you work with simple random samples, consider using the Sample Planning Wizard. The Sample Planning Wizard is a premium tool available only to registered users. > Learn more

Register Now View Demo View Wizard

Problem 1

At the end of every school year, the state administers a reading test to a simple random sample drawn without replacement from a population of 20,000 third graders. This year, the test was administered to 36 students selected via simple random sampling. The test score from each sampled student is shown below:

50, 55, 60, 62, 62, 65, 67, 67, 70, 70, 70, 70, 72, 72, 73, 73, 75, 75,
75, 78, 78, 78, 78, 80, 80, 80, 82, 82, 85, 85, 85, 88, 88, 90, 90, 90 

Using sample data, estimate the mean reading achievement level in the population. Find the margin of error and the confidence interval. Assume a 95% confidence level.

Solution: Previously we described how to compute the confidence interval for a mean score. We follow that process below.

  • Identify a sample statistic. Since we are trying to estimate a population mean, we choose the sample mean as the sample statistic. The sample mean is:

    x = Σ ( xi ) / n
    x = ( 50 + 55 + 60 + ... + 90 + 90 + 90 ) / 36 = 75

    Therefore, based on data from the simple random sample, we estimate that the mean reading achievement level in the population is equal to 75.

  • Select a confidence level. In this analysis, the confidence level is defined for us in the problem. We are working with a 95% confidence level.

  • Find the margin of error. Elsewhere on this site, we show how to compute the margin of error when the sampling distribution is approximately normal. The key steps are shown below.

    • Find standard error of the sampling distribution. First, we estimate the variance of the test scores (s2). And then, we compute the standard error (SE).

      s2 = Σ ( xi - x )2 / ( n - 1 )
      s2 = [ (50 - 75)2 + (55 - 75)2 + (60 - 75)2 + ... + (90 - 75)2 + (90 - 75)2 ] / 29 = 98.97

      SE = sqrt [ ( 1 - n/N ) * s2 / n ] = sqrt [ ( 1 - 36/20,000 ) * 98.97 / 36 ] = 1.66

    • Find critical value. The critical value is a factor used to compute the margin of error. Based on the central limit theorem, we can assume that the sampling distribution of the mean is normally distributed. Therefore, we express the critical value as a z score. To find the critical value, we take these steps.

      • Compute alpha (α): α = 1 - (confidence level / 100) = 1 - 95/100 = 0.05
      • Find the critical probability (p*): p* = 1 - α/2 = 1 - 0.05/2 = 0.975
      • The critical value is the z score having a cumulative probability equal to 0.975. From the Normal Distribution Calculator, we find that the critical value is 1.96.

    • Compute margin of error (ME): ME = critical value * standard error = 1.96 * 1.66 = 3.25

  • Specify the confidence interval. The range of the confidence interval is defined by the sample statistic + margin of error. And the uncertainty is denoted by the confidence level.

Therefore, the 95% confidence interval is 71.75 to 78.25. And the margin of error is equal to 3.25. That is, we are 95% confident that the true population mean is in the range defined by 75 + 3.25.