Stat Trek Teach yourself statistics Contact Us   |   Tell a Friend   |   Newsletter
 
  Tutorials  
  AP Statistics  
  Stat Tables  
  Stat Tools  
  Calculators  
  Books  
  Help  
   
   
 

Statistics Tutorial: Measures of Variability

Some parameters attempt to describe the amount of variation between random variables. For example, consider a population of four random variables {5, 5 ,5, 5}. Here, each of the random variables are equal, so there is no variation. The set {3, 5, 5, 7}, on the other hand, has some variation since some random variables are different.

In this lesson, we discuss three parameters that are used to quantify the amount of variation in a set of random variables - the range, the variance, and the standard deviation.

Notation

The following notation is helpful, when we talk about variability.

  • σ2: The variance of the population.
  • σ: The standard deviation of the population.
  • s2: The variance of the sample.
  • s: The standard deviation of the sample.
  • μ: The population mean.
  • x: The sample mean.
  • N: Number of observations in the population.
  • n: Number of observations in the sample.
  • P: The proportion of elements in the population that has a particular attribute.
  • p: The proportion of elements in the sample that has a particular attribute.
  • Q: The proportion of elements in the population that does not have a specified attribute. Note that Q = 1 - P.
  • q: The proportion of elements in the sample that does not have a specified attribute. Note that q = 1 - p.

Note that capital letters refer to population parameters, and lower-case letters refer to sample statistics.

The Range

The range is the simplest measure of variation. It is difference between the biggest and smallest random variable.

Range = Maximum value - Minimum value

Therefore, the range of the four random variables (3, 5, 5, 7} would be 7 - 3 or 4.

Variance of a Random Variable

It is important to distinguish between the variance of a population and the variance of a sample. They have different notation, and they are computed differently. The variance of a population is denoted by σ2; and the variance of a sample, by s2.

The variance of a random variable is the average squared deviation from the population mean, as defined by the following formula:

σ2 = Σ ( Xi - μ )2 / N

where σ2 is the population variance, μ is the population mean, Xi is the ith element from the population, and N is the number of elements in the population.

The variance of a sample is defined by slightly different formula:

s2 = Σ ( xi - x )2 / ( n - 1 )

where s2 is the sample variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the sample variance can be considered an unbiased estimate to the true population variance. Therefore, if you need to estimate the unknown population variance, based on known data from a sample, this is the formula to use.

Example 1

A population consists of four observations: {1, 3, 5, 7}. What is the variance?

Solution: First, we need to compute the population mean.

μ = ( 1 + 3 + 5 + 7 ) / 4 = 4

Then we plug all of the known values in to formula for the variance of a population, as shown below:

σ2 = Σ ( Xi - μ )2 / N

σ2 = [ ( 1 - 4 )2 + ( 3 - 4 )2 + ( 5 - 4 )2 + ( 7 - 4 )2 ] / 4

σ2 = [ ( -3 )2 + ( -1 )2 + ( 1 )2 + ( 3 )2 ] / 4

σ2 = [ 9 + 1 + 1 + 9 ] / 4 = 20 / 4 = 5

Example 2

A sample consists of four observations: {1, 3, 5, 7}. What is the variance?

Solution: This problem is handled exactly like the previous problem, except that we use the formula for calculating sample variance, rather than the formula for calculating population variance.

s2 = Σ ( xi - x )2 / ( n - 1 )

s2 = [ ( 1 - 4 )2 + ( 3 - 4 )2 + ( 5 - 4 )2 + ( 7 - 4 )2 ] / ( 4 - 1 )

s2 = [ ( -3 )2 + ( -1 )2 + ( 1 )2 + ( 3 )2 ] / 3

s2 = [ 9 + 1 + 1 + 9 ] / 3 = 20 / 3 = 6.667


Variance of a Proportion

The variance formulas introduced in the previous section can be used with confidence for any random variable - even proportions. However, for proportions the formulas can be expressed in a form that is easier to compute.

With an infinite population or when sampling with replacement, the variance of a population proportion is defined by the following formula:

σ2 = PQ / n

where P is the population proportion, Q equals 1 - P, and n is sample size.

Given the same constraints (infinite population or sampling with replacement), the variance of the sample proportion is defined by slightly different formula:

s2 = pq / (n - 1)

where n is the number of elements in the sample, p is the sample estimate of the true proportion, and q is equal to 1 - p. Using this formula, the sample variance can be considered an unbiased estimate of the true population variance. Therefore, if you need to estimate the unknown population variance, based on known data from a sample, this is the formula to use.

Warning: Many introductory statistics texts present only the formula for the variance of the population proportion. Some use the population formula, when it would be more correct to use the sample formula. If the sample size is very large, both formulas give similar results; but when the sample size is small, it is better to use the correct formula.

Standard Deviation of a Random Variable

The standard deviation is the square root of the variance. It is important to distinguish between the standard deviation of a population and the standard deviation of a sample. They have different notation, and they are computed differently. The standard deviation of a population is denoted by σ; and the standard deviation of a sample, by s.

The standard deviation of a random variable is defined by the following formula:

σ = sqrt [ Σ ( Xi - μ )2 / N ]

where σ is the population standard deviation, μ is the population mean, Xi is the ith element from the population, and N is the number of elements in the population.

The standard deviation of a sample is defined by slightly different formula:

s = sqrt [ Σ ( xi - x )2 / ( n - 1 ) ]

where s is the sample standard deviation, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the sample standard deviation can be considered an unbiased estimate to the true population standard deviation. Therefore, if you need to estimate the unknown population standard deviation, based on known data from a sample, this is the formula to use.


    

Forgotten Statistics: A Refresher Course with Applications to Economics and Business
Douglas Downing Ph.D., Jeff Clark Ph.D.
List Price: $14.95
Buy Used: $0.91
Buy New: $14.52

Intermediate Statistics For Dummies
Deborah Rumsey
List Price: $19.99
Buy Used: $0.32
Buy New: $0.37

Basic Probability Theory (Dover Books on Mathematics)
Robert B. Ash
List Price: $19.95
Buy Used: $11.77
Buy New: $13.57



Excel 2007 Data Analysis For Dummies
Stephen L. Nelson
List Price: $24.99
Buy Used: $12.98
Buy New: $17.99

Forgotten Statistics: A Refresher Course with Applications to Economics and Business
Douglas Downing Ph.D., Jeff Clark Ph.D.
List Price: $14.95
Buy Used: $0.91
Buy New: $14.52

Statistics (Cliffs Quick Review)
David H. Voelker, Peter Z. Orton, Scott Adams
List Price: $9.99
Buy Used: $1.61
Buy New: $9.99

TI-83 Plus Graphing Calculator for Dummies
C. C. Edwards
List Price: $16.99
Buy Used: $8.94
Buy New: $8.97



Site Information

About Us       Site Map       Privacy Policy       Terms of Use       Resources       Advertising   
The contents of this webpage are copyright © 2010 StatTrek.com. All Rights Reserved.