How to Measure Variability in a Data Set
In this lesson, we discuss three measures that are used to quantify the
amount of variation in a data set  the range, the variance, and
the standard deviation.
For example, consider a population of elements {5, 5
,5, 5}. Here, each of the values in the data set are equal, so there is no variation.
The set {3, 5, 5, 7}, on the other hand, has some variation since some some elements
in the data set have different values.
Notation
The following notation is helpful, when we talk about variability.

q: The proportion of elements in the sample that does not have a specified
attribute. Note that q = 1  p.
Note that capital letters refer to population
parameters, and lowercase letters refer to sample
statistics.
The Range
The range is the simplest measure of variation. It is
difference between the biggest and smallest random variable.
Range = Maximum value  Minimum value
Therefore, the range of the four random variables (3, 5, 5, 7} would be 7 minus 3 or
4.
Variance of the Mean
It is important to distinguish between the variance of a population mean and the
variance of a sample mean. They have different notation, and they are computed
differently. The variance of a population mean is denoted by σ^{2};
and the variance of a sample mean, by s^{2}.
The variance of a population mean is the average squared
deviation from the population mean, as defined by the following formula:
σ^{2} = Σ ( X_{i}  μ )^{2} / N
where σ^{2} is the population variance, μ is the population mean, X_{i} is the ith element
from the population, and N is the number of elements in the population.
The variance of a sample mean is defined by slightly different formula:
s^{2} = Σ ( x_{i}
 x )^{2} / ( n  1 )
where s^{2} is the sample variance, x is
the sample mean, x_{i} is the ith element from the sample, and n
is the number of elements in the sample. If you are working with a simple random
sample, the sample
variance can be considered an unbiased estimate of the true population
variance. Therefore, if you want to estimate the unknown population variance,
based on known data from a simple random sample, use this formula.
Example 1
A population consists of four observations: {1, 3, 5, 7}. What is the variance?
Solution: First, we need to compute the population mean.
μ = ( 1 + 3 + 5 + 7 ) / 4 = 4
Then we plug all of the known values in to formula for the variance of a
population, as shown below:
σ^{2} = Σ ( X_{i}  μ )^{2} / N
σ^{2} = [ ( 1  4 )^{2}
+ ( 3  4 )^{2} + ( 5  4 )^{2} + ( 7  4 )^{2} ] / 4
σ^{2} = [ ( 3 )^{2} +
( 1 )^{2} + ( 1 )^{2} + ( 3 )^{2} ] / 4
σ^{2} = [ 9 + 1 + 1 + 9 ] / 4
= 20 / 4 = 5
Example 2
A simple random sample consists of four observations: {1, 3, 5, 7}. What is
the best estimate of the population variance?
Solution: This problem is handled exactly like the previous problem,
except that we use the formula for calculating sample variance, rather than the
formula for calculating population variance.
s^{2} = Σ ( x_{i}
 x )^{2} / ( n  1 )
s^{2} = [ ( 1  4 )^{2} + ( 3  4 )^{2}
+ ( 5  4 )^{2} + ( 7  4 )^{2} ] / ( 4  1 )
s^{2} = [ ( 3 )^{2} + ( 1 )^{2} +
( 1 )^{2} + ( 3 )^{2} ] / 3
s^{2} = [ 9 + 1 + 1 + 9 ] / 3 = 20 / 3 = 6.667
Standard Deviation of the Mean
The standard deviation is the square root of the variance. It
is important to distinguish between the standard deviation of a population and
the standard deviation of a sample. They have different notation, and they are
computed differently. The standard deviation of a population is denoted by σ;
and the standard deviation of a sample, by s.
The standard deviation of a population mean is defined by the following formula:
$$\textcolor[rgb]{}{\sigma =\sqrt{{\sigma}^{2}}=\sqrt{\frac{\sum {({X}_{\mathrm{i}}\mu )}^{2}}{N}}}$$
where σ is the population standard deviation, μ
is the population mean, X_{i} is the ith element
from the population, and N is the number of elements in the population.
The standard deviation of a sample mean is defined by slightly different formula:
$$\textcolor[rgb]{}{s=\sqrt{{s}^{2}}=\sqrt{\frac{\sum {({x}_{\mathrm{i}}\stackrel{}{x})}^{2}}{n1}}}$$
where s is the sample standard deviation, x is
the sample mean, x_{i} is the ith element from the sample, and n
is the number of elements in the sample. Using this formula, the sample
standard deviation can be considered an unbiased estimate to the true
population standard deviation. Therefore, if you need to estimate the standard deviation
of the population mean, based on known data from a simple random sample, this is the
formula to use.
Variance of a Proportion
The variance formulas introduced in the previous section can be used with
confidence for any random variable  even proportions. However, for proportions
the formulas can be expressed in a form that is easier to compute.
When all of the elements of the population are known,
the variance of a population proportion is defined by the following formula:
σ^{2} = PQ
where P is the population proportion and Q equals 1  P.
When the population proportion is estimated from sample data,
the variance of the sample proportion is estimated by slightly different
formula:
s^{2} = pq
where p is the sample estimate
of the true proportion, and q is equal to 1  p. Given a simple random sample,
this sample
variance can be considered an unbiased estimate of the true population
variance. Therefore, if you need to estimate the unknown population variance,
based on known data from a simple random sample, this is the formula to use.
Standard Deviation of a Proportion
The standard deviation of a proportion is the square root of the variance of the proportion.
Thus, the standard deviation of a population proportion is:
$$\textcolor[rgb]{}{\sigma =\sqrt{{\sigma}^{2}}=\sqrt{PQ}}$$
where P is the population proportion and Q equals 1  P.
And, using sample data, the standard deviation of a population proportion can be estimated
from the following formula:
$$\textcolor[rgb]{}{s=\sqrt{{s}^{2}}=\sqrt{pq}}$$
where p is the sample proportion and q equals 1  p.