Statistics Notation

This appendix describes how symbols are used on the Stat Trek web site to represent numbers, variables, parameters, statistics, etc.

Capitalization

In general, capital letters refer to population attributes (i.e., parameters); and lower-case letters refer to sample attributes (i.e., statistics). For example,

  • P refers to a population proportion; and p, to a sample proportion.
  • X refers to a set of population elements; and x, to a set of sample elements.
  • N refers to population size; and n, to sample size.

Greek vs. Roman Letters

Like capital letters, Greek letters refer to population attributes. Their sample counterparts, however, are usually Roman letters. For example,

  • μ refers to a population mean; and x, to a sample mean.
  • σ refers to the standard deviation of a population; and s, to the standard deviation of a sample.

Population Parameters

By convention, specific symbols represent certain population parameters. For example,

  • μ refers to a population mean.
  • σ refers to the standard deviation of a population.
  • σ2 refers to the variance of a population.
  • P refers to the proportion of population elements that have a particular attribute.
  • Q refers to the proportion of population elements that do not have a particular attribute, so Q = 1 - P.
  • ρ is the population correlation coefficient, based on all of the elements from a population.
  • N is the number of elements in a population.

Sample Statistics

By convention, specific symbols represent certain sample statistics. For example,

  • x refers to a sample mean.
  • s refers to the standard deviation of a sample.
  • s2 refers to the variance of a sample.
  • p refers to the proportion of sample elements that have a particular attribute.
  • q refers to the proportion of sample elements that do not have a particular attribute, so q = 1 - p.
  • r is the sample correlation coefficient, based on all of the elements from a sample.
  • n is the number of elements in a sample.

Simple Linear Regression

  • Β0 is the intercept constant in a population regression line.
  • Β1 is the regression coefficient (i.e., slope) in a population regression line.
  • R2 refers to the coefficient of determination.
  • b0 is the intercept constant in a sample regression line.
  • b1 refers to the regression coefficient in a sample regression line (i.e., the slope).
  • sb1 refers to the refers to the standard error of the slope of a regression line.

Probability

Counting

  • n! refers to the factorial value of n.
  • nPr refers to the number of permutations of n things taken r at a time.
  • nCr refers to the number of combinations of n things taken r at a time.

Set Theory

Hypothesis Testing

Random Variables

  • Z or z refers to a standardized score, also known as a z score.
  • zα refers to the standardized score that has a cumulative probability equal to 1 - α.
  • tα refers to the t score that has a cumulative probability equal to 1 - α.
  • fα refers to a f statistic that has a cumulative probability equal to 1 - α.
  • fα(v1, v2) is a f statistic with a cumulative probability of 1 - α, and v1 and v2 degrees of freedom.
  • Χ2 refers to a chi-square statistic.

Special Symbols

Throughout the site, certain symbols have special meanings. For example,

  • Σ is the summation symbol, used to compute sums over a range of values.
  • Σx or Σxi refers to the sum of a set of n observations. Thus, Σxi = Σx = x1 + x2 + . . . + xn.
  • sqrt refers to the square root function. Thus, sqrt(4) = 2 and sqrt(25) = 5.
  • Var(X) refers to the variance of the random variable X.
  • SD(X) refers to the standard deviation of the random variable X.
  • SE refers to the standard error of a statistic.
  • ME refers to the margin of error.
  • DF refers to the degrees of freedom.