# Statistics Notation

This appendix describes how symbols are used on the Stat Trek web site to represent numbers, variables, parameters, statistics, etc.

## Capitalization

In general, capital letters refer to population attributes (i.e., parameters); and lower-case letters refer to sample attributes (i.e., statistics). For example,

*P*refers to a population proportion; and*p*, to a sample proportion.*X*refers to a set of population elements; and*x*, to a set of sample elements.*N*refers to population size; and*n*, to sample size.

## Greek vs. Roman Letters

Like capital letters, Greek letters refer to population attributes. Their sample counterparts, however, are usually Roman letters. For example,

- μ refers to a population mean;
and
*x*, to a sample mean. - σ refers to the standard deviation of a population; and
*s*, to the standard deviation of a sample.

## Population Parameters

By convention, specific symbols represent certain population parameters. For example,

- μ refers to a population mean.
- σ refers to the standard deviation of a population.
- σ
^{2}refers to the variance of a population. *P*refers to the proportion of population elements that have a particular attribute.*Q*refers to the proportion of population elements that*do not*have a particular attribute, so*Q*= 1 -*P*.- ρ is the population correlation coefficient, based on all of the elements from a population.
*N*is the number of elements in a population.

## Sample Statistics

By convention, specific symbols represent certain sample statistics. For example,

- x refers to a sample mean.
*s*refers to the standard deviation of a sample.*s*^{2}refers to the variance of a sample.*p*refers to the proportion of sample elements that have a particular attribute.*q*refers to the proportion of sample elements that*do not*have a particular attribute, so*q*= 1 -*p*.*r*is the sample correlation coefficient, based on all of the elements from a sample.*n*is the number of elements in a sample.

## Simple Linear Regression

- Β
_{0}is the intercept constant in a population regression line. - Β
_{1}is the regression coefficient (i.e., slope) in a population regression line. - R
^{2}refers to the coefficient of determination. - b
_{0}is the intercept constant in a sample regression line. - b
_{1}refers to the regression coefficient in a sample regression line (i.e., the slope). - s
_{b1}refers to the refers to the standard error of the slope of a regression line.

## Probability

- P(A) refers to the probability that event A will occur.
- P(A|B) refers to the conditional probability that event A occurs, given that event B has occurred.
- P(A') refers to the probability of the complement of event A.
- P(A ∩ B) refers to the probability of the intersection of events A and B.
- P(A ∪ B) refers to the probability of the union of events A and B.
- E(X) refers to the expected value of random variable X.
- b(
*x*;*n, P*) refers to binomial probability. - b*(
*x*;*n, P*) refers to negative binomial probability. - g(
*x*;*P*) refers to geometric probability. - h(x; N, n, k) refers to hypergeometric probability.

## Counting

- n! refers to the factorial value of n.
_{n}P_{r}refers to the number of permutations of*n*things taken*r*at a time._{n}C_{r}refers to the number of combinations of*n*things taken*r*at a time.

## Set Theory

- A ∩ B refers to the intersection of events A and B.
- A ∪ B refers to the union of events A and B.
- {A, B, C} refers to the set of elements consisting of A, B, and C.
- {∅} refers to the null set.

## Hypothesis Testing

- H
_{0}refers to a null hypothesis. - H
_{1}or H_{a}refers to an alternative hypothesis. - α refers to the significance level.
- Β refers to the probability of committing a Type II error.

## Random Variables

*Z*or*z*refers to a standardized score, also known as a z score.*z*_{α}refers to the standardized score that has a cumulative probability equal to 1 - α.*t*_{α}refers to the t score that has a cumulative probability equal to 1 - α.*f*_{α}refers to a f statistic that has a cumulative probability equal to 1 - α.*f*_{α}(*v*_{1},*v*_{2}) is a f statistic with a cumulative probability of 1 - α, and*v*_{1}and*v*_{2}degrees of freedom.- Χ
^{2}refers to a chi-square statistic.

## Special Symbols

Throughout the site, certain symbols have special meanings. For example,

- Σ is the summation symbol, used to compute sums over a range of values.
- Σ
*x*or Σ*x*refers to the sum of a set of_{i}*n*observations. Thus, Σ*x*= Σ_{i}*x*=*x*+_{1}*x*+ . . . +_{2}*x*._{n} *sqrt*refers to the square root function. Thus, sqrt(4) = 2 and sqrt(25) = 5.- Var(X) refers to the variance of the random variable X.
- SD(X) refers to the standard deviation of the random variable X.
- SE refers to the standard error of a statistic.
- ME refers to the margin of error.
- DF refers to the degrees of freedom.