Stat Trek Statistics Notation

This web page describes how symbols are used on the Stat Trek web site to represent numbers, variables, parameters, statistics, etc.

Capitalization

In general, capital letters refer to population attributes (i.e., parameters); and lower-case letters refer to sample attributes (i.e., statistics). For example,

P refers to a population proportion; and p, to a sample proportion.
X refers to a set of population elements; and x, to a set of sample elements.
N refers to population size; and n, to sample size.

Greek vs. Roman Letters

Like capital letters, Greek letters refer to population attributes. Their sample counterparts, however, are usually Roman letters. For example,

μ refers to a population mean; and x, to a sample mean.
σ refers to the standard deviation of a population; and s, to the standard deviation of a sample.

Population Parameters

By convention, specific symbols represent certain population parameters. For example,

μ refers to a population mean.
σ refers to the standard deviation of a population.
σ² refers to the variance of a population.
P refers to the proportion of population elements that have a particular attribute.
Q refers to the proportion of population elements that do not have a particular attribute, so Q = 1 - P.
ρ is the population correlation coefficient, based on all of the elements from a population.
N is the number of elements in a population.

Sample Statistics

By convention, specific symbols represent certain sample statistics. For example,

x refers to a sample mean.
s refers to the standard deviation of a sample.
s² refers to the variance of a sample.
p refers to the proportion of sample elements that have a particular attribute.
q refers to the proportion of sample elements that do not have a particular attribute, so q = 1 - p.
r is the sample correlation coefficient, based on all of the elements from a sample.
n is the number of elements in a sample.

Simple Linear Regression

Β₀ is the intercept constant in a population regression line.
Β₁ is the regression coefficient (i.e., slope) in a population regression line.
R² refers to the coefficient of determination.
b₀ is the intercept constant in a sample regression line.
b₁ refers to the regression coefficient in a sample regression line (i.e., the slope).
s_b₁ refers to the refers to the standard error of the slope of a regression line.

Probability

P(A) refers to the probability that event A will occur.
P(A|B) refers to the conditional probability that event A occurs, given that event B has occurred.
P(A') refers to the probability of the complement of event A.
P(A ∩ B) refers to the probability of the intersection of events A and B.
P(A ∪ B) refers to the probability of the union of events A and B.
E(X) refers to the expected value of random variable X.
b(x; n, P) refers to binomial probability.
b*(x; n, P) refers to negative binomial probability.
g(x; P) refers to geometric probability.
h(x; N, n, k) refers to hypergeometric probability.

Counting

n! refers to the factorial value of n.
_nP_r refers to the number of permutations of n things taken r at a time.
_nC_r refers to the number of combinations of n things taken r at a time.

Set Theory

A ∩ B refers to the intersection of events A and B.
A ∪ B refers to the union of events A and B.
{A, B, C} refers to the set of elements consisting of A, B, and C.
{∅} refers to the null set.

Hypothesis Testing

H₀ refers to a null hypothesis.
H₁ or H_a refers to an alternative hypothesis.
α refers to the significance level.
Β refers to the probability of committing a Type II error.

Random Variables

Z or z refers to a standardized score, also known as a z-score.
z_α refers to the standardized score that has a cumulative probability equal to 1 - α.
t_α refers to the t statistic that has a cumulative probability equal to 1 - α.
f_α refers to a f statistic that has a cumulative probability equal to 1 - α.
f_α(v₁, v₂) is a f statistic with a cumulative probability of 1 - α, and v₁ and v₂ degrees of freedom.
χ² refers to a chi-square statistic.

Special Symbols

Throughout the site, certain symbols have special meanings. For example,

Σ is the summation symbol, used to compute sums over a range of values.
Σx or Σx_i refers to the sum of a set of n observations. Thus, Σx_i = Σx = x₁ + x₂ + . . . + x_n.
sqrt refers to the square root function. Thus, sqrt(4) = 2 and sqrt(25) = 5.
Var(X) refers to the variance of the random variable X.
SD(X) refers to the standard deviation of the random variable X.
SE refers to the standard error of a statistic.
ME refers to the margin of error.
DF refers to the degrees of freedom.