Statistics Tutorial: Hypothesis Tests
A statistical hypothesis is an assumption about a population
parameter. This assumption may or may not be true.
The best way to determine whether a statistical hypothesis is true
would be to examine
the entire population. Since that is often impractical, researchers
typically examine a random sample from the population. If sample data
are consistent with the statistical hypothesis, the hypothesis is
accepted; if not, it is rejected.
There are two types of statistical hypotheses.
-
Null hypothesis. The null hypothesis, denoted by
H0, is usually the hypothesis
that sample observations result purely from chance.
-
Alternative hypothesis. The alternative hypothesis,
denoted by H1 or Ha, is the
hypothesis that sample observations are influenced by some non-random cause.
For example, suppose we wanted to determine whether a coin was fair and
balanced. A null hypothesis might be that half the flips would result in Heads
and half, in Tails. The alternative hypothesis might be that the number of
Heads and Tails would be very different. Symbolically, these hypotheses would
be expressed as
H0: P = 0.5
Ha: P ≠ 0.5
Suppose we flipped the coin 50 times,
resulting in 40 Heads and 10 Tails. Given this result, we would be inclined to
reject the null hypothesis and accept the alternative hypothesis.
Hypothesis Tests
Statisticians follow a formal process to determine whether to accept or reject a
null hypothesis, based on sample data. This process, called hypothesis
testing, consists of four steps.
-
State the hypotheses. This involves stating the null and alternative
hypotheses. The hypotheses are stated in such a way that they are
mutually exclusive. That is, if one is true, the other must
be false.
-
Formulate an analysis plan. The analysis plan
describes how to use sample data to accept or reject
the null hypothesis.
The accept/reject decision often focuses around a single
test statistic.
-
Analyze sample data. Find the value of the test statistic
(mean score, proportion, t-score,
z-score, etc.) described in the analysis plan.
Complete other computations, as required by the plan.
-
Interpret results. Apply the decision rule described in the
analysis plan. If the test statistic supports
the null hypothesis, accept the null hypothesis; otherwise,
reject the null hypothesis.
Decision Errors
Two types of errors can result from a hypothesis test.
-
Type I error. A Type I error occurs when the researcher
rejects a null hypothesis when it is true. The probability of committing a Type
I error is called the significance level. This probability is
also called alpha, and is often denoted by α.
-
Type II error. A Type II error occurs when the researcher
accepts a null hypothesis that is false. The probability of committing a Type
II error is called Beta, and is often denoted by β. The probability of not committing a Type II error is
called the Power of the test.
Decision Rules
The analysis plan includes decision rules for accepting or
rejecting the null hypothesis.
In practice, statisticians describe these decision rules in two
ways - with reference to a P-value or with reference to a
region of acceptance.
-
P-value. The strength of evidence in support of a null hypothesis
is measured by the P-value. Suppose the test
statistic is equal to S. The P-value is the probability
of observing a test statistic as extreme as S, assuming
the null hypotheis is true. If the P-value is less than the
significance level, we reject the null hypothesis.
Region of acceptance. The region of acceptance
is a range of values. If the test statistic falls within the
region of acceptance, the null hypothesis is accepted. The
region of acceptance is defined so that the chance of making a
Type I error is equal to the significance level.
The set of values outside the region of acceptance is called the
region of rejection. If the test statistic falls
within the region of rejection, the null hypothesis is rejected.
In such cases, we say that the hypothesis has been rejected at
the α level of significance.
These approaches are equivalent. Some statistics texts use the
P-value approach; others use the region of acceptance approach.
In subsequent lessons, this tutorial will present examples that illustrate each approach.
One-Tailed and Two-Tailed Tests
A test of a statistical hypothesis, where the region of rejection is on only one
side of the sampling
distribution, is called a one-tailed test. For
example, suppose the null hypothesis states that the mean is less than or equal
to 10. The alternative hypothesis would be that the mean is greater than 10.
The region of rejection would consist of a range of numbers located located on
the right side of sampling distribution; that is, a set of numbers greater than
10.
A test of a statistical hypothesis, where the region of rejection is on both
sides of the sampling distribution, is called a two-tailed test.
For example, suppose the null hypothesis states that the mean is equal to 10.
The alternative hypothesis would be that the mean is less than 10 or greater
than 10. The region of rejection would consist of a range of numbers located
located on both sides of sampling distribution; that is, the region of
rejection would consist partly of numbers that were less than 10 and partly of
numbers that were greater than 10.
|