Regression Slope: Confidence Interval
This lesson describes how to construct a confidence interval around the slope of a regression line. We focus on the equation for simple linear regression, which is:
ŷ = b0 + b1x
where b0 is a constant, b1 is the slope (also called the regression coefficient), x is the value of the independent variable, and ŷ is the predicted value of the dependent variable.
Estimation Requirements
The approach described in this lesson is valid whenever the standard requirements for simple linear regression are met.
- The dependent variable Y has a linear relationship to the independent variable X.
- For each value of X, the probability distribution of Y has the same standard deviation σ.
- For any given value of X,
Previously, we described how to verify that regression requirements are met.
The Variability of the Slope Estimate
To construct a confidence interval for the slope of the regression line, we need to know the standard error of the sampling distribution of the slope. Many statistical software packages and some graphing calculators provide the standard error of the slope as a regression analysis output. The table below shows hypothetical output for the following regression equation: y = 76 + 35x .
Predictor | Coef | SE Coef | T | P |
---|---|---|---|---|
Constant | 76 | 30 | 2.53 | 0.01 |
X | 35 | 20 | 1.75 | 0.04 |
In the output above, the standard error of the slope (shaded in gray) is equal to 20. In this example, the standard error is referred to as "SE Coeff". However, other software packages might use a different label for the standard error. It might be "StDev", "SE", "Std Dev", or something else.
If you need to calculate the standard error of the slope (SE) by hand, use the following formula:
SE = sb1 = sqrt [ Σ(yi - ŷi)2 / (n - 2) ] / sqrt [ Σ(xi - x)2 ]
where yi is the value of the dependent variable for observation i, ŷi is estimated value of the dependent variable for observation i, xi is the observed value of the independent variable for observation i, x is the mean of the independent variable, and n is the number of observations.
How to Find the Confidence Interval for the Slope of a Regression Line
Previously, we described how to construct confidence intervals . For convenience, we repeat the five steps below.
- Choose the confidence level. The confidence level describes the uncertainty of a sampling plan. Often, researchers choose 90%, 95%, or 99% confidence levels; but any percentage can be used.
- Compute the standard error. Many statistical software packages and some graphing calculators
provide the standard error of the slope as a regression analysis
output. Use that, if you can.
If you need to calculate the standard error (SE) of the slope by hand, use the following formula:
SE = sb1 = sqrt [ Σ(yi - ŷi)2 / (n - 2) ] / sqrt [ Σ(xi - x)2 ]
- Find the critical value. When calculating the margin of error for a regression slope, use a t-score for the critical value, with degrees of freedom (DF) equal to n - 2. We explained how to find the critical t-score value in the lesson on margin of error.
-
Find the margin of error. You can compute the margin of error,
based on the following equation.
Margin of error = Critical value * Standard error of statistic
-
Specify the confidence interval. The uncertainty is denoted
by the confidence level. And the range of the confidence
interval is defined by the following equation.
Confidence interval = Slope ± Margin of error
In the next section, we work through a problem that shows how to use this approach to construct a confidence interval for the slope of a regression line. Note that this approach is used for simple linear regression (one independent variable and one dependent variable).
Test Your Understanding
Problem 1
The local utility company surveys 101 randomly selected customers. For each survey participant, the company collects the following: annual electric bill (in dollars) and home size (in square feet). Output from a regression analysis appears below.
Regression equation: Annual bill = 0.55 * Home size + 15 |
||||
Predictor | Coef | SE Coef | T | P |
Constant | 15 | 3 | 5.0 | 0.00 |
Home size | 0.55 | 0.24 | 2.29 | 0.01 |
What is the 99% confidence interval for the slope of the regression line?
(A) 0.25 to 0.85
(B) 0.02 to 1.08
(C) -0.08 to 1.18
(D) 0.20 to 1.30
(E) 0.30 to 1.40
Solution
The correct answer is (C). Use the following five-step approach to construct a confidence interval.
- Select a confidence level. In this analysis, the confidence level is defined for us in the problem. We are working with a 99% confidence level.
- Find the standard error. The standard error is given in the regression output. It is 0.24.
- Find the critical value. The critical value is a factor used to
compute the margin of error. With simple linear regression,
to compute a confidence interval for the slope,
the critical value is a
t-score
with degrees of freedom equal to n - 2.
To find the critical value, we take these steps.
-
Compute alpha (α):
α = 1 - (confidence level / 100)
α = 1 - 99/100 = 0.01
-
Find the critical probability (p*):
p* = 1 - α/2 = 1 - 0.01/2 = 0.995
-
Find the
degrees of freedom
(df):
df = n - 2 = 101 - 2 = 99.
- The critical value is the t statistic having 99 degrees of freedom and a cumulative probability equal to 0.995. From the t Distribution Calculator, we find that the critical value is about 2.63.
-
Compute alpha (α):
- Compute margin of error (ME):
ME = critical value * standard error
ME = 2.63 * 0.24 = 0.63
-
Specify the confidence interval (CI). The range of the confidence
interval is defined by the sample statistic +
margin of error. Here, the sample statistic is the regression slope, 0.55; so the confidence interval is:
CI = Slope ± ME
CI = 0.55 ± 0.63
And the uncertainty is denoted by the confidence level, which is 99%.
Therefore, the 99% confidence interval for this sample is 0.55 ± 0.63, which is -0.08 to 1.18
If we replicated the same study multiple times with different random samples and computed a confidence interval for each sample, we would expect 99% of the confidence intervals to contain the true slope of the regression line.