### Linear Regression

#### Introduction

#### Simple Regression

- Linear Regression
- Regression Example
- Residual Analysis
- Transformations
- Influential Points
- Slope Estimate
- Slope Test

#### Multiple Regression

### Linear Regession: Table of Contents

#### Introduction

#### Simple Regression

- Linear Regression
- Regression Example
- Residual Analysis
- Transformations
- Influential Points
- Slope Estimate
- Slope Test

#### Multiple Regression

# Regression Slope: Confidence Interval

This lesson describes how to construct a confidence interval around the slope of a regression line. We focus on the equation for simple linear regression, which is:

ŷ = b_{0} + b_{1}x

where b_{0} is a constant,
b_{1} is the slope (also called the regression coefficient),
x is the value of the independent variable, and ŷ is the
*predicted* value of the dependent variable.

## Estimation Requirements

The approach described in this lesson is valid whenever the standard requirements for simple linear regression are met.

- The dependent variable
*Y*has a linear relationship to the independent variable*X*. - For each value of X, the probability distribution of Y has the same standard deviation σ.
- For any given value of X,

Previously, we described how to verify that regression requirements are met.

## The Variability of the Slope Estimate

To construct a confidence interval for the slope of the regression line, we need to know the standard error of the sampling distribution of the slope. Many statistical software packages and some graphing calculators provide the standard error of the slope as a regression analysis output. The table below shows hypothetical output for the following regression equation: y = 76 + 35x .

Predictor | Coef | SE Coef | T | P |
---|---|---|---|---|

Constant | 76 | 30 | 2.53 | 0.01 |

X | 35 | 20 | 1.75 | 0.04 |

In the output above, the standard error of the slope (shaded in gray) is equal to 20. In this example, the standard error is referred to as "SE Coeff". However, other software packages might use a different label for the standard error. It might be "StDev", "SE", "Std Dev", or something else.

If you need to calculate the standard error of the slope (SE) by hand, use the following formula:

SE = s_{b1} =
sqrt [ Σ(y_{i} - ŷ_{i})^{2}
/ (n - 2) ]
/ sqrt [ Σ(x_{i} -
x)^{2} ]

where y_{i} is the value of the dependent variable for
observation *i*,
ŷ_{i} is estimated value of the dependent variable
for observation *i*,
x_{i} is the observed value of the independent variable for
observation *i*,
x is the mean of the independent variable,
and n is the number of observations.

## How to Find the Confidence Interval for the Slope of a Regression Line

Previously, we described
how to construct confidence intervals. The confidence
interval for the slope of a simple linear regression equation uses the same general approach. Note,
however, that the critical value is based on a
t score
with *n* - 2
degrees of freedom.

- Identify a sample statistic. The sample statistic is the
regression slope
b
_{1}calculated from sample data. In the table above, the regression slope is 35. - Select a confidence level. The confidence level describes the uncertainty of a sampling method. Often, researchers choose 90%, 95%, or 99% confidence levels; but any percentage can be used.
- Find the margin of error. Previously, we showed
how to compute the margin of error, based on the
critical value and standard error. When calculating
the margin of error for a regression slope, use a
t score
for the critical value, with
degrees of freedom (DF) equal to
*n*- 2. - Specify the confidence interval. The range of the confidence
interval is defined by the
*sample statistic*__+__*margin of error*. And the uncertainty is denoted by the confidence level.

In the next section, we work through a problem that shows how to use this approach to construct a confidence interval for the slope of a regression line. Note that this approach is used for simple linear regression (one independent variable and one dependent variable).

## Test Your Understanding

**Problem 1**

The local utility company surveys 101 randomly selected customers. For each survey participant, the company collects the following: annual electric bill (in dollars) and home size (in square feet). Output from a regression analysis appears below.

Regression equation:Annual bill = 0.55 * Home size + 15 |
||||

Predictor | Coef | SE Coef | T | P |

Constant | 15 | 3 | 5.0 | 0.00 |

Home size | 0.55 | 0.24 | 2.29 | 0.01 |

What is the 99% confidence interval for the slope of the regression line?

(A) 0.25 to 0.85

(B) 0.02 to 1.08

(C) -0.08 to 1.18

(D) 0.20 to 1.30

(E) 0.30 to 1.40

**Solution**

The correct answer is (C). Use the following four-step approach to construct a confidence interval.

- Identify a sample statistic. Since we are trying to estimate the slope of the true regression line, we use the regression coefficient for home size (i.e., the sample estimate of slope) as the sample statistic. From the regression output, we see that the slope coefficient is 0.55.
- Select a confidence level. In this analysis, the confidence level is defined for us in the problem. We are working with a 99% confidence level.
- Find the margin of error. Elsewhere on this site, we show
how to compute the margin of error. The key steps applied
to this problem are shown below.
- Find standard deviation or standard error. The standard error is given in the regression output. It is 0.24.
- Find critical value. The critical value is a factor used to
compute the margin of error. With simple linear regression,
to compute a confidence interval for the slope,
the critical value is a
t score
with
degrees of freedom equal to
*n*- 2. To find the critical value, we take these steps.- Compute alpha (α):
α = 1 - (confidence level / 100)

α = 1 - 99/100 = 0.01

- Find the critical probability (p*):
p* = 1 - α/2 = 1 - 0.01/2 = 0.995

- Find the
degrees of freedom (df):
df =

*n*- 2 = 101 - 2 = 99. - The critical value is the t statistic having 99 degrees of freedom and a cumulative probability equal to 0.995. From the t Distribution Calculator, we find that the critical value is 2.63.

- Compute alpha (α):
- Compute margin of error (ME):
ME = critical value * standard error

ME = 2.63 * 0.24 = 0.63

- Specify the confidence interval. The range of the confidence
interval is defined by the
*sample statistic*__+__*margin of error*. And the uncertainty is denoted by the confidence level.

Therefore, the 99% confidence interval for this sample is 0.55 __+__ 0.63, which is -0.08 to 1.18

If we replicated the same study multiple times with different random samples and computed a confidence interval for each sample, we would expect 99% of the confidence intervals to contain the true slope of the regression line.