### Linear Regression

#### Introduction

#### Simple Regression

- Linear Regression
- Regression Example
- Residual Analysis
- Transformations
- Influential Points
- Slope Estimate
- Slope Test

#### Multiple Regression

### Linear Regession: Table of Contents

#### Introduction

#### Simple Regression

- Linear Regression
- Regression Example
- Residual Analysis
- Transformations
- Influential Points
- Slope Estimate
- Slope Test

#### Multiple Regression

# Hypothesis Test for Regression Slope

This lesson describes how to conduct a hypothesis test to determine
whether there is a significant linear relationship between
an independent variable *X* and a dependent variable
*Y*.

The test focuses on the slope of the regression line

Y = Β_{0} + Β_{1}X

where Β_{0} is a constant,
Β_{1} is the slope (also called the regression coefficient),
X is the value of the independent variable, and Y is the
value of the dependent variable.

If we find that the slope of the regression line is significantly different from zero, we will conclude that there is a significant relationship between the independent and dependent variables.

## Test Requirements

The approach described in this lesson is valid whenever the standard requirements for simple linear regression are met.

- The dependent variable
*Y*has a linear relationship to the independent variable*X*. - For each value of X, the probability distribution of Y has the same standard deviation σ.
- For any given value of X,

Previously, we described how to verify that regression requirements are met.

The test procedure consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

## State the Hypotheses

If there is a significant linear relationship between the independent
variable *X* and the dependent variable
*Y*, the slope will *not* equal zero.

H_{o}: Β_{1} = 0

H_{a}: Β_{1} ≠ 0

The null hypothesis states that the slope is equal to zero, and the alternative hypothesis states that the slope is not equal to zero.

## Formulate an Analysis Plan

The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.

- Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
- Test method. Use a linear regression t-test (described in the next section) to determine whether the slope of the regression line differs significantly from zero.

## Analyze Sample Data

Using sample data, find the standard error of the slope, the slope of the regression line, the degrees of freedom, the test statistic, and the P-value associated with the test statistic. The approach described in this section is illustrated in the sample problem at the end of this lesson.

- Standard error. Many statistical software packages and some graphing calculators
provide the
standard error of the slope as a regression analysis
output. The table below shows hypothetical output for the following
regression equation: y = 76 + 35x .
Predictor Coef SE Coef T P Constant 76 30 2.53 0.01 X 35 20 1.75 0.04 SE = s

where y_{b1}= sqrt [ Σ(y_{i}- ŷ_{i})^{2}/ (n - 2) ] / sqrt [ Σ(x_{i}- x)^{2}]_{i}is the value of the dependent variable for observation*i*, ŷ_{i}is estimated value of the dependent variable for observation*i*, x_{i}is the observed value of the independent variable for observation*i*, x is the mean of the independent variable, and n is the number of observations. - Slope. Like the standard error, the slope of the regression line will be provided by most statistics software packages. In the hypothetical output above, the slope is equal to 35.
- Degrees of freedom. For simple linear regression (one independent
and one dependent variable), the
degrees of freedom (DF) is equal to:
DF = n - 2

where n is the number of observations in the sample. - Test statistic. The test statistic is a t statistic
(t) defined by
the following equation.
t = b

where b_{1}/ SE_{1}is the slope of the sample regression line, and SE is the standard error of the slope. - P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a t statistic, use the t Distribution Calculator to assess the probability associated with the test statistic. Use the degrees of freedom computed above.

## Interpret Results

If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.

## Test Your Understanding

**Problem**

The local utility company surveys 101 randomly selected customers. For each survey participant, the company collects the following: annual electric bill (in dollars) and home size (in square feet). Output from a regression analysis appears below.

Annual bill = 0.55 * Home size + 15 |
||||

Predictor | Coef | SE Coef | T | P |

Constant | 15 | 3 | 5.0 | 0.00 |

Home size | 0.55 | 0.24 | 2.29 | 0.01 |

Is there a significant linear relationship between annual bill and home size? Use a 0.05 level of significance.

**Solution**

The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

**State the hypotheses.**The first step is to state the null hypothesis and an alternative hypothesis.H

_{o}: The slope of the regression line is equal to zero.H

If the relationship between home size and electric bill is significant, the slope will_{a}: The slope of the regression line is*not*equal to zero.*not*equal zero.**Formulate an analysis plan**. For this analysis, the significance level is 0.05. Using sample data, we will conduct a linear regression t-test to determine whether the slope of the regression line differs significantly from zero.**Analyze sample data**. To apply the linear regression t-test to sample data, we require the standard error of the slope, the slope of the regression line, the degrees of freedom, the t statistic test statistic, and the P-value of the test statistic.We get the slope (b

_{1}) and the standard error (SE) from the regression output.b

_{1}= 0.55 SE = 0.24We compute the degrees of freedom and the t statistic test statistic, using the following equations.

DF = n - 2 = 101 - 2 = 99

t = b

_{1}/SE = 0.55/0.24 = 2.29where DF is the degrees of freedom, n is the number of observations in the sample, b

Based on the t statistic test statistic and the degrees of freedom, we determine the P-value. The P-value is the probability that a t statistic having 99 degrees of freedom is more extreme than 2.29. Since this is a two-tailed test, "more extreme" means greater than 2.29 or less than -2.29. We use the t Distribution Calculator to find P(t > 2.29) = 0.0121 and P(t < -2.29) = 0.0121. Therefore, the P-value is 0.0121 + 0.0121 or 0.0242._{1}is the slope of the regression line, and SE is the standard error of the slope.**Interpret results**. Since the P-value (0.0242) is less than the significance level (0.05), we cannot accept the null hypothesis.

**Note:** If you use this approach on an exam, you may also want to mention
that this approach is only appropriate when the
standard requirements for simple linear regression are satisfied.

Bestsellers Advanced Placement Statistics Updated daily | ||

1. Ultimate AP Statistics Practice Book: 100 Essential Problems Completely Explained on YouTube $14.70 $14.70 | ||

2. Barron's AP Statistics, 9th Edition $18.99 $30.85 | ||

3. Barron's AP Statistics, 8th Edition $18.99 $13.95 | ||

4. Cracking the AP Statistics Exam, 2019 Edition: Practice Tests & Proven Techniques to Help You Score a 5 (College Test Preparation) $19.99 $24.00 | ||

5. 5 Steps to a 5: AP Statistics 2018 $18.00 $14.19 |

Statistics Hacks: Tips & Tools for Measuring the World and Beating the Odds $29.99 $4.99 83% off | |

See more Statistics books ... |

Bestsellers Handheld Calculators Updated daily | ||

1. Texas Instruments TI-89 Titanium Graphing Calculator (packaging may differ) $128.45 $128.45 | ||

2. Sharp EL-W535B WriteView Scientific Calculator $24.99 $78.95 | ||

3. Texas Instruments TI-84 Plus CE Graphing Calculator, Black $150.00 $121.28 | ||

4. Texas Instruments Ti-84 plus Graphing calculator - Black $105.00 $105.00 | ||

5. HP 12CP Financial Calculator $79.99 $51.99 |

Texas Instruments TI-83 Plus Graphing Calculator, Standard $149.99 $83.99 44% off | |

See more Graphing Calculators ... |

Bestsellers Statistics and Probability Updated daily | ||

1. How to Lie with Statistics $13.95 $8.06 | ||

2. Naked Statistics: Stripping the Dread from the Data $16.95 $10.89 | ||

3. Practical Statistics for Data Scientists: 50 Essential Concepts $49.99 $27.49 | ||

4. Statistics For Dummies (For Dummies (Math & Science)) $19.99 $11.99 | ||

5. The Cartoon Guide to Statistics $19.99 $12.19 |

# Teach Yourself Flexbox

- Build responsive websites with simple CSS.
- No floats, no workarounds, no javascript.