Multiple Linear Regression
This post demonstrates the usage of the Excel
LINEST function to perform multiple linear regression. The data being utilized are tabulated below. It is a subset of the data used in the final course project.
The data shown below is the top 5 rows of 30 rows
LINEST operated on for this explanation.
|Age||Years at Employer||Years at Address||Income||Credit Card Debt||Automobile Debt||Outcome|
The table below contains the formatted output of the
Columns having coefficients and standard errors of zero have high collinearity and therefore add very little to the model.
Key to the Output
The table below contains a key for interpreting the output of the
|x_6 Coeff||x_5 Coeff||x_4 Coeff||x_3 Coeff||x_2 Coeff||x_1 Coeff||Constant|
|x_6 Std Err||x_5 Std Err||x_4 Std Err||x_3 Std Err||x_2 Std Err||x_1 Std Err||Constant Std Err|
|F-Statistic||Degrees of Freedom||#N/A||#N/A||#N/A||#N/A||#N/A|
|“Regression Sum of Squares”||“Residual Sum of Squares”||#N/A||#N/A||#N/A||#N/A||#N/A|
x_6 Coeff is the coefficient associated with the “Automobile Debt” column.
LINEST works differently from many excel functions. Instructions for using it follow.
LINEST takes the following parameters:
LINEST([known_ys], [known_xs], const=true, stats=true)
[known_ys]: the outcome column,
[known_xs]: the left six columns,
const=true: if omitted, Excel will assume true. If false, then the constant term (y-intercept) will be set to 0.
stats=true: if omitted, Excel will assume true. If false, then the extra stats in rows 3 through 5 of the output will not be included.
- Enter the
LINESTfunction in a cell, according to the explanation set forth above.
- Select a 5-row, n-column group of cells with the
LINESTfunction cell in the top left, where n is the total number of columns of data, including both independent and dependent columns.
Note: the commands in step 3 and step 4 are for Excel for Mac only. Other versions will differ.
Some other content is taken from my notes on other aspects of the Coursera course.