Multiple Linear Regression
This post demonstrates the usage of the Excel LINEST
function to perform multiple linear regression. The data being utilized are tabulated below. It is a subset of the data used in the final course project.
Sample Data
The data shown below is the top 5 rows of 30 rows LINEST
operated on for this explanation.
Age | Years at Employer | Years at Address | Income | Credit Card Debt | Automobile Debt | Outcome |
---|---|---|---|---|---|---|
26 | 0.35 | 0.24 | $12,181 | ($2,057) | ($3,696) | 1 |
28 | 4.47 | 0.37 | $38,536 | ($6,970) | ($3,018) | 1 |
37 | 9.30 | 0.02 | $30,602 | ($2,892) | ($1,674) | 0 |
28 | 8.43 | 1.06 | $15,588 | ($38) | ($1,758) | 0 |
39 | 5.40 | 0.74 | $27,599 | ($776) | ($4,374) | 0 |
The Output
The table below contains the formatted output of the LINEST
function.
0.00 | 0.00 | 0.00 | -0.04 | -0.05 | 0.00 | 0.61 |
0.00 | 0.00 | 0.00 | 0.13 | 0.02 | 0.01 | 0.41 |
0.513 | 0.376 | #N/A | #N/A | #N/A | #N/A | #N/A |
4.04 | 23 | #N/A | #N/A | #N/A | #N/A | #N/A |
3.42 | 3.25 | #N/A | #N/A | #N/A | #N/A | #N/A |
Columns having coefficients and standard errors of zero have high collinearity and therefore add very little to the model.
Key to the Output
The table below contains a key for interpreting the output of the LINEST
function.
x_6 Coeff | x_5 Coeff | x_4 Coeff | x_3 Coeff | x_2 Coeff | x_1 Coeff | Constant |
x_6 Std Err | x_5 Std Err | x_4 Std Err | x_3 Std Err | x_2 Std Err | x_1 Std Err | Constant Std Err |
R^2 | SDev(y) | #N/A | #N/A | #N/A | #N/A | #N/A |
F-Statistic | Degrees of Freedom | #N/A | #N/A | #N/A | #N/A | #N/A |
“Regression Sum of Squares” | “Residual Sum of Squares” | #N/A | #N/A | #N/A | #N/A | #N/A |
The x_6 Coeff
is the coefficient associated with the “Automobile Debt” column.
Instructions
LINEST
works differently from many excel functions. Instructions for using it follow.
LINEST
takes the following parameters: LINEST([known_ys], [known_xs], const=true, stats=true)
[known_ys]
: the outcome column, G2:G31
[known_xs]
: the left six columns, A2:F31
const=true
: if omitted, Excel will assume true. If false, then the constant term (y-intercept) will be set to 0.
stats=true
: if omitted, Excel will assume true. If false, then the extra stats in rows 3 through 5 of the output will not be included.
- Enter the
LINEST
function in a cell, according to the explanation set forth above. - Select a 5-row, n-column group of cells with the
LINEST
function cell in the top left, where n is the total number of columns of data, including both independent and dependent columns. - Press
Control-U
. - Press
Control-Shift-Return
.
Note: the commands in step 3 and step 4 are for Excel for Mac only. Other versions will differ.
Excel-Linest-Function.xlsx
Some other content is taken from my notes on other aspects of the Coursera course.