## 1. Definition of Linear Regression Model

Multiple linear regression is a linear model accessing the relationship between a dependent variable (`DV, or Y`

) and multiple intendent variables (`IV, or X`

).

For instance, you might want to test how consumer purchase intention can be impacted by price as well as by household income. In this case, consumer purchase intention is the `DV or Y`

, whereas price and household income are the `IVs or Xs`

. Conceptually, you can think of it as follows.

## 2. Math Statements of Linear Regression Model

Below is the regression function, in which 𝛽₀ is the intercept and 𝛽₁ and 𝛽_{2} are the regression coefficients and 𝜀 is the random** **error.

\[y=\beta_0 +\beta_1x_1+\beta_2x_2+\epsilon\]

You might have questions regarding what criteria to find 𝑏₀, 𝑏₁, and *b*_{2}. It is based on the goal of minimizing the residual — same logic as simple linear regression

\[f(x)=b_0 +b_1x_1+b_2x_2\]

## 3. Matrix Solution for Linear Regression Model

You can use just use pure matrix calculation to calculate the regression coefficients. Below is the process.

\[ Y= \left[ \begin{array} {} y_{11} \\ y_{12} \\ y_{13} \\ ..\\y_{1n} \end{array} \right] = \left[ \begin{array} {} b_0+b_1 x_{11} + b_2 x_{21} \\ b_0+b_1 x_{12}+b_2 x_{22} \\ b_0+b_1 x_{13}+ b_2 x_{23} \\..\\b_0+b_1 x_{1n} + b_2 x_{2n} \end{array} \right] = \left[ \begin{array} {} 1& x_{11} & x_{21} \\ 1 & x_{12} & x_{22} \\ 1 & x_{13} & x_{23} \\..\\1 & x_{1n} & x_{2n} \end{array} \right] \begin{bmatrix} b_0\\ b_1\\ b_2\end{bmatrix} = X B \]

Thus, we can get the following.

\[ Y = XB \]

We can time X transpose on both sides and get the following.

\[ X^TY = X^TXB \]

Since X^{T }X is a square matrix, we can calculate its inverse matrix and time both sides.

\[ (X^T X)^{-1} X^TY =(X^T X)^{-1} X^T X B\]

Since (X^{T }X)^{-1}X^{T }X is an identity matrix, we can write it as follows.

\[ (X^T X)^{-1} X^TY = B\]

If we change the position of left and right, it will become below. By using the following function, we can calculate the regression coefficients of the linear model.

\[B =(X^TX)^{-1}X^TY\]

Where,

\[ B = \begin{bmatrix} b_0\\ b_1\\ b_2\end{bmatrix} \]

\[ X= \left[ \begin{array} {} 1& x_{11} & x_{21} \\ 1 & x_{12} & x_{22} \\ 1 & x_{13} & x_{23} \\..\\1 & x_{1n} & x_{2n} \end{array} \right] \]

\[ Y= \left[ \begin{array} {} y_{11} \\ y_{12} \\ y_{13} \\ ..\\y_{1n} \end{array} \right] \]

## 4. Use Python Numpy for Linear Regression Model

We can use NumPy to do matrix manipulation and calculation. The following is a linear regression model, including household income as `IV`

s and purchase intention as `DV`

.

\[f(x)=b_0 +b_1 \times Price+b_2 \times Household \ Income \]

The following is the hypothetical data, including purchase intention as `DV`

and prices and household income as `IV`

s.

Prices | Household Income | Purchase Intention |
---|---|---|

5 | 7 | 7 |

6 | 5 | 6 |

7 | 4 | 5 |

8 | 6 | 5 |

9 | 3 | 3 |

10 | 3 | 4 |

### Step 1: Prepare the X matrix and Y vector

```
import numpy as np
X_rawdata = np.array([np.ones(6),[5,6,7,8,9,10], [7,5,4,6,3,3]])
X_matrix=X_rawdata.T
print("X Matrix:\n", X_matrix)
```

Output:

X Matrix: [[ 1. 5. 7.] [ 1. 6. 5.] [ 1. 7. 4.] [ 1. 8. 6.] [ 1. 9. 3.] [ 1. 10. 3.]]

```
Y_rawdata = np.array([[7,6,5,5,3,4]])
Y_vector=Y_rawdata.T
print("Y Vector:\n",Y_vector)
```

Output:

Y Vector: [[7] [6] [5] [5] [3] [4]]

### Step 2: Calculate X^{T }and X^{T}X

```
X_matrix_T=X_matrix.transpose()
print("X Matrix Transpose:\n",X_matrix_T)
```

Output:

X Matrix Transpose: [[ 1. 1. 1. 1. 1. 1.] [ 5. 6. 7. 8. 9. 10.] [ 7. 5. 4. 6. 3. 3.]]

```
X_T_X=np.matmul(X_matrix_T,X_matrix)
print(X_T_X)
```

Output:

[[ 6. 45. 28.] [ 45. 355. 198.] [ 28. 198. 144.]]

### Step 3: Calculate (X^{T}X)^{-1}

```
X_T_X_Inv=np.linalg.inv(X_T_X)
print(X_T_X_Inv)
```

Output:

[[22.23134328 -1.74626866 -1.92164179] [-1.74626866 0.14925373 0.13432836] [-1.92164179 0.13432836 0.19589552]]

### Step 4: Calculate (X^{T}X)^{-1}X^{T}Y

`X_T_X_Inv@X_matrix_T@Y_vector`

Output:

array([[ 6.73880597], [-0.44776119], [ 0.34701493]])

### Step 5: Write out the linear regression model

We can see 𝑏₀ = 6.73, 𝑏₁ = -0.45, and *b*_{2} =0.35. We can write the estimated regression function below.

\[f(x)=b_0 +b_1x_1+b_2x_2=6.73-0.45Price+0.35Household Income\]

## 5. Use `numpy.linalg.lstsq`

to verify

We can use the Numpy function `numpy.linalg.lstsq`

to verify our calculation above. Below is the Python code for linear regression regression model.

```
results=np.linalg.lstsq(X_matrix, Y_vector, rcond=None)[0]
print(results)
```

Output:

[[ 6.73880597] [-0.44776119] [ 0.34701493]]

As we can see, it is exactly the same as matrix calculation method shown above. Thus, we know that we did it correctly by using the matrix method.