Introduction
We can use sklearn.linear_model.LinearRegression
to do linear regression in Python. The following is the core syntax of using sklearn.
lm.fit(IVs, DV)
Where,
IVs: the independent variables
DV: the dependent variable
Example for Linear Regression Model
The following is the linear regression model, including household income as IV
s and purchase intention as DV
.
\[f(x)=b_0 +b_1 \times Price+b_2 \times Household \ Income \]
The following is the hypothetical data, including purchase intention as DV
and prices and household income as IV
s.
Purchase Intention | Prices | Household Income |
---|---|---|
7 | 5 | 7 |
6 | 6 | 5 |
5 | 7 | 4 |
5 | 8 | 6 |
3 | 9 | 3 |
4 | 10 | 3 |
Step 1: Prepare the data
The following Python code generates the hypothetical data and then changed it into appropriate format.
import numpy as np
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
# Input hypothetical data
Purchase_Intention=(7,6,5,5,3,4)
Prices=(5,6,7,8,9,10)
Household_income=(7,5,4,6,3,3)
# change it into the format we want
DV= np.transpose([Purchase_Intention])
print("DV: \n",DV)
# change it into the format we want
IVs=np.concatenate(([Prices], [Household_income]), axis=0)
IVs=np.transpose(IVs)
print("IVs: \n",IVs)
Output:
DV: [[7] [6] [5] [5] [3] [4]] IVs: [[ 5 7] [ 6 5] [ 7 4] [ 8 6] [ 9 3] [10 3]]
Step 2: Use lm
from sklearn.linear_model
.The following is the Python code of adding the IVs and DV in the lm.fit()
.
# apply sklearn.linear_model
result = lm.fit(IVs, DV)
print("Result is as follows:")
print("Intercept:\n",result.intercept_)
print("Regression Coefficients:\n", result.coef_)
Output:
Result is as follows: Intercept: [6.73880597] Regression Coefficients: [[-0.44776119 0.34701493]]
Step 3: Write out the regression model
We can write out a model with the estimated regression coefficients.
\[f(x)=6.73-0.45 \ Price+0.35 \ Household \ Income \]
We can see that price is negatively related with purchase intention, while household income is positively related with purchase intention.