This tutorial shows how to use sklearn for logistic regression in Python.

Logistic regression is a model testing the relationship between Y (which is as a binary variable) and X (X can be more than one). *logistic regression* is also called *logit regression*. The following is the syntax.

**LogisticRegression().fit(x, y.values.ravel())**

## Step 1: Data Sample

Suppose we would like to predict how age and household income impact whether consumers buy a certain brand (1 = bought it before vs. 0 = never bought it).

We can ask like 10 people about their age and household income (7=much more than the average, 4=the average, 1=much lower than the average), as well as whether they have bought this brand.

The following is the hypothetical data.

Buy or Not | Household Income | Age |
---|---|---|

1 | 7 | 26 |

1 | 6 | 23 |

0 | 5 | 29 |

1 | 5 | 28 |

0 | 3 | 50 |

0 | 4 | 60 |

1 | 2 | 45 |

1 | 2 | 19 |

0 | 2 | 36 |

0 | 0 | 45 |

The following is the Python to reproduce the data shown above.

```
import pandas as pd
Buy_or_not=(1,1,0,1,0,0,1,1,0,0)
HouseholdIncome=(7,6,5,5,3,4,2,2,2,0)
Age=(26,23,29,28,50,60,45,19,36,45)
x_df = pd.DataFrame(
{'HouseholdIncome':HouseholdIncome,
'Age': Age})
print(x_df)
y_df = pd.DataFrame(
{'Buy_or_not': Buy_or_not})
print(y_df)
```

Output:

HouseholdIncome Age 0 7 26 1 6 23 2 5 29 3 5 28 4 3 50 5 4 60 6 2 45 7 2 19 8 2 36 9 0 45 Buy_or_not 0 1 1 1 2 0 3 1 4 0 5 0 6 1 7 1 8 0 9 0

## Step 2: Use sklearn for Logistic Regression In Python

After having the data sample, we can use sklearn for logistic regression in Python. The following is the actual Python code example.

```
import pandas as pd
from sklearn.linear_model import LogisticRegression
# Use sklearn for Logistic Regression
model = LogisticRegression().fit(x_df, y_df.values.ravel())
# print the intercept
print(model.intercept_)
# print the regression coefficients
print(model.coef_)
```

The following is the output, which shows the regression coefficients.

[3.95803785] [[ 0.11845729 -0.12390002]]

We can also write the logistic regression function below.

\[log\frac{p(y=1)}{1-p(y=1)}=\beta_0 +\beta_1x_1+\beta_2x_2\]

\[log\frac{p(bought-it-before)}{1-p(bought-it-before)}=3.96+0.12 Household Income – 0.12 Age\]