## Plot for Interactions of 2 Categorical Variables in Python (with example)

This tutorial shows how to plot interactions of 2 categorical independent variables in Python. The following shows both the ANOVA and linear regression outputs. You will see that ANOVA is also a linear regression model. Thus, it does not matter you use ANOVA or linear regression, you can use the same method (i.e., the same … Read more

## How to Calculate Predicated Y in Linear Regression in Python

This tutorial shows how you can calculate predicted Y (or, estimated Y) in linear regression in Python. Steps of Calculating Predicated Y in Linear Model in Python Step 1: Prepare data, X and Y Output: X: [[ 5] [ 2] [ 3] [ 4] [10] [11] [14]] Y: [[ 3] [ 1] [ 2] [ … Read more

## Linear Regression: Python Numpy Implementation from Scratch

This tutorial shows how you can conduct linear regression Python Numpy from scratch. 1. Math and Matrix of Linear Regression We can use just use pure matrix calculation to estimate the regression coefficients in a linear regression model. Below is the process. Thus, we can simplify the function above to the function below. We can … Read more

## Python: Type I, Type II, and Type III ANOVA

1. Introduction Type I, Type II, and Type III ANOVA are 3 different ways of calculating sum of squares in ANOVA. Type I ANOVA: SS(A) for factor A SS(B | A) for factor B SS(AB | A, B) for interaction AB Type II ANOVA: SS(A | B) for factor A SS(B | A) for factor … Read more

## How to Perform Two-Way ANOVA in Python

Introduction A two-way ANOVA is used to test whether the means from the two or more categorical variables are significantly different from one another. We can use statsmodel.stats.lm() to do two-way ANOVA. The following is the core syntax. model = ols(‘DV ~ C(factor_1) + C(factor_2) + C(factor_1):C(factor_2)’, data=df_x).fit() sm.stats.anova_lm(model, typ=1, 2, or 3) Hypothetical Data There are two … Read more

## One Sample t-test in Python (with Code Example)

Introduction One sample t-test examines whether the mean of a population is statistically different from a known or hypothesized value. In Python, we can use the scipy.stats.ttest_1samp() to test one sample t-test. The code statement syntax is as follows. scipy.stats.ttest_1samp(data, popmean) Where, data: an array of sample observations popmean: Expected value in the null hypothesis. The following … Read more

## Paired t-test in Python (with Code Example)

Introduction Paired t-test is used when data_1 and data_2 are from the same group of people or objects but at two different times. In Python, we can use scipy.stats.ttest_rel() to conduct paired sample t-test. The syntax is as follows. scipy.stats.ttest_rel (data_1, data_2) Where, data_1: data collected at time point 1 data_2: data collected at time … Read more

## Use sklearn for Linear Regression in Python

Introduction We can use sklearn.linear_model.LinearRegression to do linear regression in Python. The following is the core syntax of using sklearn. lm.fit(IVs, DV) Where, IVs: the independent variables DV: the dependent variable Example for Linear Regression Model The following is the linear regression model, including household income as IVs and purchase intention as DV. The following … Read more

## nltk: How to Remove Stop words in Python

This tutorial shows how you can remove stop words using nltk in Python. Stop words are words not carrying important information, such as propositions (“to”, “with”), articles (“an”, “a”, “the”), or conjunctions (“and”, “or”, “but”). We first need to import the needed packages. We can then set the language to be English. Before removing stop … Read more

## How to Do One-way ANOVA in Python (2 Examples)

1. Introduction One-Way ANOVA is to compare the means of different groups, to see whether the mean difference is statistically significant. This tutorial shows two methods of testing one-way ANOVA in Python. Method 1: use scipy.stats for one-way ANOVA f_oneway(level_1,level_2,level_3,…) Method 2: use statsmodel for one-way ANOVA sm.stats.anova_lm(model, typ=1, 2, or 3) 2. Sample Data … Read more