# How to Calculate MSE in R

MSE stands for Mean Squared Error, and can be used to compare our estimated values and observed values in a model. The following is the formula of MSE.

$MSE=\frac{SSR}{n-p-1}=\frac{\sum_{i=1}^{n} (\hat{y_i}-y_i)^2 }{n-p-1}$

## How to Calculate MSE in R

R can be used to calculate Mean Squared Error (MSE). The following is the core syntax, which calculates the ratio of sum of the squared residuals and the degree of freedom in residuals.

sum(residuals(fit)^2)/fit$df.residual The following are 2 examples showing how to calculate MSE for linear regression models in R. ## Example 1: Use data of mtcars mtcarts is a built-in sample dataset in R. We can have a linear regression model of mpg as the DV and hp as the IV. We can use lm() to estimate the regression coefficients. After getting the fit, we use the sum(residuals(fit)^2)/fit$df.residual to calculate MSE.

# use lm() to estimate regression coefficinets
fit <- lm(mpg~hp, data=mtcars)

# calculate Mean Squared Error (MSE)
sum(residuals(fit)^2)/fit$df.residual Output:  14.92248 Thus, the Mean Squared Error (MSE) for the regression model is 14.92. ## Example 2: Hypothetical data The following hypothetical data has cities and stores as the IVs and sales as the DV. We write them in a linear model in lm() to estimate the regression coefficients. After getting the fit, we use the sum(residuals(fit)^2)/fit$df.residual to calculate MSE.

x_1 = rep(c('City1','City2'),each=5)
x_2 = rep(c('store1','store2'), 5)
sales=c(10,20,20,50,30,10,5,4,12,4)

df <- data.frame (cities  = x_1,
stores = x_2,
sales=sales)

# use lm() to estimate regression coefficinets
fit <- lm(sales~x_1*x_2, data=df)

# calculate Mean Squared Error (MSE)
sum(residuals(fit)^2)/fit\$df.residual

Output:

116.4167

Thus, the Mean Squared Error (MSE) for the regression model is 116.42.

## MSE denominator: n vs. n-p-1

Note that some people define MSE using n rather than n-p-1 in the denominator. To better understand the nuanced difference, please refer to my other post on this topic (link below).

In that post, I also explain the difference and connection between MSE (Mean Square Error) and MSR (Mean Squared Residuals). You might find it useful as well.

I also have a post showing how to calculate MSE in Python (link below), in which I show how to calculate both biased MSE and unbiased MSE using Python.