MSE stands for Mean Squared Error, and can be used to compare our estimated values and observed values in a model. The following is the formula of MSE.

\[ MSE=\frac{SSR}{n-p-1}=\frac{\sum_{i=1}^{n} (\hat{y_i}-y_i)^2 }{n-p-1}\]

## How to Calculate MSE in R

R can be used to calculate Mean Squared Error (MSE). The following is the core syntax, which calculates the ratio of sum of the squared residuals and the degree of freedom in residuals.

**sum(residuals(fit)^2)**/**fit$df.residual**

The following are 2 examples showing how to calculate MSE for linear regression models in R.

## Example 1: Use data of mtcars

mtcarts is a built-in sample dataset in R. We can have a linear regression model of mpg as the `DV`

and hp as the `IV`

. We can use `lm()`

to estimate the regression coefficients.

After getting the `fit`

, we use the `sum(residuals(fit)^2)/fit$df.residual`

to calculate MSE.

```
# use lm() to estimate regression coefficinets
fit <- lm(mpg~hp, data=mtcars)
# calculate Mean Squared Error (MSE)
sum(residuals(fit)^2)/fit$df.residual
```

Output:

[1] 14.92248

Thus, the Mean Squared Error (MSE) for the regression model is 14.92.

## Example 2: Hypothetical data

The following hypothetical data has cities and stores as the `IVs`

and sales as the `DV`

. We write them in a linear model in `lm()`

to estimate the regression coefficients.

After getting the `fit`

, we use the `sum(residuals(fit)^2)/fit$df.residual`

to calculate MSE.

```
x_1 = rep(c('City1','City2'),each=5)
x_2 = rep(c('store1','store2'), 5)
sales=c(10,20,20,50,30,10,5,4,12,4)
df <- data.frame (cities = x_1,
stores = x_2,
sales=sales)
# use lm() to estimate regression coefficinets
fit <- lm(sales~x_1*x_2, data=df)
# calculate Mean Squared Error (MSE)
sum(residuals(fit)^2)/fit$df.residual
```

Output:

[1]116.4167

Thus, the Mean Squared Error (MSE) for the regression model is 116.42.

## MSE denominator: n vs. n-p-1

Note that some people define MSE using `n`

rather than `n-p-1`

in the denominator. To better understand the nuanced difference, please refer to my other post on this topic (link below).

In that post, I also explain the difference and connection between MSE (Mean Square Error) and MSR (Mean Squared Residuals). You might find it useful as well.

I also have a post showing how to calculate MSE in Python (link below), in which I show how to calculate both biased MSE and unbiased MSE using Python.

## Reference

- Mean squared error and the residual sum of squares function (Stack Exchange)
- R – Confused on Residual Terminology (Stack Exchange)