One-Way ANOVA is to compare the means of different groups, to see whether the mean difference is statistically significant. For instance, you would like to compare the average household size of three cities. You can collect 3 samples from these three cities and conduct a one-way ANOVA to check the difference.

## Formulas of One-way ANOVA

The full name of ANOVA is Analysis of Variance. Thus, ANOVA is about partitioning the variance into different parts. `Sum of Square Total (SSB) `

is the total variance of all the observations. SSB can be separated into `Sum of Squares Between (SSB)`

and `Sum of squares Error (SSE)`

.

\[SST=SSB+SSE\]

The formulas of `SSB and SSE`

are as follows.

\[SSB=\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2\]

\[SSE=\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2\]

We also need to consider the degree of freedom, which leads to mean squares, namely `Mean Square Between (MSB)`

and `Mean Square Error (MSE)`

.

\[MSB=\frac{SSB}{k-1}\]

\[MSE=\frac{SSE}{n-k}\]

Finally, the F value is the ratio of `MSB`

and `MSE`

.

\[F(k-1,n-k)=\frac{MSB}{MSE}=\frac{\frac{SSB}{k-1}}{\frac{SSE}{n-k}}=\frac{\frac{\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2}{k-1}}{\frac{\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2}{n-k}}\]

## Manual Calculation Example

Suppose we would like to see whether 3 cities differ in terms of household size. We sample 5 households from each city. The null hypothesis and alternative hypothesis for one-way ANOVA are as follows.

\[H_0: \mu_{city1}=\mu_{city2}=\mu_{city3}\]

\[H_1: \mu_{city1},\mu_{city2},\mu_{city3} \ are \ not \ all \ equal.\]

Group | Household Size | Group Mean | Overall Mean |
---|---|---|---|

City 1 | 6 | 4 | 3.4 |

City 1 | 2 | 4 | 3.4 |

City 1 | 3 | 4 | 3.4 |

City 1 | 4 | 4 | 3.4 |

City 1 | 5 | 4 | 3.4 |

City 2 | 2 | 3 | 3.4 |

City 2 | 1 | 3 | 3.4 |

City 2 | 3 | 3 | 3.4 |

City 2 | 4 | 3 | 3.4 |

City 2 | 5 | 3 | 3.4 |

City 3 | 4 | 3.2 | 3.4 |

City 3 | 1 | 3.2 | 3.4 |

City 3 | 2 | 3.2 | 3.4 |

City 3 | 4 | 3.2 | 3.4 |

City 3 | 5 | 3.2 | 3.4 |

\[SSB=\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2=5 \times(4-3.4)^2+5\times(3-3.4)^2+5 \times (3.2-3.4)^2=2.8\]

\[\begin{equation}

\begin{aligned}

SSE=\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2= & (6-4)^2+(2-4)^2+(3-4)^2+(4-4)^2+(5-4)^2+\\ &(2-3)^2+(1-3)^2+(3-3)^2+(4-3)^2+(5-3)^2+\\

&(4-3.2)^2+(1-3.2)^2+(2-3.2)^2+(4-3.2)^2+(5-3.2)^2 \\

= &30.8

\end{aligned}

\end{equation}\]

\[MSB=\frac{SSB}{k-1}=\frac{2.8}{3-1}=1.4\]

\[MSE=\frac{SSE}{n-k}=\frac{30.8}{15-3}=2.57\]

Finally, we can calculate the` F-value`

by calculating the ratio of `MSB`

and `MSE`

.

\[F(k-1,n-k)=\frac{MSB}{MSE}=F(2,12)=\frac{1.4}{2.57}=0.55\]

We can then check the `F(2,12)`

critical value table, and it is 3.89. Since the calculated `F(2,12) `

= 0.55 and it is smaller than 3.89, we fail to reject the null hypothesis. Thus, we conclude that we do not have evidence to reject the claim that all these three cities have the same household size.