# How to Calculate Means by Group in R (2 Examples)

This tutorial is about how to calculate means by group in R. There are two methods of calculating means by groups in R, either `aggregate()` or `dplyr`.

Method 1: Use base R

aggregate(df\$col_being_aggregated, list(df\$col_1_groupby,df\$col_2_groupby,…), FUN=mean)

Method 2: Use package of dplyr

library(dplyr)
df%>% group_by(col_1_groupby,col_2_groupby,…) %>% summarise_at(vars(col_being_aggregated), list(name = mean))

## Example 1 for Method 1

The following is the data being used for the examples. It has two categorical variables (i.e., City and Brand) and one continuous variable (i.e., sales).

``````# download data from Github

# print out the dataframe
print(df)``````

Output:

```    City  Brand sales
1  City1 brand1    70
2  City1 brand2    10
3  City1 brand1   100
4  City1 brand2     2
5  City1 brand1    30
6  City1 brand2     2
7  City1 brand1    20
8  City1 brand2    10
9  City1 brand1    20
10 City1 brand2    10
11 City2 brand1     9
12 City2 brand2    10
13 City2 brand1     5
14 City2 brand2     4
15 City2 brand1     4
16 City2 brand2     4
17 City2 brand1     5
18 City2 brand2     4
19 City2 brand1    12
20 City2 brand2    11```

The following uses the `aggregate()` from base R to calculate the means. We group by two categorical variables, City and Brand.

``````# use aggregate() in base R to calculate means grouped by City and Brand
aggregate(df\$sales, list(df\$City,df\$Brand), FUN=mean)``````

The following output shows the four means of all 4 cells.

```  Group.1 Group.2    x
1   City1  brand1 48.0
2   City2  brand1  7.0
3   City1  brand2  6.8
4   City2  brand2  6.6```

## Example 2 for Method 2

We are going to use the same data frame, namely `df`. Thus, I am not going to print it out again here.

The following uses `dplyr` to calculate means grouped by two categorical variables, City and Brand.

``````# use dplyr to calculate means grouped by City and Brand
library(dplyr)
df%>% group_by(City,Brand) %>% summarise_at(vars(sales), list(name = mean))``````

The following is the output. We can see that it prints out 4 means for all cells.

```# A tibble: 4 x 3
# Groups:   City [2]
City  Brand   name
<chr> <chr>  <dbl>
1 City1 brand1  48
2 City1 brand2   6.8
3 City2 brand1   7
4 City2 brand2   6.6```