This tutorial is about how to calculate means by group in R. There are two methods of calculating means by groups in R, either
Method 1: Use base R
aggregate(df$col_being_aggregated, list(df$col_1_groupby,df$col_2_groupby,…), FUN=mean)
Method 2: Use package of dplyr
df%>% group_by(col_1_groupby,col_2_groupby,…) %>% summarise_at(vars(col_being_aggregated), list(name = mean))
Example 1 for Method 1
The following is the data being used for the examples. It has two categorical variables (i.e., City and Brand) and one continuous variable (i.e., sales).
# download data from Github df <- read.csv("https://raw.githubusercontent.com/TidyPython/interactions/main/city_brand_sales.csv") # print out the dataframe print(df)
City Brand sales 1 City1 brand1 70 2 City1 brand2 10 3 City1 brand1 100 4 City1 brand2 2 5 City1 brand1 30 6 City1 brand2 2 7 City1 brand1 20 8 City1 brand2 10 9 City1 brand1 20 10 City1 brand2 10 11 City2 brand1 9 12 City2 brand2 10 13 City2 brand1 5 14 City2 brand2 4 15 City2 brand1 4 16 City2 brand2 4 17 City2 brand1 5 18 City2 brand2 4 19 City2 brand1 12 20 City2 brand2 11
The following uses the
aggregate() from base R to calculate the means. We group by two categorical variables, City and Brand.
# use aggregate() in base R to calculate means grouped by City and Brand aggregate(df$sales, list(df$City,df$Brand), FUN=mean)
The following output shows the four means of all 4 cells.
Group.1 Group.2 x 1 City1 brand1 48.0 2 City2 brand1 7.0 3 City1 brand2 6.8 4 City2 brand2 6.6
Example 2 for Method 2
We are going to use the same data frame, namely
df. Thus, I am not going to print it out again here.
The following uses
dplyr to calculate means grouped by two categorical variables, City and Brand.
# use dplyr to calculate means grouped by City and Brand library(dplyr) df%>% group_by(City,Brand) %>% summarise_at(vars(sales), list(name = mean))
The following is the output. We can see that it prints out 4 means for all cells.
# A tibble: 4 x 3 # Groups: City  City Brand name <chr> <chr> <dbl> 1 City1 brand1 48 2 City1 brand2 6.8 3 City2 brand1 7 4 City2 brand2 6.6