Group by function in R using Dplyr - GeeksforGeeks (2024)

Group_by() function belongs to the dplyr package in the R programming language, which groups the data frames. Group_by() function alone will not give any output. It should be followed by summarise() function with an appropriate action to perform. It works similar to GROUP BY in SQL and pivot table in excel.

Syntax:

group_by(col,…)

Syntax:

group_by(col,..) %>% summarise(action)

The dataset in use:

Sample_Superstore

Group_by() on a single column

This is the simplest way by which a column can be grouped, just pass the name of the column to be grouped in the group_by() function and the action to be performed on this grouped column in summarise() function.

Example: Grouping single column by group_by()

R

library(dplyr)

df = read.csv("Sample_Superstore.csv")

df_grp_region = df %>% group_by(Region) %>%

summarise(total_sales = sum(Sales),

total_profits = sum(Profit),

.groups = 'drop')

View(df_grp_region)

Output:

Group by function in R using Dplyr - GeeksforGeeks (1)

Group_by() on multiple columns

Group_by() function can also be performed on two or more columns, the column names need to be in the correct order. The grouping will occur according to the first column name in the group_by function and then the grouping will be done according to the second column.

Example: Grouping multiple columns

R

library(dplyr)

df = read.csv("Sample_Superstore.csv")

df_grp_reg_cat = df %>% group_by(Region, Category) %>%

summarise(total_Sales = sum(Sales),

total_Profit = sum(Profit),

.groups = 'drop')

View(df_grp_reg_cat)

Output:

Group by function in R using Dplyr - GeeksforGeeks (2)

We can also calculate mean, count, minimum or maximum by replacing the sum in the summarise or aggregate function. For example, we will find mean sales and profits for the same group_by example above.

Example:

R

library(dplyr)

df = read.csv("Sample_Superstore.csv")

df_grp_reg_cat = df %>% group_by(Region, Category) %>%

summarise(mean_Sales = mean(Sales),

mean_Profit = mean(Profit),

.groups = 'drop')

View(df_grp_reg_cat)

Output:

Group by function in R using Dplyr - GeeksforGeeks (3)


Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape, GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out - check it out now!


Commit to GfG's Three-90 Challenge! Purchase a course, complete 90% in 90 days, and save 90% cost click here to explore.

Last Updated : 31 Aug, 2021

Like Article

Save Article

Share your thoughts in the comments

Please Login to comment...

Group by function in R using Dplyr - GeeksforGeeks (2024)
Top Articles
Latest Posts
Article information

Author: Francesca Jacobs Ret

Last Updated:

Views: 5272

Rating: 4.8 / 5 (68 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Francesca Jacobs Ret

Birthday: 1996-12-09

Address: Apt. 141 1406 Mitch Summit, New Teganshire, UT 82655-0699

Phone: +2296092334654

Job: Technology Architect

Hobby: Snowboarding, Scouting, Foreign language learning, Dowsing, Baton twirling, Sculpting, Cabaret

Introduction: My name is Francesca Jacobs Ret, I am a innocent, super, beautiful, charming, lucky, gentle, clever person who loves writing and wants to share my knowledge and understanding with you.