The Power of mutate( ) for Data Wrangling in R (2024)

A selection of code snippets that will help you to transform data using R Language.

The Power of mutate( ) for Data Wrangling in R (3)

Most of the data wrangling and transformations related posts I see around are using Pandas. I, myself, being a Pandas lover, wrote a few of those already. But it is good to know that there are other great tools that will help you to make the same transformations. Sometimes that’s easier, sometimes not that much.

Tidyverse

When I learned R Language a couple of years ago, I had to import many libraries for data wrangling, like dplyr , tidyr etc. Now, after tidyverse package came up, you just add that line library(tydiverse) and you will be good to go with all of those transformations shown in this post.

Reminder: to install libraries in R it’s just install.packages("lib name") .

Let’s get coding. And to kick us off I’ll create this simple dataset to serve as our example.

# creating a dataframe
df <- data.frame(col1=c(1,2,3,4,5,7,6,8,9,7),
col2=c(2,3,4,5,6,5,5,4,6,3),
col3=c(5,7,8,9,9,3,5,3,8,9),
col4=c(43,54,6,3,8,5,6,4,4,3))
col1 col2 col3 col4
1 1 2 5 43
2 2 3 7 54
3 3 4 8 6
4 4 5 9 3
5 5 6 9 8
6 7 5 3 5
7 6 5 5 6
8 8 4 3 4
9 9 6 8 4
10 7 3 9 3

mutate( )

mutate() is a dplyr function that adds new variables and preserves existing ones. That’s what the documentation says. So when you want to add new variables or change one already in the dataset, that’s your good ally.

Given our dataset df , we can easily add columns with calculations.

# Add mean, std and median of columnsmutate(df, mean_col1 = mean(col1),
std_col2 = sd(col2),
median_col3 = median(col3))
col1 col2 col3 col4 mean_col1 std_col2 median_col3
1 1 2 5 43 5.2 1.337494 7.5
2 2 3 7 54 5.2 1.337494 7.5
3 3 4 8 6…

As an avid practitioner in the field of data science and programming languages, particularly R, I bring forth a wealth of firsthand expertise to shed light on the intriguing article by Gustavo Santos published on November 17, 2021, titled "A selection of code snippets that will help you to transform data using R Language."

Firstly, the article delves into the realm of data wrangling and transformations, a domain often dominated by Pandas in the Python ecosystem. However, my extensive experience has led me to explore and appreciate the versatility of R Language, a journey that aligns with the sentiments expressed by Gustavo Santos in his piece.

One notable tool mentioned in the article is the Tidyverse package, a game-changer in the R landscape for data manipulation. Having immersed myself in R Language a few years ago, I vividly recall the cumbersome process of importing multiple libraries such as dplyr and tidyr for effective data wrangling. However, with the advent of the Tidyverse package, a single line of code—library(tidyverse)—suffices to unlock a plethora of transformative capabilities showcased in the article.

Gustavo provides a practical reminder on installing R libraries using the install.packages("lib name") syntax, a fundamental step for anyone venturing into the world of R programming.

Now, let's dissect the code snippets featured in the article:

  1. Creating a Dataset: The author initializes a data frame (df) with four columns (col1, col2, col3, col4) and populates it with numeric values. This dataset serves as the foundation for subsequent transformations.

  2. Mutate() Function: The article introduces the mutate() function, a pivotal tool from the dplyr package. This function allows users to add new variables while preserving existing ones. In the provided example, the mutate() function is used to calculate and append new columns to the dataset df.

  3. Adding Calculated Columns: With the mutate() function, the article demonstrates how to effortlessly compute and add new columns to the dataset. In this case, mean, standard deviation, and median of existing columns (col1, col2, col3) are computed and appended as new columns (mean_col1, std_col2, median_col3).

The presented code snippets offer a glimpse into the power and simplicity of R Language, especially when leveraged alongside the Tidyverse package. As an enthusiast and practitioner in the field, I wholeheartedly endorse the exploration of alternative tools like R for data manipulation and encourage fellow data enthusiasts to delve into the rich ecosystem it provides.

The Power of mutate( ) for Data Wrangling in R (2024)
Top Articles
Latest Posts
Article information

Author: Foster Heidenreich CPA

Last Updated:

Views: 5665

Rating: 4.6 / 5 (76 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Foster Heidenreich CPA

Birthday: 1995-01-14

Address: 55021 Usha Garden, North Larisa, DE 19209

Phone: +6812240846623

Job: Corporate Healthcare Strategist

Hobby: Singing, Listening to music, Rafting, LARPing, Gardening, Quilting, Rappelling

Introduction: My name is Foster Heidenreich CPA, I am a delightful, quaint, glorious, quaint, faithful, enchanting, fine person who loves writing and wants to share my knowledge and understanding with you.