How to Use Mutate function in R (2024)

How to Use Mutate function in R, This article demonstrates how to add additional variables to a data frame using R’s mutate() function.

How to Use Mutate function in R

The dplyr library has the following functions that can be used to add additional variables to a data frame.

mutate() – adds new variables while retaining old variables to a data frame.

transmute() – adds new variables and removes old ones from a data frame.

mutate_all() – changes every variable in a data frame simultaneously.

mutate_at() – changes certain variables by name.

mutate_if() – alterations all variables that satisfy a specific criterion

A data frame’s existing variables are preserved when new variables are added using the mutate() function. The mutate() basic syntax is as follows.

data <- mutate(new_variable = existing_variable/3)

data: the fresh data frame where the fresh variables will be placed

new_variable: the name of the new variable

existing_variable: the current data frame variable that you want to modify in order to generate a new variable

As an illustration, the code that follows shows how to modify the built-in iris dataset to include a new variable called root sepal width.

The first six lines of the iris dataset should be defined as a data frame.

data <- head(iris)dataSepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosalibrary(dplyr)

Set the new column’s root sepal width to the sepal’s square root. variable width

data %>% mutate(root_sepal_width = sqrt(Sepal.Width)) Sepal.Length Sepal.Width Petal.Length Petal.Width Species root_sepal_width1 5.1 3.5 1.4 0.2 setosa 1.8708292 4.9 3.0 1.4 0.2 setosa 1.7320513 4.7 3.2 1.3 0.2 setosa 1.7888544 4.6 3.1 1.5 0.2 setosa 1.7606825 5.0 3.6 1.4 0.2 setosa 1.8973676 5.4 3.9 1.7 0.4 setosa 1.974842


A data frame’s variables are added and removed via the transmute() method. The code that follows demonstrates how to eliminate all of the existing variables and add two new variables to a dataset.

The first six lines of the iris dataset should be defined as a data frame.

data <- head(iris)data Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa

Create two new variables, then get rid of all the others.

data %>% transmute(root_sepal_width = sqrt(Sepal.Width), root_petal_width = sqrt(Petal.Width)) root_sepal_width root_petal_width1 1.870829 0.44721362 1.732051 0.44721363 1.788854 0.44721364 1.760682 0.44721365 1.897367 0.44721366 1.974842 0.6324555


The mutate_all() function changes every variable in a data frame at once, enabling you to use the funs() function to apply a certain function to every variable.

The use of mutate_all() to divide each column in a data frame by ten is demonstrated in the code below.

The first six rows of iris sans the Species variable as the new data frame.

data2 <- head(iris) %>% select(-Species)data2

divide 10 from each of the data frame’s variables.

data2 %>% mutate_all(funs(./10))Sepal.Length Sepal.Width Petal.Length Petal.Width1 0.51 0.35 0.14 0.022 0.49 0.30 0.14 0.023 0.47 0.32 0.13 0.024 0.46 0.31 0.15 0.025 0.50 0.36 0.14 0.026 0.54 0.39 0.17 0.04

Remember that you can add more variables to the data frame by supplying a new name to be prefixed to the existing variable name.

data2 %>% mutate_all(funs(mod = ./10)) Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length_mod1 5.1 3.5 1.4 0.2 0.512 4.9 3.0 1.4 0.2 0.493 4.7 3.2 1.3 0.2 0.474 4.6 3.1 1.5 0.2 0.465 5.0 3.6 1.4 0.2 0.506 5.4 3.9 1.7 0.4 0.54 Sepal.Width_mod Petal.Length_mod Petal.Width_mod1 0.35 0.14 0.022 0.30 0.14 0.023 0.32 0.13 0.024 0.31 0.15 0.025 0.36 0.14 0.026 0.39 0.17 0.04


Using names, the mutate at() function changes particular variables. The use of mutate_at() to divide two particular variables by 10 is demonstrated in the code below:

data2 %>% mutate_at(c("Sepal.Length", "Sepal.Width"), funs(mod = ./10))Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length_mod1 5.1 3.5 1.4 0.2 0.512 4.9 3.0 1.4 0.2 0.493 4.7 3.2 1.3 0.2 0.474 4.6 3.1 1.5 0.2 0.465 5.0 3.6 1.4 0.2 0.506 5.4 3.9 1.7 0.4 0.54 Sepal.Width_mod1 0.352 0.303 0.324 0.315 0.366 0.39


All variables that match a specific condition are modified by the mutate_if() function.

The mutate_if() function can be used to change any variables of type factor to type character, as shown in the code below.

data <- head(iris)sapply(data, class)Sepal.Length Sepal.Width Petal.Length Petal.Width Species "numeric" "numeric" "numeric" "numeric" "factor"

every factor variable can be converted to a character variable.

new_data <- data %>% mutate_if(is.factor, as.character)sapply(new_data, class)Sepal.Length Sepal.Width Petal.Length Petal.Width Species "numeric" "numeric" "numeric" "numeric" "character"

The mutate_if() method can be used to round any numeric variables to the nearest whole number using the following example code.

In the first six rows of the iris dataset,

data <- head(iris)dataSepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa

any numeric variables should be rounded to the nearest decimal place.

data %>% mutate_if(is.numeric, round, digits = 0)Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5 4 1 0 setosa2 5 3 1 0 setosa3 5 3 1 0 setosa4 5 3 2 0 setosa5 5 4 1 0 setosa6 5 4 2 0 setosa

