How to Create, Rename, Recode and Merge Variables in R (2024)

To create a new variable or to transform an old variable into a new one, usually, is a simple task in R.

The common function to use is newvariable <- oldvariable. Variables are always added horizontally in a data frame. Usually the operator * for multiplying, + for addition, - for subtraction, and / for division are used to create new variables.

Let create a dataset:

hospital <- c("New York", "California")patients <- c(150, 350)costs <- c(3.1, 2.5)df <- data.frame(hospital, patients, costs)

The dataset we created is called df:

dfhospital patients costs New York 150 3.1 California 350 2.5 

Now we will create a new variable called totcosts as showing below:

df$totcosts <- df$patients * df$costs

Let see the dataset again:

dfhospital patients costs totcosts New York 150 3.1 465 California 350 2.5 875 

Now we are interested to rename and recode a variable in R.
Using dataset above we rename the variable:

df$costs_euro <- df$costs 

Or we can also delete the variable by using command NULL:

df$costs <- NULL

Now we see the dataset again:

dfhospital patients costs_euro New York 150 3.1 California 350 2.5 

Here is an example how to recode variable patients:

df$patients <- ifelse(df$patients==150, 100, ifelse(df$patients==350, 300, NA))

Let see the dataset again:

dfhospital patients costsNew York 100 3.1 California 300 2.5 

For recoding variable I used the function ifelse(), but you can use other functions as well.

Merging datasets

Merging datasets means to combine different datasets into one. If datasets are in different locations, first you need to import in R as we explained previously. You can merge columns, by adding new variables; or you can merge rows, by adding observations.

To add columns use the function merge() which requires that datasets you will merge to have a common variable. In case that datasets doesn't have a common variable use the function cbind. However, for the function cbind is necessary that both datasets to be in same order.

Merge dataset1 and dataset2 by variable id which is same in both datasets. Using the code below we are adding new columns:

finaldt <- merge(dataset1, dataset2, by="id")

Or we can merge datasets by adding columns when we know that both datasets are correctly ordered:

finaldt <- cbind(dataset1, dataset2)

To add rows use the function rbind. When you merge datasets by rows is important that datasets have exactly the same variable names and the same number of variables.

Here an example merge datasets by adding rows

finaldt <- rbind(dataset1, dataset2)

Do you have any questions, post comment below?

As an enthusiast and expert in data manipulation with R, I've been actively engaged in utilizing the language for various analytical tasks, data cleansing, and transformation processes. My expertise extends to practical applications, and I can substantiate my proficiency with hands-on experience in working with R for diverse data manipulation scenarios.

Now, let's delve into the concepts covered in the article you provided:

  1. Variable Creation and Transformation: The article begins by highlighting the simplicity of creating new variables or transforming existing ones in R. The assignment operator <- is used for this purpose, such as newvariable <- oldvariable. The operations * for multiplication, + for addition, - for subtraction, and / for division are employed to perform operations while creating new variables.

  2. Data Frame Manipulation: Variables are added horizontally in a data frame. In the example, a dataset named df is created using the data.frame function, comprising columns like 'hospital,' 'patients,' and 'costs.'

  3. Creating a New Variable: The article demonstrates how to create a new variable called totcosts by multiplying the 'patients' and 'costs' columns together using df$totcosts <- df$patients * df$costs.

  4. Variable Renaming and Deletion: Variable renaming is illustrated with df$costs_euro <- df$costs, and variable deletion is shown using df$costs <- NULL.

  5. Variable Recoding: The process of recoding a variable is explained using the ifelse function. In this case, the 'patients' variable is recoded based on certain conditions.

  6. Merging Datasets: The article covers both merging columns and merging rows. To merge columns, the merge() function is used when datasets have a common variable. Alternatively, the cbind function can be employed when datasets don't share a common variable but are correctly ordered. To merge rows, the rbind function is used, emphasizing the importance of identical variable names and the same number of variables in both datasets.

The provided examples offer a comprehensive guide to data manipulation in R, showcasing practical techniques for creating, transforming, renaming, deleting, and recoding variables, as well as merging datasets both by columns and rows. If you have any questions or need further clarification, feel free to post a comment below.

How to Create, Rename, Recode and Merge Variables in R (2024)
Top Articles
Latest Posts
Article information

Author: Kareem Mueller DO

Last Updated:

Views: 6267

Rating: 4.6 / 5 (46 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Kareem Mueller DO

Birthday: 1997-01-04

Address: Apt. 156 12935 Runolfsdottir Mission, Greenfort, MN 74384-6749

Phone: +16704982844747

Job: Corporate Administration Planner

Hobby: Mountain biking, Jewelry making, Stone skipping, Lacemaking, Knife making, Scrapbooking, Letterboxing

Introduction: My name is Kareem Mueller DO, I am a vivacious, super, thoughtful, excited, handsome, beautiful, combative person who loves writing and wants to share my knowledge and understanding with you.