R - Replace NA with Empty String in a DataFrame - Spark By {Examples} (2024)

How to replace NA (missing values) with blank space or an empty string in an R dataframe? You can replace NA values with blank space on columns of R dataframe (data.frame) by using is.na(), replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.

  • R base is.na() function
  • R base replace() function
  • dplyr::mutate_if() and tidyr::replace_na()
  • dplyr::mutate_at() and tidyr::replace_na()

Generally, NA values are considered missing values, and doing any operation on these values results in inconsistent results, hence before processing data, it is good practice to handle these missing values. Similarly, using these you can also replace NA with zero (0) in R.

1. Quick Examples of Replace NA Values with Empty String

Below are quick examples of how to replace dataframe column values from NA to blank space or an empty string in R.

# Quick Examples#Example 1 - Replace na values with blank using is.na()my_dataframe[is.na(my_dataframe)] <- ""#Example 2 - By using replace() & is.na()my_dataframe <- replace(my_dataframe, is.na(my_dataframe), "")# All below examples need to load these librarieslibrary("dplyr")library("tidyr")#Example 3 - Replace only string columnsmy_dataframe <- my_dataframe %>% mutate_if(is.character, ~replace_na(.,""))# Example 4 - Replace on selected columns by Namemy_dataframe <- my_dataframe %>% mutate_at(c('name','gender'), ~replace_na(.,""))# Example 5 - Replace on selected columns by Indexmy_dataframe <- my_dataframe %>% mutate_at(c(1,2), ~replace_na(.,""))

Let’s create a dataframe with some NA values, run these examples and validate the result.

#Create dataframemy_dataframe=data.frame( name=c('sravan',NA,'chrisa','shivgami',NA), gender=c(NA,'m',NA,'f',NA))#Display dataframeprint(my_dataframe)

Output:

#Output name gender1 sravan <NA>2 <NA> m3 chrisa <NA>4 shivgami f5 <NA> <NA>

2. Replace NA values with Empty String using is.na()

is.na() is used to check whether the given dataframe column value is equal to NA or not in R. If it is NA, it will return TRUE, otherwise FALSE. So by specifying it inside-[] (index), it will return NA and assigns it to space. In this way, we can replace NA (missing values) with empty string in an R DataFrame.

Syntax:

#Syntaxdf[is.na(df)] = "value to replace"

where my_dataframe is the input dataframe. Let’s run an example to update NA values with blank space in R dataframe.

#Replace na values with blank using is.na()my_dataframe[is.na(my_dataframe)] <- ""#Display the dataframeprint(my_dataframe)

Output:

#Output name gender1 sravan 2 m3 chrisa 4 shivgami f5 

In the above output, we can see that NA values are replaced with blank space.

3. Replace NA values with Blank Space using replace()

Let’s see another way to change NA values with zero using the replace(). It will take three parameters.

Syntax:

#Syntaxreplace(df,is.na(df),"value to replace")

Parameters:

  1. the first parameter is the input dataframe.
  2. the second parameter takes is.na() method to check if it is NA
  3. the last parameter takes value “” (blank), which will replace the value present in the second parameter

Example: Replace NA with blank space in the dataframe using replace()

#By using replace() & is.na()my_dataframe <- replace(my_dataframe, is.na(my_dataframe), "")#Display dataframeprint(my_dataframe)

Yields the same output as above.

Alternatively, you can also write the above statement using %>% operator. In order to use this, load library dplyr.

#Example 2 - Using %>% library("dplyr")my_dataframe <- my_dataframe %>% replace(is.na(my_dataframe), "")print(my_dataframe)

4. Replace NA with Empty String only on Character Columns

All examples above use dataframe with only characters hence renaming NA with an empty string is straight forward but, in real-time we would get a mix of numeric and character columns, and running the above examples results in an error hence, we need to use qualifiers to apply the change only on character columns ignoring numeric columns.

You can apply conditions by using dplyr::mutate_if() and is.character is used to check if the column is a character or not and apply tidyr::replace_na() only on character columns.

#Create dataframemy_dataframe=data.frame( id=c(2,1,3,4,NA), name=c('sravan',NA,'chrisa','shivgami',NA), gender=c(NA,'m',NA,'f',NA))#Load librarylibrary("dplyr")library("tidyr")#Replace only character columnsmy_dataframe <- my_dataframe %>% mutate_if(is.character, ~replace_na(., ""))print(my_dataframe)

Yields below output. Notice that colid id still have NA values as it’s been ignored because it holds numeric values.

#Output id name gender1 2 sravan 2 1 m3 3 chrisa 4 4 shivgami f5 NA 

5. Replace NA with Empty String on Selected Multiple Columns

To replace NA with an empty string on selected multiple columns by name use mutate_at() function with vector c() of column names.

# Replace on selected multiple columnslibrary("dplyr")library("tidyr")my_dataframe <- my_dataframe %>% mutate_at(c('name','gender'), ~replace_na(.,""))print(my_dataframe)

Yields the same output as above.

6. Replace NA with Empty String on Selected Multiple Index

Finally, if you wanted to replace NA with an empty string on selected multiple r dataframe columns by index use mutate_at() function with vector c() of index values.

# Replace on selected multiple indexlibrary("dplyr")library("tidyr")my_dataframe <- my_dataframe %>% mutate_at(c(2,3), ~replace_na(.,""))print(my_dataframe)

Yields the same output as above. You can find more details on themutate()function and its variants in theR Documentation.

6. Conclusion

In this article, I have explained several ways to replace NA also called missing values with blank space or an empty string in the R dataframe by using is.na(), replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.

Related Articles

  • Convert DataFrame Column to Numeric Type in R
  • Drop Dataframe Columns by Name in R
  • How to Replace Empty String with NA in R?
  • How to Replace Zero (0) with NA on R Dataframe Column?
  • How to Replace NA with Empty String in an R DataFrame?
  • How to Replace Values in R with Examples
  • R – Replace Character in a String
  • R – Replace Column Value with Another Column

References

  1. replace() in R
  2. imputeTS() package in R
  3. What is NA or Missing Values?
R - Replace NA with Empty String in a DataFrame - Spark By {Examples} (2024)
Top Articles
Latest Posts
Article information

Author: Tyson Zemlak

Last Updated:

Views: 6357

Rating: 4.2 / 5 (43 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Tyson Zemlak

Birthday: 1992-03-17

Address: Apt. 662 96191 Quigley Dam, Kubview, MA 42013

Phone: +441678032891

Job: Community-Services Orchestrator

Hobby: Coffee roasting, Calligraphy, Metalworking, Fashion, Vehicle restoration, Shopping, Photography

Introduction: My name is Tyson Zemlak, I am a excited, light, sparkling, super, open, fair, magnificent person who loves writing and wants to share my knowledge and understanding with you.