How to remove '.' from column names in a dataframe?

1) sqldf can deal with names having dots in them if you quote the names:

library(sqldf)
d0 <- read.csv(text = "A.B,C.D\n1,2")
sqldf('select "A.B", "C.D" from d0')

giving:

  A.B C.D
1   1   2

2) When reading the data using read.table or read.csv use the check.names=FALSE argument.

Compare:

Lines <- "A B,C D
1,2
3,4"
read.csv(text = Lines)
##   A.B C.D
## 1   1   2
## 2   3   4
read.csv(text = Lines, check.names = FALSE)
##   A B C D
## 1   1   2
## 2   3   4

however, in this example it still leaves a name that would have to be quoted in sqldf since the names have embedded spaces.

3) To simply remove the periods, if DF is a data frame:

names(DF) <- gsub(".", "", names(DF), fixed = TRUE)

or it might be nicer to convert the periods to underscores so that it is reversible:

names(DF) <- gsub(".", "_", names(DF), fixed = TRUE)

This last line could be alternatively done like this:

names(DF) <- chartr(".", "_", names(DF))

To replace all the dots in the names you'll need to use gsub, rather than sub, which will only replace the first occurrence.

This should work.

test <- data.frame(abc.def = NA, ewf.asd.fkl = NA, qqit.vsf.addw.coil = NA)
names(test) <- gsub( ".",  "", names(test), fixed = TRUE)
test
  abcdef ewfasdfkl qqitvsfaddwcoil
1     NA        NA              NA

UPDATE dplyr 0.8.0

As of dplyr 0.8 funs() is soft deprecated, use formula notation.

a dplyr way to do this using stringr.

library(dplyr)
library(stringr)

data <- data.frame(abc.def = 1, ewf.asd.fkl = 2, qqit.vsf.addw.coil = 3)
renamed_data <- data %>%
  rename_all(~str_replace_all(.,"\\.","_")) # note we have to escape the '.' character with \\

Make sure you install the packages with install.packages().

Remember you have to escape the . character with \\. in regex, which functions like str_replace_all use, . is a wildcard.

Tags:

R