Convert type of multiple columns of a dataframe at once

I find I run into this a lot as well. This is about how you import data. All of the read...() functions have some type of option to specify not converting character strings to a factor. Meaning that text strings will stay character and things that look like numbers will stay as numbers. A problem arises when you have elements that are empty and not NA. But again, na.strings = c("",...) should solve that as well. I'd start by taking a hard look at your import process and adjusting it accordingly.

But you could always create a function and push this string through.

convert.magic <- function(x, y=NA) {
for(i in 1:length(y)) { 
if (y[i] == "numeric") { 
x[i] <- as.numeric(x[[i]])
}
if (y[i] == "character")
x[i] <- as.character(x[[i]])
}
return(x)
}

foo <- convert.magic(foo, c("character", "character", "numeric"))

> str(foo)
'data.frame':   10 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "4" ...
 $ y: chr  "red" "red" "red" "blue" ...
 $ z: num  15254 15255 15256 15257 15258 ...

Edit See this related question for some simplifications and extensions on this basic idea.

My comment to Brandon's answer using switch:

convert.magic <- function(obj,types){
    for (i in 1:length(obj)){
        FUN <- switch(types[i],character = as.character, 
                                   numeric = as.numeric, 
                                   factor = as.factor)
        obj[,i] <- FUN(obj[,i])
    }
    obj
}

out <- convert.magic(foo,c('character','character','numeric'))
> str(out)
'data.frame':   10 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "4" ...
 $ y: chr  "red" "red" "red" "blue" ...
 $ z: num  15254 15255 15256 15257 15258 ...

For truly large data frames you may want to use lapply instead of the for loop:

convert.magic1 <- function(obj,types){
    out <- lapply(1:length(obj),FUN = function(i){FUN1 <- switch(types[i],character = as.character,numeric = as.numeric,factor = as.factor); FUN1(obj[,i])})
    names(out) <- colnames(obj)
    as.data.frame(out,stringsAsFactors = FALSE)
}

When doing this, be aware of some of the intricacies of coercing data in R. For example, converting from factor to numeric often involves as.numeric(as.character(...)). Also, be aware of data.frame() and as.data.frame()s default behavior of converting character to factor.


I know I am quite late to answer, but using a loop along with the attributes function is a simple solution to your problem.

names <- c("x", "y", "z")
chclass <- c("character", "character", "numeric")

for (i in (1:length(names))) {
  attributes(foo[, names[i]])$class <- chclass[i]
}

If you want to automatically detect the columns data-type rather than manually specify it (e.g. after data-tidying, etc.), the function type.convert() may help.

The function type.convert() takes in a character vector and attempts to determine the optimal type for all elements (meaning that it has to be applied once per column).

df[] <- lapply(df, function(x) type.convert(as.character(x)))

Since I love dplyr, I prefer:

library(dplyr)
df <- df %>% mutate_all(funs(type.convert(as.character(.))))