How do I run apply on a data.table?

You can do this:

library("stringr")
dt[, -1] <- lapply(dt[, -1], function(x) str_replace(x," ","_"))

Completely rewritten on 2015-11-24, to fix an error in previous versions.

Also added more modern options on 2019-09-27

You have a few options.

  1. Process all of the target columns with an embedded call to lapply(), using := to assign the modified values in place. This relies on :='s very handy support for simultaneous assignment to several column named on its LHS.

  2. Use a for loop to run through the target columns one at a time, using set() to modify the value of each one in turn.

  3. Use a for loop to iterate over multiple "naive" calls to [.data.table(), each one of which modifies a single column.

These methods all seem about equally fast, so which one you use will be mostly a matter of taste. (1) is nicely compact and expressive. It's what I most often use, though you may find (2) easier to read. Because they process and modify the columns one at a time, (2) or (3) will have an advantage in the rare situation in which your data.table is so large that you are in danger of running up against limits imposed by your R session's available memory.

library(data.table)

## Create three identical 1000000-by-20 data.tables
DT1 <- data.table(1:1e6,
           as.data.table(replicate(1e6, paste(sample(letters, nr, TRUE),
                                             sample(letters, nr, TRUE)))))
cnames <- c("ID", paste0("X", 1:19))
setnames(DT1, cnames)
DT2 <- copy(DT1); DT3 <- copy(DT1)

## Method 1
system.time({
DT1[, .SDcols=cnames[-1L], cnames[-1L] := 
  lapply(.SD, function(x) gsub(" ", "_", x, fixed=TRUE)), ]
})
##   user  system elapsed 
##  10.90    0.11   11.06 

## Method 2
system.time({
    for(cname in cnames[-1]) {
        set(DT2, j=cname, value=gsub(" ", "_", DT2[[cname]], fixed=TRUE))
    }
})
##   user  system elapsed 
##  10.65    0.05   10.70 

## Method 3
system.time({
    for(cname in cnames[-1]) {
        DT3[ , (cname) := gsub(" ", "_", get(cname), fixed=TRUE)]
    }
})
##   user  system elapsed 
##  10.33    0.03   10.37 

For more details on set() and :=, read their help page, gotten by typing ?set or ?":=".