R: apply-like function that returns a data frame?

You can use the apply family but, you're right, the result is either a matrix or a list. Not a big deal though to get back to a data.frame.

Your function needs to return something consistent across columns (raw iris instead of iris[, 1:4] would not work below, because of iris$Species which is a factor with 3 levels where summary returns 6 numeric from a numeric column) and that's where a reproducible would help. Below, I used iris and summary:

  1. apply: as.data.frame(apply(iris[, 1:4], 2, summary))
  2. sapply: as.data.frame(sapply(iris[, 1:4], summary))
  3. lapply: do.call(cbind, lapply(iris[, 1:4], summary))

I have just implemented this function, which applies FUN over rows as lists and concatenates the result to a tibble:

library(magrittr)

lapply_rows <- function(df, return_tibble = TRUE, FUN, ...) {
  df_rownames <- rownames(df)

  res <- lapply(purrr::transpose(df), FUN = FUN, ...) %>%
    purrr::map_depth(2, function(x) {
      if (length(x) != 1) {
        return(list(x))
      } else {
        return(x)
      }
    }) %>%
    dplyr::bind_rows()

  if (!return_tibble) {
    res <- as.data.frame(res)
    rownames(res) <- df_rownames
  }

  return(res)
}

df is converted to a list of lists by purrr::transpose(df), where each sublist is one row of the original df. FUN must return a named list, which can also contain elements with a length other than one. These elements are then wrapped in list() (type of a column of a data.frame-like object could be also a list). If return_tibble is FALSE, result is coerced to data.frame and original rownames are set.

Example:

df <- lapply_rows(mtcars, FUN = function(row_list) {
  row_list$cyl_2 <- row_list$cyl ** 2
  row_list$colors <- c("red", "green", "blue")
  row_list$sublist <- mtcars[1:5, 1:5]
  return(row_list)
})

head(df)
# A tibble: 6 x 14
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb cyl_2 colors    sublist         
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list>    <list>          
1  21       6   160   110  3.9   2.62  16.5     0     1     4     4    36 <chr [3]> <df[,5] [5 × 5]>
2  21       6   160   110  3.9   2.88  17.0     0     1     4     4    36 <chr [3]> <df[,5] [5 × 5]>
3  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1    16 <chr [3]> <df[,5] [5 × 5]>
4  21.4     6   258   110  3.08  3.22  19.4     1     0     3     1    36 <chr [3]> <df[,5] [5 × 5]>
5  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2    64 <chr [3]> <df[,5] [5 × 5]>
6  18.1     6   225   105  2.76  3.46  20.2     1     0     3     1    36 <chr [3]> <df[,5] [5 × 5]>

Example returning a data.frame:

df2 <- lapply_rows(mtcars, return_tibble = FALSE, FUN = function(row_list) {
  row_list$cyl_2 <- row_list$cyl ** 2
  row_list$colors <- c("red", "green", "blue")
  row_list$sublist <- mtcars[1:5, 1:5]
  return(row_list)
})

head(df2)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb cyl_2           colors
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4    36 red, green, blue
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4    36 red, green, blue
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1    16 red, green, blue
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1    36 red, green, blue
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2    64 red, green, blue
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1    36 red, green, blue
                                                                                                                                                                                       sublist
Mazda RX4         21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Mazda RX4 Wag     21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Datsun 710        21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Hornet 4 Drive    21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Hornet Sportabout 21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Valiant           21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15

(you can see that tibble is handling the <list> columns much better)

Tags:

R

Dataframe