Is there a function like switch which works inside of dplyr::mutate?

Eons too late for the OP, but in case this shows up in a search ...

dplyr v0.5 has recode(), a vectorized version of switch(), so

data_frame(
  x = sample(1:4, 10, replace=TRUE),
  y1 = rnorm(n=10, mean=7, sd=2),
  y2 = rnorm(n=10, mean=5, sd=2),
  y3 = rnorm(n=10, mean=7, sd=1),
  y4 = rnorm(n=10, mean=5, sd=1)
) %>%
mutate(y = recode(x,y1,y2,y3,y4))

produces, as anticipated:

# A tibble: 10 x 6
       x        y1       y2       y3       y4        y
   <int>     <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1      2  6.950106 6.986780 7.826778 6.317968 6.986780
2      1  5.776381 7.706869 7.982543 5.048649 5.776381
3      2  7.315477 2.213855 6.079149 6.070598 2.213855
4      3  7.461220 5.100436 7.085912 4.440829 7.085912
5      3  5.780493 4.562824 8.311047 5.612913 8.311047
6      3  5.373197 7.657016 7.049352 4.470906 7.049352
7      2  6.604175 9.905151 8.359549 6.430572 9.905151
8      3 11.363914 4.721148 7.670825 5.317243 7.670825
9      3 10.123626 7.140874 6.718351 5.508875 6.718351
10     4  5.407502 4.650987 5.845482 4.797659 4.797659

(Also works with named args, including character and factor x's.)

Do the operation by each value of x. This is the data.table version, I assume smth similar can be done in dplyr:

library(data.table)

dt = data.table(x = c(1,1,2,2), a = 1:4, b = 4:7)

dt[, newcol := switch(as.character(x), '1' = a, '2' = b, NA), by = x]
dt
#   x a b newcol
#1: 1 1 4      1
#2: 1 2 5      2
#3: 2 3 6      6
#4: 2 4 7      7

You can now use dplyr's function case_when with mutate().

To follow your example in generating the data:

library(dplyr)

df.faithful <- tbl_df(faithful)
df.faithful$x  <- sample(1:4, 272, rep=TRUE)
df.faithful$y1 <- rnorm(n=272, mean=7, sd=2)
df.faithful$y2 <- rnorm(n=272, mean=5, sd=2)
df.faithful$y3 <- rnorm(n=272, mean=7, sd=1)
df.faithful$y4 <- rnorm(n=272, mean=5, sd=1)

Now we define a new pick() function using case_when:

pick2 <- function(x, v1, v2, v3, v4) {
  out = case_when(
    x == 1 ~ v1,
    x == 2 ~ v2,
    x == 3 ~ v3,
    x == 4 ~ v4
  )
  return(out)
}

And you see you can perfectly use it within mutate():

df.faithful %>% 
  mutate(y = pick2(x, y1, y2, y3, y4))

And the output is:

# A tibble: 272 x 8
   eruptions waiting     x    y1    y2    y3    y4     y
       <dbl>   <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
 1      3.6       79     3  8.73  7.23  8.89  4.04  8.89
 2      1.8       54     3  9.97  4.31  7.06  5.05  7.06
 3      3.33      74     1  6.65  7.23  4.46  6.49  6.65
 4      2.28      62     1  6.40  4.39  5.41  3.49  6.40
 5      4.53      85     4  3.96  8.85  7.43  6.51  6.51
 6      2.88      55     4  6.36  8.08  5.82  5.06  5.06
 7      4.7       88     1  5.91  6.47  6.43  5.88  5.91
 8      3.6       85     1  7.77  4.55  6.56  5.05  7.77
 9      1.95      51     4  5.74  6.46  6.95  4.26  4.26
10      4.35      85     1  7.04  1.73  5.71  2.53  7.04
# ... with 262 more rows

Is there a function like switch which works inside of dplyr::mutate?

Tags:

R

Dplyr

Related

Recent Posts