How to get frequency counts using column breaks by row?

Alternatively to rle() you can use diff():

dat %>%
  group_by(name) %>%
  summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))

#   A tibble: 1 x 2
#   name  ever_inv
#   <fct>    <int>
# 1 Bob          2

Assuming that srvc_inv is either 0 or 1, diff(srvc_inv) == 1 only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv for a case when it starts from 1s run.

And with rle(), from my opinion, there is even simpler solution:

dat %>%
  group_by(name) %>%
  summarise(ever_inv = sum(rle(srvc_inv)$value))

#   A tibble: 1 x 2
#   name  ever_inv
#   <fct>    <int>
# 1 Bob          2

Assuming that srvc_inv is either 0 or 1, that's enough just to sum values component of rle object, which returns the number of 1s runs.


One possibility could be:

dat %>%
 group_by(name) %>%
 mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
 summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))

  name  ever_inv
  <fct>    <int>
1 Bob          2

One more solution based on base R rle

library(dplyr)
dat %>% group_by(name) %>% 
        summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))

# A tibble: 1 x 2
name  ever_inv
  <fct>    <int>
1 Bob          2

Tags:

R