How to select the row with the maximum value in each group

Here's a data.table solution:

require(data.table) ## 1.9.2
group <- as.data.table(group)

If you want to keep all the entries corresponding to max values of pt within each group:

group[group[, .I[pt == max(pt)], by=Subject]$V1]
#    Subject pt Event
# 1:       1  5     2
# 2:       2 17     2
# 3:       3  5     2

If you'd like just the first max value of pt:

group[group[, .I[which.max(pt)], by=Subject]$V1]
#    Subject pt Event
# 1:       1  5     2
# 2:       2 17     2
# 3:       3  5     2

In this case, it doesn't make a difference, as there aren't multiple maximum values within any group in your data.


The most intuitive method is to use group_by and top_n function in dplyr

    group %>% group_by(Subject) %>% top_n(1, pt)

The result you get is

    Source: local data frame [3 x 3]
    Groups: Subject [3]

      Subject    pt Event
        (dbl) (dbl) (dbl)
    1       1     5     2
    2       2    17     2
    3       3     5     2

A shorter solution using data.table:

setDT(group)[, .SD[which.max(pt)], by=Subject]
#    Subject pt Event
# 1:       1  5     2
# 2:       2 17     2
# 3:       3  5     2

Tags:

R

R Faq

Dataframe