How to quickly form groups (quartiles, deciles, etc) by ordering column(s) in a data frame

There's a handy ntile function in package dplyr. It's flexible in the sense that you can very easily define the number of *tiles or "bins" you want to create.

Load the package (install first if you haven't) and add the quartile column:

library(dplyr)
temp$quartile <- ntile(temp$value, 4)  

Or, if you want to use dplyr syntax:

temp <- temp %>% mutate(quartile = ntile(value, 4))

Result in both cases is:

temp
#   name       value quartile
#1     a -0.56047565        1
#2     b -0.23017749        2
#3     c  1.55870831        4
#4     d  0.07050839        2
#5     e  0.12928774        3
#6     f  1.71506499        4
#7     g  0.46091621        3
#8     h -1.26506123        1
#9     i -0.68685285        1
#10    j -0.44566197        2
#11    k  1.22408180        4
#12    l  0.35981383        3

data:

Note that you don't need to create the "quartile" column in advance and use set.seed to make the randomization reproducible:

set.seed(123)
temp <- data.frame(name=letters[1:12], value=rnorm(12))

The method I use is one of these or Hmisc::cut2(value, g=4):

temp$quartile <- with(temp, cut(value, 
                                breaks=quantile(value, probs=seq(0,1, by=0.25), na.rm=TRUE), 
                                include.lowest=TRUE))

An alternate might be:

temp$quartile <- with(temp, factor(
                            findInterval( val, c(-Inf,
                               quantile(val, probs=c(0.25, .5, .75)), Inf) , na.rm=TRUE), 
                            labels=c("Q1","Q2","Q3","Q4")
      ))

The first one has the side-effect of labeling the quartiles with the values, which I consider a "good thing", but if it were not "good for you", or the valid problems raised in the comments were a concern you could go with version 2. You can use labels= in cut, or you could add this line to your code:

temp$quartile <- factor(temp$quartile, levels=c("1","2","3","4") )

Or even quicker but slightly more obscure in how it works, although it is no longer a factor, but rather a numeric vector:

temp$quartile <- as.numeric(temp$quartile)

I'll add the data.table version for anyone else Googling it (i.e., @BondedDust's solution translated to data.table and pared down a tad):

library(data.table)
setDT(temp)
temp[ , quartile := cut(value,
                        breaks = quantile(value, probs = 0:4/4),
                        labels = 1:4, right = FALSE)]

Which is much better (cleaner, faster) than what I had been doing:

temp[ , quartile := 
        as.factor(ifelse(value < quantile(value, .25), 1,
                         ifelse(value < quantile(value, .5), 2,
                                ifelse(value < quantile(value, .75), 3, 4))]

Note, however, that this approach requires the quantiles to be distinct, e.g. it will fail on rep(0:1, c(100, 1)); what to do in this case is open ended so I leave it up to you.