Fast sorted sample with replacement

Implementing @Frank's idea in Rcpp:

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector sort_sample(int n) {
  IntegerVector tab(n);
  for (int i = 0; i < n; i++) {
    int k = n * unif_rand();
    tab[k]++;
  }
  return tab;
}

Benchmark:

microbenchmark::microbenchmark(
  sort(sample(n, replace = TRUE)),
  tabulate(sample(n, replace = TRUE), n),
  sort_sample(n)
)

For n = 1e5:

Unit: microseconds
                                   expr      min       lq      mean    median        uq      max neval
        sort(sample(n, replace = TRUE)) 3214.726 3393.606 3667.1001 3507.0525 3716.3560 7007.411   100
 tabulate(sample(n, replace = TRUE), n) 1087.543 1104.215 1245.0722 1136.9085 1218.5865 4265.075   100
                         sort_sample(n)  789.403  812.780  870.2723  837.3445  924.4395 1188.432   100

For n = 1e6:

Unit: milliseconds
                                   expr      min       lq     mean   median       uq       max neval
        sort(sample(n, replace = TRUE)) 49.96651 52.58784 61.19775 55.09312 58.43035 160.16275   100
 tabulate(sample(n, replace = TRUE), n) 13.74391 14.44253 17.22742 15.99816 17.54367  48.72374   100
                         sort_sample(n) 12.80741 14.40371 17.98320 15.31699 17.53548  63.21692   100

Tags:

Performance

R