Sample from vector of varying length (including 1)

When fed only one single number, sample works like sample.int (see ?sample). If you want to make sure it only samples from the vector you give it, you can work with indices and use this construct:

x[sample(length(x))]

This gives you the correct result regardless the length of x, and without having to add an if-condition checking the length.

Example:

mylist <- list(
  a = 5,
  b = c(2,4),
  d = integer(0)
)

mysample <- lapply(mylist,function(x) x[sample(length(x))])

> mysample
$a
[1] 5

$b
[1] 2 4

$d
integer(0)

Note : you can replace sample by sample.int to get a little speed gain.


This is a documented feature:

If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x. Note that this convenience feature may lead to undesired behaviour when x is of varying length in calls such as sample(x).

An alternative is to write your own function to avoid the feature:

sample.vec <- function(x, ...) x[sample(length(x), ...)]
sample.vec(10)
# [1] 10
sample.vec(10, 3, replace = TRUE)
# [1] 10 10 10

Some functions with similar behavior are listed under seq vs seq_along. When will using seq cause unintended results?


You could use this 'bugfree' redefinition of the function:

sample = function(x, size, replace = F, prob = NULL) {
  if (length(x) == 1) return(x)
  base::sample(x, size = size, replace = replace, prob = prob)
}

Test it:

> sapply(1:7, base::sample, size = 1)
[1] 1 2 2 4 4 4 4
> sapply(1:7, sample)
[1] 1 2 3 4 5 6 7

Tags:

R