How to randomly select subsets

This question may be a duplicate but for the time being:

list = Range[300];

The number of subsets length 25:

n = Binomial[300, 25]
1953265141442868389822364184842211512

Five samples:

samp = RandomInteger[{1, n}, 5]
{1097179597483122074395819626389736050,
 1278400886908268917844987164926797363,
 1855898035549513136165016617586671669,
 1005956584417012779260052361741534263,
 1845054078551378518016127833496347335}

Your subsets:

 Subsets[list, {25}, {#}][[1]] & /@ samp
{{10,15,57,64,65,73,82,115,120,130,133,160,161,164,178,192,196,218,223,235,238,240,267,271,290},
{12,54,58,81,90,91,115,130,146,181,189,204,205,218,222,230,233,234,235,254,256,268,281,283,284},
{33,42,45,65,78,81,85,118,151,167,172,174,202,203,207,208,211,212,223,239,246,251,254,262,267},
{9,12,35,69,72,77,79,109,113,116,141,144,158,163,195,202,221,228,230,231,254,259,267,280,292},
{32,39,49,53,62,102,104,132,135,159,164,167,169,172,191,211,244,245,253,263,265,271,282,283,286}}

Be aware that RandomInteger could produce duplicate samples however for the example given it is extremely unlikely. You can produce more an use DeleteDuplicates and Take as needed.


I think kguler's answer is the better method, and I wish I had had the insight to realize it myself, however there is still some value in the method above. Referring to subsets by a single number can make them easier to handle.

  • They take less space.
  • Comparison (e.g. for removing duplicates) requires a single numeric comparison rather than a list comparison.
  • A given subset is independent of the input list; only length of input and subset matter.

One can "unrank" them at any time using the third parameter of Subsets as shown above.


I think RandomSample already does exactly what you need:

RandomSample: enter image description here

RandomSample[Range[300], 25]
(* {292, 257, 36, 83, 259, 245, 280, 270, 24, 236, 186, 100, 300, 240, 
    176, 295, 42, 105, 97, 106, 60, 114, 63, 25, 253} *)

Table[RandomSample[Range[300], 25], {5}] 
{{221, 54, 124, 64, 168, 91, 149, 25, 142, 87, 184, 288, 93, 105, 95, 195, 264, 180, 300, 241, 185, 126, 237, 52, 160}, 
 {16, 59, 145, 181,  109, 163, 99, 81, 24, 257, 2, 293, 298, 78, 207, 162, 19, 277, 133, 28, 23, 254, 237, 93, 242}, 
 {210, 62, 76, 122, 196, 90, 84, 117, 256, 216, 95, 197, 107, 260, 78, 241, 64, 173, 169, 224, 160, 265, 163, 39, 261},
 {48, 146, 103, 10, 259, 142, 197, 77, 39, 260, 75, 128, 137, 181, 99, 256, 82, 127, 294, 117, 250, 76, 276, 196, 11}, 
 {193, 275, 48, 299, 42, 159, 75, 251, 241, 179, 165, 169, 87, 224, 288, 133, 94, 72, 274, 109, 296, 264, 82, 51, 124}}

Dealing with possible duplicates:

rsF = Module[{t = Table[RandomSample[#, #2], {#3}]},
             While[Length[DeleteDuplicates@t] < #3, 
                   t = Join[DeleteDuplicates@t, {RandomSample[#, #2]}]]; t] &;

rsF[Range[300], 10, 5] // Grid

enter image description here


Combinatorica has a function for doing exactly this:

<< "Combinatorica`"
RandomKSubset[Range@300, 25] & /@ Range@5 // Grid

Mathematica graphics

You may get dups, though. Use DeleteDuplicates[] if you consider it necessary.