# Weighted random sample without replacement in python

## Built-in solution

As suggested by Miriam Farber, you can just use the numpy's builtin solution:

```
np.random.choice(vec,size,replace=False, p=P)
```

## Pure python equivalent

What follows is close to what *numpy* does internally. It, of course, uses numpy arrays and *numpy.random.choices()*:

```
from random import choices
def weighted_sample_without_replacement(population, weights, k=1):
weights = list(weights)
positions = range(len(population))
indices = []
while True:
needed = k - len(indices)
if not needed:
break
for i in choices(positions, weights, k=needed):
if weights[i]:
weights[i] = 0.0
indices.append(i)
return [population[i] for i in indices]
```

## Related problem: Selection when elements can be repeated

This is sometimes called an *urn* problem. For example, given an urn with 10 red balls, 4 white balls, and 18 green balls, choose nine balls without replacement.

To do it with *numpy*, generate the unique selections from the total population count with *sample()*. Then, bisect the cumulative weights to get the population indices.

```
import numpy as np
from random import sample
population = np.array(['red', 'blue', 'green'])
counts = np.array([10, 4, 18])
k = 9
cum_counts = np.add.accumulate(counts)
total = cum_counts[-1]
selections = sample(range(total), k=k)
indices = np.searchsorted(cum_counts, selections, side='right')
result = population[indices]
```

To do this without *numpy', the same approach can be implemented with *bisect()* and *accumulate()* from the standard library:

```
from random import sample
from bisect import bisect
from itertools import accumulate
population = ['red', 'blue', 'green']
weights = [10, 4, 18]
k = 9
cum_weights = list(accumulate(weights))
total = cum_weights.pop()
selections = sample(range(total), k=k)
indices = [bisect(cum_weights, s) for s in selections]
result = [population[i] for i in indices]
```

You can use `np.random.choice`

with `replace=False`

as follows:

```
np.random.choice(vec,size,replace=False, p=P)
```

where `vec`

is your population and `P`

is the weight vector.

For example:

```
import numpy as np
vec=[1,2,3]
P=[0.5,0.2,0.3]
np.random.choice(vec,size=2,replace=False, p=P)
```