Compute rand7() using rand5()

3 * rand5() + rand5()


is not uniformly distributed. For example, it generates 0 in only one way, but 3 two ways, so 3 is more likely to occur than 0.

It's just like 2 * rand5() * rand5(), 4 * rand5() + rand5(), etc.

But 5 * rand5() + rand5() is uniformly distributed.

It is like generating two random digits of a base-5 number.

00 => 0
01 => 1
02 => 2
03 => 3
04 => 4
10 => 5
11 => 6
12 => 7
...


There is only and only one way to generate each number from 0 to 24.

If X and Y are independent and uniformly distributed on the set S_5 = {0,1,2,3,4}, then

1. 5*X + Y is uniformly distributed on the set {0,...,24}, but
2. 3*X + Y is not uniformly distributed on {0,...,16} and neither is its restriction on {0,...,13}

It's easy to see that (1) is indeed the case, because f(x,y) = 5*x + y is a bijection between S_5 x S_5 and S_25.

If we look at the distribution of 3*X + Y we get:

>>> Counter(3*x + y for x in range(5) for y in range(5))
Counter({3: 2, 4: 2, 6: 2, 7: 2, 9: 2, 10: 2, 12: 2, 13: 2, 0: 1, 1: 1, 2: 1, 5: 1, 8: 1, 11: 1, 14: 1, 15: 1, 16: 1}


The results 3, 4, 6, 7, 9, 10, 12, 13 are twice as likely as 1, 2, 5, 8 or 11. More proof:

>>> def rand7():
...   x = 3*rand5() + rand5()
...   if x < 14: return x % 7
...   return rand7()
...
>>> Counter(rand7() for _ in xrange(100000))
Counter({6: 18219, 3: 18105, 4: 13734, 5: 13715, 2: 13634, 0: 13560, 1: 9033}


6 and 3 have a 4/22 ~ 18.2% chance of occuring, 4, 5, 2 and 0 have an 3/22 ~ 13.6% chance and 1 only has a 2/22 ~ 9.1% chance. That's one rigged dice.