Compute rank of a combination?

I would suggest a specialised hash table. The hash for a combination should be the exclusive-or of the hashes for the values. Hashes for values are basically random bit-patterns.

You could code the table to cope with collisions, but it should be fairly easy to derive a minimal perfect hash scheme - one where no two three-item combinations give the same hash value, and where the hash-size and table-size are kept to a minimum.

This is basically Zobrist hashing - think of a "move" as adding or removing one item of the combination.

EDIT

The reason to use a hash table is that the lookup performance O(n) where n is the number of items in the combination (assuming no collisions). Calculating lexicographical indexes into the combinations is significantly slower, IIRC.

The downside is obviously the up-front work done to generate the table.


You can try using the lexicographic index of the combination. Maybe this page will help: http://saliu.com/bbs/messages/348.html

This MSDN page has more details: Generating the mth Lexicographical Element of a Mathematical Combination.

NOTE: The MSDN page has been retired. If you download the documentation at the above link, you will find the article on page 10201 of the pdf that is downloaded.

To be a bit more specific:

When treated as a tuple, you can order the combinations lexicographically.

So (0,1,2) < (0,1,3) < (0,1,4) etc.

Say you had the number 0 to n-1 and chose k out of those.

Now if the first element is zero, you know that it is one among the first n-1 choose k-1.

If the first element is 1, then it is one among the next n-2 choose k-1.

This way you can recursively compute the exact position of the given combination in the lexicographic ordering and use that to map it to your number.

This works in reverse too and the MSDN page explains how to do that.


Use a hash table to store the results. A decent hash function could be something like:

h(x) = (x1*p^(k - 1) + x2*p^(k - 2) + ... + xk*p^0) % pp

Where x1 ... xk are the numbers in your combination (for example (0, 1, 2) has x1 = 0, x2 = 1, x3 = 2) and p and pp are primes.

So you would store Hash[h(0, 1, 2)] = 78 and then you would retrieve it the same way.

Note: the hash table is just an array of size pp, not a dict.


Here is a conceptual answer and a code based on how lex ordering works. (So I guess my answer is like that of "moron", except that I think that he has too few details and his links have too many.) I wrote a function unchoose(n,S) for you that works assuming that S is an ordered list subset of range(n). The idea: Either S contains 0 or it does not. If it does, remove 0 and compute the index for the remaining subset. If it does not, then it comes after the binomial(n-1,k-1) subsets that do contain 0.

def binomial(n,k):
    if n < 0 or k < 0 or k > n: return 0
    b = 1
    for i in xrange(k): b = b*(n-i)/(i+1)
    return b

def unchoose(n,S):
    k = len(S)
    if k == 0 or k == n: return 0
    j = S[0]
    if k == 1: return j
    S = [x-1 for x in S]
    if not j: return unchoose(n-1,S[1:])
    return binomial(n-1,k-1)+unchoose(n-1,S)

def choose(X,k):
    n = len(X)
    if k < 0 or k > n: return []
    if not k: return [[]]
    if k == n: return [X]
    return [X[:1] + S for S in choose(X[1:],k-1)] + choose(X[1:],k)

(n,k) = (13,3)
for S in choose(range(n),k): print unchoose(n,S),S

Now, it is also true that you can cache or hash values of both functions, binomial and unchoose. And what's nice about this is that you can compromise between precomputing everything and precomputing nothing. For instance you can precompute only for len(S) <= 3.

You can also optimize unchoose so that it adds the binomial coefficients with a loop if S[0] > 0, instead of decrementing and using tail recursion.