Finding n-th permutation without computing others

Here's a solution that allows to select the size of the permutation. For example, apart from being able to generate all permutations of 10 elements, it can generate permutations of pairs among 10 elements. Also it permutes lists of arbitrary objects, not just integers.

function nth_permutation($atoms, $index, $size) {
    for ($i = 0; $i < $size; $i++) {
        $item = $index % count($atoms);
        $index = floor($index / count($atoms));
        $result[] = $atoms[$item];
        array_splice($atoms, $item, 1);
    }
    return $result;
}

Usage example:

for ($i = 0; $i < 6; $i++) {
    print_r(nth_permutation(['A', 'B', 'C'], $i, 2));
}
// => AB, BA, CA, AC, BC, CB

How does it work?

There's a very interesting idea behind it. Let's take the list A, B, C, D. We can construct a permutation by drawing elements from it like from a deck of cards. Initially we can draw one of the four elements. Then one of the three remaining elements, and so on, until finally we have nothing left.

Decision tree for permutations of 4 elements

Here is one possible sequence of choices. Starting from the top we're taking the third path, then the first, the the second, and finally the first. And that's our permutation #13.

Think about how, given this sequence of choices, you would get to the number thirteen algorithmically. Then reverse your algorithm, and that's how you can reconstruct the sequence from an integer.

Let's try to find a general scheme for packing a sequence of choices into an integer without redundancy, and unpacking it back.

One interesting scheme is called decimal number system. "27" can be thought of as choosing path #2 out of 10, and then choosing path #7 out of 10.

Decision three for number 27 in decimal

But each digit can only encode choices from 10 alternatives. Other systems that have a fixed radix, like binary and hexadecimal, also can only encode sequences of choices from a fixed number of alternatives. We want a system with a variable radix, kind of like time units, "14:05:29" is hour 14 from 24, minute 5 from 60, second 29 from 60.

What if we take generic number-to-string and string-to-number functions, and fool them into using mixed radixes? Instead of taking a single radix, like parseInt('beef', 16) and (48879).toString(16), they will take one radix per each digit.

function pack(digits, radixes) {
    var n = 0;
    for (var i = 0; i < digits.length; i++) {
        n = n * radixes[i] + digits[i];
    }
    return n;
}

function unpack(n, radixes) {
    var digits = [];
    for (var i = radixes.length - 1; i >= 0; i--) {
        digits.unshift(n % radixes[i]);
        n = Math.floor(n / radixes[i]);
    }
    return digits;
}

Does that even work?

// Decimal system
pack([4, 2], [10, 10]); // => 42

// Binary system
pack([1, 0, 1, 0, 1, 0], [2, 2, 2, 2, 2, 2]); // => 42

// Factorial system
pack([1, 3, 0, 0, 0], [5, 4, 3, 2, 1]); // => 42

And now backwards:

unpack(42, [10, 10]); // => [4, 2]

unpack(42, [5, 4, 3, 2, 1]); // => [1, 3, 0, 0, 0]

This is so beautiful. Now let's apply this parametric number system to the problem of permutations. We'll consider length 2 permutations of A, B, C, D. What's the total number of them? Let's see: first we draw one of the 4 items, then one of the remaining 3, that's 4 * 3 = 12 ways to draw 2 items. These 12 ways can be packed into integers [0..11]. So, let's pretend we've packed them already, and try unpacking:

for (var i = 0; i < 12; i++) {
    console.log(unpack(i, [4, 3]));
}

// [0, 0], [0, 1], [0, 2],
// [1, 0], [1, 1], [1, 2],
// [2, 0], [2, 1], [2, 2],
// [3, 0], [3, 1], [3, 2]

These numbers represent choices, not indexes in the original array. [0, 0] doesn't mean taking A, A, it means taking item #0 from A, B, C, D (that's A) and then item #0 from the remaining list B, C, D (that's B). And the resulting permutation is A, B.

Another example: [3, 2] means taking item #3 from A, B, C, D (that's D) and then item #2 from the remaining list A, B, C (that's C). And the resulting permutation is D, C.

This mapping is called Lehmer code. Let's map all these Lehmer codes to permutations:

AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB, DC

That's exactly what we need. But if you look at the unpack function you'll notice that it produces digits from right to left (to reverse the actions of pack). The choice from 3 gets unpacked before the choice from 4. That's unfortunate, because we want to choose from 4 elements before choosing from 3. Without being able to do so we have to compute the Lehmer code first, accumulate it into a temporary array, and then apply it to the array of items to compute the actual permutation.

But if we don't care about the lexicographic order, we can pretend that we want to choose from 3 elements before choosing from 4. Then the choice from 4 will come out from unpack first. In other words, we'll use unpack(n, [3, 4]) instead of unpack(n, [4, 3]). This trick allows to compute the next digit of Lehmer code and immediately apply it to the list. And that's exactly how nth_permutation() works.

One last thing I want to mention is that unpack(i, [4, 3]) is closely related to the factorial number system. Look at that first tree again, if we want permutations of length 2 without duplicates, we can just skip every second permutation index. That'll give us 12 permutations of length 4, which can be trimmed to length 2.

for (var i = 0; i < 12; i++) {
    var lehmer = unpack(i * 2, [4, 3, 2, 1]); // Factorial number system
    console.log(lehmer.slice(0, 2));
}

As stated by RickyBobby, when considering the lexicographical order of permutations, you should use the factorial decomposition at your advantage.

From a practical point of view, this is how I see it:

  • Perform a sort of Euclidian division, except you do it with factorial numbers, starting with (n-1)!, (n-2)!, and so on.
  • Keep the quotients in an array. The i-th quotient should be a number between 0 and n-i-1 inclusive, where i goes from 0 to n-1.
  • This array is your permutation. The problem is that each quotient does not care for previous values, so you need to adjust them. More explicitly, you need to increment every value as many times as there are previous values that are lower or equal.

The following C code should give you an idea of how this works (n is the number of entries, and i is the index of the permutation):

/**
 * @param n The number of entries
 * @param i The index of the permutation
 */
void ithPermutation(const int n, int i)
{
   int j, k = 0;
   int *fact = (int *)calloc(n, sizeof(int));
   int *perm = (int *)calloc(n, sizeof(int));

   // compute factorial numbers
   fact[k] = 1;
   while (++k < n)
      fact[k] = fact[k - 1] * k;

   // compute factorial code
   for (k = 0; k < n; ++k)
   {
      perm[k] = i / fact[n - 1 - k];
      i = i % fact[n - 1 - k];
   }

   // readjust values to obtain the permutation
   // start from the end and check if preceding values are lower
   for (k = n - 1; k > 0; --k)
      for (j = k - 1; j >= 0; --j)
         if (perm[j] <= perm[k])
            perm[k]++;

   // print permutation
   for (k = 0; k < n; ++k)
      printf("%d ", perm[k]);
   printf("\n");

   free(fact);
   free(perm);
}

For example, ithPermutation(10, 3628799) prints, as expected, the last permutation of ten elements:

9 8 7 6 5 4 3 2 1 0

It depends on the way you "sort" your permutations (lexicographic order for example).

One way to do it is the factorial number system, it gives you a bijection between [0 , n!] and all the permutations.

Then for any number i in [0,n!] you can compute the ith permutation without computing the others.

This factorial writing is based on the fact that any number between [ 0 and n!] can be written as :

SUM( ai.(i!) for i in range [0,n-1]) where ai <i 

(it's pretty similar to base decomposition)

for more information on this decomposition, have a look at this thread : https://math.stackexchange.com/questions/53262/factorial-decomposition-of-integers

hope it helps


As stated on this wikipedia article this approach is equivalent to computing the lehmer code :

An obvious way to generate permutations of n is to generate values for the Lehmer code (possibly using the factorial number system representation of integers up to n!), and convert those into the corresponding permutations. However the latter step, while straightforward, is hard to implement efficiently, because it requires n operations each of selection from a sequence and deletion from it, at an arbitrary position; of the obvious representations of the sequence as an array or a linked list, both require (for different reasons) about n2/4 operations to perform the conversion. With n likely to be rather small (especially if generation of all permutations is needed) that is not too much of a problem, but it turns out that both for random and for systematic generation there are simple alternatives that do considerably better. For this reason it does not seem useful, although certainly possible, to employ a special data structure that would allow performing the conversion from Lehmer code to permutation in O(n log n) time.

So the best you can do for a set of n element is O(n ln(n)) with an adapted data structure.