Rubik's cube genetic algorithm solver?

Disclaimer: I'm by far no expert on Rubik's cubes, I even never solved one

Solving a given Rubik's cube by GA

You have a given configuration and you want to use GA to produce the sequence of steps that solve this particular configuration.

Representation

For each of the axes there are three possible rotations, each of them in two directions. This gives following moves:

X1+, X1-, X2+, X2-, X3+, X3-,
Y1+, Y1-, Y2+, Y2-, Y3+, Y3-,
Z1+, Z1-, Z2+, Z2-, Z3+, Z3-

The genotype is then a sequence of these moves.

Genotype

There are two possibilites:

  • variable-length genotype
  • fixed-length genotype

Variable-length genotype follows the idea that we don't know, in advance, how many moves will the particular configuration take to solve. I'll come back to this variant later.

Fixed-length genotype can be used too. Even though we don't know how many moves will the particular configuration take to solve, we know that any configuration can be solved in 20 moves or less. Since 20 would mean that for some positions the algorithm would be forced to find the optimal solution (which could be quite hard), we can set the genotype length to, say, 50 to give the algorithm some flexibility.

Fitness

You need to find a good fitness measure for judging the quality of the solutions. Currently I can't think of a nice quality measure as I'm no expert on Rubik's cubes. However, you probably should incorporate the number of moves in your fitness measure. I'll come back to it a little bit later.

Also you should decide whether you are looking for any solution or for a good one. If you are looking for any solution, you can stop at the first solution found. If you are looking for a good solution, you don't stop at first solution found but instead you then optimize its length. You then stop when you decide to stop (i.e. after a desired amount of time spent searching).

Operators

The two basic operators - crossover and mutation - can be basically identical to classical GA. The only difference is that you don't have two states for a "bit" but 18. Even when using a variable-length genotype, the crossover can remain the same - you just cut the both genotypes in two pieces and swap the tails. The only difference from the fixed-length case is that the cuts won't be at the same positions, but rather independent of each other.

Bloat

Using the variable-length genotypes brings an unpleasant phenomenon, which is an excessive growth of the size of the solution, without a big impact on fitness. In Genetic Programming (which is quite different topic) this is called bloat. This can be controlled by two mechanisms.

The first mechanism I already mentioned - incorporate the length of the solution into the fitness. If you look for a good solution (i.e. you don't stop at finding the first one), the length is, of course, only the length of the effective part (i.e. the number of moves from the start up to the move that finishes the cube), not counting the rest.

The other mechanism I borrow from Grammatical Evolution (a Genetic Programming algorithm that also uses linear, variable-length genotypes), which is pruning. A pruning operator takes a solution and deletes the non-effective part of the genotype.

Other possibilites

You could also evolve something like a "policy" or strategy - a general rule that would, given a cube, decide what move to do next. That is far more difficult task, but definitely evolvable (meaning that you can use evolution to try to find it, not that you will find it eventually). You might use e.g. neural networks as a representation of the policy and use some neuro-evolution concepts and frameworks to achieve this task.

Or you could evolve a heuristic (using Genetic Progamming) for a given search algorithm, e.g. A*. The A* algorithm needs a heuristic to estimate the distance of a state to the final state. The more accurate this heuristic is, the faster the A* algorithm finds the solution.

Final remarks

From my experience, if you know something about the problem (which in case of Rubik's cube you do know), it is far better to use a dedicated or at least an informed algorithm to solve the problem rather than using an almost blind one like GA. In my opinion, Rubik's cube is not a good problem for GAs, or rather GAs are not a good algorithms for solving Rubik's cube.