I don't understand what label does in Prolog

If you want to learn Prolog and CLP(FD) rapidly, use Prolog's top level shell to play around until you get comfortable with it. In fact, everything you need to know about CLP(FD) and Prolog can be explained there ; or almost. No need to write (what's their name?) files, everything fits in a line. Yes, I know, our parents warned us: My child, promise me, never do a one-liner. But you will learn so much faster.

So do you have the ?- waiting?

In traditional Prolog (without constraints) what you get back from a query is a so called answer substitution. In many situations, this answer substitution already describes a solution. This is the case if for each variable, a variable free term is found. Lets look at a concrete example and describe a list with 5 elements where each element is a value between 1 and 5. In this case, solutions are found for different values for L.

?- N = 5, length(L,N),maplist(between(1,N),L).
   N = 5, L = [1, 1, 1, 1, 1] ;
   N = 5, L = [1, 1, 1, 1, 2] ;
   N = 5, L = [1, 1, 1, 1, 3] ...

Prolog will show you only one solution (secretly hoping that you will be happy with it, its a bit lazy, er non-strict). You get all of them typing SPACE or ;. Try it a bit just to see how many they are...

There is a total of 5^5 solutions. It's not very practical, if you just want to pick a few solutions out of that many. Large sets of solutions are represented quite ineffectively in that manner. And then, think of infinite sets! How can Prolog, or any finite being, enumerate an infinite set? We can only start to do so, finite as we are.

To overcome this, Prolog is not always forced to show concrete values, that is, solutions, but can subsume them a bit by showing answers instead:

?- N = 5, length(L,N).
   N = 5, L = [_A, _B, _C, _D, _E].

This answer (-substitution) contains all 5^5 answers above, and many many more, like L = [stack,over,flow,dot,com]. In fact, it describes an infinite set of solutions! Didn't I say we finite beings can't do this? As long as we insist on concrete solutions we can't, but if we are happy with answers instead, we can do the impossible.

That idea can be extended to describe more specific sets. All with a single answer. For sets about integers, we have library(clpfd). Use it like so:

?- use_module(library(clpfd)).

?- asserta(clpfd:full_answer).  % only necessary for SICStus

We can now restate our original query (in SWI, you can do Cursor up ↑ to get it):

?- N = 5, length(L,N),L ins 1..N.
   N = 5, L = [_A, _B, _C, _D, _E],
   _A in 1..5, _B in 1..5, _C in 1..5, _D in 1..5,_E in 1..5.

Now all 3125 solutions are compactly described with a single answer. (3125? that's 5^5). We can continue to state further requirements, like that they are all different:

?- N = 5, length(L,N),L ins 1..N, all_different(L).
   N = 5, L = [_A, _B, _C, _D, _E],
   _A in 1..5,_B in 1..5,_C in 1..5,_D in 1..5,_E in 1..5,
   all_different([_A, _B, _C, _D, _E]).

What (practically) all constraints have in common is that they do not enumerate solutions, instead, they try to maintain consistency. Let's try this, by stating that the first element should be 1:

?- N = 5, length(L,N),L ins 1..N, all_different(L), L = [1|_].
   N = 5, L = [1, _A, _B, _C, _D],
   _A in 2..5,_B in 2..5,_C in 2..5,_D in 2..5,
   all_different([1, _A, _B, _C, _D]).

Did you see the effect? They promptly changed their domains! Now they are all in 2..5.

And they should all be in 1..4:

?- N = 5, length(L,N),L ins 1..N, all_different(L), L = [1|_], L ins 1..4.
   N = 5, L = [1, _A, _B, _C, _D],
   _A in 2..4,_B in 2..4,_C in 2..4,_D in 2..4,
   all_different([1, _A, _B, _C, _D]).

Again, they are updated. But ... think of it: There are 4 variables left, they should all be different but there are only 3 different values for them.

So we have caught Prolog being a bit too lazy. There actually is a better constraint called all_distinct/1 that would fail now, but no matter how many clever constraints a system has, there will be always such inconsistencies. Ask Professor Gödel. The only ways to bail out would be errors or infinite loops.

So we need another method to be sure that an answer does describe real solutions. Enter labeling! With label/1 or labeling/2 we can eliminate all those strange constraints and bust inconsistencies:

?- N = 5, length(L,N),L ins 1..N, all_different(L), L = [1|_], L ins 1..4, labeling([], L).
   false.

?- N = 5, length(L,N),L ins 1..N, all_different(L), L = [1|_], labeling([], L).
   N = 5, L = [1, 2, 3, 4, 5] ;
   N = 5, L = [1, 2, 3, 5, 4] ;
   N = 5, L = [1, 2, 4, 3, 5] ...

How can we be sure that these are real solutions? Easy: They do not contain any extra goals besides answer substitutions1. For if we forgot some:

?- N = 5, length(L,N),L ins 1..N, all_different(L), L = [1,B,C|_], labeling([],[B,C]).
   N = 5, L = [1, 2, 3, _A, _B], B = 2, C = 3,
   _A in 4..5, _B in 4..5,
   all_different([1, 2, 3, _A, _B]),

They would show.

SWI's labeling/2 has a very useful guarantee:

Labeling is always complete, always terminates, and yields no redundant solutions.


1 Since the SWI toplevel does not show all constraints, you would need to wrap call_residue_vars(Goal, Vs) around it. But for simple top level queries, above is good enough.


Roughly, there are 2 phases in constraint programming: constraint propagation and search.

Constraint propagation alone can give concrete values, and it often does. But in general, it only reduces domain (set of the possible values for a variable) to a smaller subset, and then you need search (label values from a subset, obtained with constraint propagation).

Very simple example (pseudocode):

A #:: 0..1,
B #:: 0..1,
A #\= B

Constraint propagation cannot solve this by itself, it cannot even reduce domain of A or B - it can only create a delayed constraint. After that a search (label) tries a value 0 for A, the delayed constraint fires up, and domain of B reduced to {1}.

In contrast, this can be solved with constraint propagation alone:

A #:: 1,
B #:: 0..1,
A #\= B