What occurs-check optimizations is SWI Prolog using?

The major optimization makes unification of local variables a constant operation.

Many abstract machines like the PLM, ZIP, WAM, VAM provide a special case for logic variables that cannot be subterm of some structure (called local variables). Unification with such variables does not require any occurs-check at all and thus can be constant. In this manner large terms can be "passed back" without an extra occurs-check required.

Without this optimization, the handling of grammars (for parsing a given list) gets an overhead quadratic in the number of tokens. Every time the "input list" is handed back (so, graphically speaking, every time you are crossing a comma after a non-terminal in the grammar body), the remaining input list needs to be scanned for the occurrence of that local variable. (It's better than quadratic in the number of characters, since regular expressions are mostly encoded tail-recursively).

This optimization was introduced 2007 in 5.6.39. It is surprising that your measurements show overheads even in cases like tak, where not a single structure is built at all. As far as I can remember, occurs-check unification in SWI 5.6.39 ran a tiny bit faster than rational tree unification (for simple cases) as (at that time) no extra setup was needed.

There is still ample room for many further occurs-check optimizations. But those will only happen, if people do use this feature. As for SWI, not much happened in the last 13 years.

But think-of-it: The very first Prolog, called Prolog 0 did support the occurs-check by default. But from Prolog I ("Marseille Prolog") on, only excuses (such as those you cite) were made. And at least, the standard did not rule out occurs-check unification as default by only defining NSTO executions and requiring unify_with_occurs_check/2 and acyclic_term/1. And now, Prologs like SWI and Tau provide it optionally via a flag.

A further optimization into the same direction is Joachim Beer's NEW_UNBOUND tag which avoids also occurs-checks of some heap-variables at the expense of a more complex scheme. See The Occur-Check Problem Revisited. JLP 5(3) 1988. And LNCS 404.


Here is a test case where the occurs check doubles the time to execute a query. Take this code here, to compute a negation normal form. Since the (=)/2 is at the end of the rule, the visited compounds becomes quadratic:

/* Variant 1 */
norm(A, R) :- var(A), !, R = pos(A).
norm((A+B), R) :- !, norm(A, C), norm(B, D), R = or(C,D).
norm((A*B), R) :- !, norm(A, C), norm(B, D), R = and(C,D).
Etc..

We can compare with this variant where the (=)/2 is done first while the compound is not yet instantiated:

/* Variant 2 */
norm(A, R) :- var(A), !, R = pos(A).
norm((A+B), R) :- !, R = or(C,D), norm(A, C), norm(B, D).
norm((A*B), R) :- !, R = and(C,D), norm(A, C), norm(B, D).
Etc..

Here are some measurements for SWI-Prolog 8.3.19. For variant 1 setting the occurs check flag to true doubles the time needed to convert some propositional formulas from the principia mathematica:

/* Variant 1 */
/* occurs_check=false */
?- time((between(1,1000,_),test,fail;true)).
% 3,646,000 inferences, 0.469 CPU in 0.468 seconds (100% CPU, 7778133 Lips)
true.

/* occurs_check=true */
?- time((between(1,1000,_),test,fail;true)).
% 6,547,000 inferences, 0.984 CPU in 0.983 seconds (100% CPU, 6650921 Lips)
true.

On the other hand, moving (=)/2 to the front changes the picture favorably:

/* Variant 2 */
/* occurs_check=false */
?- time((between(1,1000,_),test,fail;true)).
% 3,646,000 inferences, 0.453 CPU in 0.456 seconds (99% CPU, 8046345 Lips)
true.

/* occurs_check=true */
?- time((between(1,1000,_),test,fail;true)).
% 6,547,000 inferences, 0.703 CPU in 0.688 seconds (102% CPU, 9311289 Lips)
true.

Open Source:

Negation Normal Form, Non-Tail Recursive
https://gist.github.com/jburse/7705ace654af0df6f4fdd12eee80aaec#file-norm-pl

Negation Normal Form, Tail Recursive
https://gist.github.com/jburse/7705ace654af0df6f4fdd12eee80aaec#file-norm2-pl

193 propositional logic test cases from Principia.
https://gist.github.com/jburse/7705ace654af0df6f4fdd12eee80aaec#file-principia-pl