Set Intersection predicate Prolog using not

Your definition is correct too general. It admits e.g. that [] is the intersection of any two lists which is too general. I.e. it incorrectly succeeds for intersection([],[a],[a]). It lacks a third "for all" idiom stating that all elements that are in both lists will be in the resulting list.

But otherwise your definition is fine. For the ground case. What is a bit unusual is that the intersection is the first and not the last argument. Quite irritating to me are the variable names. I believe that R means "result", thus the intersection. And L1 and L2 are the two sets to build the intersection.

It is a bit too general, though - like many Prolog predicates - think of append([], non_list, non_list). Apart from lists, your definition admits also terms that are neither lists nor partial lists:

?- intersection(non_list1,[1,2|non_list2],[3,4|non_list3]).

To make it really useful safe, use it like so:

?- when(ground(intersection(I, A, B)), intersection(I, A, B)).

or so:

?- (  ground(intersection(I, A, B))
   -> intersection(I, A, B)
   ;  throw(error(instantiation_error, intersection(I, A, B)))
   ).

Or, using iwhen/2:

?- iwhen(ground(intersection(I, A, B)), intersection(I, A, B) ).

As a minor remark, rather write (\+)/1 in place of not/1.


The problem is that not/1 merely negates the outcome of your element/2. It doesn't cause element/2 to backtrack to find other instantiations for which the enclosing not/1 will be true.

Consider the following program.

a(1).
a(2).

b(1).
b(2).
b(3).

And the following queries:

  1. b(X), not(a(X)).
  2. not(a(X)), b(X).

The first one yields X = 3 while the second one yields false. That is because the first query first instantiates X with 1, then with 2, then with 3, until finally not(a(X)) succeeds.
The second query first instantiates X with 1, a(1) succeeds, so not(a(1)) fails. There is no backtracking done!


The lack of backtracking due to negation as pointed out by @SQB is actually not the only problem with your code. If you play around a little with ground queries you find that non-lists and the empty list as pointed out by @false are also not the only issue. Consider the following queries:

   ?- intersection([2,3],[1,2,3],[2,3,4]).
yes
   ?- intersection([2],[1,2,3],[2,3,4]).
yes
   ?- intersection([3],[1,2,3],[2,3,4]).
yes
   ?- intersection([],[1,2,3],[2,3,4]).
yes

The first is what usually is understood as intersection. The other three are all sublists of the intersection including the trivial sublist []. This is due to the way the predicate describes what an intersection is: In an intersection is not the case that an element is in the first but not the second list AND that said element is in the first but not the third list. This description clearly fits the three above queries hence they succeed. Fooling around a little more with this description in mind there are some other noteworthy ground queries that succeed:

   ?- intersection([2,2,3],[1,2,3],[2,3,4]).
yes

The question whether the presence of duplicates in the solution is acceptable or not is in fact quite a matter of debate. The lists [2,2,3] and [2,3] although different represent the same set {2,3}. There is this recent answer to a question on Prolog union that is elaborating on such aspects of answers. And of course the sublists of the intersection mentioned above can also contain duplicates or multiples:

   ?- intersection([2,2,2],[1,2,3],[2,3,4]).
yes

But why is this? For the empty list this is quite easy to see. The query

   ?- element(A,[]).
no

fails hence the conjunction element(A,L1), not(element(A,L2)) also fails for L1=[]. Therefore the negation wrapped around it succeeds. The same is true for the second negation, consequently [] can be derived as intersection. To see why [2] and [3] succeed as intersection it is helpful to write your predicate as logic formula with the universal quantifiers written down explicitly:

L1L2RA (intersection(L1,L2,R) ← ¬ (element(A,L1) ∧ ¬ element(A,L2)) ∧ ¬ (element(A,L1) ∧ ¬ element(A,R)))

If you consult a textbook on logic or one on logic programming that also shows Prolog code as logic formulas you'll find that the universal quantifiers for variables that do not occur in the head of the rule can be moved into the body as existential quantifiers. In this case for A:

L1L2R (intersection(L1,L2,R) ← ∃A ( ¬ (element(A,L1) ∧ ¬ element(A,L2)) ∧ ¬ (element(A,L1) ∧ ¬ element(A,R))))

So for all arguments L1,L2,R there is some A that satisfies the goals. Which explains the derivation of the sublists of the intersection and the multiple occurrences of elements.

However, it is much more annoying that the query

   ?- intersection(L1,[1,2,3],[2,3,4]).

loops instead of producing solutions. If you consider that L1 is not instantiated and look at the results for the following query

   ?- element(A,L1).
L1 = [A|_A] ? ;
L1 = [_A,A|_B] ? ;
L1 = [_A,_B,A|_C] ? ;
...

it becomes clear that the query

   ?- element(A,L1),not(element(A,[1,2,3])).

has to loop due to the infinitely many lists L1, that contain A, described by the first goal. Hence the corresponding conjunction in your predicate has to loop as well. Additionally to producing results it would also be nice if such a predicate mirrored the relational nature of Prolog and worked the other way around too (2nd or 3rd arguments variable). Let's compare your code with such a solution. (For the sake of comparison the following predicate describes sublists of the intersection just as your code does, for a different definition see further below.)

To reflect its declarative nature lets call it list_list_intersection/3:

list_list_intersection(_,_,[]).
list_list_intersection(L1,L2,[A|As]) :-
   list_element_removed(L1,A,L1noA),
   list_element_removed(L2,A,L2noA),
   list_list_intersection(L1noA,L2noA,As).

list_element_removed([X|Xs],X,Xs).
list_element_removed([X|Xs],Y,[X|Ys]) :-
   dif(X,Y),
   list_element_removed(Xs,Y,Ys).

Like your predicate this version is also using the elements of the intersection to describe the relation. Hence it's producing the same sublists (including []):

   ?- list_list_intersection([1,2,3],[2,3,4],I).
I = [] ? ;
I = [2] ? ;
I = [2,3] ? ;
I = [3] ? ;
I = [3,2] ? ;
no

but without looping. However, multiple occurrences are not produced anymore as already matched elements are removed by list_element_removed/3. But multiple occurrences in both of the first lists are matched correctly:

   ?- list_list_intersection([1,2,2,3],[2,2,3,4],[2,2,3]).
yes

This predicate also works in the other directions:

   ?- list_list_intersection([1,2,3],L,[2,3]).
L = [2,3|_A] ? ;
L = [2,_A,3|_B],
dif(_A,3) ? ;
L = [2,_A,_B,3|_C],
dif(_A,3),
dif(_B,3) ? ;
...

   ?- list_list_intersection(L,[2,3,4],[2,3]).
L = [2,3|_A] ? ;
L = [2,_A,3|_B],
dif(_A,3) ? ;
L = [2,_A,_B,3|_C],
dif(_A,3),
dif(_B,3) ? ;
...

So this version corresponds to your code without the duplicates. Note how the element A of the intersection explicitly appears in the head of the rule where all elements of the intersection are walked through recursively. Which I believe is what you tried to achieve by utilizing the implicit universal quantifiers in front of Prolog rules.

To come back to a point in the beginning of my answer, this is not what is commonly understood as the intersection. Among all the results list_list_intersection/3 describes for the arguments [1,2,3] and [2,3,4] only [2,3] is the intersection. Here another issue with your code comes to light: If you use the elements of the intersection to describe the relation, how do you make sure you cover all intersecting elements? After all, all elements of [2] occur in [1,2,3] and [2,3,4]. An obvious idea would be to walk through the elements of one of the other lists and describe those occurring in both as also being in the intersection. Here is a variant using if_/3 and memberd_t/3:

list_list_intersection([],_L2,[]).
list_list_intersection([X|Xs],L2,I) :-
   if_(memberd_t(X,L2),
       (I=[X|Is],list_element_removed(L2,X,L2noX)),
       (I=Is,L2noX=L2)),
   list_list_intersection(Xs,L2noX,Is).

Note that it is also possible to walk through the arguments of the second list instead of the first one. The predicate memberd_t/3 is a reified variant of your predicate element/2 and list_element_removed/3 is again used in the description to avoid duplicates in the solution. Now the solution is unique

   ?- list_list_intersection([1,2,3],[2,3,4],L).
L = [2,3] ? ;
no

and the "problem queries" from above fail as expected:

   ?- list_list_intersection([1,2,3],[2,3,4],[]).
no
   ?- list_list_intersection([1,2,3],[2,3,4],[2]).
no
   ?- list_list_intersection([1,2,3],[2,3,4],[3]).
no
   ?- list_list_intersection([1,2,3],[2,3,4],[2,2,3]).
no
   ?- list_list_intersection([1,2,3],[2,3,4],[2,2,2]).
no

And of course you can also use the predicate in the other directions:

   ?- list_list_intersection([1,2,3],L,[2,3]).
L = [2,3] ? ;
L = [3,2] ? ;
L = [2,3,_A],
dif(_A,1) ? ;
...

   ?- list_list_intersection(L,[2,3,4],[2,3]).
L = [2,3] ? ;
L = [2,3,_A],
dif(4,_A) ? ;
...

Tags:

Prolog