What is the fastest algorithm for intersection of two sorted lists?

Assume that A has m elements and B has n elements, with m ≥ n. Information theoretically, the best we can do is

   (m + n)!
lg -------- = n lg (m/n) + O(n)
    m!  n!

comparisons, since in order to verify an empty intersection, we essentially have to perform a sorted merge. We can get within a constant factor of this bound by iterating through B and keeping a "cursor" in A indicating the position at which the most recent element of B should be inserted to maintain sorted order. We use exponential search to advance the cursor, for a total cost that is on the order of

lg x_1 + lg x_2 + ... + lg x_n,

where x_1 + x_2 + ... + x_n = m + n is some integer partition of m. This sum is O(n lg (m/n)) by the concavity of lg.


I don't know if this is the fastest option but here's one that runs in O(n+m) where n and m are the sizes of your lists:

  • Loop over both lists until one of them is empty in the following way:
  • Advance by one on one list.
  • Advance on the other list until you find a value that is either equal or greater than the current value of the other list.
  • If it is equal, the element belongs to the intersection and you can append it to another list
  • If it is greater that the other element, advance on the other list until you find a value equal or greater than this value
  • as said, repeat this until one of the lists is empty

Tags:

Algorithm