Fastest Algorithm to Find the Minimum Sum of Absolute Differences through List Rotation

Since Python treats negative indexes as counting from the right end, you could sum the absolute value of list1 minus (list2 shifted by k) where 0 ≤ k < len(list1) as

sum(abs(list1[i] - list2[i - k]) for i in range(len(list1)))

If you want the minimum of all of these values

length = len(list1)
min(sum(abs(list1[i] - list2[i - k]) for i in range(length))
    for k in range(length))

This code is still O(n^2), but there is a lot less pushing and popping going on.

I really can't think of any way to make the algorithm faster than O(n^2).


An optimized mix of your original and Frank's accepted answer:

min(list1.append(list1.pop(0)) or
    sum(abs(x - y) for x, y in zip(list1, list2))
    for _ in list1)

Bit dirty to have the rotation in there like that, but hey, you're asking for "Fastest" :-)

Benchmark with lists of length 1000:

    original     Frank_Yellin   superb_rain  
     127 ms         164 ms         125 ms    
     140 ms         170 ms         117 ms    
     134 ms         166 ms         116 ms    
     124 ms         161 ms         126 ms    
     135 ms         164 ms         126 ms    

Benchmark code:

from timeit import repeat
from random import shuffle

def original(list1, list2):
    choices = []                # Put all possible sums into a list to find the minimum value.
    for j in range(len(list1)):  # List1 does a full rotation
        total = 0
        for k in range(len(list1)):
            total += abs(list1[k] - list2[k])
        list1.append(list1.pop(0))
        choices.append(total)
    return min(choices)

def Frank_Yellin(list1, list2):
    length = len(list1)
    return min(sum(abs(list1[i] - list2[i - k]) for i in range(length))
    for k in range(length))

def superb_rain(list1, list2):
    return min(list1.append(list1.pop(0)) or
               sum(abs(x - y) for x, y in zip(list1, list2))
               for _ in list1)

funcs = [
    (10, original),
    (10, Frank_Yellin),
    (10, superb_rain),
    ]

list1 = list(range(1000))
list2 = list1.copy()
shuffle(list2)

for _, f in funcs:
    print(f(list1, list2))

for _, f in funcs:
    print(f.__name__.center(15), end='')
print()

for _ in range(5):
    for number, f in funcs:
        t = min(repeat(lambda: f(list1, list2), number=number)) / number
        print('%8d ms    ' % (t * 1e3), end='')
    print()

I haven't cracked the full problem, but in the special case where the input values are all 0 or 1 (or any two different values, or any of O(1) different values, but we'll need another idea to get much further than that), we can get an O(n log n)-time algorithm by applying fast convolution.

The idea is to compute all of the sums of absolute differences as List1 * reverse(1 - List2) + (1 - List1) * reverse(List2) where 1 - List means doing that operation point-wise and * denotes circular convolution (computable in time O(n log n) using a pair of FFTs). The definition of circular convolution here is

             n-1
             __
             \
(f * g)(i) = /_  f(j) g((i - j) mod n).
             j=0

Substituting List1 for f and reverse(1 - List2) for g, we get

                                  n-1
                                  __
                                  \
(List1 * reverse(1 - List2))(i) = /_ List1(j) (1 - List2((n-1-(i-j)) mod n))
                                  j=0

                                  n-1
                                  __
                                  \
                                = /_ List1(j) (1 - List2((j-(i+1)) mod n)).
                                  j=0

The product List1(j) (1 - List2((j-(i+1)) mod n)) is 1 if and only if List1(j) = 1 and List2((j-(i+1)) mod n) = 0, and 0 otherwise. Thus the i value of the convolution counts the number of places where List1 has a 1 offset i+1 circularly to the left of where List2 has a 0. The other convolution counts 0s corresponding to 1s. Given our input restrictions, this is the sum of absolute differences.

Code:

import numpy


def convolve_circularly(a1, a2):
    return numpy.round(numpy.abs(numpy.fft.ifft(numpy.fft.fft(a1) * numpy.fft.fft(a2))))


def min_sum_abs_diff(a1, a2):
    a1 = numpy.array(a1)
    a2 = numpy.array(a2)[::-1]
    return numpy.min(convolve_circularly(a1, 1 - a2) + convolve_circularly(1 - a1, a2))


def slow_min_sum_abs_diff(a1, a2):
    return min(
        sum(abs(a1[i] - a2[i - k]) for i in range(len(a1))) for k in range(len(a2))
    )


def main():
    n = 100
    for r in range(100000):
        a1 = numpy.random.randint(2, size=n)
        a2 = numpy.random.randint(2, size=n)
        r = min_sum_abs_diff(a1, a2)
        slow_r = slow_min_sum_abs_diff(a1, a2)
        if r != slow_r:
            print(a1, a2, r, slow_r)
            break


if __name__ == "__main__":
    main()