Efficient algorithm to find the sum of all concatenated pairs of integers in a list

The concatenation of two integers:

m ∘ n

is equal to:

10**digit_length(n) * m + n

so the sum of the concatenations of every list item with a given integer:

(a[0] ∘ n) + (a[1] ∘ n) + …

is equal to:

(10**digit_length(n) * a[0] + n) + (10**digit_length(n) * a[1] + n) + …

and you can put all the ns on one side:

(10**digit_length(n) * a[0]) + (10**digit_length(n) * a[1]) + … + n + n + …

and note that each element of the array is multiplied by a value that only depends on n:

10**digit_length(n) * (a[0] + a[1] + …) + n + n + …

simplifying again:

10**digit_length(n) * sum(a) + len(a) * n

sum(a) doesn’t change, and the sum of len(a) * ns across all ns is len(a) * sum(a):

def concatenationsSum(a):
    sum_a = sum(a)
    return sum(10**digit_length(n) * sum_a for n in a) + len(a) * sum_a


def digit_length(n):
    """
    The number of base-10 digits in an integer.

    >>> digit_length(256)
    3

    >>> digit_length(0)
    1
    """
    return len(str(n))

This runs in linear time when the upper bound on the integers involved is constant. You can also use math.log10 to make digit_length faster as long as floating-point math is precise enough for the integer sizes involved (and if not, there are still better ways to implement it than going through a string – but probably no shorter or more understandable ways).


Instead of prepending each number with every number separately, just prepend it once with the sum. Well, then it appears as the tail only once instead of N times, so just add it N-1 more times (or equivalently, overall add the sum N-1 times).

def concatenationsSum(a):
    sum_ = sum(a)
    return sum(int(str(sum_) + str(x)) for x in a) + (len(a) - 1) * sum_

Runtime is O(N). Demo at repl.it for only 1000 values, output:

original result 460505045000 in 3.3822 seconds
  faster result 460505045000 in 0.0017 seconds
Same result? True

It's impossible to efficiently generate each number seperately. What you can do, however, is to try to calculate the result without necessarly generating the individual values.

Numbers in the array are up to 10^6. That means each number has from 1 to 7 digits. Put all the numbers into groups: in a single group there should be numbers with the same amount of digits. There will be up to 7 groups. That you can do in O(n) (for the next steps only the sizes of the groups actually matter, you don't have to physically create 7 lists of numbers)

Consider an integer X in the array. You will concatenate it with the rest of the numbers in the array. Concatenation with an integer Y with K digits can be seen as: X * 10^K + Y. You want to calculate the sum of the concatenations, it's much easier to calculate how many times each digit will actually act as Y (exactly N-1 times, where N is a size of the array) and how many times it will be an X with a specific K value (there are only 7 possible Ks, check how many integers are in each of the groups; for example if you are considering K = 4, the amount is equal to the size of the group 4). You can do that in O(1).

The last step is to calculate the result using the previous computations. This is quite straightforward, for each number V in the array you add to the result V * Y_V, V * 10 * X_V_1, Y * 100 * Y_V_2, ..., where Y_V equals to the number of concatenations where V acts as Y, X_V_K equals to the number of concatenations where V acts as X with an integer Y with K digits. Having all the values already calculated, it takes O(n) time.