What does the built-in function sum do with sum(list, [])?

Don't you think that start should be a number?

start is a number, by default; 0, per the documentation you've quoted. Hence when you do e.g.:

sum((1, 2))

it is evaluated as 0 + 1 + 2 and it equals 3 and everyone's happy. If you want to start from a different number, you can supply that instead:

>>> sum((1, 2), 3)
6

So far, so good.


However, there are other things you can use + on, like lists:

>>> ['foo'] + ['bar']
['foo', 'bar']

If you try to use sum for this, though, expecting the same result, you get a TypeError:

>>> sum((['foo'], ['bar']))

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    sum((['foo'], ['bar']))
TypeError: unsupported operand type(s) for +: 'int' and 'list'

because it's now doing 0 + ['foo'] + ['bar'].

To fix this, you can supply your own start as [], so it becomes [] + ['foo'] + ['bar'] and all is good again. So to answer:

Why [] can be written here?

because although start defaults to a number, it doesn't have to be one; other things can be added too, and that comes in handy for things exactly like what you're currently doing.


First of all, never use sum for concatenating/flattening lists because it's of quadratic time and hence not efficient at all compare to the other ways around. It actually uses a schlemiel the painter algorithm.

The sum function calls the __add__ attribute of the start on each iteration with all the items of an iterable that's been passed as the first argument.

For example :

>>> [].__add__([2,3])
[2, 3]
#OR
>>> [] + [1,2,3]
[1, 2, 3]

And in this case the result would be a concatenated list of your input lists. From an algorithmic perspective it does the followings:

>>> a = [[1, 2], [3, 4], [5, 6]]
>>> start = []
>>> for i in a:
...     start += i
... 
>>> start
[1, 2, 3, 4, 5, 6]

Not that you can call the sum function on any sequence of objects that have an __add__ attribute, but note that since the default start argument is 0 if your object is not an integer it will raise an TypeError. In that case you need to specify a proper start for the function.

>>> class newObj(object):
...    def  __init__(self,val):
...         self.val = val
...    def __add__(self,item):
...        return '{}_____{}'.format(self.val,item)
... 
>>> 
>>> start=newObj('new_obj')
>>> 
>>> start
<__main__.newObj object at 0x7f75f9241c50>
>>> 
>>> start + 5
'new_obj_____5'
>>> 
>>> 
>>> sum(['1','2','3'],start)
'new_obj_____123'

You sum the start with the contents of the iterable you provide as the first argument. sum doesn't restrict the type of start to an int in order to allow for various cases of adding.

Essentially sum does something like this:

a = [[1, 2], [3, 4], [5, 6]]
sum(a, number)

Roughly translates to:

number += every value in the list a

Since every value in the list a is a list this works and the previous summation, when expanded, looks like this:

number + [1, 2] + [3, 4] + [5, 6]

So if you enter an int this will result in an unfortunate TypeError because adding an int and a list is not allowed.

1 + [1, 2] == I hope you like TypeErrors

However, If you enter a list [] it is simply going to join the elements of a together and result in the flattened list we know and love.

The value of start defaults to 0 an int mainly because the most common case of summation is arithmetic.

Tags:

Python