Sum along axis in numpy array

Here is another way to interpret this. You can consider a multi-dimensional array as a tensor, T[i][j][k], while i, j, k represents axis 0,1,2 respectively.

T.sum(axis = 0) mathematically will be equivalent to:

enter image description here

Similary, T.sum(axis = 1):

enter image description here

And, T.sum(axis = 2):

enter image description here

So in another word, the axis will be summed over, for instance, axis = 0, the first index will be summed over. If written in a for loop:

result[j][k] = sum(T[i][j][k] for i in range(T.shape[0])) for all j,k

for axis = 1:

result[i][k] = sum(T[i][j][k] for j in range(T.shape[1])) for all i,k

etc.


If you want to keep the dimensions you can specify keepdims:

>>> arr = np.arange(0,30).reshape(2,3,5)
>>> arr.sum(axis=0, keepdims=True)
array([[[15, 17, 19, 21, 23],
        [25, 27, 29, 31, 33],
        [35, 37, 39, 41, 43]]])

Otherwise the axis you sum along is removed from the shape. An easy way to keep track of this is using the numpy.ndarray.shape property:

>>> arr.shape
(2, 3, 5)

>>> arr.sum(axis=0).shape
(3, 5)  # the first entry (index = axis = 0) dimension was removed 

>>> arr.sum(axis=1).shape
(2, 5)  # the second entry (index = axis = 1) was removed

You can also sum along multiple axis if you want (reducing the dimensionality by the amount of specified axis):

>>> arr.sum(axis=(0, 1))
array([75, 81, 87, 93, 99])
>>> arr.sum(axis=(0, 1)).shape
(5, )  # first and second entry is removed

numpy displays a (2,3,5) array as 2 blocks of 3x5 arrays (3 rows, 5 columns). Or call them 'planes' (MATLAB would show it as 5 blocks of 2x3).

The numpy display also matches a nested list - a list of two sublists; each with 3 sublists. Each of those is 5 elements long.

In the 3x5 2d case, axis 0 sums along the size 3 dimension, resulting in a 5 element array. The descriptions 'sum over rows' or 'sum along colulmns' are a little vague in English. Focus on the results, the change in shape, and which values are being summed, not on the description.

Back to the 3d case:

With axis=0, it sums along the 1st dimension, effectively removing it, leaving us with a 3x5 array. 0+15=16, 1+16=17 etc.

Axis 1, condenses the size 3 dimension, result is 2x5. 0+5+10=15, etc.

Axis 2, condense the size 5 dimenson, result is 2x3, sum((0,1,2,3,4))

Your example is good, since the 3 dimensions are different, and it is easier to see which one was eliminated during the sum.

With 2d there's some ambiguity; 'sum over rows' - does that mean the rows are eliminated or retained? With 3d there's no ambiguity; with axis=0, you can only remove it, leaving the other 2.