From ND to 1D arrays

I wanted to see a benchmark result of functions mentioned in answers including unutbu's.

Also want to point out that numpy doc recommend to use arr.reshape(-1) in case view is preferable. (even though ravel is tad faster in the following result)


TL;DR: np.ravel is the most performant (by very small amount).

Benchmark

Functions:

  • np.ravel: returns view, if possible
  • np.reshape(-1): returns view, if possible
  • np.flatten: returns copy
  • np.flat: returns numpy.flatiter. similar to iterable

numpy version: '1.18.0'

Execution times on different ndarray sizes

+-------------+----------+-----------+-----------+-------------+
|  function   |   10x10  |  100x100  | 1000x1000 | 10000x10000 |
+-------------+----------+-----------+-----------+-------------+
| ravel       | 0.002073 |  0.002123 |  0.002153 |    0.002077 |
| reshape(-1) | 0.002612 |  0.002635 |  0.002674 |    0.002701 |
| flatten     | 0.000810 |  0.007467 |  0.587538 |  107.321913 |
| flat        | 0.000337 |  0.000255 |  0.000227 |    0.000216 |
+-------------+----------+-----------+-----------+-------------+

Conclusion

ravel and reshape(-1)'s execution time was consistent and independent from ndarray size. However, ravel is tad faster, but reshape provides flexibility in reshaping size. (maybe that's why numpy doc recommend to use it instead. Or there could be some cases where reshape returns view and ravel doesn't).
If you are dealing with large size ndarray, using flatten can cause a performance issue. Recommend not to use it. Unless you need a copy of the data to do something else.

Used code

import timeit
setup = '''
import numpy as np
nd = np.random.randint(10, size=(10, 10))
'''

timeit.timeit('nd = np.reshape(nd, -1)', setup=setup, number=1000)
timeit.timeit('nd = np.ravel(nd)', setup=setup, number=1000)
timeit.timeit('nd = nd.flatten()', setup=setup, number=1000)
timeit.timeit('nd.flat', setup=setup, number=1000)

In [14]: b = np.reshape(a, (np.product(a.shape),))

In [15]: b
Out[15]: array([1, 2, 3, 4, 5, 6])

or, simply:

In [16]: a.flatten()
Out[16]: array([1, 2, 3, 4, 5, 6])

Use np.ravel (for a 1D view) or np.ndarray.flatten (for a 1D copy) or np.ndarray.flat (for an 1D iterator):

In [12]: a = np.array([[1,2,3], [4,5,6]])

In [13]: b = a.ravel()

In [14]: b
Out[14]: array([1, 2, 3, 4, 5, 6])

Note that ravel() returns a view of a when possible. So modifying b also modifies a. ravel() returns a view when the 1D elements are contiguous in memory, but would return a copy if, for example, a were made from slicing another array using a non-unit step size (e.g. a = x[::2]).

If you want a copy rather than a view, use

In [15]: c = a.flatten()

If you just want an iterator, use np.ndarray.flat:

In [20]: d = a.flat

In [21]: d
Out[21]: <numpy.flatiter object at 0x8ec2068>

In [22]: list(d)
Out[22]: [1, 2, 3, 4, 5, 6]

Tags:

Python

Numpy