How to accelerate the application of the following for loop and function?

list_list_int = [1,2,3,4,5,6]
for j in chunks(2, list_list_int):
  for i in j:
    avg_, max_, last_ = foo(bar, i)

I don't have chunks installed, but from the docs I suspect it produces (for size 2 chunks, from:

alist = [[1,2],[3,4],[5,6],[7,8]]                                     
j = [[1,2],[3,4]]
j = [[5,6],[7,8]]   

which would produce an error:

In [116]: alist[j]                                                              
TypeError: list indices must be integers or slices, not list

And if your foo can't work with the full list of lists, I don't see how it will work with that list split into chunks. Apparently it can only work with one sublist at a time.


If you are looking to perform parallel operations on a numpy array, then I would use Dask.

With just a few lines of code, your operation should be able to be easily ran on multiple processes and the highly developed Dask scheduler will balance the load for you. A huge benefit to Dask compared to other parallel libraries like joblib, is that it maintains the native numpy API.

import dask.array as da

# Setting up a random array with dimensions 10K rows and 10 columns
# This data is stored distributed across 10 chunks, and the columns are kept together (1_000, 10)
x = da.random.random((10_000, 10), chunks=(1_000, 10))
x = x.persist()  # Allow the entire array to persist in memory to speed up calculation


def foo(x):
    return x / 10


# Using the native numpy function, apply_along_axis, applying foo to each row in the matrix in parallel
result_foo = da.apply_along_axis(foo, 0, x)

# View original contents
x[0:10].compute()

# View sample of results
result_foo = result_foo.compute()
result_foo[0:10]