Numpy slicing with bound checks

If you used range instead of the common slicing notation you could get the expected behaviour. For example for a valid slicing:

arr[range(2),:]

array([[1., 1.],
       [1., 1.]])

And if we tried to slice with for instance:

arr[range(5),:]

It would throw the following error:

IndexError: index 2 is out of bounds for size 2

My guess on why this throws an error is that slicing with common slice notation is a basic property in numpy arrays as well as lists, and thus instead of throwing an index out of range error when we try to slice with wrong indices, it already contemplates this and cuts to the nearest valid indices. Whereas this is apparently not contemplated when slicing with a range, which is an immutable object.

This ended up a bit longer than expected, but you can write your own wrapper that checks the get operations to make sure that slices do not go beyond limits (indexing arguments that are not slices are already checked by NumPy). I think I covered all cases here (ellipsis, np.newaxis, negative steps...), although there might be some failing corner case still.

import numpy as np

# Wrapping function
def bounds_checked_slice(arr):
    return SliceBoundsChecker(arr)

# Wrapper that checks that indexing slices are within bounds of the array
class SliceBoundsChecker:

    def __init__(self, arr):
        self._arr = np.asarray(arr)

    def __getitem__(self, args):
        # Slice bounds checking
        self._check_slice_bounds(args)
        return self._arr.__getitem__(args)

    def __setitem__(self, args, value):
        # Slice bounds checking
        self._check_slice_bounds(args)
        return self._arr.__setitem__(args, value)

    # Check slices in the arguments are within bounds
    def _check_slice_bounds(self, args):
        if not isinstance(args, tuple):
            args = (args,)
        # Iterate through indexing arguments
        arr_dim = 0
        i_arg = 0
        for i_arg, arg in enumerate(args):
            if isinstance(arg, slice):
                self._check_slice(arg, arr_dim)
                arr_dim += 1
            elif arg is Ellipsis:
                break
            elif arg is np.newaxis:
                pass
            else:
                arr_dim += 1
        # Go backwards from end after ellipsis if necessary
        arr_dim = -1
        for arg in args[:i_arg:-1]:
            if isinstance(arg, slice):
                self._check_slice(arg, arr_dim)
                arr_dim -= 1
            elif arg is Ellipsis:
                raise IndexError("an index can only have a single ellipsis ('...')")
            elif arg is np.newaxis:
                pass
            else:
                arr_dim -= 1

    # Check a single slice
    def _check_slice(self, slice, axis):
        size = self._arr.shape[axis]
        start = slice.start
        stop = slice.stop
        step = slice.step if slice.step is not None else 1
        if step == 0:
            raise ValueError("slice step cannot be zero")
        bad_slice = False
        if start is not None:
            start = start if start >= 0 else start + size
            bad_slice |= start < 0 or start >= size
        else:
            start = 0 if step > 0 else size - 1
        if stop is not None:
            stop = stop if stop >= 0 else stop + size
            bad_slice |= (stop < 0 or stop > size) if step > 0 else (stop < 0 or stop >= size)
        else:
            stop = size if step > 0 else -1
        if bad_slice:
            raise IndexError("slice {}:{}:{} is out of bounds for axis {} with size {}".format(
                slice.start if slice.start is not None else '',
                slice.stop if slice.stop is not None else '',
                slice.step if slice.step is not None else '',
                axis % self._arr.ndim, size))

A small demo:

import numpy as np

a = np.arange(24).reshape(4, 6)
print(bounds_checked_slice(a)[:2, 1:5])
# [[ 1  2  3  4]
#  [ 7  8  9 10]]
bounds_checked_slice(a)[:2, 4:10]
# IndexError: slice 4:10: is out of bounds for axis 1 with size 6

If you wanted, you could even make this a subclass of ndarray, so you get this behavior by default, instead of having to wrap the array every time.

Also, note that there may be some variations as to what you may consider to be "out of bounds". The code above considers that going even one index beyond the size is out of bounds, meaning that you cannot take an empty slice with something like arr[len(arr):]. You could in principle edit the code if you were thinking of a slightly different behavior.

Numpy slicing with bound checks

Tags:

Python

Numpy

Related

Recent Posts