Boolean masking on multiple axes with numpy

X[mask1, mask2] is described in Boolean Array Indexing Doc as the equivalent of

In [249]: X[mask1.nonzero()[0], mask2.nonzero()[0]]
Out[249]: array([1, 5])
In [250]: X[[0,1], [0,1]]
Out[250]: array([1, 5])

In effect it is giving you X[0,0] and X[1,1] (pairing the 0s and 1s).

What you want instead is:

In [251]: X[[[0],[1]], [0,1]]
Out[251]: 
array([[1, 2],
       [4, 5]])

np.ix_ is a handy tool for creating the right mix of dimensions

In [258]: np.ix_([0,1],[0,1])
Out[258]: 
(array([[0],
        [1]]), array([[0, 1]]))
In [259]: X[np.ix_([0,1],[0,1])]
Out[259]: 
array([[1, 2],
       [4, 5]])

That's effectively a column vector for the 1st axis and row vector for the second, together defining the desired rectangle of values.

But trying to broadcast boolean arrays like this does not work: X[mask1[:,None], mask2]

But that reference section says:

Combining multiple Boolean indexing arrays or a Boolean with an integer indexing array can best be understood with the obj.nonzero() analogy. The function ix_ also supports boolean arrays and will work without any surprises.

In [260]: X[np.ix_(mask1, mask2)]
Out[260]: 
array([[1, 2],
       [4, 5]])
In [261]: np.ix_(mask1, mask2)
Out[261]: 
(array([[0],
        [1]], dtype=int32), array([[0, 1]], dtype=int32))

The boolean section of ix_:

    if issubdtype(new.dtype, _nx.bool_):
        new, = new.nonzero()

So it works with a mix like X[np.ix_(mask1, [0,2])]

Tags:

Python

Numpy