pandas: Boolean indexing with multi index

If you transform your index 'a' back to a column, you can do it as follows:

>>> df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3], 
                       'b':[1,2,3,1,2,3,1,2,3], 
                       'c':range(9)})
>>> filt = pd.Series({1:True, 2:False, 3:True})
>>> df[filt[df['a']].values]
   a  b  c
0  1  1  0
1  1  2  1
2  1  3  2
6  3  1  6
7  3  2  7
8  3  3  8

edit. As suggested by @joris, this works also with indices. Here is the code for your sample data:

>>> df[filt[df.index.get_level_values('a')].values]
     c
a b   
1 1  0
  2  1
  3  2
3 1  6
  2  7
  3  8

The more readable (to my liking) solution is to reindex the boolean series (dataframe) to match index of the multi-index df:

df.loc[filt.reindex(df.index, level='a')]

You can use pd.IndexSlicer.

>>> df.loc[pd.IndexSlice[filt[filt].index.values, :], :]
     c
a b   
1 1  0
  2  1
  3  2
3 1  6
  2  7
  3  8

where filt[filt].index.values is just [1, 3]. In other words

>>> df.loc[pd.IndexSlice[[1, 3], :]]
     c
a b   
1 1  0
  2  1
  3  2
3 1  6
  2  7
  3  8

so if you design your filter construction a bit differently, the expression gets shorter. The advantave over Emanuele Paolini's solution df[filt[df.index.get_level_values('a')].values] is, that you have more control over the indexing.

The topic of multiindex slicing is covered in more depth here.

Here the full code

import pandas as pd
import numpy as np
df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3], 'b':[1,2,3,1,2,3,1,2,3], 'c':range(9)}).set_index(['a', 'b'])
filt = pd.Series({1:True, 2:False, 3:True})

print(df.loc[pd.IndexSlice[[1, 3], :]])
print(df.loc[(df.index.levels[0].values[filt], slice(None)), :])
print(df.loc[pd.IndexSlice[filt[filt].index.values, :], :])

If the boolean series is not aligned with the dataframe you want to index it with, you can first explicitely align it with align:

In [25]: df_aligned, filt_aligned = df.align(filt.to_frame(), level=0, axis=0)

In [26]: filt_aligned
Out[26]:
         0
a b
1 1   True
  2   True
  3   True
2 1  False
  2  False
  3  False
3 1   True
  2   True
  3   True

And then you can index with it:

In [27]: df[filt_aligned[0]]
Out[27]:
     c
a b
1 1  0
  2  1
  3  2
3 1  6
  2  7
  3  8

Note: the align didn't work with a Series, therefore the to_frame in the align call, and therefore the [0] above to get back the series.

Tags:

Python

Pandas