Pandas Equivalent of R's which()

I may not understand clearly the question, but it looks like the response is easier than what you think:

using pandas DataFrame:

df['colname'] > somenumberIchoose

returns a pandas series with True / False values and the original index of the DataFrame.

Then you can use that boolean series on the original DataFrame and get the subset you are looking for:

df[df['colname'] > somenumberIchoose]

should be enough.

See http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing


What what I know of R you might be more comfortable working with numpy -- a scientific computing package similar to MATLAB.

If you want the indices of an array who values are divisible by two then the following would work.

arr = numpy.arange(10)
truth_table = arr % 2 == 0
indices = numpy.where(truth_table)
values = arr[indices]

It's also easy to work with multi-dimensional arrays

arr2d = arr.reshape(2,5)
col_indices = numpy.where(arr2d[col_index] % 2 == 0)
col_values = arr2d[col_index, col_indices]

enumerate() returns an iterator that yields an (index, item) tuple in each iteration, so you can't (and don't need to) call .index() again.

Furthermore, your list comprehension syntax is wrong:

indexfuture = [(index, x) for (index, x) in enumerate(df['colname']) if x > yesterday]

Test case:

>>> [(index, x) for (index, x) in enumerate("abcdef") if x > "c"]
[(3, 'd'), (4, 'e'), (5, 'f')]

Of course, you don't need to unpack the tuple:

>>> [tup for tup in enumerate("abcdef") if tup[1] > "c"]
[(3, 'd'), (4, 'e'), (5, 'f')]

unless you're only interested in the indices, in which case you could do something like

>>> [index for (index, x) in enumerate("abcdef") if x > "c"]
[3, 4, 5]