Select row from a DataFrame based on the type of the object(i.e. str)

This works:

df[df['A'].apply(lambda x: isinstance(x, str))]

You can do something similar to what you're asking with

In [14]: df[pd.to_numeric(df.A, errors='coerce').isnull()]
Out[14]: 
       A  B
2  Three  3

Why only similar? Because Pandas stores things in homogeneous columns (all entries in a column are of the same type). Even though you constructed the DataFrame from heterogeneous types, they are all made into columns each of the lowest common denominator:

In [16]: df.A.dtype
Out[16]: dtype('O')

Consequently, you can't ask which rows are of what type - they will all be of the same type. What you can do is to try to convert the entries to numbers, and check where the conversion failed (this is what the code above does).


It's generally a bad idea to use a series to hold mixed numeric and non-numeric types. This will cause your series to have dtype object, which is nothing more than a sequence of pointers. Much like list and, indeed, many operations on such series can be more efficiently processed with list.

With this disclaimer, you can use Boolean indexing via a list comprehension:

res = df[[isinstance(value, str) for value in df['A']]]

print(res)

       A  B
2  Three  3

The equivalent is possible with pd.Series.apply, but this is no more than a thinly veiled loop and may be slower than the list comprehension:

res = df[df['A'].apply(lambda x: isinstance(x, str))]

If you are certain all non-numeric values must be strings, then you can convert to numeric and look for nulls, i.e. values that cannot be converted:

res = df[pd.to_numeric(df['A'], errors='coerce').isnull()]

Tags:

Python

Pandas