Is there a better more readable way to coalese columns in pandas

The Pandas equivalent to COALESCE is the method fillna():

result = column_a.fillna(column_b)

The result is a column where each value is taken from column_a if that column provides a non-null value, otherwise the value is taken from column_b. So your combo1 can be produced with:

df['first'].fillna(df['second']).fillna(df['third'])

giving:

0       A
1       C
2       B
3    None
4       A

And your combo2 can be produced with:

(df['second']).fillna(df['third']).fillna(df['first'])

which returns the new column:

0       C
1       C
2       B
3    None
4       B

If you wanted an efficient operation called coalesce, it could simply combine columns with fillna() from left to right and then return the result:

def coalesce(df, column_names):
    i = iter(column_names)
    column_name = next(i)
    answer = df[column_name]
    for column_name in i:
        answer = answer.fillna(df[column_name])
    return answer

print coalesce(df, ['first', 'second', 'third'])
print coalesce(df, ['second', 'third', 'first'])

which gives:

0       A
1       C
2       B
3    None
4       A

0       C
1       C
2       B
3    None
4       B

Tags:

Python

Pandas