Remove prefix (or suffix) substring from column headers in pandas

python < 3.9, pandas < 1.4

Use str.strip/rstrip:

# df.columns = df.columns.str.strip('_x')
# Or, 
df.columns = df.columns.str.rstrip('_x')  # strip suffix at the right end only.

df.columns
# Index(['W', 'First', 'Last', 'Slice'], dtype='object')

To avoid the issue highlighted in the comments:

Beware of strip() if any column name starts or ends with either _ or x beyond the suffix.

You could use str.replace,

df.columns = df.columns.str.replace(r'_x$', '')

df.columns
# Index(['W', 'First', 'Last', 'Slice'], dtype='object')

Update: python >= 3.9, pandas >= 1.4

From version 1.4, you will soon be able to use str.removeprefix/str.removesuffix.

Examples:

s = pd.Series(["str_foo", "str_bar", "no_prefix"])
s
0    str_foo
1    str_bar
2    no_prefix
dtype: object

s.str.removeprefix("str_")
0    foo
1    bar
2    no_prefix
dtype: object
s = pd.Series(["foo_str", "bar_str", "no_suffix"])
s
0    foo_str
1    bar_str
2    no_suffix
dtype: object

s.str.removesuffix("_str")
0    foo
1    bar
2    no_suffix
dtype: object

Note that 1.4 is not out yet, but you can play with this feature by installing a development environment of pandas.


df.columns = [col[:-2] for col in df.columns if col[-2:]=='_x' else col]

or

df.columns = [col.replace('_x', '') for col in df.columns]

Tags:

Python

Pandas