Renaming columns in a Pandas dataframe with duplicate column names?

Here is another dynamic solution that I think is nicer

In [59]: df
Out[59]:
   a  x  x  x  z
0  6  2  7  7  8
1  6  6  3  1  1
2  6  6  7  5  6
3  8  3  6  1  8
4  5  7  5  3  0
In [61]: class renamer():
             def __init__(self):
                  self.d = dict()

              def __call__(self, x):
                  if x not in self.d:
                      self.d[x] = 0
                      return x
                  else:
                      self.d[x] += 1
                      return "%s_%d" % (x, self.d[x])

          df.rename(columns=renamer())
Out[61]:
   a  x  x_1  x_2  z
0  6   2   7   7  8
1  6   6   3   1  1
2  6   6   7   5  6
3  8   3   6   1  8
4  5   7   5   3  0

X_R.columns = ['Retail','Cost']

Here is a dynamic solution:

In [59]: df
Out[59]:
   a  x  x  x  z
0  6  2  7  7  8
1  6  6  3  1  1
2  6  6  7  5  6
3  8  3  6  1  8
4  5  7  5  3  0

In [60]: d
Out[60]: {'x': ['x1', 'x2', 'x3']}

In [61]: df.rename(columns=lambda c: d[c].pop(0) if c in d.keys() else c)
Out[61]:
   a  x1  x2  x3  z
0  6   2   7   7  8
1  6   6   3   1  1
2  6   6   7   5  6
3  8   3   6   1  8
4  5   7   5   3  0

Not directly an answer, but since this a top search result, here is a short and flexible solution to append a suffix to duplicate column names:

# A dataframe with duplicated column names
df = pd.DataFrame([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
df.columns = ['a', 'b', 'b']

# Columns to not rename
excluded = df.columns[~df.columns.duplicated(keep=False)]

# An incrementer
import itertools
inc = itertools.count().__next__

# A renamer
def ren(name):
    return f"{name}{inc()}" if name not in excluded else name

# Use inside rename()
df.rename(columns=ren)

 

    a   b   b              a  b0  b1
0   1   2   3          0   1   2   3
1   4   5   6    =>    1   4   5   6
2   7   8   8          2   7   8   9

Tags:

Python

Pandas