KeyError when indexing Pandas dataframe

As mentioned by alko, it is probably extra character at the beginning of your file. When using read_csv, you can specify encoding to deal with encoding and heading character, known as BOM (Byte order mark)

df = pd.read_csv('values.csv', delimiter=',', encoding="utf-8-sig")

This question finds some echoes on Stackoverflow: Pandas seems to ignore first column name when reading tab-delimited data, gives KeyError


It is almost always one of these reasons

  1. You spelled the column name wrong
  2. There are leading/trailing whitespaces
    • in this case, use df.columns = df.columns.str.strip() to remove them, or revisit your pd.read_csv (or other IO function) call to see if you can remove them while parsing input
  3. Your column is not actually a column, but an index level
    • you can check the index level names using df.index.names to see if it is there. Calling .reset_index() before selecting the column should fix it.
  4. Your DataFrame does not have the column, at all
    • it was all just a figment of your imagination. Please turn off your system and take a nap.

Regardless of the reason, the first step is to stop what you're doing and run print(df.columns.tolist()) and eyeball the result to see which of these 4 possible reasons it could be.


You most likely have an extra character at the beginning of your file, that is prepended to your first column name, 'Date'. Simply Copy / Paste your output to a non-unicode console produces.

Index([u'?Date', u'Open', u'High', u'Low', u'Close', u'Volume'], dtype='object')

Tags:

Python

Pandas