Error: Unsupported format, or corrupt file: Expected BOF record

If you use read_excel() to read a .csv you will get the error

XLRDError: Unsupported format, or corrupt file: Expected BOF record;

To read .csv one needs to use read_csv(), like this

df1= pd.read_csv("filename.csv")

The error message relates to the BOF (Beginning of File) record of an XLS file. However, the example shows that you are trying to read an XLSX file.

There are 2 possible reasons for this:

  1. Your version of xlrd is old and doesn't support reading xlsx files.
  2. The XLSX file is encrypted and thus stored in the OLE Compound Document format, rather than a zip format, making it appear to xlrd as an older format XLS file.

Double check that you are in fact using a recent version of xlrd. Opening a new XLSX file with data in just one cell should verify that.

However, I would guess the you are encountering the second condition and that the file is encrypted since you state above that you are already using xlrd version 0.9.2.

XLSX files are encrypted if you explicitly apply a workbook password but also if you password protect some of the worksheet elements. As such it is possible to have an encrypted XLSX file even if you don't need a password to open it.

Update: See @BStew's, third, more probable, answer, that the file is open by Excel.


You can get this error when the xlsx file is actually html; you can open it with a text editor to verify this. When I got this error I solved it using pandas:

import pandas as pd
df_list = pd.read_html('filename.xlsx')
df = pd.DataFrame(df_list[0])

There is also a third reason. The case when the file is already open by Excel. It generates the same error.