Introduction
Python has a wealthy ecosystem of libraries that make it a perfect language for knowledge evaluation. A kind of libraries is pandas
, which simplifies the method of studying and writing knowledge between in-memory knowledge buildings and totally different file codecs.
Nonetheless, whereas working with Excel recordsdata utilizing pandas.read_excel
, you may run into an error that appears like this:
xlrd.biffh.XLRDError: Excel xlsx file; not supported
On this Byte, we’ll dissect this error message, perceive why it happens, and learn to repair it.
What’s the Error “xlrd.biffh.XLRDError”
The xlrd.biffh.XLRDError
is a particular error message that you just may encounter whereas working with the pandas
library in Python. This error is thrown whenever you attempt to learn an Excel file with the .xlsx
extension utilizing pandas.read_excel
technique.
Here is an instance of the error:
import pandas as pd
df = pd.read_excel('file.xlsx')
Output:
xlrd.biffh.XLRDError: Excel xlsx file; not supported
Reason for the Error
The xlrd.biffh.XLRDError
error is attributable to a latest change within the xlrd
library that pandas
makes use of to learn Excel recordsdata. The xlrd
library now solely helps the older .xls
file format and now not helps the newer .xlsx
file format.
This modification could be a little bit of a shock in case you’ve been utilizing pandas.read_excel
with xlrd
. By default, pandas.read_excel
makes use of the xlrd
library to learn Excel recordsdata, however as of xlrd
model 2.0.0, this library now not helps .xlsx
recordsdata.
As builders, we have all been there…
Easy methods to Repair the Error
The answer to this error is straightforward. You simply want to put in openpyxl
and specify the engine
argument within the pandas.read_excel
technique to make use of the openpyxl
library as an alternative of xlrd
. The openpyxl
library helps each .xls
and .xlsx
file codecs.
Here is do it:
First, it is advisable set up the openpyxl
library. You are able to do this utilizing pip:
$ pip set up openpyxl
Then, you possibly can specify the engine
argument within the pandas.read_excel
technique like this:
import pandas as pd
df = pd.read_excel('file.xlsx', engine='openpyxl')
This code will learn the Excel file utilizing the openpyxl
library, and you’ll now not encounter the xlrd.biffh.XLRDError
error.
Conclusion
On this Byte, we have realized concerning the xlrd.biffh.XLRDError
error that occurs when utilizing pandas.read_excel
to learn .xlsx
recordsdata. We have realized why this error happens and repair it by utilizing the openpyxl
library.