Determine MS Excel file type with Apache POI

Promoting a comment to an answer...

If you're going to be doing something special with the files, then rjokelai's answer is the way to do it.

However, if you're just going to be using the HSSF / XSSF / Common SS usermodel, then it's much simpler to have POI do it for you, and use WorkbookFactory to have the type detected and opened for you. You'd do something like:

 Workbook wb = WorkbookFactory.create(new File("something.xls"));

or

 Workbook wb = WorkbookFactory.create(request.getInputStream());

Then if you needed to do something special, test if it's a HSSFWorkbook or XSSFWorkbook. When opening the file, use a File rather than an InputStream if possible to speed things up and save memory.

If you don't know what your file is at all, use Apache Tika to do the detection - it can detect a huge number of different file formats for you.


You can use:

// For .xlsx
POIXMLDocument.hasOOXMLHeader(new BufferedInputStream( new FileInputStream(file) ));

// For .xls
POIFSFileSystem.hasPOIFSHeader(new BufferedInputStream( new FileInputStream(file) ));

These are essentially the methods that the WorkbookFactory#create(InputStream) uses for determining the type

Please note, that both method supports only streams supporting "mark" feature (or PushBackInputStream), so simple FileInputStream is not supported. Use BufferedInputStream as a wrapper. For this reason after the detection you can simply reuse the stream, since it will be reseted to the starting point.