Fast CSV parsing

opencsv

Take a look at opencsv.

This blog post, opencsv is an easy CSV parser, has example usage.


The problem of your code is that it's using replaceAll and split which are very costly operation. You should definitely consider using a csv parser/reader that would do a one pass parsing.

There is a benchmark on github

https://github.com/uniVocity/csv-parsers-comparison

that unfortunately is ran under java 6. The number are slightly different under java 7 and 8. I'm trying to get more detail data for different file size but it's work in progress

see https://github.com/arnaudroger/csv-parsers-comparison


Apache Commons CSV

Have you seen Apache Commons CSV?

Caveat On Using split

Bear in mind is that split only returns a view of the data, meaning that the original line object is not eligible for garbage collection whilst there is a reference to any of its views. Perhaps making a defensive copy will help? (Java bug report)

It also is not reliable in grouping escaped CSV columns containing commas

Tags:

Java

Csv

Parsing