Is java.util.Scanner that slow?

Don't know about Android, but at least in JavaSE, Scanner is slow.

Internally, Scanner does UTF-8 conversion, which is useless in a file with floats.

Since all you want to do is read floats from a file, you should go with the java.io package.

The folks on SPOJ struggle with I/O speed. It's is a Polish programming contest site with very hard problems. Their difference is that they accept a wider array of programming languages than other sites, and in many of their problems, the input is so large that if you don't write efficient I/O, your program will burst the time limit.

Of course, I advise against writing your own float parser, but if you need speed, that's still a solution.


For the Spotify Challenge they wrote a small java utility for parsing IO faster: http://spc10.contest.scrool.se/doc/javaio The utility is called Kattio.java and uses BufferedReader, StringTokenizer and Integer.parseInt/Double.parseDouble/Long.parseLong to read numerics.


Very Insightful post. Normally when I worked with Java thought Scanner is fastest on PC. The same when I try to use it in AsyncTask on Android, its WORST.

I think Android must come up with alternative to scanner. I was using scanner.nextFloat(); & scanner.nextDouble(); & scanner.nextInt(); all together which made my life sick. After I did my tracing of my app, found that the culprit was sitting hidden.

I did change to Float.parseFloat(scanner.next()); similarly Double.parseDouble(scanner.next()); & Integer.parseInt(scanner.next());, which certainly made my app quite fast I must agree, may be 60% faster.

If anyone have experienced the same, please post here. And I'm too looking out at alternative to Scanner API, any one have bright ideas can come forward and post here on reading file formats.


As other posters have stated it's more efficient to include the data in a binary format. However, for a quick fix I've found that replacing:

scanner.nextFloat();

with

Float.parseFloat(scanner.next());

is almost 7 times faster.

The source of the performance issues with nextFloat are that it uses a regular expression to search for the next float, which is unnecessary if you know the structure of the data you're reading beforehand.

It turns out most (if not all) of the next* use regular expressions for a similar reason, so if you know the structure of your data it's preferable to always use next() and parse the result. I.E. also use Double.parseDouble(scanner.next()) and Integer.parseInt(scanner.next()).

Relevant source: https://android.googlesource.com/platform/libcore/+/master/luni/src/main/java/java/util/Scanner.java

Tags:

Java

Android