Java StringTokenizer.nextToken() skips over empty fields

I would use Guava's Splitter, which doesn't need all the big regex machinery, and is more well-behaved than String's split() method:

Iterable<String> parts = Splitter.on('\t').split(string);

There is a RFE in the Sun's bug database about this StringTokenizer issue with a status Will not fix.

The evaluation of this RFE states, I quote:

With the addition of the java.util.regex package in 1.4.0, we have basically obsoleted the need for StringTokenizer. We won't remove the class for compatibility reasons. But regex gives you simply what you need.

And then suggests using String#split(String) method.


Thank you at all. Due to the first comment I was able to find a solution: Yes you are right, thank you for your reference:

 Scanner s = new Scanner(new File("data.txt"));
 while (s.hasNextLine()) {
      String line = s.nextLine();
      String[] items= line.split("\t", -1);
      System.out.println(items[5]);
      //System.out.println(Arrays.toString(cols));
 }

You can use Apache Commons StringUtils.splitPreserveAllTokens(). It does exactly what you need.