Understanding Spliterator, Collector and Stream in Java 8

You should almost certainly never have to deal with Spliterator as a user; it should only be necessary if you're writing Collection types yourself and also intending to optimize parallelized operations on them.

For what it's worth, a Spliterator is a way of operating over the elements of a collection in a way that it's easy to split off part of the collection, e.g. because you're parallelizing and want one thread to work on one part of the collection, one thread to work on another part, etc.

You should essentially never be saving values of type Stream to a variable, either. Stream is sort of like an Iterator, in that it's a one-time-use object that you'll almost always use in a fluent chain, as in the Javadoc example:

int sum = widgets.stream()
                  .filter(w -> w.getColor() == RED)
                  .mapToInt(w -> w.getWeight())
                  .sum();

Collector is the most generalized, abstract possible version of a "reduce" operation a la map/reduce; in particular, it needs to support parallelization and finalization steps. Examples of Collectors include:

  • summing, e.g. Collectors.reducing(0, (x, y) -> x + y)
  • StringBuilder appending, e.g. Collector.of(StringBuilder::new, StringBuilder::append, StringBuilder::append, StringBuilder::toString)

Spliterator basically means "splittable Iterator".

Single thread can traverse/process the entire Spliterator itself, but the Spliterator also has a method trySplit() which will "split off" a section for someone else (typically, another thread) to process -- leaving the current spliterator with less work.

Collector combines the specification of a reduce function (of map-reduce fame), with an initial value, and a function to combine two results (thus enabling results from Spliterated streams of work, to be combined.)

For example, the most basic Collector would have an initial vaue of 0, add an integer onto an existing result, and would 'combine' two results by adding them. Thus summing a spliterated stream of integers.

See:

  • Spliterator.trySplit()
  • Collector<T,A,R>