Stream with sorted() before findFirst() is no longer lazy

Here's a smaller example that illustrates the issue:

Stream.of("a", "ab", "abc", "abcd")
    // .sorted() // uncomment and what follows becomes eager
    .filter(s -> s.contains("b"))
    .peek(s -> System.out.println("PEEK: " + s))
    .findFirst()
    .orElse("X");

As expected the output is:

PEEK: ab

If the sorted line is uncommented, the output is:

PEEK: ab
PEEK: abc
PEEK: abcd

(The final result of the entire pipeline is "ab" in both cases, as expected.)

It's true that sorted must consume all of its input before producing its first output element. In that sense it's eager. However, it does seem strange that it affects how elements are sent downstream.

Without sorting, the findFirst operation "pulls" elements from upstream until it finds one, and then it stops. With sorting, the sorted() operation eagerly gathers all the elements, sorts them, and since it has them all right there, it "pushes" them down the stream. Of course, findFirst ignores all but the first element. But this means that intervening operations (such as the filter) may do unnecessary work.

The final result is correct, but the behavior is unexpected. This might be considered a bug. I'll investigate and file a bug if appropriate.


The sorted operation forces traversal of all the items in the stream.

Stateful operations, such as distinct and sorted, may incorporate state from previously seen elements when processing new elements.

Stateful operations may need to process the entire input before producing a result. For example, one cannot produce any results from sorting a stream until one has seen all elements of the stream.

(Source)

I'm not sure, though, why the operations following the sorted are also executed for all the elements in the stream.

If you perform the sort separately, and then use the stream for the rest of the processing, the processing will stop when the first match is found, as expected.

Arrays.sort(dataArray, Comparator.comparing(d -> d.getPriority())); // sort

Arrays.stream(dataArray)
.peek(o -> System.out.println("SORT: " + o))
.map(d -> d.getOriginalURL(shortUrl))
.peek(o -> System.out.println("MAP: " + o))
.filter(u -> u != null && !u.isEmpty())
.peek(o -> System.out.println("FILTER: " + o))
.findFirst().orElse("");

What you want to use is stream.collect(minBy(...)). It has linear performance.

The minBy static method is found on the Collectors class.