what is the difference between a stateful and a stateless lambda expression?

The first problem is this:

 List<Integer> list = new ArrayList<>();

    List<Integer> result = Stream.of(1, 2, 3, 4, 5, 6)
            .parallel()
            .map(x -> {
                list.add(x);
                return x;
            })
            .collect(Collectors.toList());

System.out.println(list);

You have no idea what the result will be here, since you are adding elements to a non-thread-safe collection ArrayList.

But even if you do:

  List<Integer> list = Collections.synchronizedList(new ArrayList<>());

And perform the same operation the list has no predictable order. Multiple Threads add to this synchronized collection. By adding the synchronized collection you guarantee that all elements are added (as opposed to the plain ArrayList), but in which order they will be present in unknown.

Notice that list has no order guarantees what-so-ever, this is called processing order. While result is guaranteed to be: [1, 2, 3, 4, 5, 6] for this particular example.

Depending on the problem, you usually can get rid of the stateful operations; for your example returning the synchronized List would be:

 Stream.of(1, 2, 3, 4, 5, 6)
            .filter(x -> x > 2) // for example a filter is present
            .collect(Collectors.collectingAndThen(Collectors.toList(), 
                          Collections::synchronizedList));

To try to give an example, let's consider the following Consumer (note : the usefulness of such a function is not of the matter here) :

public static class StatefulConsumer implements IntConsumer {

    private static final Integer ARBITRARY_THRESHOLD = 10;
    private boolean flag = false;
    private final List<Integer> list = new ArrayList<>();

    @Override
    public void accept(int value) {
        if(flag){   // exit condition
            return; 
        }
        if(value >= ARBITRARY_THRESHOLD){
            flag = true;
        }
        list.add(value); 
    }

}

It's a consumer that will add items to a List (let's not consider how to get back the list nor the thread safety) and has a flag (to represent the statefulness).

The logic behind this would be that once the threshold has been reached, the consumer should stop adding items.

What your book was trying to say was that because there is no guaranteed order in which the function will have to consume the elements of the Stream, the output is non-deterministic.

Thus, they advise you to only use stateless functions, meaning they will always produce the same result with the same input.


Here is an example where a stateful operation returns a different result each time:

public static void main(String[] args) {

Set<Integer> seen = new HashSet<>();

IntStream stream = IntStream.of(1, 2, 3, 1, 2, 3);

// Stateful lambda expression
IntUnaryOperator mapUniqueLambda = (int i) -> {
    if (!seen.contains(i)) {
        seen.add(i);
        return i;
    }
    else {
        return 0;
    }
};

int sum = stream.parallel().map(mapUniqueLambda).peek(i ->   System.out.println("Stream member: " + i)).sum();

System.out.println("Sum: " + sum);
}

In my case when I ran the code I got the following output:

Stream member: 1
Stream member: 0
Stream member: 2
Stream member: 3
Stream member: 1
Stream member: 2
Sum: 9

Why did I get 9 as the sum if I'm inserting into a hashset?
The answer: Different threads took different parts of the IntStream For example values 1 & 2 managed to end up on different threads.