Java Streams GroupingBy and filtering by count (similar to SQL's HAVING)

The operation has to be performed after the grouping in general, as you need to fully collect a group before you can determine whether it fulfills the criteria.

Instead of collecting a map into another, similar map, you can use removeIf to remove non-matching groups from the result map and inject this finishing operation into the collector:

Map<KeyType, List<ElementType>> result =
    input.stream()
        .collect(collectingAndThen(groupingBy(x -> x.id(), HashMap::new, toList()),
            m -> {
                m.values().removeIf(l -> l.size() <= 5);
                return m;
            }));

Since the groupingBy(Function) collector makes no guarantees regarding the mutability of the created map, we need to specify a supplier for a mutable map, which requires us to be explicit about the downstream collector, as there is no overloaded groupingBy for specifying only function and map supplier.

If this is a recurring task, we can make a custom collector improving the code using it:

public static <T,K,V> Collector<T,?,Map<K,V>> having(
                      Collector<T,?,? extends Map<K,V>> c, BiPredicate<K,V> p) {
    return collectingAndThen(c, in -> {
        Map<K,V> m = in;
        if(!(m instanceof HashMap)) m = new HashMap<>(m);
        m.entrySet().removeIf(e -> !p.test(e.getKey(), e.getValue()));
        return m;
    });
}

For higher flexibility, this collector allows an arbitrary map producing collector but since this does not enforce a map type, it will enforce a mutable map afterwards, by simply using the copy constructor. In practice, this won’t happen, as the default is to use a HashMap. It also works when the caller explicitly requests a LinkedHashMap to maintain the order. We could even support more cases by changing the line to

if(!(m instanceof HashMap || m instanceof TreeMap
  || m instanceof EnumMap || m instanceof ConcurrentMap)) {
    m = new HashMap<>(m);
}

Unfortunately, there is no standard way to determine whether a map is mutable.

The custom collector can now be used nicely as

Map<KeyType, List<ElementType>> result =
    input.stream()
        .collect(having(groupingBy(x -> x.id()), (key,list) -> list.size() > 5));

The only way I am aware of is to use Collectors.collectingAndThen with the same implementation inside the finisher function:

Map<Integer, List<Item>> a = input.stream().collect(Collectors.collectingAndThen(
        Collectors.groupingBy(Item::id),
        map -> map.entrySet().stream()
                             .filter(e -> e.getValue().size() > 5)
                             .collect(Collectors.toMap(Entry::getKey, Entry::getValue))));