Does the perfomance of "filter then map" and "map then filter" differ in a Stream?

In this specific example, where calling Person.getName() has basically no cost at all, it doesn't matter, and you should use what you find the most readable (and filtering after could even be marginally faster, since as TJ mentions, the mapping operation is part of the filtering operation).

If the mapping operation has a significant cost however, then filtering first (if possible) is more efficient, since the stream won't have to map the elements that have been filtered out.

Let's take a contrived example: you have a stream of IDs, and for every even ID in the stream, you have to execute an http GET request or a database query to get the details of the item identified by this ID (and thus mapping the ID to a detailed object).

Assuming that the stream is composed of half even and half odd IDs, and each request takes the same time, you would divide the time by two by filtering first. If every http request takes 1 second and you have 60 IDs, you would go from 60 seconds to 30 seconds for the same task by filtering first, and you would also reduce the charge on the network and the external http API.


Apparently the performance totally depends on

  • how complex operations you performs while streaming (your business logic)
  • how complex your data is

Lets take two simple scenarios

Scenario 1

If your map function needs to performs some complex operation such as calling some external REST api to manipulate the stream objects, then in this scenario I recommend to filter first before map since it will reduce the no of unwanted expensive REST calls. In this approach when we do filter first, apparently it is performing the mapping operation twice for all matching objects.

enter image description here

Scenario 2

Assume that you need to manipulate the data stream first based on some external REST API calls or functions and then filter on that results. Apparently in this scenario you need to map first before filter the stream. This approach can be slightly faster compared to the previous one, since mapping operation is part of the filtering operation

enter image description here