Why does function concat use foldr? Why not foldl'
EDIT I talk about laziness and productivity in this answer, and in my excitement I forgot a very important point that jpmariner focuses on in their answer: left-associating
(++) is quadratic time!
foldl' is appropriate when your accumulator is a strict type, like most small types such as
Int, or even large spine-strict data structures like
Data.Map. If the accumulator is strict, then the entire list must be consumed before any output can be given.
foldl' uses tail recursion to avoid blowing up the stack in these cases, but
foldr doesn't and will perform badly. On the other hand,
foldl' must consume the entire list in this way.
foldl f z  = z foldl f z  = f z 1 foldl f z [1,2] = f (f z 1) 2 foldl f z [1,2,3] = f (f (f z 1) 2) 3
The final element of the list is required to evaluate the outermost application, so there is no way to partially consume the list. If we expand this with
(++), we will see:
foldl (++)  [[1,2],[3,4],[5,6]] = (( ++ [1,2]) ++ [3,4]) ++ [5,6] ^^ = ([1,2] ++ [3,4]) ++ [5,6] = ((1 : ) ++ [3,4]) ++ [5,6] ^^ = (1 : ( ++ [3,4])) ++ [5,6] ^^ = 1 : (( ++ [3,4]) ++ [5,6])
(I admit this looks a little magical if you don't have a good feel for cons lists; it's worth getting dirty with the details though)
See how we have to evaluate every
(++) (marked with
^^ when they are evaluated) on the way down before before the
1 bubbles out to the front? The
1 is "hiding" under function applications until then.
foldr, on the other hand, is good for non-strict accumulators like lists, because it allows the accumulator to yield information before the entire list is consumed, which can bring many classically linear-space algorithms down to constant space! This also means that if your list is infinite,
foldr is your only choice, unless your goal is to heat your room using your CPU.
foldr f z  = z foldr f z  = f 1 z foldr f z [1,2] = f 1 (f 2 z) foldr f z [1,2,3] = f 1 (f 2 (f 3 z)) foldr f z [1..] = f 1 (f 2 (f 3 (f 4 (f 5 ...
We have no trouble expressing the outermost applications without having to see the entire list. Expanding
foldr the same way we did
foldr (++) z [[1,2],[3,4],[5,6]] = [1,2] ++ ([3,4] ++ ([5,6] ++ )) = (1 : ) ++ (3,4] ++ ([5,6] ++ )) ^^ = 1 : ( ++ ([3,4] ++ ([5,6] ++ )))
1 is yielded immediately without having to evaluate any of the
(++)s but the first one. Because none of those
(++)s are evaluated, and Haskell is lazy, they don't even have to be generated until more of the output list is consumed, meaning
concat can run in constant space for a function like this
concat [ [1..n] | n <- [1..] ]
which in a strict language would require intermediate lists of arbitrary length.
If these reductions look a little too magical, and if you want to go deeper, I suggest examining the source of
(++) and doing some simple manual reductions against its definition to get a feel for it. (Just remember
[1,2,3,4] is notation for
1 : (2 : (3 : (4 : )))).
In general, the following seems to be a strong rule of thumb for efficiency: use
foldl' when your accumulator is a strict data structure, and
foldr when it's not. And if you see a friend using regular
foldl and don't stop them, what kind of friend are you?
cause of using foldr in concat instead of foldl' ?
What if the result gets fully evaluated ?
If you consider
[1,2,3] ++ [6,7,8] within an imperative programming mindset, all you have to do is redirect the next pointer at node 3 towards node 6, assuming of course you may alter your left side operand.
This being Haskell, you may NOT alter your left side operand, unless the optimizer is able to prove that
++ is the sole user of its left side operand.
Short of such a proof, other Haskell expressions pointing to node 1 have every right to assume that node 1 is forever at the beginning of a list of length 3. In Haskell, the properties of a pure expression cannot be altered during its lifetime.
So, in the general case, operator
++ has to do its job by duplicating its left side operand, and the duplicate of node 3 may then be set to point to node 6. On the other hand, the right side operand can be taken as is.
So if you fold the concat expression starting from the right, each component of the concatenation must be duplicated exactly once. But if you fold the expression starting from the left, you are facing a lot of repetitive duplication work.
Let's try to check that quantitatively. To ensure that no optimizer will get in the way by proving anything, we'll just use the ghci interpreter. Its strong point is interactivity not optimization.
So let's introduce the various candidates to ghci, and switch statistics mode on:
$ ghci λ> λ> myConcat0 = L.foldr (++)  λ> myConcat1 = L.foldl (++)  λ> myConcat2 = L.foldl' (++)  λ> λ> :set +s λ>
We'll force full evaluation by using lists of numbers and printing their sum.
First, let's get baseline performance by folding from the right:
λ> λ> sum $ concat [ [x] | x <- [1..10000::Integer] ] 50005000 (0.01 secs, 3,513,104 bytes) λ> λ> sum $ myConcat0 [ [x] | x <- [1..10000::Integer] ] 50005000 (0.01 secs, 3,513,144 bytes) λ>
Second, let's fold from the left, to see whether that improves matters or not.
λ> λ> sum $ myConcat1 [ [x] | x <- [1..10000::Integer] ] 50005000 (1.26 secs, 4,296,646,240 bytes) λ> λ> sum $ myConcat2 [ [x] | x <- [1..10000::Integer] ] 50005000 (1.28 secs, 4,295,918,560 bytes) λ>
So folding from the left allocates much more transient memory and takes much more time, probably because of this repetitive duplication work.
As a last check, let's double the problem size:
λ> λ> sum $ myConcat2 [ [x] | x <- [1..20000::Integer] ] 200010000 (5.91 secs, 17,514,447,616 bytes) λ>
We see that doubling the problem size causes the resource consumptions to get multiplied by about 4. Folding from the left has quadratic cost in the case of
Looking at the excellent answer by luqui, we see that both concerns:
- the need to be able to access the beginning of the result list lazily
- the need to avoid quadratic cost for full evaluation
happen to vote both in the same way, that is in favor of folding from the right.
Hence the Haskell library
concat function using
After running some tests using GHC v8.6.5 with -O3 option instead of ghci, it appears that my preconceived idea of the optimizer messing up with the measurements was erroneous.
Even with -O3, for a problem size of 20,000, the foldr-based concat function is about 500 times faster that the foldl'-based one.
So either the optimizer fails to prove that it is OK to alter/reuse the left operand, or it just does not try at all.