Number all occurring leaves in a tree from left to right in Haskell

I have the same idea in mind as chepner: to use State. However, you don't have to write the recursion yourself because this is a simple traversal of the tree! Instead, derive Traversable and Foldable for your tree (good ideas anyway), and then lean on them to do the recursion for you:

{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveTraversable #-}

import qualified Control.Monad.Trans.State.Strict as S
data Tree a = Leaf a | Node (Tree a) (Tree a)
            deriving (Show, Functor, Foldable, Traversable)

labelTree :: Tree a -> Tree (a, Int)
labelTree t = S.evalState (traverse applyLabel t) 0
  where applyLabel x = do
          n <- S.get
          S.modify' succ
          pure (x, n)

*Main> labelTree (Node (Node (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
Node (Node (Leaf ('a',0)) (Leaf ('b',1))) (Leaf ('c',2))

One nice feature of this implementation is that it will still work if you change the structure of your tree (e.g., to store data in the interior nodes). It is impossible to make mistakes like swapping the order of nodes, because you don't work at that level at all: Traversable handles it for you.


What you probably need here is some sort of accumulator: a variable that you pass through the recursive calls, and each time you increment each time you "assign" the next id.

We thus define our function in terms of a helper function go. go will return a 2-tuple: the "labeled" tree, and the next id that we will "dispatch". This will later be used since we define a recursive call:

labelTree :: Tree a -> Tree (a, Int)
labelTree = fst . go 0
    where go ...

So go has type Int -> Tree a -> (Int, Tree (a, Int)). In case we see a Leaf, we thus "dispatch" that id, and then return that leaf, together with n + 1 as second part of the tuple, like:

go (Leaf x) n = (Leaf (x, n), n+1)

for a node, we will first dispatch ids to the left subtree, and then take the second item of that tuple as a start to dispatch elements to the right subtree, like:

go (Node l r) n0 = (Node ll lr, n2)
    where (ll, n1) = go l n0
          (lr, n2) = go r n1

We thus first call go l n0 to label the left subtree, and obtain a 2-tuple (ll, n1) that contains ll the labeled left subtree, and n1 the new number to dispatch later. We make a call to go r n1 so we dispatch numbers to the right subtree starting with n1. Our go funcion thus returns a new Node with the labeled subtrees, and the new number to dispatch. This is important for the caller of this function.

So in full, we can label a tree with:

labelTree :: Tree a -> Tree (a, Int)
labelTree = fst . go 0
    where go (Leaf x) n = (Leaf (x, n), n+1)
          go (Node l r) n0 = (Node ll lr, n2)
              where (ll, n1) = go l n0
                    (lr, n2) = go r n1

You can use the State monad to keep track of the number to add to a node.

labelTree :: Tree a -> Tree (a, Int)
labelTree l = evalState (labelTree' l) 0
    where labelTree' :: Tree a -> State Int (Tree (a, Int))
          labelTree' (Node l r) = Node <$> labelTree' l <*> labelTree' r
          labelTree' (Leaf a) = do n <- get
                                   put $ n + 1
                                   return $ Leaf (a, n)

labelTree' builds up a stateful computation which will number the leaves along an in-order traversal. evalState then runs the computation with an initial state of 0, so that leaves are numbered starting at 0.

The recursive case looks much like an ordinary tree function. Instead of simply applying Node to the results of each recursive call, you use the Applicative instance.

The base case numbers each Leaf using the current state and updates the state for the next leaf.

(Note that this is very similar to Willem Van Onsem's answer. Given that State s a is effectively a wrapper around functions of type s -> (a, s), the type of labelTree' :: Tree a -> State Int (Tree (a, Int), Int) can be coaxed into the same type as go:

labelTree' :: Tree a -> State Int (Tree (a, Int)) 
            ~ Tree a -> Int -> (Tree (a, Int), Int)
go ::         Tree a -> Int -> (Tree (a, Int), Int)

)

Tags:

Binary

Haskell