How do I make this algorithm lazier without repeating myself?

Rather than use explicit wrappers, you can leverage the type system. Note that the version of precise that uses Maybe for your first code snippet:

precise :: Maybe (Integer, v) -> TreeMap v -> Maybe (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
  LT -> precise closestSoFar l
  EQ -> Just (k, v)
  GT -> precise (Just (k, v)) r

is almost exactly the same algorithm as the version of precise without Maybe from your second code snippet, which could be written in the Identity functor as:

precise :: Identity (Integer, v) -> TreeMap v -> Identity (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
  LT -> precise closestSoFar l
  EQ -> Identity (k, v)
  GT -> precise (Identity (k, v)) r

These can be unified into a version polymorphic in the Applicative:

precise :: (Applicative f) => f (Integer, v) -> TreeMap v -> f (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
  LT -> precise closestSoFar l
  EQ -> pure (k, v)
  GT -> precise (pure (k, v)) r

By itself, that doesn't accomplish much, but if we know that the GT branch will always return a value, we can force it to run in the Identity functor, regardless of the starting functor. That is, we can start in the Maybe functor but recurse into the Identity functor in the GT branch:

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing
  where
    precise :: (Applicative t) => t (Integer, v) -> TreeMap v -> t (Integer, v)
    precise closestSoFar Leaf = closestSoFar
    precise closestSoFar (Node k v l r) = case i `compare` k of
      LT -> precise closestSoFar l
      EQ -> pure (k, v)
      GT -> pure . runIdentity $ precise (Identity (k, v)) r

This works fine with your test case:

> isJust $ closestLess 5 (Node 3 () Leaf undefined)
True

and is a nice example of polymorphic recursion.

Another nice thing about this approach from a performance point of view is that the -ddump-simpl shows that there are no wrappers or dictionaries. It's all been erased at the type level with specialized functions for the two functors:

closestLess
  = \ @ v i eta ->
      letrec {
        $sprecise
        $sprecise
          = \ @ v1 closestSoFar ds ->
              case ds of {
                Leaf -> closestSoFar;
                Node k v2 l r ->
                  case compareInteger i k of {
                    LT -> $sprecise closestSoFar l;
                    EQ -> (k, v2) `cast` <Co:5>;
                    GT -> $sprecise ((k, v2) `cast` <Co:5>) r
                  }
              }; } in
      letrec {
        $sprecise1
        $sprecise1
          = \ @ v1 closestSoFar ds ->
              case ds of {
                Leaf -> closestSoFar;
                Node k v2 l r ->
                  case compareInteger i k of {
                    LT -> $sprecise1 closestSoFar l;
                    EQ -> Just (k, v2);
                    GT -> Just (($sprecise ((k, v2) `cast` <Co:5>) r) `cast` <Co:4>)
                  }
              }; } in
      $sprecise1 Nothing eta

Starting from my non-lazy implementation, I first refactored precise to receive Just as an argument, and generalized its type accordingly:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Just Nothing where
  precise :: ((Integer, v) -> t) -> t -> TreeMap v -> t
  precise _ closestSoFar Leaf = closestSoFar
  precise wrap closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise wrap closestSoFar l
    EQ -> wrap (k, v)
    GT -> precise wrap (wrap (k, v)) r

Then, I changed it to do wrap early and call itself with id in the GT case:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Just Nothing where
  precise :: ((Integer, v) -> t) -> t -> TreeMap v -> t
  precise _ closestSoFar Leaf = closestSoFar
  precise wrap closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise wrap closestSoFar l
    EQ -> wrap (k, v)
    GT -> wrap (precise id (k, v) r)

This still works exactly as before, except for the benefit of the added laziness.

I think the CPS version you answered with yourself is the best but for completeness here are a few more ideas. (EDIT: Buhr's answer is now the most performant.)

The first idea is to get rid of the "closestSoFar" accumulator, and instead let the GT case handle all the logic of picking the rightmost value smallest than the argument. In this form, the GT case can directly return a Just:

closestLess1 :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess1 _ Leaf = Nothing
closestLess1 i (Node k v l r) =
  case i `compare` k of
    LT -> closestLess1 i l
    EQ -> Just (k, v)
    GT -> Just (fromMaybe (k, v) (closestLess1 i r))

This is simpler, but takes a bit more space on the stack when you hit a lot of GT cases. Technically you could even use that fromMaybe in the accumulator form (i.e., replacing the fromJust implicit in luqui's answer), but that would be a redundant, unreachable branch.

The other idea that there's really two "phases" of the algorithm, one before and one after you hit a GT, so you parameterize it by a boolean to represent these two phases, and use dependent types to encode the invariant that there will always be a result in the second phase.

data SBool (b :: Bool) where
  STrue :: SBool 'True
  SFalse :: SBool 'False

type family MaybeUnless (b :: Bool) a where
  MaybeUnless 'True a = a
  MaybeUnless 'False a = Maybe a

ret :: SBool b -> a -> MaybeUnless b a
ret SFalse = Just
ret STrue = id

closestLess2 :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess2 i = precise SFalse Nothing where
  precise :: SBool b -> MaybeUnless b (Integer, v) -> TreeMap v -> MaybeUnless b (Integer, v)
  precise _ closestSoFar Leaf = closestSoFar
  precise b closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise b closestSoFar l
    EQ -> ret b (k, v)
    GT -> ret b (precise STrue (k, v) r)

How do I make this algorithm lazier without repeating myself?

Tags:

Haskell

Lazy Evaluation

Dry

Related

Recent Posts