Non-trivial algorithm conversion from imperative to functional

(Disclaimer: I have zero idea of what is being calculated here.)

The access pattern in the array U seems to be as follows: from the index i outwards, we consume values to the left and to the right. We could imagine having two lists which consisted precisely in those sequences of elements. In fact, we could construct such lists out of a source list like this:

pryAt :: Int -> [a] -> ([a], [a])
pryAt i xs = go i ([], xs)
  where
    go 0 a = a
    go n (us, v : vs) = go (pred n) (v : us, vs)
-- pryAt 5 ['a'..'x']
-- ("edcba","fghijklmnopqrstuvwx")

For random-access containers, we could have specialized versions of pryAt, because traversing the whole list until we reach the ith element will be inefficient.

In the outer loop, we have arrays N, left and right which grow with each iteration (N seems to be fully reconstructed at each iteration, as well). We could represent them as lists. In each iteration, we also consume a pair of elements of U, to the left and right.

The following datatype represents the situation at the beginning of an iteration of the outer loop:

data State = State
  { ns :: [Double],
    left :: [Double],
    right :: [Double],
    us :: ([Double], [Double])
  }

Assuming we already had the outerStep :: State -> State implemented, we could simply turn the crank p times:

basis :: Int -> Double -> Int -> [Double] -> [Double]
basis i u p us =
  ns $ iterate outerStep initial !! p
  where
    initial =
      State
        { ns = [1.0],
          left = [],
          right = [],
          us = pryAt i us
        }

What is done at outerStep? We add new elements to left and right, then we re-create the whole N list from the beginning, while carrying a saved accumulator along the way. This is a mapAccumR. We need some extra info: the right values (in the same direction as N) and the left values (in reverse direction) so we need to zip them beforehand:

    outerStep (State {ns, left, right, us = (ul : uls, ur : urs)}) =
      let left' = u - ul : left
          right' = ur - u : right
          (saved', ns') = mapAccumR innerStep 0.0 $ zip3 ns right' (reverse left')
       in State
            { ns = saved' : ns',
              left = left',
              right = right',
              us = (uls, urs)
            }

And here are the computations of the inner step:

    innerStep saved (n, r, l) = 
        let temp = n / (r - l)
            n' = saved + r
            saved' = l * temp
         in (saved', n')

In addition to correcting possible bugs, more work would remain because the basis function in its current form is likely to leak memory (in particular, that mapAccumR will create lots of thunks). Perhaps it could be rewritten to use functions like iterate' or foldl' that keep their accumulators strict.


Although it is certainly possible to translate numerical algorithms into Haskell starting from Fortran/Matlab/C “everything is an array” style (and, using unboxed mutable vectors, performance will generally be not much worse) this is really missing the point about using a functional language. The underlying math is actually much closer to functional than to imperative programming, so the best thing is to start right there. Specifically, the recurrence formula

equation 2.5 from the NURBS book

can be translated almost literally into Haskell, much better than into an imperative language:

baseFuncs :: [Double]  -- ^ Knots, \(\{u_i\}_i\)
          -> Int       -- ^ Index \(i\) at which to evaluate
          -> Int       -- ^ Spline degree \(p\)
          -> Double    -- ^ Position \(u\) at which to evaluate
          -> Double
baseFuncs us i 0 u
  | u >= us!!i, u < us!!(i+1)  = 1
  | otherwise                  = 0
baseFuncs us i p u
      = (u - us!!i)/(us!!(i+p) - us!!i) * baseFuncs us i (p-1) u
       + (us!!(i+p+1) - u)/(us!!(i+p+1) - us!!(i+1)) * baseFuncs us (i+1) (p-1) u

Unfortunately this will actually not be efficient though, for multiple reasons.

First, lists suck at random-access. A simple fix is to switch to unboxed (but pure) vectors. While we're at is let's wrap them in a newtype because the ui are supposed to be strictly increasing. Talking about types: the direct accesses are unsafe; we could fix this by bringing p and the number of segments to the type level and only allowing indices i < n-p, but I won't go into that here.

Also, it's awkward to pass us and u around all the way down the recursion, better just bind it once and then use a helper function to go down:

import Data.Vector.Unboxed (Vector, (!))
import qualified Data.Vector.Unboxed as VU

newtype Knots = Knots {getIncreasingKnotsSeq :: Vector Double}

baseFuncs :: Knots     -- ^ \(\{u_i\}_i\)
          -> Int       -- ^ Index \(i\) at which to evaluate
          -> Int       -- ^ Spline degree \(p\)
          -> Double    -- ^ Position \(u\) at which to evaluate
          -> Double
baseFuncs (Knots us) i₀ p₀ u = go i₀ p₀
 where go i 0
        | u >= us!i
        , i>=VU.length us-1 || u < us!(i+1)  = 1
        | otherwise                          = 0
       go i p
           = (u - us!i)/(us!(i+p) - us!i) * go i (p-1)
            + (us!(i+p+1) - u)/(us!(i+p+1) - us!(i+1)) * go (i+1) (p-1)

The other thing that's not optimal is that we don't share the lower-level evaluations between neighbouring recursive calls. (The evaluation is effectively spanning a directed graph with p22 nodes, but we evaluate it as a tree with 2p nodes.) That's a massive inefficiency for large p, but actually quite harmless for the typical low-degree splines.

The way to avoid this inefficiency is to memoise. The C version does this explicitly with the N array, but – this being Haskell – we can be lazy to save the effort of allocating the correct size by using a generic memoisation library, e.g. memo-trie:

import Data.MemoTrie (memo2)

baseFuncs (Knots us) i₀ p₀ u = go' i₀ p₀
 where go i 0
        | u >= us!i
        , i>=VU.length us || u < us!(i+1)  = 1
        | otherwise                        = 0
       go i p
           = (u - us!i)/(us!(i+p) - us!i) * go' i (p-1)
            + (us!(i+p+1) - u)/(us!(i+p+1) - us!(i+1)) * go' (i+1) (p-1)
       go' = memo2 go

That was the no-brains version (“just memoise the entire domain of go”). As dfeuer remarks, it is easy enough to explicitly memoise only the regian that actually gets evaluated, and then we can again use an efficient unboxed vector:

baseFuncs (Knots us) i₀ p₀ u = VU.unsafeHead $ gol i₀ p₀
 where gol i 0 = VU.generate (p₀+1) $ \j ->
        if u >= us!(i+j)
            && (i+j>=VU.length us || u < us!(i+j+1))
         then 1 else 0
       gol i p = case gol i (p-1) of
        res' -> VU.izipWith
         (\j l r -> let i' = i+j
            in (u - us!i')/(us!(i'+p) - us!i') * l
            + (us!(i'+p+1) - u)/(us!(i'+p+1) - us!(i'+1)) * r)
         res' (VU.unsafeTail res')

(I can safely use unsafeHead and unsafeTail here, because at each recursion level the zipping reduces the length by 1, so at the top-level I still have p₀ - (p₀-1) = 1 elements left.)

This version should, I think, have the same asymptotics as the C version. With some more small improvements like precomputing the interval lengths and pre-checking that the arguments are in the allowed range so all accesses can be made unsafe, it is probably very close in performance to the C version.

As – again – dfeuer remarks, it might not even be necessary to use vectors there because I just zip together the result. For this kind of stuff, GHC can be very good at optimising code even when using plain lists. But, I won't investigate the performance any further here.


The test I used to confirm it actually works:

plot of the base splines

https://gist.github.com/leftaroundabout/4fd6ef8642029607e1b222783b9d1c1e