# Triangularizing a list in Haskell

This appears to be directly related to the set theory argument proving that the set of integer pairs are in one-to-one correspondence with the set of integers (denumerable). The argument involves a so-called Cantor pairing function.

So, out of curiosity, let's see if we can get a diagonalize function that way. Define the infinite list of Cantor pairs recursively in Haskell:

auxCantorPairList :: (Integer, Integer) -> [(Integer, Integer)]
auxCantorPairList (x,y) =
let nextPair = if (x > 0) then (x-1,y+1) else (x+y+1, 0)
in (x,y) : auxCantorPairList nextPair

cantorPairList :: [(Integer, Integer)]
cantorPairList = auxCantorPairList (0,0)


And try that inside ghci:

 λ> take 15 cantorPairList
[(0,0),(1,0),(0,1),(2,0),(1,1),(0,2),(3,0),(2,1),(1,2),(0,3),(4,0),(3,1),(2,2),(1,3),(0,4)]
λ>


We can number the pairs, and for example extract the numbers for those pairs which have a zero x coordinate:

 λ>
λ> xs = [1..]
λ> take 5 $map fst$ filter (\(n,(x,y)) -> (x==0)) $zip xs cantorPairList [1,3,6,10,15] λ>  We recognize this is the top row from the OP's result in the text of the question. Similarly for the next two rows:  λ> λ> makeRow xs row = map fst$ filter (\(n,(x,y)) -> (x==row)) $zip xs cantorPairList λ> take 5$ makeRow xs 1
[2,5,9,14,20]
λ>
λ> take 5 $makeRow xs 2 [4,8,13,19,26] λ>  From there, we can write our first draft of a diagonalize function:  λ> λ> printAsLines xs = mapM_ (putStrLn . show) xs λ> diagonalize xs = takeWhile (not . null)$ map (makeRow xs) [0..]
λ>
λ> printAsLines $diagonalize [1..19] [1,3,6,10,15] [2,5,9,14] [4,8,13,19] [7,12,18] [11,17] [16] λ>  ## EDIT: performance update For a list of 1 million items, the runtime is 18 sec, and 145 seconds for 4 millions items. As mentioned by Redu, this seems like O(n√n) complexity. Distributing the pairs among the various target sublists is inefficient, as most filter operations fail. To improve performance, we can use a Data.Map structure for the target sublists.  {-# LANGUAGE ExplicitForAll #-} {-# LANGUAGE ScopedTypeVariables #-} import qualified Data.List as L import qualified Data.Map as M type MIL a = M.Map Integer [a] buildCantorMap :: forall a. [a] -> MIL a buildCantorMap xs = let ts = zip xs cantorPairList -- triplets (a,(x,y)) m0 = (M.fromList [])::MIL a redOp m (n,(x,y)) = let afn as = case as of Nothing -> Just [n] Just jas -> Just (n:jas) in M.alter afn x m m1r = L.foldl' redOp m0 ts in fmap reverse m1r diagonalize :: [a] -> [[a]] diagonalize xs = let cm = buildCantorMap xs in map snd$ M.toAscList cm



With that second version, performance appears to be much better: 568 msec for the 1 million items list, 2669 msec for the 4 millions item list. So it is close to the O(n*Log(n)) complexity we could have hoped for.

Make increasing size chunks:

chunks :: [a] -> [[a]]
chunks = go 0 where
go n [] = []
go n as = b : go (n+1) e where (b,e) = splitAt n as


Then just transpose twice:

diagonalize :: [a] -> [[a]]
diagonalize = transpose . transpose . chunks


Try it in ghci:

> diagonalize [1..19]
[[1,3,6,10,15],[2,5,9,14],[4,8,13,19],[7,12,18],[11,17],[16]]


It might be a good idea to craete a comb filter.

So what does comb filter do..? It's like splitAt but instead of splitting at a single index it sort of zips the given infinite list with the given comb to separate the items coressponding to True and False in the comb. Such that;

comb :: [Bool]  -- yields [True,False,True,False,False,True,False,False,False,True...]
comb = iterate (False:) [True] >>= id

combWith :: [Bool] -> [a] -> ([a],[a])
combWith _ []          = ([],[])
combWith (c:cs) (x:xs) = let (f,s) = combWith cs xs
in if c then (x:f,s) else (f,x:s)

λ> combWith comb [1..19]
([1,3,6,10,15],[2,4,5,7,8,9,11,12,13,14,16,17,18,19])


Now all we need to do is to comb our infinite list and take the fst as the first row and carry on combing the snd with the same comb.

Lets do it;

diags :: [a] -> [[a]]
diags [] = []
diags xs = let (h,t) = combWith comb xs
in h : diags t

λ> diags [1..19]
[ [1,3,6,10,15]
, [2,5,9,14]
, [4,8,13,19]
, [7,12,18]
, [11,17]
, [16]
]


also seems to be lazy too :)

λ> take 5 . map (take 5) \$ diags [1..]
[ [1,3,6,10,15]
, [2,5,9,14,20]
, [4,8,13,19,26]
, [7,12,18,25,33]
, [11,17,24,32,41]
]


I think the complexity could be like O(n√n) but i can not make sure. Any ideas..?