Implementing a function which generalizes the merging step in merge sort

Preamble

Since I agree that it would be nice to have a generic function of this type, I will provide a general implementation. First, I will give a generic one based on linked lists, then I will add a JIT-compiled one for special numeric types, and lastly, I will bring it all together.

Top-level implementation based on linked lists

Here is a reasonably efficient implementation based on linked lists:

ClearAll[toLinkedList, ll];
SetAttributes[ll, HoldAllComplete];
toLinkedList[s_List] := Fold[ll[#2, #1] &, ll[], Reverse[s]];

and the main function:

ClearAll[merge];
merge[a_ll, ll[], s_, _] := List @@ Flatten[ll[s, a], Infinity, ll];
merge[ll[], b_ll, s_, _] := List @@ Flatten[ll[s, b], Infinity, ll];
merge[ll[a1_, atail_], b : ll[b1_, _], s_, f_: LessEqual] /;f[a1, b1] :=
    merge[atail, b, ll[s, a1], f];
merge[a : ll[a1_, _], ll[b1_, brest_], s_, f_: LessEqual] :=
    merge[a, brest, ll[s, b1], f];
merge[a_List, b_List, f_: LessEqual] :=
    merge[toLinkedList@a, toLinkedList@b, ll[], f];

For example:

merge[{2,5,1},{3,6,4},LessEqual]
 {2,3,5,1,6,4}
merge[{2,5,1},{3,6,4},Greater]
 {3,6,4,2,5,1}

And also for large lists:

large1 = RandomInteger[100, 10000];
large2 = RandomInteger[100, 10000];

Block[{$IterationLimit = Infinity},
   merge[large1,large2,LessEqual]]//Short//AbsoluteTiming
{0.0751953,{70,54,78,84,11,21,41,49,78,93,90,70,19,
      <<19975>>,42,2,10,40,53,12,2,47,89,40,2,80}}

For a complete implementation of merge sort algorithm based on linked lists, see this post (the difference there is that I used repeated rule application instead of recursion. Originally, the goal of that example was to show that ReplaceRepeated is not necessarily slow if the patterns are constructed efficiently).

Full implementation including JIT-compilation

I'd like to show here how one could implement a fairly complete function which would automatically dispatch to an efficient JIT-compiled code when the arguments are appropriate. Compilation will work not just for numeric lists, but for lists of tensors in general, as long as they are of the same shape.

JIT - compilation

First comes the JIT-compiled version, done along the lines discussed in this answer, section "Making JIT-compiled functions"

ClearAll[mergeJIT];
mergeJIT[pred_, listType_, target : ("MVM" | "C") : "MVM"] :=
  mergeJIT[pred, Verbatim[listType], target] =
    Block[{fst, sec},
      With[{decl = {Prepend[listType, fst], Prepend[listType, sec]}},
       Compile @@
         Hold[decl,
           Module[{result = Table[0, {Length[fst] + Length[sec]}], i = 0, 
                fctr = 1, sctr = 1},
             While[fctr <= Length[fst] && sctr <= Length[sec],
               If[pred[fst[[fctr]], sec[[sctr]]],
                   result[[++i]] = fst[[fctr++]],
                   (* else *)
                   result[[++i]] = sec[[sctr++]]
               ]
             ];
             If[fctr > Length[fst],
               result[[i + 1 ;; -1]] = sec[[sctr ;; -1]],
               (* else *)
               result[[i + 1 ;; -1]] = fst[[fctr ;; -1]]
             ];
             result
           ],
           CompilationTarget -> target
         ]]];

You can use this in isolation:

mergeJIT[LessEqual,{_Integer,1},"MVM"][{2,5,1},{3,6,4}]
 {2,3,5,1,6,4}

but it is much better to use as a part of the generic function, which would figure out the types for you automatically.

Generic function implementation

This is a function to find the type of our lists:

Clear[getType, $useCompile];
getType[arg_List] /; $useCompile && ArrayQ[arg, _, IntegerQ] := 
    {_Integer, Length@Dimensions@arg};
getType[arg_List] /; $useCompile && ArrayQ[arg, _, NumericQ] && 
     Re[arg] == arg := 
        {_Real, Length@Dimensions@arg};
getType[_] := General;

This is a function to dispatch to a right type:

Clear[mergeDispatch];
SetAttributes[mergeDispatch, Orderless];
mergeDispatch[{Verbatim[_Integer], n_}, {Verbatim[_Real], n_}, pred_] :=
    mergeDispatch[{Verbatim[_Real], n}, {Verbatim[_Real], n}, pred];

mergeDispatch[f : {Verbatim[_Real], n_}, {Verbatim[_Real], n_}, pred_] :=
    mergeJIT[pred, f, $target];

mergeDispatch[f : {Verbatim[_Integer], n_}, {Verbatim[_Integer], n_}, pred_] :=
    mergeJIT[pred, f, $target];

mergeDispatch[_, _, pred_] :=
    Function[{fst, sec},
      Block[{$IterationLimit = Infinity},
         merge[fst, sec, pred]]];

and this is a function to bring it all together:

ClearAll[mergeList];
Options[mergeList] = 
 {
    CompileToC -> False,
    Compiled -> True
 };
mergeList[f_, s_, pred_, opts : OptionsPattern[]] :=
  Block[{
       $target = If[TrueQ[OptionValue[CompileToC]], "C", "MVM"],
       $useCompile = TrueQ[OptionValue[Compiled]]
    },
    mergeDispatch[getType@f, getType@s, pred][f, s]
  ];

Finally, a helper function to clear the cache of mergeJIT, if that would be desired:

ClearAll[clearMergeJITCache];
clearMergeJITCache[] :=
   DownValues[mergeJIT] = {Last@DownValues[mergeJIT]};

Benchmarks and tests

First, create test data:

clearMergeJITCache[];
huge1 = RandomInteger[1000,1000000];
huge2 = RandomInteger[1000,1000000];

A first call to the function with C compilation target is expensive:

mergeList[huge1,huge2,Less,CompileToC -> True]//Short//AbsoluteTiming
 {3.8652344,{267,461,66,607,797,116,197,474,852,805,135,
     <<1999978>>,266,667,799,280,261,930,241,83,594,904,894}}

But then, for the same types of lists, it will pay off for huge lists:

mergeList[huge1,huge2,Less,CompileToC -> True]//Short//AbsoluteTiming
 {0.0468750,{267,461,66,607,797,116,197,474,852,805,135,
       <<1999978>>,266,667,799,280,261,930,241,83,594,904,894}}

On the other hand, the call with MVM target is fast out of the box, but not as fast as the one with the C target after the "warm-up":

mergeList[huge1,huge2,Less]//Short//AbsoluteTiming
 {0.2138672,{267,461,66,607,797,116,197,474,852,805,135,
       <<1999978>>,266,667,799,280,261,930,241,83,594,904,894}}

The call to generic one is general but comparatively very slow:

mergeList[huge1,huge2,Less,Compiled->False]//Short//AbsoluteTiming
 {5.015,{267,461,66,607,797,116,197,474,852,805,135,
       <<1999978>>,266,667,799,280,261,930,241,83,594,904,894}}

Here's another approach.

mergeLists[lista_, listb_, crit_: LessEqual] :=
 Module[{merge},
  merge[list1_, list2_] /; crit[First[list1], First[list2]] :=

   With[{part = TakeWhile[list1, crit[#, First[list2]] &]},
    Sow[part];
    If[Length[part] == Length[list1],
     Sow[list2],
     merge[list1[[Length[part] + 1 ;;]], list2]]];

  merge[list2_, list1_] /; crit[First[list1], First[list2]] := 
   merge[list1, list2];

  merge[list1_, list2_] := With[
    {part = TakeWhile[list1, Not[crit[First[list2], #]] &]},
    Sow[part];
    If[Length[part] == Length[list1],
     Sow[list2],
     merge[list1[[Length[part] + 1 ;;]], list2]]];

  Flatten[Reap[merge[lista, listb];][[2]]]]

It does give slightly different results from Leonid's code though. For example for

list1 = {1, 4, 3};
list2 = {2, 3, 4};

I get with my code

mergeLists[{1, 4, 3}, {2, 3, 4}, LessEqual]

(* out: {1, 2, 3, 4, 4, 3} *)

whereas with Leonid's code I get

Block[{$IterationLimit = Infinity}, merge[{1, 4, 3}, {2, 3, 4}, LessEqual]]

(* out: {1, 2, 3, 4, 3, 4} *)

If I take Less instead of LessEqual I get the same result for both codes, so I expect that it has to do with a different treatment of border cases where the two sublists start with the same element.

Taking this issue aside, my code does seem to be faster than Leonid's solution. Consider for example (I'm choosing large1 and large2 such that their intersection is empty to avoid the issue above)

{large1, large2} = Partition[RandomSample[Range[20000]], 10000];

then with Leonid's code I get

Block[{$IterationLimit = Infinity}, merge[large1, large2, LessEqual]] // Short // 
  AbsoluteTiming

(* {0.070483,{9941,7246,4261,11184,10148,1867,12324,
    <<19986>>,6927,17973,10762,9165,19379,11449,7735}} *)

and with my code

mergeLists[large1, large2, LessEqual] // Short // AbsoluteTiming

(* {0.039470,{9941,7246,4261,11184,10148,1867,12324,
    <<19986>>,6927,17973,10762,9165,19379,11449,7735}} *)

Forget Leonid and Heike's recursive stuff (okay, actually I upvoted both as they are both good responses). But here is a simple, direct version. Note that it will not sort, so if the inputs are unsorted the result will be as well.

mergeSortedLists[lista_, listb_, crit_: LessEqual] := Module[
  {result, len1 = Length[lista], len2 = Length[listb], i = 1, 
   j = 1},
  result = Reap[While[i <= len1 && j <= len2,
      If[TrueQ[crit[lista[[i]], listb[[j]]]],
        Sow[lista[[i]]]; i++,
        Sow[listb[[j]]]; j++];
      ]][[2, 1]];
  If[i <= len1, result = Join[result, lista[[i ;; -1]]], 
   If[j <= len2, result = Join[result, listb[[j ;; -1]]]]];
  result]

Here is Heike's example modified slightly:

{large1, large2} = 
  Partition[Sort[RandomSample[Range[100000]]], 50000];
In[2236]:= Timing[ml = mergeSortedLists[large1, large2, LessEqual];]
ml === Range[100000]

Out[2236]= {0.21, Null}

Out[2237]= True