How can I regroup elements in a list into a tree based on their values?

This is a job for recursion:

ClearAll[f];    
f[{fst : {{left__}, _}, sub : Longest[{{left__, __}, _} ...], rest___}] :=
    {{fst, f[{sub}]} /. {el_, {}} :> el, Sequence @@ f[{rest}]};
f[x_] := x;

The usage is

f[your-initial-structure]

It would be nice to have a real big sample, but I hope it works anyway. Here is the approach: I would extract the information level by level and build up the structure. Level 1 contains the sections and this is basically your starting point

Select[input, Length[First[#]] === 1 &]

All section number lists which have a length of one are your sections. The first one looks like this

{{"1"}, "Setup code"}

Lets say our current output is a list of all those entries. What you want to do now is to extract all sections of level 2 which have the same first number like the above. Lets say we found 3 subsections call them s1,s2,s3 and we call the section entry from above elm then we want to replace in our current output

elm -> {elm, {s1, s2, s3}}

This can be done level by level until we are finished. To make it a bit more concrete: the front of our rule above must ensure, that we work only on the elements of the current level, therefore, we check the length of the section numbering

elm : {l : {__String}, what_String} /; Length[l] === 1 :> blub

To extract s1,... we Select from our input all entries which are one level deeper than the current one (this means Length[..]===2 here) and which have the same section numbers up to the current level. For instance, all childs of {2,2,3} have to look like {2,2,3,..}. In a generic solution this call is therefore

Select[input, Length[First[#]] === i + 1 && First[#][[Range[i]]] === l &]

where i is the current level. We need the First[#] because the section numbers are in the first part of every entry.

One last thing: If an entry has no sub-entries, the Select returns {} which I simply filter in a second rule. The nice thing is that when we are finished, the output does not change anymore. This is perfect for FixedPoint

TransformDoc[input_] :=
 Block[{i = 0}, 
  FixedPoint[
   (i++; # /. elm : {l : {__String}, what_String} /; Length[l] === i :> 
    {elm,Select[input, Length[First[#]] === i + 1 && 
     First[#][[Range[i]]] === l &]} /. 
    {whatever__, {}} :> whatever) &, 
  Select[input, Length[First[#]] === 1 &]]
 ]

I hope I explained all details enough to understand the rule I used.

input = {{{"1"}, "Setup code"}, {{"1", "1"}, 
    "Local variables"}, {{"1", "2"}, "Option variables"}, {{"1", "3"},
     "Check for null Input"}, {{"2"}, "Argument Logic"}, {{"2", "1"}, 
    "X Spec"}, {{"2", "2"}, "One Dimensional Case"}, {{"2", "2", "1"},
     "Tick and Label Generation"}, {{"2", "2", "2"}, 
    "plot generation"}, {{"2", "2", "3"}, 
    "bin names generation"}, {{"2", "2", "4"}, 
    "bar chart generation"}, {{"2", "3"}, "Y Spec"}};
TransformDoc[input] // ToOutline

Mathematica graphics

Finally, please give me a chance to maybe make you reconsider your approach. Why do you need the section numbering at all? Look at LaTeX, as long as you don't want to mix different sections at different places you only need to markup whether you need a section, a subsection, ... The numbering and the structure can simply be build from this. Wouldn't it be easier to just give a number defining the depth of your comment and the rest is calculated automatically?


ReplaceRepeated with some pre-processing to re-structure outline into the desired form.

The key ideas are

  1. construct a tree with root 0 using the information in the first column of outline
  2. pre-process the vertex list to turn leaf indices to strings
  3. Use ReplaceRepeated starting from the root node, and replacing each processed node i with the list {"i", {children of i}}.

First, few simple helper functions:

ClearAll[fatherF, rootsF, chldrnListF]
fatherF = Function[{item, list},   If[Length[item] == 1, {0}, 
      Pick[list, (item[[;; -2]] == #) & /@ list]]];
(* create a list of children lists - the leaf nodes of the tree marked as strings in each part *)
chldrnListF = Function[{grph}, 
 VertexOutComponent[grph, #, 1]& /@ VertexList[grph] /.
   i_Integer /; VertexOutComponent[grph, {i}, 1] == {i} :> ToString[i]];
(* to extract roots *)
rootsF = Function[{grph}, 
 Pick[VertexList[grph], (VertexInComponent[grph, {#}, 1] == {#}) & /@
  VertexList[grph]]];

and some pre-processing:

indices = First /@ outline;
fathers = fatherF[#, indices] & /@ indices;
edgelist = (Thread[fathers -> indices] /.
   Rule[a_, b_] :>
     Rule[Position[indices, First@a, 1], Position[indices, b, 1]] /. {} -> {{0}}) //
  Map[First@(Sequence @@ #) &, #, {2}] &
(* {0->1,1->2,1->3,1->4,0->5,5->6,5->7,7->8,7->9,7->10,7->11,5->12} *)
(* if needed remove the zero node to get a forest *)
edgelistb = edgelist /. Rule[0, _] :> Sequence[]

and a long "one"-liner:

({0} //. (i_Integer :> {ToString[i], Rest[chldrnListF[Graph@edgelist][[i + 1]]]}) /. 
   x_String :> ToExpression[x] // Part[#, 1, 2] &) /. i_Integer :> outline[[i]];
% == answer
(* True *)

outline as a tree or a forest:

labels = MapIndexed[First[#2] -> Placed[Rotate[ToString[#1], 15 Degree], Below] &,    outline];
{grph, grphb} = {Graph[edgelist, VertexLabels -> Prepend[labels, {"0"}],
    ImagePadding -> 50, ImageSize -> 600], 
 Graph[edgelistb, VertexLabels -> labels, ImagePadding -> 50, ImageSize -> 400]};
Panel@Row[{grph, grphb}, Spacer[10]]

enter image description here

Details of the steps:

grph = Graph[edgelist];
tmp = If[MemberQ[rootsF[grph], 0],
  rootsF[grph] //. (i_Integer :> {ToString[i], Rest[chldrnListF[grph][[i + 1]]]}),
  rootsF[ grph] //. (i_Integer :> {ToString[i], Rest[chldrnListF[grph][[i]]]})] /.
x_String :> ToExpression[x]
(* {{"0",{{"1",{"2","3","4"}},{"5",{"6",{"7",{"8","9","10","11"}},"12"}}}}}*)
(* post process to transform strings  back to integers and get the appropriate part*)
treeList = If[MemberQ[rootsF[grph], 0], Part[#, 1, 2], #] &[tmp]
(*{{1,{2,3,4}},{5,{6,{7,{8,9,10,11}},12}}} *)
(* map back to outline *)
tocTreeList = treeList /. i_Integer :> outline[[i]]
(* {{{{"1"},"Setup code"},
    {{{"1","1"},"Local variables"},
    {{"1","2"},"Option variables"},
    {{"1","3"},"Check for null Input"}}},
  {{{"2"},"Argument Logic"},
   {{{"2","1"},"X Spec"},
   {{{"2","2"},"One Dimensional Case"},
    {{{"2","2","1"},"Tick and Label Generation"},
     {{"2","2","2"},"plot generation"},
     {{"2","2","3"},"bin names generation"},
     {{"2","2","4"},"bar chart generation"}}},
  {{"2","3"},"Y Spec"}}}} *)
 answer == tocTreeList
 (* True *)