Constructing higher order transition probability matrix

Update: Using EmpiricalDistribution and MarginalDistribution to compute the conditional probabilities:

ClearAll[transitionProb]
transitionProb[step_: 1][x_] := Module[{states = DeleteDuplicates@x, 
   ed = EmpiricalDistribution[Partition[ArrayComponents @ x, step + 1, 1]], 
   ordering, tuples, md, condpdF},
  ordering = Ordering[states]; tuples = Tuples[ordering, step];
  md = MarginalDistribution[ed, Range[step]];
  condpdF[u__, w_] := If[PDF[md, {u}] === 0, 0, PDF[ed, {u, w}]/PDF[md, {u}]];
  Prepend[{Row @ states[[{##}]], 
      ## & @@ Table[## & @@ condpdF[##, i], {i, ordering}]} & @@@ tuples, 
   Prepend[states[[ordering]], ""]]]

Examples:

transitionProb[2][x] // Grid[#, Dividers -> All] & // TeXForm

$\begin{array}{|c|c|c|c|c|c|} \hline \text{} & \text{A} & \text{B} & \text{C} & \text{D} & \text{E} \\ \hline \text{A}\text{A} & \frac{3}{5} & 0 & \frac{1}{5} & 0 & \frac{1}{5} \\ \hline \text{A}\text{B} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{A}\text{C} & 0 & 0 & 1 & 0 & 0 \\ \hline \text{A}\text{D} & \frac{1}{2} & \frac{1}{2} & 0 & 0 & 0 \\ \hline \text{A}\text{E} & 0 & 0 & 0 & 1 & 0 \\ \hline \text{B}\text{A} & 0 & 0 & 0 & 1 & 0 \\ \hline \text{B}\text{B} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{B}\text{C} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{B}\text{D} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{B}\text{E} & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} \\ \hline \text{C}\text{A} & 0 & 0 & 0 & 1 & 0 \\ \hline \text{C}\text{B} & 0 & 0 & 0 & 0 & 1 \\ \hline \text{C}\text{C} & 0 & 0 & \frac{1}{2} & \frac{1}{2} & 0 \\ \hline \text{C}\text{D} & 0 & 0 & 0 & 1 & 0 \\ \hline \text{C}\text{E} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{D}\text{A} & 1 & 0 & 0 & 0 & 0 \\ \hline \text{D}\text{B} & \frac{1}{2} & 0 & 0 & 0 & \frac{1}{2} \\ \hline \text{D}\text{C} & 0 & 1 & 0 & 0 & 0 \\ \hline \text{D}\text{D} & 0 & 0 & \frac{1}{3} & \frac{1}{3} & \frac{1}{3} \\ \hline \text{D}\text{E} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{E}\text{A} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{E}\text{B} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{E}\text{C} & 1 & 0 & 0 & 0 & 0 \\ \hline \text{E}\text{D} & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 \\ \hline \text{E}\text{E} & 0 & 0 & 0 & \frac{1}{2} & \frac{1}{2} \\ \hline \end{array}$

transitionProb[1][x] // Grid[#, Dividers -> All] & // TeXForm

$\begin{array}{|c|c|c|c|c|c|} \hline \text{} & \text{A} & \text{B} & \text{C} & \text{D} & \text{E} \\ \hline \text{A} & \frac{5}{9} & 0 & \frac{1}{9} & \frac{2}{9} & \frac{1}{9} \\ \hline \text{B} & \frac{1}{3} & 0 & 0 & 0 & \frac{2}{3} \\ \hline \text{C} & \frac{1}{5} & \frac{1}{5} & \frac{2}{5} & \frac{1}{5} & 0 \\ \hline \text{D} & \frac{1}{8} & \frac{1}{4} & \frac{1}{8} & \frac{3}{8} & \frac{1}{8} \\ \hline \text{E} & 0 & 0 & \frac{1}{5} & \frac{2}{5} & \frac{2}{5} \\ \hline \end{array}$

Original answer:

states = DeleteDuplicates[x];
ordering = Ordering[states]; 
data = ArrayComponents@x ;
estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
tuples = Tuples[Range[5][[ordering]], {2}];
table = {Row@states[[{##}]], ## & @@ 
      Table[Probability[p[3] == s \[Conditioned] p[1] == # && p[2] == #2, 
        p \[Distributed] estproc], {s, Range[Length @ states]}]} & @@@ tuples ;

TeXForm @ Grid[Prepend[table, Prepend[states[[ordering]], ""]], Dividers -> All]

$\begin{array}{|c|c|c|c|c|c|} \hline \text{} & \text{A} & \text{B} & \text{C} & \text{D} & \text{E} \\ \hline \text{AA} & \frac{5}{9} & \frac{1}{9} & \frac{2}{9} & \frac{1}{9} & 0 \\ \hline \text{AB} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{AC} & \frac{1}{5} & 0 & \frac{1}{5} & \frac{2}{5} & \frac{1}{5} \\ \hline \text{AD} & \frac{1}{8} & \frac{1}{8} & \frac{3}{8} & \frac{1}{8} & \frac{1}{4} \\ \hline \text{AE} & 0 & \frac{2}{5} & \frac{2}{5} & \frac{1}{5} & 0 \\ \hline \text{BA} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{BB} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{BC} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{BD} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{BE} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{CA} & \frac{5}{9} & \frac{1}{9} & \frac{2}{9} & \frac{1}{9} & 0 \\ \hline \text{CB} & \frac{1}{3} & \frac{2}{3} & 0 & 0 & 0 \\ \hline \text{CC} & \frac{1}{5} & 0 & \frac{1}{5} & \frac{2}{5} & \frac{1}{5} \\ \hline \text{CD} & \frac{1}{8} & \frac{1}{8} & \frac{3}{8} & \frac{1}{8} & \frac{1}{4} \\ \hline \text{CE} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{DA} & \frac{5}{9} & \frac{1}{9} & \frac{2}{9} & \frac{1}{9} & 0 \\ \hline \text{DB} & \frac{1}{3} & \frac{2}{3} & 0 & 0 & 0 \\ \hline \text{DC} & \frac{1}{5} & 0 & \frac{1}{5} & \frac{2}{5} & \frac{1}{5} \\ \hline \text{DD} & \frac{1}{8} & \frac{1}{8} & \frac{3}{8} & \frac{1}{8} & \frac{1}{4} \\ \hline \text{DE} & 0 & \frac{2}{5} & \frac{2}{5} & \frac{1}{5} & 0 \\ \hline \text{EA} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{EB} & 0 & 0 & 0 & 0 & 0 \\ \hline \text{EC} & \frac{1}{5} & 0 & \frac{1}{5} & \frac{2}{5} & \frac{1}{5} \\ \hline \text{ED} & \frac{1}{8} & \frac{1}{8} & \frac{3}{8} & \frac{1}{8} & \frac{1}{4} \\ \hline \text{EE} & 0 & \frac{2}{5} & \frac{2}{5} & \frac{1}{5} & 0 \\ \hline \end{array}$


The following code is just brute-force. But at least yields the expected results. Also, it can be used for any order.

The first parameter is the data. The second parameter is the order.

probM[data_, ord_] := 
 Module[{uniques = Union[data], acc = 0, len, trans, trPre, tData, 
   toCount, toGather, toNormalize},
  trans = Dispatch@Thread[uniques -> Range[len = Length[uniques]]];
  trPre = Dispatch@Flatten[Array[{##} -> ++acc &, ConstantArray[len, ord]]];
  tData = Replace[data, trans, {1}];
  toCount = Partition[tData, ord + 1, 1];
  toGather = Map[{Replace[#[[1, ;; -2]], trPre], #[[1, -1]]} -> #[[2]] &, 
    Tally[toCount]];
  toNormalize = GatherBy[toGather, #[[1, 1]] &];
  SparseArray[
   Flatten@Map[
     With[{tot = 1/Plus @@ #[[All, 2]]}, 
       Map[#[[1]] -> #[[2]] tot &, #]] &, toNormalize]]];

Let us check the dimensions of the first three orders.

Table[probM[x, i] // Dimensions, {i, 3}]
(*{{5, 5}, {25, 5}, {125, 5}}*)

As for the efficiency of probM, I tried replacing some of the Map with ParallelMap but it did not yield any improvement. You might want to combine with niceties from the other answer. For example, use ArrayComponents instead of dispatch tables.

In any case, check the second order table:

$$ \begin{array}{cccccc} \text{} & \text{A} & \text{B} & \text{C} & \text{D} & \text{E} \\ \text{AA} & \frac{3}{5} & 0 & \frac{1}{5} & 0 & \frac{1}{5} \\ \text{AB} & 0 & 0 & 0 & 0 & 0 \\ \text{AC} & 0 & 0 & 1 & 0 & 0 \\ \text{AD} & \frac{1}{2} & \frac{1}{2} & 0 & 0 & 0 \\ \text{AE} & 0 & 0 & 0 & 1 & 0 \\ \text{BA} & 0 & 0 & 0 & 1 & 0 \\ \text{BB} & 0 & 0 & 0 & 0 & 0 \\ \text{BC} & 0 & 0 & 0 & 0 & 0 \\ \text{BD} & 0 & 0 & 0 & 0 & 0 \\ \text{BE} & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} \\ \text{CA} & 0 & 0 & 0 & 1 & 0 \\ \text{CB} & 0 & 0 & 0 & 0 & 1 \\ \text{CC} & 0 & 0 & \frac{1}{2} & \frac{1}{2} & 0 \\ \text{CD} & 0 & 0 & 0 & 1 & 0 \\ \text{CE} & 0 & 0 & 0 & 0 & 0 \\ \text{DA} & 1 & 0 & 0 & 0 & 0 \\ \text{DB} & \frac{1}{2} & 0 & 0 & 0 & \frac{1}{2} \\ \text{DC} & 0 & 1 & 0 & 0 & 0 \\ \text{DD} & 0 & 0 & \frac{1}{3} & \frac{1}{3} & \frac{1}{3} \\ \text{DE} & 0 & 0 & 0 & 0 & 0 \\ \text{EA} & 0 & 0 & 0 & 0 & 0 \\ \text{EB} & 0 & 0 & 0 & 0 & 0 \\ \text{EC} & 1 & 0 & 0 & 0 & 0 \\ \text{ED} & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 \\ \text{EE} & 0 & 0 & 0 & \frac{1}{2} & \frac{1}{2} \\ \end{array} $$


As a variant of my answer to the linked question, the following should work correctly and efficiently.

Some random data to work with:

x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];

Creating the probability tensor P:

n = 2;
data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
A = With[{spopt = SystemOptions["SparseArrayOptions"]}, 
     Internal`WithLocalSettings[
      (*switch to additive assembly*)
      SetSystemOptions["SparseArrayOptions" -> {"TreatRepeatedEntries" -> Total}],

      (*assemble matrix*)
      SparseArray[Partition[data, n + 1, 1] -> 1, ConstantArray[Max[data], n + 1] ],

      (*reset "SparseArrayOptions" to previous value*)
      SetSystemOptions[spopt]]]; // AbsoluteTiming // First
P = #/N[Total[Abs[#], {n + 1}] /. 0 -> 1] &@Flatten[A, n - 1];

0.717521

0.184357

The row labels of P should be

Tuples[Sort[DeleteDuplicates[x]], n]