Dynamic Programming: Why Knuth's improvement to Optimal Binary Search Tree O(n^2)?

You're correct that the distance from r[i, j - 1] to r[i + 1, j] is not constant in the worst case, but it is constant on average, which suffices to imply a quadratic running time. The total number of iterations for l is

  S = sum_{i = 1}^{n - l + 1} (r[i + 1, j] + 1 - r[i, j - 1]),  j = i + l - 1
    = sum_{i = 1}^{n - l + 1} (r[i + 1, i + l - 1] + 1 - r[i, i + l - 2])
    = r[n - l + 2, n] + n - l + 1 - r[1, l - 1]

therefore the average is S / (n - l + 1), which is a constant

by simplifying the telescoping sum.


You can find the exact running time analysis with a google search or just start to write your own analysis w.r.t for loops. But just note that in all of them sum in total is calculated by telescopic sum, I mean may be one of them is big but in each iteration for first loop takes O(n), and totally takes O(n2).