Procedures on Functions That Remember Values They Have Found

Clear["Global`*"]

f[x_] := f[x] = f[x - 1] + f[x - 2];
f[1] = f[2] = 1;
f[5]
?f

(* 5 *)

enter image description here

Note the saved definitions

Trace[f[6]]

enter image description here

The Trace shows that lookup values were used

ClearAll[f]
f[x_] := f[x - 1] + f[x - 2];
f[1] = f[2] = 1;
f[5]
?f

(* 5 *)

enter image description here

Note that the only saved values are the two initial values.

Trace[f[6]]

enter image description here

The second Trace is much more extensive since there are no saved values other than the two initial values.

EDIT: One approach to implement a pseudo-Trace for this

ClearAll[f, f2, trace]

f[x_] := f[x - 1] + f[x - 2]
f[1] = f[2] = 1;

Format[f2] := f

f2[x_] := Inactive[Plus][Inactive[f2][x - 1], Inactive[f2][x - 2]]
f2[1] = f2[2] = 1;

trace[n_] := 
 Module[{sum = NestList[Activate[#, f2] &, Inactive[f2][n], n - 1]},
  Append[sum, sum[[-1]] // Activate]]

n = 6;

trace[n]

enter image description here

%[[-1]] == f[n]

(* True *)

For purposes of discussion, rather than calling both variations f let us call the memoizing version f and the non-memoizing version g:

ClearAll[f]
f[x_] := f[x] = f[x - 1] + f[x - 2];
f[1] = f[2] = 1;

ClearAll[g]
g[x_] := g[x - 1] + g[x - 2];
g[1] = g[2] = 1;

f[5] with no prior memoization vs g[5]

If we show the traces of the f[5] and g[5] as graphs, we see this (click to enlarge):

f[5] vs. g[5]

The diagrams show 33 evaluation steps for f and 34 for g. In actual fact there are a total of 88 steps for both, but the diagrams suppress inert expressions for readability. Even though the two functions perform the same number of steps, they are qualitatively different. Close inspection will reveal, for example, that the value for f[3] is saved and reused, whereas the value for g[3] is calculated twice. But in this small example, the savings are not apparent because all of the assignments performed by f offset the extra calculations.

f[6] with prior memoization vs g[6]

But the difference is much more apparent if we subsequently calculate f[6] and g[6], where we retain the memoized f results from the previous evaluation:

f[6] vs g[6]

This time, f has far fewer steps due to the reuse of previously calculated results.

f[10] with no prior memoization vs g[10]

The fact that f and g both required 88 steps in the first example was but a coincidence. We a significant different if we clear the definitions of f (so as to forget the memoized values) and then calculate a larger example, f[10] and g[10]:

f[10] vs. g[10]

Labels are suppressed to keep the diagram manageable, f is on the left. With this larger example, the payoff of memoization is much more apparent. f requires 89 non-inert steps vs. 433 for g (228 vs 1138 including inerts).


Update

f[6] with no prior memoization vs g[6]

By request in the comments, here are the graphs for f[6] and g[6] with no prior memoization. They show 45 steps for f and 57 for g (116 and 151 steps respectively including inerts).

f[6] no memoization vs. g[6]