What definition takes up more memory, \def or \chardef (considering \def contains a single character)?

The answer is quite easy: \chardef uses much less resources.

Compiling

%\def\usedef{\expandafter\def\csname\the\count255\endcsname{a}}
\def\usedef{\expandafter\chardef\csname\the\count255\endcsname=`a }

\count255=1
\loop\ifnum\count255<500000
\usedef
\advance\count255 1
\repeat

\tracingstats=1

\bye

I get

Here is how much of TeX's memory you used:
 499994 strings out of 2095124
 2888925 string characters out of 6220659
 5959 words of memory out of 5000000
 500917 multiletter control sequences out of 15000+600000
 14794 words of font info for 50 fonts, out of 8000000 for 9000
 14 hyphenation exceptions out of 8191
 5i,0n,1p,81b,6s stack positions out of 5000i,500n,10000p,200000b,80000s

Switching the comment in the first two lines, TeX will perform \def instead of \chardef and the log will show

Here is how much of TeX's memory you used:
 499994 strings out of 2095124
 2888921 string characters out of 6220659
 1505944 words of memory out of 5000000
 500917 multiletter control sequences out of 15000+600000
 14794 words of font info for 50 fonts, out of 8000000 for 9000
 14 hyphenation exceptions out of 8191
 5i,0n,1p,82b,6s stack positions out of 5000i,500n,10000p,200000b,80000s

Note. I had to supply more memory space for strings with max_strings=10000000 in order to make space for 500000 new control sequences.


As \chardef can only associate a control sequence with an integer value from 0 to 255 while \def can associate it with arbitrary token lists, the latter is more general, and it shouldn't be surprising that it is also more expensive. In fact, The TeXbook introduces \chardef as "for numbers between 0 and 255, as an efficient alternative to" \def, and indeed it is.


As a further refinement of @egreg's answer, which uses either \def or \chardef 500000 times and examines the stats, we can instead, for each (fixed) one of those two, compare the stats after using it (say) 1000 and 2000 times (or 1000 and 1001 times, or whatever), to see the additional (incremental) cost of each definition.

What you'll find is that for either \def\testName{a} or \chardef\testName`\a, the cost in common (for each such definition) is:

  • 1 string entry (namely, the new string testName),

  • 8 string characters (namely, to store the string t e s t N a m e) (depending on the length of your name),

  • 1 "multiletter control sequence" (assuming your name was of length greater than 1).

(Note that TeX has some optimizations to avoid storing things that already exist, so in some cases it may be able to avoid some of this cost, but not in general.) For \chardef\testName`\a this is the only cost, but for each \def\testName{a} there is the above cost and also a cost of:

  • 3 words of memory

The reason for this is not hard to see. All that \chardef has to do is put an integer value in a table. For example, your \chardef\testName`\a is the same as \chardef\testName=97. (For example, plain.tex uses \chardef\active=13.) So when you write \chardef\testName=97, the bookkeeping that TeX does internally is something like the following:

  • in the string pool: append eight characters testName.
  • in the list of strings (not explicitly stored): point from the latest string number (incremented by 1) to the starting position of the above 8 characters in the string pool
  • in the list of control sequences: point from the hash of testName to the above string number
  • in the (internal) table of equivalents (eqtb), which (among other things) stores the meanings of control sequences: store a new entry with
    • eq_level the current/appropriate level,
    • eq_type the special code char_given (saying that the meaning of \testName is that it's a chardef'd number),
    • equiv being the value itself (here, the integer 97).

You can see this in section §1224 of the program:

chardef

With \def on the other hand, TeX internally has to create a token list containing the token a (i.e. catcode 11=letter, and value 97). Further, every token list starts with a reference count and ends with NULL, which explains why there are three words of memory used.


So to summarize the difference, just look at the following two cases (with apologies that the diagram was not made with TikZ ): storing the integer 97 in an array is free (requires no extra space), while storing a pointer to a 3-word token list requires those three words of space.

drawing