Question about exercise 11.5 in TeXbook

For readers who don't find the line:

\def\\{\if\space\next\ % assume that \next is unexpandable

in their TeXbook, it is in the errata published by Donald E. Knuth (“page A311”).

Suppose you do:

\def\foobar{cat}

\noindent
\demobox{The \foobar\ in the hat}

After \next has grabbed the \foobar token, \\ will expand to something equivalent to:

\if\space\next\ %
\else \setbox0=\hbox{\next}\maketypebox\fi

with \next being \let-equal to \foobar. According to the documentation of \if (TeXbook p. 209), TeX is going to expand tokens following the \if until it finds two non-expandable ones. \space expands to an explicit space token in one step, so TeX goes on with \next (which has the same meaning as \foobar at this point), because it needs one more non-expandable token. After \next has been expanded, the input is equivalent to:

\if〈space token〉cat\ %
\else \setbox0=\hbox{\next}\maketypebox\fi

where 〈space token〉 represents an explicit space token (one could define a control sequence that is \let-equal to an explicit space token and use it instead of 〈space token〉, see footnote 1 below). Now, TeX has two non-expandable tokens following the \if: a space token and a c character token (of category 11 under the normal catcode regime). So, the outcome of the \if can be decided: it is false because the character codes of a 〈space token〉 and of c differ, so TeX will skip to the \else clause.

There is no big problem so far, though we are going to box the whole cat at once instead of each character separately (c, a, and t); but let's back up a little bit. Had we used:

\def\foobar{ cat}

the input would have been equivalent to:

\if〈space token〉〈space token〉cat\ %
\else \setbox0=\hbox{\next}\maketypebox\fi

The test would have been true and TeX would have left cat\ in the input stream, which is plain wrong, because we were supposed to test what we just grabbed in \next, not to insert new text!

So, the comment “assume that \next is unexpandable” could be rephrased more generally, in my humble opinion, as “assume that \next ultimately expands to either (1) exactly one character token or (2) exactly one \chardef token or (3) a control sequence token that is \let-equal to (1) or (2)”2 (the character token in (1) is necessarily non-active, because of the “ultimately”). Indeed, you can test that \demobox works perfectly when \next recursively expands to a single character token, as in:

\def\myspacei{\myspace}
\def\myspace{\space}

\noindent
\demobox{Abc def\myspacei pU gHi}

Screenshot

Using \myspacei here gives the same result as using an explicit space token, because it recursively expands to such a token.

Here is another example that additionally uses a control sequence that recursively expands to a non-space character token:

\def\myspacei{\myspace}
\def\myspace{\space}

\def\myxii{\myxi}
\def\myxi{\myx}
\def\myx{X}

\noindent
\demobox{Abc def\myspacei pU\myxii gHi}

Screenshot with also a non-space character token

Your proposal:

\def\\{\expandafter\ifx\space\next\ %
...

would also work, as long as \next has been \let-equal to a space token (explicit or implicit). But it wouldn't work with input containing spaces in the form of macros like \space or our \myspacei macro defined above. Indeed, \ifx distinguishes between character tokens and macros (see specification of \ifx p. 210 of the TeXbook).

Finally, although it would work, your replacement of \endlist with \end does not sound like the best coding style to me, because \end is an existing TeX primitive; Knuth chose something more “unique” to mark the end of the text to be worked on. Besides, the name \endlist was visibly chosen to match \dolist: it is a matter of consistency. See in particular:

\def\demobox#1{\setbox0=\hbox{\dolist#1\endlist}%
...

Footnotes

  1. You can define a control sequence \stoken that is \let-equal to an explicit space token like this:

    {\def\\{\global\let\stoken= }\\ }% now, \stoken is an implicit space token
    

    (adapted from the TeXbook p. 376). Two other ways are given in the TeXbook p. 336 (exercise 24.6):

    \def\\{\let\stoken= }\\ %
    

    and

    \def\\#1\\{}\futurelet\stoken\\ \\%
    
  2. This is in particular the case when \next has been \let-equal to a non-active character token—which is, I think, the case Knuth had in mind when he used the word “unexpandable” (indeed, a non-active character token, or a control sequence that has been \let-equal to such a token, never expands). In other words, the condition “\next is unexpandable” from the comment you quoted is a sufficient condition for ensuring that the macros behave sanely, and is only a particular case of the more general condition I gave. :-)


It means what it says: only unexpandable tokens are allowed in the argument to \demobox. More precisely, character tokens or unexpandable control sequences that correspond (via \chardef or \let) to printable characters (including spaces).

If you try

\demobox{abc def}

\def\expandable{expandable token}

\demobox{\expandable}

you get

enter image description here

which is probably not what you were expecting. On the other hand, a definition like

\def\expandable{ expandable token}

would yield something very far from the expectations.

So the \demobox macro can be used only with its argument consisting of unexpandable tokens or macros expanding to a single unexpandable token.

Also \chardef tokens are allowed, as well as implicit character tokens. However \bgroup would also be problematic: compare \demobox{A \bgroup AB\egroup} with \demobox{A AB}, to see the issue.

You might want to extend \demobox in various ways, but that's not the object of the exercise.

About your suggestion to redefine \dolist to use \end instead of \endlist: you can do it, if you prefer. Knuth doesn't prefer it; instead he uses a control sequence that's specific to \dolist processing. Note that the definition of \endlist will produce an infinite loop whenever \endlist is expanded (probably by a mistake in the macros using \dolist), whereas \end wouldn't.