Delimiting a macro argument with the macro itself

With inconclusive comments, I decided to look again for an explanation in the TeXBook, but with no success. However this led to another example. In the answer to exercise 24.6 Knuth shows how to make the control sequence \cs into an implicit space using \futurelet.

\def\\#1\\{}\futurelet\cs\\ \\

You can try this out with the minimal

\tt
\def\\#1\\{}\futurelet\cs\\ \\
% example
a.\cs b.c
\def\empty{}
% Trying with an empty macro
a.\empty b.c
\bye

The obvious next port of call was the original TeX source code itself. As Pascal is a typed language, my suspicion was that using delimiters in this manner, would conserve one string name.

Strings are defined in Part4, String Handling. All strings--as a matter of fact almost everything in TeX--are translated to integers and indexed. As far as TeX is concerned the name of a primitive macro or the string of an error message are no different they are all placed in str_pool.

When the original WEB system program called TANGLE processed the TEX.WEB file, it ouputted a Pascal program TEX.PAS (now it is CWEB and a C program) and also a string file called TEX.POOL which held all the strings used. The INITEX program read the latter file, where each string appeared as a two-digit decimal length followed by the string itself, and the information was recorded in TEX's string memory. INITEX would later on produce a binary format file which can subsequently be read at high speed by the TeX engine. (You can view the TEX.POOL file by searching for it in your distribution).

Given LaTeX's history, it made sense to use delimiters in this manner as it conserved one string.

I haven't looked carefully at the scanning routines for user-defined strings during normal typesetting operations i.e., without the source being passed through INITEX. In all probability the same mechanism is used and it is apparent that this will save some memory space, however little this is.

If one should use delimiters in such a manner is debatable as it tends to obscure the code and perhaps this is the reason for them being generally absent in packages. Rather two or three string names should be reserved for this purpose. If there is one exception I would make it would be for macros similar to the one Knuth defined:

 \def\\#1\\

It has a symmetry which one could argue has a certain beauty!

Special thanks to all the people that posted comments and especially to Lev Bishop for the additional example.


In my opinion it does provide some advantage, specifically in order to capture the contents of argument-less macros where the macro is used repetitively. One practical example is taken from Order items in enumerate environment automatically where the repetitive use of \item in a list environment provides a means to delineate (or parameterize) the "argument" used. In general, an item in a list has the form

%...
\item <some stuff>
\item <some more stuff>
%...

or, more aptly written as

%...
\item <some stuff> \item
<some more stuff>
%...

which allows to be captured using

\def\item#1\item{<do something with #1>}

Sure you have to find a way to manage the end of the list, since the final item has no apparent ending \item to match the parameter text for the newly defined \item. That could be achieved by appending \item at the end of the environment after capturing its entire contents via the help of environ. More details on this in the linked post.