What exactly do \csname and \endcsname do?

Normally, control sequence names are made only of letters or of one non-letter character.

A letter is, more precisely, a character having category code 11 at the moment the control sequence name is read. So, any character can become part of a control sequence name, provided we change its catcode before the definition and each usage.

With \csname...\endcsname we are freed from this limitation and every character can go inside them to form a control sequence name (of course, % is excluded because it disappears together with what remains on the line before TeX is doing its work on characters).

However, this is not the main purpose of \csname...\endcsname. This construction is used to build commands from "variable parts". Think, for instance to LaTeX's \newcounter: after \newcounter{foo}, TeX knows \thefoo that is built precisely in this way. Roughly, what LaTeX does is

\newcommand{\newcounter}[1]{%
   \expandafter\newcount\csname c@#1\endcsname
   \expandafter\def\csname the#1\endcsname{\arabic{#1}}%
 }

so that \newcounter{foo} does the right job. It's more complicated than this, of course, but the main things are here; \newcount is the low-level command to allocate a counter. The \expandafter is just to build the control sequence before \newcount and \def see the token.

Inside \csname...\endcsname, category codes don't matter (with one main exception: active characters will be expanded if not preceded by \string, see final note). LaTeX exploits this in order to build control sequence names that users won't be able to access (easily). For example, the control sequence to choose the default ten point font is \OT1/cmr/m/n/10, which can be easily split internally (by the "reverse" operation that is \string) and is not available to the casual user.

Another important use is in environments: when you say \newenvironment{foo}, LaTeX really defines \foo and \endfoo. Upon finding \begin{foo}, LaTeX does some bookkeeping and then executes \csname foo\endcsname (that's why one can say also \newenvironment{foo*}); similarly, at \end{foo} LaTeX executes \csname endfoo\endcsname and after this it does some bookkeeping again.

Other uses: \label{foo} will define control sequences based on foo via \csname...\endcsname that can be used by \ref.

When one says \csname foo\endcsname, LaTeX will look whether \foo is defined; if not, it will execute \relax and from then on (respecting grouping), \foo will be interpreted as \relax. An interesting usage for this feature is that one can say

\chapter*{Introduction}
\csname phantomsection\endcsname
\addcontentsline{toc}{chapter}{Introduction}

and keep hyperref happy if it's loaded, while doing nothing if the package is not loaded.

It's possible to give many other interesting uses of this trick. But one should always keep in mind that TeX does complete expansion of what it finds in that context and that only characters must remain. So

\csname abc\relax def\endcsname

is forbidden. But, after \def\xyz{abc},

\csname \xyz def\endcsname

will be legal and equivalent to saying \csname abcdef\endcsname or \abcdef.

Final note

It's better to add something about category codes. An active character in \csname...\endcsname will be expanded, so to get a literal ~ one has to write \string~. Comment (category 14), ignored (category 9) and invalid (category 15) characters will remain such. So

\csname %\endcsname

will give an error (Missing \endcsname); in \csname ^^@\endcsname there will be no character and \csname ^^?\endcsname will raise an error.


For reference, from the TeX Book (with slight formatting changes), Chapter 7: How TeX Reads What You Type (p 40):

...you can go from a list of character tokens to a control sequence by saying \csname<tokens>\endcsname. The tokens that appear in this construction between \csname and \endcsname may include other control sequences, as long as those control sequences ultimately expand into characters instead of TeX primitives; the final characters can be of any category, not necessarily letters. For example, \csname TeX\endcsname is essentially the same as \TeX; but \csname\TeX\endcsname is illegal, because \TeX expands into tokens containing the \kern primitive. Furthermore, \csname\string\TeX\endcsname will produce the unusual control sequence \\TeX, i.e., the token <\TeX>, which you can't ordinarily write.

I have used this indirectly by using the \label-\ref system and defining labels based on counters:

\newcounter{mycount}
%...
\newcommand{\mycmd}{%
  \stepcounter{mycount}%
  \label{abc\themycount}%
  %...
}

This creates a "successive label abc1, abc2, ... for every call to \mycmd, in order to avoid creating multiply defined labels with the same name. Indirectly, \label{abc\themycount} calls \@namedef{r@abc\themycount}, which calls

\expandafter\def\csname r@abc\themycount\endcsname

thereby expanding r@abc\themycount to r@abc1 and defining \r@abc1 for the first label, \r@abc2 for the second label, etc. Yes, labels in LaTeX are actually control sequences prepended with r@ and is constructed using \csname ... \endcsname which then allows numerals.


Suppose you want to define a command \foo2. You cannot do this because 2 is not a letter. However, this construction works: \csname foo2\endcsname. Sometimes this is useful, e.g. when you need a series of commands, \foo1, \foo2, etc (another way is to use roman numerals). Another example, suppose you want to define a series of commands like \endsection, \endsubsection, etc. Then you can use a loop with \expandafter\def\csname end#1\endcsname...