About catcodes and active characters

When one defines a command with an optional argument, say (simplified)

\newcommand{\includegraphics}[2][]{...}

where ... is code to evaluate the material in the optional argument and in the mandatory one, the syntax for \includegraphics is either

\includegraphics[<options>]{filename}

or

\includegraphics{filename}

When \includegraphics is called, TeX looks at the next character; if it is [12 (the subscript denotes the category code) one path is taken and basically a macro defined with delimited arguments is called. Otherwise another macro with a standard argument is called.

So if you change the category code of [ to 13 (active), the call

\includegraphics[<options>]{filename}

will result in TeX not taking the first path, because a [12 doesn't follow; instead the single argument macro is called, resulting in the active [ to become the file name of the graphics to include.

Conclusion: don't make [ active and neither ], of course.


Some more details. When you do

\newcommand{\foo}[2][baz]{Something with #1 and #2}

the basic working is as follows. LaTeX defines two macros, namely \foo and \\foo (with a backslash in the name, I'll use \sfoo for better clarity in what follows); the former by

\def\foo{\@protected@testopt\foo\sfoo{bar}}

and the latter by

\def\sfoo[#1]#2{Something with #1 and #2}

The kernel macro \@protected@testopt is define by

% latex.ltx, line 885:
\def\@protected@testopt#1{%
  \ifx\protect\@typeset@protect
    \expandafter\@testopt
  \else
    \@x@protect#1%
  \fi}

In standard typesetting, the “true” branch is followed, so \@testopt is called

% latex.ltx, line 883:
\long\def\@testopt#1#2{%
  \kernel@ifnextchar[{#1}{#1[{#2}]}}

Here's where the testing for [12 happens. In the replacement text of \@testopt there is [, which has already been tokenized at definition time, having category code 12. It won't recognize any other token to be the same as [. And [13 is another token.


Broken down to the essentials/the basic ideas one can say:

For LaTeX-macros defined in terms of \newcommand that process optional arguments, in order to determine whether an optional argument is present, at some stage of processing \kernel@ifnextchar, the LaTeX 2ε wrapper around \futurelet, is used to test whether a token is present whose meaning equals the meaning of the explicit opening-square-bracket-character-token of category code 12(other).

Then a second macro is called by inserting a control-sequence-token into the token-stream whose name equals the name of the command defined via \newcommand but with a backslash prepended. E.g., with \newcommand\foo[2][optional-default]{...}, this will be \\foo. The definition of that second macro corresponds to the definition provided with the corresponding \newcommand-directive. The control sequence token corresponding to this second macro (\\foo) is delimited by an explicit opening-square-bracket-character-token of category code 12(other). The first argument of this macro is delimited by an explicit closing-square-bracket-character-token of category code 12(other).

If the \kernel@ifnextchar test has revealed that there is no optional argument, then this second macro is called with a token sequence appended behind the corresponding control sequence token, which is structured as follows: opening-square-bracket-character-token of category code 12(other), default value, explicit closing-square-bracket-character-token of category code 12(other).

If the \kernel@ifnextchar test has revealed that there probably is an optional argument, this second macro is called without appending any additional tokens.

With macros that process delimited arguments, the tokens that within the definition's parameter-text form the argument-delimiters cannot be replaced by other tokens of the same meaning.

This is because when TeX scans for argument-delimiters, TeX does "look" at tokens, not at the meanings of tokens.

The \\foo-macro causes TeX to scan for argument-delimiters [12 and ]12.

With things like

\let\amacro=[
\let\anothermacro=]
\foo\amacro optional value \anothermacro...

you would require TeX not to "look" at tokens while scanning for argument-delimiters but to "look" at meanings of tokens.

In some situations you can do some \uppercase- or \lowercase-trickery—\uppercase and \lowercase leave the category-codes of explicit character-tokens unchanged while changing the character-codes, thus you can do something like this:

\documentclass[landscape, a4paper]{article}
%%%%%%% Layout %%%%%%%%%%%%%%%%%%%%%%%%%%%%
\csname @ifundefined\endcsname{pdfpagewidth}{}{\pdfpagewidth=\paperwidth}
\csname @ifundefined\endcsname{pdfpageheight}{}{\pdfpageheight=\paperheight}
\csname @ifundefined\endcsname{pagewidth}{}{\pagewidth=\paperwidth}
\csname @ifundefined\endcsname{pageheight}{}{\pageheight=\paperheight}
\textwidth=\paperwidth
\oddsidemargin=1.5cm
\marginparsep=.2\oddsidemargin
\advance\textwidth -2\oddsidemargin
\marginparwidth=\oddsidemargin
\advance\oddsidemargin-1in
\evensidemargin=\oddsidemargin
\advance\marginparwidth-2\marginparsep
\textheight=\paperheight
\topmargin=1.5cm
\advance\textheight-2\topmargin
\footskip=.5\topmargin
\advance\topmargin-1in
\headheight=0pt
\headsep=0pt
{\normalfont
 \csname @tempdima\endcsname=.5\ht\strutbox
 \expandafter}\expandafter\advance\expandafter\footskip\the\csname @tempdima\endcsname
{\normalfont\expandafter}\expandafter\topskip\expandafter=\the\ht\strutbox
\pagestyle{plain}%
\parindent=0ex
\parskip=\baselineskip
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newcommand\macrowithoptionalarg[1][default]{The things inside the parentheses are taken for the argument: (#1)}

\begin{document}

Test 1: \macrowithoptionalarg

Test 2: \macrowithoptionalarg[{non-default}]%

\begingroup
\catcode`[ = \active 
\catcode`] = \active
\def[{blah}
\def]{pfff}
%...
\begingroup
% catcode of ( is 12 and catcode of ) is 12.
\lccode`\(=`\[
\lccode`\)=`\]
% Now applying \lowercase to ( yields [ of catcode 12 and applying \lowercase to ) yields ] 
% of catcode 12, but be aware that AND gets lowercasd also:
Test 3: \lowercase{\endgroup\macrowithoptionalarg({[ AND ]})}%
%...    
\endgroup

\end{document}

enter image description here

In case \edef and ε-TeX's \unexpanded are allowed, you can apply \string for transforming active-[ and active-] into their category-code-12-pendants:

\documentclass[landscape, a4paper]{article}
%%%%%%% Layout %%%%%%%%%%%%%%%%%%%%%%%%%%%%
\csname @ifundefined\endcsname{pdfpagewidth}{}{\pdfpagewidth=\paperwidth}
\csname @ifundefined\endcsname{pdfpageheight}{}{\pdfpageheight=\paperheight}
\csname @ifundefined\endcsname{pagewidth}{}{\pagewidth=\paperwidth}
\csname @ifundefined\endcsname{pageheight}{}{\pageheight=\paperheight}
\textwidth=\paperwidth
\oddsidemargin=1.5cm
\marginparsep=.2\oddsidemargin
\advance\textwidth -2\oddsidemargin
\marginparwidth=\oddsidemargin
\advance\oddsidemargin-1in
\evensidemargin=\oddsidemargin
\advance\marginparwidth-2\marginparsep
\textheight=\paperheight
\topmargin=1.5cm
\advance\textheight-2\topmargin
\footskip=.5\topmargin
\advance\topmargin-1in
\headheight=0pt
\headsep=0pt
{\normalfont
 \csname @tempdima\endcsname=.5\ht\strutbox
 \expandafter}\expandafter\advance\expandafter\footskip\the\csname @tempdima\endcsname
{\normalfont\expandafter}\expandafter\topskip\expandafter=\the\ht\strutbox
\pagestyle{plain}%
\parindent=0ex
\parskip=\baselineskip
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newcommand\macrowithoptionalarg[1][default]{The things inside the parentheses are taken for the argument: (#1)}

\begin{document}

Test 1: \macrowithoptionalarg

Test 2: \macrowithoptionalarg[{non-default}]%

\begingroup
\catcode`[ = \active 
\catcode`] = \active
\def[{blah}
\def]{pfff}
%...
\edef\scratchmacro{\unexpanded{\macrowithoptionalarg}\string[\unexpanded{[ and ]}\string]}%
Test 3: \scratchmacro
%...    
\endgroup

\end{document}

enter image description here


I'm reading more possibilities in this question, but it seems like the attractive use for active definitions for characters is in math mode. There is support for that!

\begingroup
 \catcode`[ = \active 
 \catcode`] = \active
 \gdef[{blah}
 \gdef]{pfff}
\endgroup 
% catcodes are back to normal, but the definitions remain.
% special setting to use the definition in math mode
\mathchardef`[="8000
\mathchardef`]="8000

Now the problems with variant catcodes are gone, but the defined meanings are used in math formulas, but the plain characters are used in text mode.

Tags:

Catcodes