Multiple letters without spacing in Math

It's theoretically possible to make LaTeX recognize clusters of letters and typeset them as if they were input as argument to \mathit, but it would be deadly slow: each alphabetic character should be turned into a “math active” one, which checks whether the following item is an alphabetic character; if it is it should typeset itself, starting \mathit if it's the first one, and then pass the same control to the next character; if not followed by an alphabetic character it should end \mathit.

It's instead better to do something like

\newcommand{\mli}[1]{\mathit{#1}}

where \mli stands for MultiLetter Identifier; use any control sequence name. Then you'd input

$\mli{P}=\mli{NP}$

that has a good deal of advantages, but mainly keeps information about the input.

Note: I have a nice proof of the above statement, but unfortunately there are length limitations for posts on this site.


Heiko's solution in expl3.

\documentclass[fleqn]{article}
\usepackage{xcolor}
\usepackage{amsmath}
\usepackage{xparse}

\ExplSyntaxOn

% 1. Define an equivalent for each letter
\cs_new_protected:Nn \__egreg_letter_loop:nn
 {
  \int_step_inline:nnn { `#1 } { `#2 }
   {
    \exp_args:Nc \mathchardef 
     { __egreg_letter_code_##1: } % generate the old code
     =
     \mathcode##1 % the old code
     \scan_stop:
    \cs_new_protected:cx { __egreg_letter_act_##1: }
     {
      \exp_not:N \__egreg_letter_scan:nw { ##1 }
     }
    \char_set_active_eq:nc { ##1 } { __egreg_letter_act_##1: }
   }
 }
\__egreg_letter_loop:nn { A } { Z }
\__egreg_letter_loop:nn { a } { z }

% 2. Define the main macro that does the scanning
\tl_new:N \l__egreg_letter_scanned_tl

\cs_new_protected:Npn \__egreg_letter_scan:nw #1
 {
  \tl_if_empty:NT \l__egreg_letter_scanned_tl { \c_group_begin_token } % to be closed later
  \tl_put_right:Nx \l__egreg_letter_scanned_tl { \exp_not:c { __egreg_letter_code_#1: } }
  \peek_catcode:NF A
   {% next char is not a letter
    \__egreg_letter_deliver:V \l__egreg_letter_scanned_tl
   }
 }
\cs_new_protected:Nn \__egreg_letter_deliver:n
 {
  \tl_if_single:nTF { #1 }
   {
    \__egreg_letter_single:n { #1 }
   }
   {
    \__egreg_letter_group:n { #1 }
   }
  \c_group_end_token % finish the group
 }
\cs_generate_variant:Nn \__egreg_letter_deliver:n { V }

% 3. The interface for defining the actions
\NewDocumentCommand{\definelettersingle}{m}
 {
  \cs_set_protected:Nn \__egreg_letter_single:n { { #1 } }
 }
\NewDocumentCommand{\definelettergroup}{m}
 {
  \cs_set_protected:Nn \__egreg_letter_group:n { { #1 } }
 }

% 4. Make all letters math active
\int_step_inline:nnn { `A } { `Z } { \mathcode#1="8000 }
\int_step_inline:nnn { `a } { `z } { \mathcode#1="8000 }

\ExplSyntaxOff

% initialize

\definelettersingle{#1}
\definelettergroup{\mathit{#1}}

\begin{document}

First a standard formula $NP\ne N P$

\bigskip

Now \textcolor{blue}{single letters are set in blue},
\textcolor{red}{multiple letters in red}.
\begin{itemize}
\item Multiple letters are put in \verb|\mathit|:
  \definelettersingle{{\color{blue}#1}}%
  \definelettergroup{\mathit{\color{red}#1}}%
  \[ NP \neq N P \]

\item Multiple letters are put in \verb|\mathrm|:
  \definelettersingle{{\color{blue}#1}}%
  \definelettergroup{\mathrm{\color{red}#1}}%
  \[ NP\ (text) \neq N P\ (two\ variables) \]
  \[ F_force = m_mass a_acceleration \]
\end{itemize}
\end{document}

enter image description here


Here the solution for "egreg's exercise for the reader". The following example makes the letters "math active" with the following features:

  • The letters are collected until a non-letter is reached. Single letters are passed to macro \mprintsingle, multiple letters are passed to macro \mprintmulti. Both macros have two arguments. The first is intended to output the result in math mode, the second can be used for text mode.
  • Also the result is put into a subformula using \bgroup and \egroup. This allows putting multiple letters into indexes and subscripts without curly braces, e.g. F_strong.

Two applications:

  • The OP would use something like:

    \renewcommand*{\mprintsingle}[2]{#1}% the default
    \renewcommand*{\mprintmulti}[2]{\mathit{#1}}
    

Usually multiple letters are text words and not variables and should be printed upright. This can be achieved by:

    \usepackage{amstext}
    \renewcommand*{\mprintsingle}[2]{#1}% the default
    \renewcommand*{\mprintmulti}[2]{\text{#2}}

The full example file showing both applications:

\documentclass[fleqn]{article}

\makeatletter
\usepackage{etexcmds}
\newtoks\m@toks@math
\newtoks\m@toks@text

\edef\m@tmp@restore{%
  \lccode\number`\X=\the\lccode`\X\relax
  \lccode\number`\~=\the\lccode`\~\relax
}

\newcommand*{\m@activate}{}
\newcommand*{\m@check@letter}{}
\newcommand*{\m@check@fi}{}

\newif\ifm@single

\let\m@start\relax
\def\m@loop{%
  \lccode`\X=\count@
  \lccode`\~=\count@
  \lowercase{%
    \expandafter\mathchardef\csname m@code@X\endcsname=\mathcode\count@
    \edef\m@activate{%
      \etex@unexpanded\expandafter{\m@activate}%
      \mathcode\the\count@="8000\relax
      \def\noexpand~{\m@start\csname m@code@X\endcsname X}%
    }%
    \g@addto@macro\m@check@letter{%
      \ifx\@let@token X\else
    }%
  }%
  \g@addto@macro\m@check@fi{\fi}%
  \advance\count@\@ne
}
% A-Z
\count@=`\A\relax
\@whilenum\count@<\numexpr`\Z+1\relax\do{\m@loop}
% a-z
\count@=`\a\relax
\@whilenum\count@<\numexpr`\z+1\relax\do{\m@loop}

\newcommand*{\m@start}[2]{%
  \bgroup
  \m@toks@math{#1}%
  \m@toks@text{#2}%
  \m@singletrue
  \futurelet\@let@token\m@check
}
\edef\m@check{%
  \etex@unexpanded{%
    \let\m@next\m@add
    \ifx\@let@token\space
      \let\m@next\m@finish
    \else
      \ifx\@let@token\egroup
        \let\m@next\m@finish
      \else
  }%
  \etex@unexpanded\expandafter{%
        \m@check@letter
  }%
  \etex@unexpanded{%
          \let\m@next\m@finish
  }%
  \etex@unexpanded\expandafter{%
        \m@check@fi
  }%
  \etex@unexpanded{%
      \fi
    \fi
    \m@next
  }%
}

\newcommand*{\m@add}[1]{%
  \m@singlefalse
  \m@toks@math\expandafter{%
    \the\expandafter\m@toks@math
    \csname m@code@#1\endcsname
  }%
  \m@toks@text\expandafter{%
    \the\m@toks@text
    #1%
  }%
  \futurelet\@let@token\m@check
}

\newcommand*{\m@finish}{%
  \ifm@single
    \expandafter\mprintsingle\expandafter{%
      \the\expandafter\m@toks@math\expandafter
    }\expandafter{%
      \the\expandafter\m@toks@text\expandafter
    }%
  \else
    \expandafter\mprintmulti\expandafter{%
      \the\expandafter\m@toks@math\expandafter
    }\expandafter{%
      \the\expandafter\m@toks@text\expandafter
    }%
  \fi
  \egroup
}
\let\mprintsingle\@firstoftwo
\let\mprintmulti\@firstoftwo

\everymath{\m@activate}
\everydisplay{\m@activate}

\m@tmp@restore
\makeatother

\usepackage{color}
\usepackage{amstext}

\begin{document}
\textcolor{blue}{Single letters are set in blue},
\textcolor{red}{multiple letters in red}.
\begin{itemize}
\item Multiple letters are put in \verb|\mathit|:
  \renewcommand*{\mprintsingle}[2]{{\color{blue}#1}}%
  \renewcommand*{\mprintmulti}[2]{\mathit{\color{red}#1}}%
  \[ NP \neq N P \]

\item Multiple letters are put in \verb|\text|:
  \renewcommand*{\mprintsingle}[2]{{\color{blue}#1}}%
  \renewcommand*{\mprintmulti}[2]{\text{\color{red}#2}}%
  \[ NP\ (text) \neq N P\ (two\ variables) \]
  \[ F_force = m_mass * a_acceleration \]
\end{itemize}
\end{document}

Result


Asked in comments to expand on my comment about fonts.

TeX does not control the space between letters in math mode, the apparent space appears because usually math italic fonts have wider sidebearings and inter-letter kerning (and often the characters themselves are wider) than a text italic.

\documentclass{article}

\begin{document}

\showoutput

\textit{NP ffiv} $NP ffiv$ $\mathit{NP ffiv}$

\end{document}

Produces

enter image description here

and in the log:

...\hbox(6.94444+1.94444)x345.0, glue set 224.80687fil
....\hbox(0.0+0.0)x15.0
....\OT1/cmr/m/it/10 N
....\OT1/cmr/m/it/10 P
....\glue 3.57774 plus 1.53178 minus 1.02322
....\OT1/cmr/m/it/10 ^^N (ligature ffi)
....\OT1/cmr/m/it/10 v
....\kern 1.07637
....\glue 3.33333 plus 1.66666 minus 1.11111
....\mathon
....\OML/cmm/m/it/10 N
....\kern1.09026
....\OML/cmm/m/it/10 P
....\kern1.3889
....\OML/cmm/m/it/10 f
....\kern1.0764
....\OML/cmm/m/it/10 f
....\kern1.0764
....\OML/cmm/m/it/10 i
....\OML/cmm/m/it/10 v
....\kern0.35878
....\mathoff
....\glue 3.33333 plus 1.66666 minus 1.11111
....\mathon
....\hbox(6.94444+1.94444)x28.70953
.....\OT1/cmr/m/it/10 N
.....\OT1/cmr/m/it/10 P
.....\OT1/cmr/m/it/10 ^^N
.....\OT1/cmr/m/it/10 v
.....\kern1.07637
....\mathoff
....\penalty 10000

Looking at the log you see that the third example uses the same font, .\OT1/cmr/m/it/10 as the first (this is a choice of the default computer modern font setup, other font packages may differ in whether they make \mathit use the same font as \textit or a font chosen to match the main math font).

The fact that there is kern of 1.09026pt between N and P (and the fact that there is no ffi ligature) in the middle example is not under the control of TeX (or visible from TeX macros except in luatex), it is a property of the font metrics. It is not just the kerning and sidebearings that differ, the letter shapes are different as well: most noticeable in v in this example. Of course if the font package changes the text fonts to be other than computer modern, but leaves the math fonts untouched, the difference between \mathit and \textit would be more pronounced.