\immediate\write with plain text

The LaTeX kernel provides the filecontents environment to write to external files without having to worry about catcodes and such. In older LaTeX releases (prior to 2019-10-01; see here) the filecontents package did minimal changes to this environment allowing it to be used anywhere in the document (LaTeX's version could only be used in the preamble and didn't allow overwriting). In newer releases, this functionality is also included in the LaTeX kernel.

To produce

To be or not to be,
that is % the question

you use:

\documentclass{article}
% \usepackage{filecontents} For older LaTeX releases
\begin{document}
\begin{filecontents*}[overwrite]{tmp.txt}
To be or not to be,
that is % the question
\end{filecontents*}
\end{document}

The starred version (filecontents*) omits the heading that is printed in the standard version of the environment:

%% LaTeX2e file `tmp.txt'
%% generated by the `filecontents' environment
%% from source `test' on 2020/01/12.
%%
To be or not to be,
that is % the question

An addendum on my (admittedly lazy) answer:

Should you want to persist on reinventing the wheel (which is much more fun, I must admit), then you can create a command to take care of the \catcodeing for you. Here I provide an implementation of a \verbwrite command which does the job for you.

The command syntax is somewhat like LaTeX's \verb: you can use either as \verbwrite\file{<stuff>} or \verbwrite\file|<stuff>|. For the latter syntax, any character other than { can be used to delimit the contents. This character, obviously, can't appear in <stuff>. The advantage of the second syntax is that you don't have any restriction in balancing { and } inside the contents of the command.

\documentclass{article}

\makeatletter
\long\def\@ifnextchar@other@space#1#2#3{%
  \let\reserved@d=#1%
  \def\reserved@a{#2}%
  \def\reserved@b{#3}%
  \futurelet\@let@token\@ifnch@other}
\def\@ifnch@other{%
  \ifx\@let@token\other@sptoken
    \let\reserved@c\@xifnch@other
  \else
    \ifx\@let@token\reserved@d
      \let\reserved@c\reserved@a
    \else
      \let\reserved@c\reserved@b
    \fi
  \fi
  \reserved@c}
{\catcode`\ =12
{\global\let\other@sptoken= }%
\gdef\@xifnch@other {\futurelet\@let@token\@ifnch@other}}%
\def\verbwrite{%
  \kernel@ifnextchar*%
    {\let\@ifnextchar\@ifnextchar@other@space\expandafter\verbwrite@grab\@gobble}%
    {\let\@ifnextchar\kernel@ifnextchar\verbwrite@grab}}
\def\verbwrite@grab#1{%
  \begingroup
    \catcode`\^^M=13
    \newlinechar`\^^M
    \let\do\@makeother \dospecials
    \catcode`\{=1
    \@ifnextchar\bgroup
      {\catcode`\}= 2\relax\verbwrite@brace#1}%
      {\catcode`\{=12\relax\verbwrite@other#1}}
\def\verbwrite@brace#1#2{%
    \immediate\write#1{\unexpanded{#2}}%
  \endgroup}
\def\verbwrite@other#1#2{%
  \def\verbwrite@delim##1##2#2{%
    \verbwrite@brace##1{##2}}%
  \verbwrite@delim#1}
\makeatother

\begin{document}
\newwrite\file
\immediate\openout\file=tmp.txt
\verbwrite\file {1-To be or not to be,
that is % the question}
\verbwrite*\file {2-To be or not to be,
that is % the question}
\verbwrite\file|3-To be or not to be,
that is } the {question|
\verbwrite\file$4-Être ou ne pas être,
вот в чем вопрос$
\verbwrite\file}5-Être ou ne pas être,
вот в чем вопрос}
\closeout\file
\end{document}

Please beware that I took 68 minutes to write this command, so it is certainly not what you can call robust. Proceed with care :)

Fix 1: Prevent expansion of the text using ε-TeX's \unexpanded (thanks to jfbu :)

Fix 2: Prevent premature tokenization of the delimiter (thanks again to jfbu :)

Feature 1: Added a starred version that ignores spaces before the delimiter of the verbatim content.

Fix 3: Actually allow } as a "other" delimiter (\verbwrite\file}stuff}) (thanks to Ulrich Diez :)

Fix 4: Fix misfeature from Feature 1. The effect of the * argument would remain for further calls to \verbwrite once used.



With my explanations below I write (La)TeX in places where I wish to indicate that what is written is valid for "pure" TeX and thus is valid for LaTeX also. I do so for people who are not aware that LaTeX basically is TeX plus a collection of macros that forms the LaTeX-format and that gets loaded automatically when executing latex.exe/the latex-binary.


I suggest using the filecontents*-environment.

Be aware that there is also a LaTeX 2ε-package filecontents which does remove some of the limitations that come along with the filecontents*-environment from the LaTeX 2ε-kernel.


If you are in the mood for reinventing the wheel, you can write a macro which does

  • switch to verbatim-catcode-régime,
  • switch the catcode of the endlinechar (usually ^^M/ASCII-Return) to 12 so that ASCII-return is treated like digits and punctuation-marks,
  • read and tokenize under that catcode-régime the argument containing the text that is to be written to file
  • trim leading and trailing endline-chars from that text
  • write the text to file while having \endlinechar also as \newlinechar.

In (La)TeX there are several stages of processing input.

(La)TeX does read TeX-input, e.g., a .tex-input-file, line by line.

In the pre-processing-stage, the single characters that form the line will be converted to (La)TeX's internal character encoding. (With old-school (La)TeX engines, the internal character-encoding is ASCII. With engines based on XeTeX or LuaTeX, the internal character-encoding is utf-8 whereof ASCII is a subset.) Then all space-characters (code-point-number 32 both in ASCII and in utf-8, i.e., in all encodings that come into question as internal-character encoding of a (La)TeX engine) that occur at the right end of the line will be removed. Then a character will be inserted at the right end of the line whose code-point-number in (La)TeX' internal character-encoding (i.e. ASCII or utf-8) corresponds to the number of the integer-parameter \endlinechar. Usually the value of the integer-parameter \endlinechar is 13 while code-point-number 13 both in ASCII and in utf-8, i.e., in all encodings that come into question as internal-character encoding of a (La)TeX engine, denotes the ⟨RETURN⟩-character. This means: Usually a ⟨RETURN⟩-character gets inserted at the right end of the line.

When this is done, the tokenizing-stage begins: In this stage (La)TeX takes the characters that form the line for instructions for placing tokens into the token-stream. This is the stage when things start to be about so-called tokens, e.g., control-sequence-tokens (which come in two flavors: control-word-tokens and control-symbol-tokens) and character-tokens. Character-tokens consist of character-codes denoting the code-point-number in the (La)TeX' internal character-encoding and category-codes. Category-codes make it possible for characters to have special meanings for the (La)TeX-engine. E.g., the category-code of the backslash-character usually is 0(escape). A character whose category-code is 0 at tokenizing-time causes (La)TeX to gather the name of a control-sequence-token and afterwards place that control-sequence-token into the token-stream. E.g., the category-code of the opening curly brace usually is 1(begin grouping) and the category-code of the closing curly brace usually is 2(end grouping) while character-tokens of category-code 1(begin grouping) are to be used for introducing groups (i.e., macro arguments consisting of several tokens or local-scopes for assignments like macro-definitions or the ⟨balanced text⟩ with things like \scantokens) and character-tokens of category-code 2(end grouping) are to be used for denoting what does not belong to the group in question any more. More information about category-codes can be found at https://en.wikibooks.org/wiki/TeX/catcode.

After tokenizing, there is a "stream of tokens". Processing the stream of tokens includes things like expansion of expandable tokens (e.g., macro-tokens, e.g., expandable primitives like \string or \csname...\endcsname) and (later) carrying out assignments, creating boxes etc.

When reading and tokenizing a .tex-input-file, (La)TeX will— during the pre-processing-stage— remove spaces at every line-ending and insert an endline-character at every line-ending.

Therefore the input-sequence

\immediate\write\file{
To be or not to be,
that is % the question
}

will by (La)TeX at tokenizing-time, i.e., after pre-processing, be treated as

\immediate\write\file{⟨character due to endline-char-insertion⟩
To be or not to be,⟨character due to endline-char-insertion⟩
that is % the question⟨character due to endline-char-insertion⟩
}⟨character due to endline-char-insertion⟩

Usually the endline-character is ^^M, i.e., ⟨RETURN⟩.

Thus the above input-sequence usually will by (La)TeX at tokenizing-time be treated as

\immediate\write\file{⟨^^M/RETURN-character⟩
To be or not to be,⟨^^M/RETURN-character⟩
that is % the question⟨^^M/RETURN-character⟩
}⟨^^M/RETURN-character⟩

(The answer to the question which tokens (La)TeX will insert into the token-stream when encountering a ⟨^^M/RETURN-character⟩ depends on the category-code which at the time of tokenizing is assigned to the ⟨^^M/RETURN-character⟩.

Usually the category-code of the ⟨^^M/RETURN-character⟩ is 5 (end of line) which means that depending on the state of (La)TeX' reading apparatus either (in state S=skipping blanks) no token at all or (in state M=in the middle of a line) a space-token(=a character-token of category-code 10(space) and character-code 32 (32 is the number of the space-character in (La)TeX' internal character-encoding) or (in state N=about to begin new line) a \par-token will be inserted.

In case category code 12(other) is assigned to the ⟨^^M/RETURN-character⟩, (La)TeX will insert a character-token of category-code 12(other) and character-code 13 (13 is the number of the ⟨RETURN-character⟩, in (La)TeX' internal character-encoding) into the token-stream. Such a token can be processed as any other character token.)

Besides this, (La)TeX will—at writing-time—in any case attach at the end of the argument of a \write-command that sequence of characters/bytes that on the platform in use serves for ending lines within plain text files.

Thus—assuming that we managed to have LaTeX accept the percent-char as an ordinary character—the \write-command will get something like:

⟨token due to ^^M/RETURN-character⟩To be or not to be,⟨token due to ^^M/RETURN-character⟩that is % the question⟨token due to ^^M/RETURN-character⟩

Att writing-time, a

⟨platform-dependent sequence for ending the line⟩
will be attached.

If the category code of the endline-character/of the ⟨^^M/RETURN-character⟩ was 5(end of line) at the time of tokenizing the input, the sequence

⟨space⟩To be or not to be,⟨space⟩that is % the question⟨space⟩⟨platform-dependent sequence for ending the line⟩
will be written to the external file.

If the category code of the endline-character/of the ⟨^^M/RETURN-character⟩ was 12(return) at the time of tokenizing the input, the sequence

^^MTo be or not to be,^^Mthat is % the question^^M⟨platform-dependent sequence for ending the line⟩
will be written to the external file.

You can ensure that at writing-time a ⟨^^M/RETURN-character⟩ also yields the ⟨platform-dependent sequence for ending the line⟩ by assigning the integer-parameter \newlinechar the value of the integer-parameter \endlinechar.

If you do this also, the sequence

⟨platform-dependent sequence for ending the line⟩To be or not to be,⟨platform-dependent sequence for ending the line⟩that is % the question⟨platform-dependent sequence for ending the line⟩⟨platform-dependent sequence for ending the line⟩

will be written to the external file.

But this way you might get undesired empty lines.

Therefore you may wish to apply a routine for removing leading and trailing ⟨characters due to endline-char-insertion⟩ from the entire argument before letting \write do the writing-job.


A coding-example could look like this:

\documentclass{article}

\makeatletter

\begingroup
\catcode`\^^M=12\relax%
\@firstofone{%
  \endgroup%
  \newcommand*\gobbleendl{}\def\gobbleendl ^^M{}%
  \newcommand\trimendls[2]{\innertrimleadendl{#2}#1^^M\relax{#1}}%
  \newcommand*\innertrimleadendl{}%
  \def\innertrimleadendl#1#2^^M#3\relax#4{%
    \ifx\relax#2\relax\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi%
    {%
      \ifx\relax#4\relax\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi%
      {\trimtrailendl{}{#1}}%
      {\expandafter\trimtrailendl\expandafter{\gobbleendl#4}{#1}}%
    }%
    {\trimtrailendl{#4}{#1}}%
  }%
  \newcommand*\trimtrailendl[2]{%
    \innertrimtrailendl{#2}.#1\relax.^^M\relax.\relax\relax{#1}%
  }%
  \newcommand*\innertrimtrailendl{}%
  \def\innertrimtrailendl#1#2^^M\relax.#3\relax\relax#4{%
    \ifx\relax#3\relax\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi%
    {\def\@tempa{#4}}%
    {\expandafter\def\expandafter\@tempa\expandafter{\@gobble#2}}%
    \@onelevel@sanitize\@tempa%
    \newlinechar=\endlinechar%
    \immediate\write#1{\@tempa}%
  }%
}%

\newcommand\immediateverbatimwrite[1]{%
  \begingroup
  \let\do=\@makeother
  \dospecials
  \catcode`\ =10 %We don't want to allow space as verb-arg-delimiter.
                 %Thus let's remove spaces when grabbing undelimited arguments.
  %\endlinechar=`\^^M%
  %\catcode`\endlinechar=5 %
  \bracefork{#1}%
}%
\begingroup
\catcode`\(=1 %
\catcode`\{=12 %
\@firstofone(%
  \endgroup
  \newcommand\bracefork[2](%
    \catcode`\ =12\relax
    \catcode\endlinechar=12 %
    \ifx{#2\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi
    (%
      \catcode`\{=1 %
      \catcode`\}=2 %
      \internalfilewritercaller(#1}(}%
    }(%
      \internalfilewritercaller(#1}(#2}%
    }%
  }%
}%
\newcommand\internalfilewritercaller[2]{%
  \def\@tempa##1#2{\internalfilewriter{#1}{##1}}%
  \ifx\relax#2\relax\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi
  {\expandafter\expandafter
   \expandafter\@tempa
   \expandafter\expandafter
   \expandafter{%
   \expandafter\@gobble\string}}%
  {\@tempa}%
}
\newcommand\internalfilewriter[2]{%
  \trimendls{#2}{#1}%
  \endgroup
}%
\makeatother

\begin{document}

\newwrite\file
\immediate\openout\file=tmp.txt\relax

A\immediateverbatimwrite{\file}
{
être ou ne pas être.
That is % the question.
}B%
C%
%
D\immediateverbatimwrite{\file}  |
}être ou ne pas être.
That is % the question.
|E%
F

\immediate\closeout\file

\end{document}

With this example you get

  • a pdf-file with the sequence ABCDEF. (This shows that no spurious spaces/whatsoever characters get introduced/inserted.)
  • a text-file whose name is tmp.txt and whose content is:
    être ou ne pas être.⟨linebreak⟩
    That is % the question.⟨linebreak⟩
    }être ou ne pas être.⟨linebreak⟩
    That is % the question.⟨linebreak⟩
    Due to the linebreaks, editors which also show line-numbers might display that file as
    1 être ou ne pas être.
    2 That is % the question.
    3 }être ou ne pas être.
    4 That is % the question.
    5

By the way: With (La)TeX it is not possible to keep spaces at the ends of lines.

The reason is that (La)TeX does read and tokenize input line by line and one of the first things it does (in the pre-processing-stage) to every line of input (even before adding the endline-character and starting tokenizing the line) is removing all spaces that occur at the ends of lines.

Thus (La)TeX input like

code⟨space⟩⟨space⟩
more code⟨space⟩⟨space⟩⟨space⟩⟨space⟩⟨space⟩
even more code⟨space⟩⟨space⟩

will in any case be pre-processed to

code⟨character due to endline-char-insertion⟩more code⟨character due to endline-char-insertion⟩even more code⟨character due to endline-char-insertion⟩

before any further processing/tokenization etc takes place.