pdfTeX hang prevention

The issue can be reproduced with a smaller example, showing it has nothing to do with the loaded packages:

\documentclass{article}

\begin{document}

$\left. \begin{array} { l } { a ) A = \{ 2   $

\section{Solution}
${a: 0}$

\end{document}

The errors in the math formula, followed by the section title, make TeX enter an infinite loop announced by

! LaTeX Error: \begin{array} on input line 7 ended by \end{document}.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.12 \end{document}

? 

! Improper \prevdepth.
\newpage ...everypar {}\fi \par \ifdim \prevdepth 
                                                  >\z@ \vskip -\ifdim \prevd...
l.12 \end{document}

The missing \end{array} causes \par to still be defined as “do nothing” like it always is in array. Since the error recovery here is to try doing \end{document}, LaTeX tries to finish up the page issuing \par, which does nothing.

If we add \tracingmacros=1, after the last error message we see, in the log file after interrupting the program, a string of

\par ->

\par ->

\par ->

\par ->

\par ->

Solution: don't make silly errors in your input.

Another solution could be running pdflatex with the option -halt-on-error, which would stop it at the first error.

However, this is not foolproof. If the user has \def\foo{\foo} in their preamble, then the first usage of \foo in the document would start an infinite loop with no error.


This is an answer to the actual problem of running TeX as a subprocess on a document that contains user supplied code that you have no control over (rather than focusing on the particular example you've provided).

As already mentioned by others, it's trivially easy to trigger an infinite loop in TeX without generating any errors. Your example shows a plausible user mistake (forgetting the end of an environment) but you also need to guard against a malicious user deliberately triggering an infinite loop.

Whenever you have an application or script that spawns a subprocess that has the potential to run indefinitely it's a good idea to include a timeout. Since you're using Python, you might find the answers to Using module 'subprocess' with timeout useful.

There are, however, other types of malicious code that you need to consider. There were some significant improvements made in both TeX Live and MikTeX in 2010 to improve security, but there have also been some more recent fixes, such as:

  • Buffer overflow in texlive-bin allowed arbitrary code execution when a malicious Type 1 font is loaded.
  • Incorrect handling of certain files in TeX Live on Ubuntu 14.04 LTS

So make sure you have an up-to-date TeX distribution.

The security settings for TeX Live are in the texmf.cnf configuration file. There are two of these files by default and their locations can be found with kpsewhich -a texmf.cnf. One contains the default settings that shouldn't be modified. The other can be used to override specific settings if required.

The security settings for MikTeX are in the miktex.ini file.

The main source for concern is the shell escape (\write18). There are three modes:

  • Disabled (shell_escape=f in the texmf.cnf file or use -no-shell-escape when running TeX). This will prevent any systems commands from being called by TeX. This is the most secure mode.
  • Restricted (shell_escape=p in the texmf.cnf file). This imposes the following restrictions on \write18:
    • Certain characters are forbidden (such as ' and ;) to prevent injection.
    • Only applications on the trusted list can be run. These are identified in the shell_escape_commands setting in the texmf.cnf file. You can list them with kpsewhich -var-value=shell_escape_commands There are currently eight: bibtex, bibtex8, extractbb, gregorio, kpsewhich, makeindex, repstopdf, texosquery-jre8. These have been evaluated by the TeX Live security team and determined to be safe. (It is, however, possible to still misuse this setting with destructive effect, as I recently demonstrated in the UK TUG meeting.)
  • Unrestricted (shell_escape=t in the texmf.cnf file or use -shell-escape when running TeX). This allows any system command to be called and is therefore insecure.

Another area of concern are the file I/O operations, which are essential to common document build requirements (such as generating table of contents, cross-referencing and indexes) but can be misused. In addition to the operating system's native file permissions, TeX also has settings to determine whether read or write access is allowed.

The texmf.cnf file has two settings openin_any and openout_any that may take one of the following values:

  • a: any file allowed (if permitted by the operating system);
  • r: (restricted) hidden dot files not allowed;
  • p: (paranoid) hidden dot files not allowed, and disallow going to parent directories (..) and restrict absolute paths to be under $TEXMFOUTPUT.

The default values are:

openin_any = a
openout_any = p

The paranoid setting prevents files from being acessed outside of the current working path (the directory that TeX was called from).

For example, suppose you are running TeX on a web server and suppose your home directory on that server is /home/foo and the root for your website is /home/foo/public_html (so, for example, if your website is www.example.com then www.example.com/index.php corresponds to the file /home/foo/public_html/index.php).

If you run TeX from your home directory (/home/foo) then, even with the paranoid setting, malicious code added to your document can overwrite public_html/index.php (if it's not protected by the filing system). Your website's home page is now corrupted.

With the file read operation, if the user gets to see the generated PDF, they can use malicious TeX code to access information from your system. Suppose you have a script /home/foo/public_html/foobar.php that accesses a database. This could be input verbatim into the document and the database connection information, including the password, can now be read from the PDF.

TeX code can be obfuscated so don't rely on using regular expressions to check for certain commands within the user-supplied code.

Summary:

  • Ensure you have an up-to-date TeX installation.
  • Invoke TeX with a timeout that will automatically kill the process if it goes on too long.
  • Run TeX with -no-shell-escape.
  • Run TeX in a safe directory that doesn't have any subdirectories leading to important files.
  • Ensure that both openout_any and openin_any are set to p.
  • If you need to view the generated PDF, make sure that your PDF viewer has JavaScript disabled.

You have unbalanced environments/braces; \begin{array} doesn't have \end{array} and \left. doesn't have \right... Also, load breqn after amsmath and add lmodern for preventing missing font sizes substitution.

\documentclass{article}
\usepackage{graphicx,lmodern}
\usepackage{draftwatermark}
\usepackage{amsmath}
\usepackage{breqn}
\SetWatermarkText{FAST MATH}
\SetWatermarkScale{2}
\SetWatermarkVerCenter{0.6\paperheight}
\SetWatermarkAngle{30}

\begin{document}

\section{Input}
$ \begin{array} { l }  a) A = \{ 2  \end{array}$
\section{Solution}
${a: 0}$

\end{document}

Tags:

Errors