What are formats in LaTeX and how to manage them?

A format file is just a preloaded set of definitions to speed up document processing. "pdflatex" for example is just "pdftex" with the definitions in latex.ltx preloaded. Actually these days not loading latex.ltx on every document only saves a fraction of a second, when the system was designed it would save tens of minutes per document.

Related to this the \patterns instructions that set up language-specific hyphenation can only be loaded in initex mode (which is set up for dumping formats) which is why you see a specification of which files to load in the config file.


I will restrict my explanations to formats and their handling in TeX Live. MikTeX has a different approach I guess.

To add to David's explanation, formats are defined in the so called TLPOBJ, these are the stanzas (paragraphs) you can see in the texlive.tlpdb (which is in PATH/TO/TEXLIVE/2019/tlpkg/texlive.tlpdb). An example:

name aleph
category Package
revision 50602
...
execute AddFormat name=aleph engine=aleph options=*aleph.ini           fmttriggers=cm,hyphen-base,knuth-lib,plain
execute AddFormat name=lamed engine=aleph patterns=language.dat           options=*lambda.ini           fmttriggers=cm,hyphen-base,antomega,lambda,latex,latex-fonts,omega
...

This adds two format definitions. In the TLPDB we are using key=value pairs, from which the fmtutil.cnf is generated. (Ignore the fmttriggers for now, they are internal features to trigger rebuilds)

The definitions in turn come from the macro package writers and are put into the source files for the tlpobj by us (TL Team).

Most of the formats are defined in the respective engine package:

  • aleph engine is in aleph package and defined aleph and lamed
  • context package provides all kinds of cont-XX formats
  • cslatex package provides cslatex and pdfcslatex formats
  • ...

Some formats, in particular the base latex formats, get a special treatment, though.

Concerning the language question: As David said, hyphenation patterns can only be loaded during format creation time, and in addition different engines (binary programs) use different formats of hyphenation pattern definitions, thus we need to have the file to be loaded in the configuration file.