How is the output of XeTeX or LuaTeX tested for fidelity with output from earlier TeX engines?

I suspect that in practice the best test (especially in the case of luatex) is that the LaTeX maintainers regularly run the LaTeX2e and expl3 test suites using all three engines (and also ptex variants in the case of expl3). The l3build test format normalises the log output to some extend to remove known differences and (allows engine-specific versions of the base log files to be stored).

So while we are not directly involved in the development of either luatex or xetex we can (and do:-) report any issues to the engine developers, often using test releases of texlive, so things get fixed before the full texlive release.

Not all differences are bugs (and especially luatex is explicitly not designed to be fully compatible) but having the test suite means that it is easier to document expected differences and ensure that the latex macros work in all three cases, the format itself and several packages have variant definitions depending on the engine being used. Notably \showhyphens and the allocation macros like \newcount plus the initialisation of character properties such as \lccode lowercase settings are all engine dependent.


Quoting the excellent UK TeX FAQ

TeX (and MetaFont and MetaPost) are written in a ‘literate’ programming language called Web which is designed to be portable across a wide range of computer systems. How, then, is a new version of TeX checked?

Of course, any sensible software implementor will have his own suite of tests to check that his software runs: those who port TeX and its friends to other platforms do indeed perform such tests.

Knuth, however, provides a ‘conformance test’ for both TeX (trip) and MetaFont (trap). He characterises these as ‘torture tests’: they are designed not to check the obvious things that ordinary typeset documents, or font designs, will exercise, but rather to explore small alleyways off the main path through the code of TeX. They are, to the casual reader, pretty incomprehensible!

Once an implementation of TeX has passed its trip test, or an implementation of MetaFont has passed its trap test, then it may in principle be distributed as a working version. (In practice, any distributor would test new versions against “real” documents or fonts, too; while trip and trap test bits of pathways within the program, they don’t actually test for any real world problem.)

Here is a link to the documentation for the trip test