Discretionary hyphen destroys kerning and ligature

By default, the control sequence \- is an abbreviation for

\discretionary{-}{}{}

(see p. 95 of the TeXbook). The first argument of \discretionary, the prebreak, specifies what gets inserted immediately before the linebreak if a linebreak is inserted (here: a hyphen character). The second argument, the postbreak, specifies what's inserted immediately after the line break if a linebreak occurs (here: nothing). The third argument, the nobreak, specifies what gets typeset if no linebreak occurs -- here: nothing or, more precisely and crucially, an empty token, {}.

For the sake of specificity, consider the following example: half\-line. This first gets expanded to half\discretionary{-}{}{}line. Suppose the word does not get broken up at the end of a line. What happens next depends on which TeX engine is in use.

  • Assuming pdfLaTeX is in use, what gets typeset is half{}line, and the fl ligature is (correctly!) broken up, since that is what's supposed to happen if f{}l is encountered.

  • In contrast, under LuaLaTeX all text-mode instances of {} are discarded prior to final processing. Therefore, halfline is processed without the {} between f and l, and the fl ligature is (incorrectly in this case) not broken up.

  • What happens under XeLaTeX depends partly on the version of XeLaTeX that's in use on your system. E.g., if you use TeXLive2018, halfline and half\-line are both typeset with an fl-ligature if no line break occurs between half and line. At some in the not-too-distant past, though, the fl-ligature was suppressed if half\-line was encountered.

To verify these claims, simply compile the following document under pdfLaTeX, XeLaTeX, and LuaLaTeX:

\documentclass{article}
\begin{document}
halfline half\-line
\end{document}

For "plain" pdfTeX, XeTeX, and LuaTeX, simply run

halfline half\-line 
\bye

To get all three engines to produce the same output, don't write half\-line. Instead, write hal\discretionary{f-}{l}{fl}ine. That way, the fl-ligature will be used whenever no line break occurs.


\discretionary{f-}{i}{fi} seems to work. You can define a macro to ease the use:

\def\dfi{\discretionary{f-}{i}{fi}}
\def\VA{\discretionary{V-}{A}{VA}}
VA fi\par
AD\VA NCE Nar\dfi na
\bye

Not ideal but if you don't have many problems, you could work with that.

In any case, hyphenation is not something user should worry much in theory, do you have the correct language set in your document? That comes with many rules to tell TeX how to break up words. And you also have \hyphenation{wordwithf-ibreak wordwithV-Abreak} to add rules for particular words; it does work well (ligature when not broken, and broken correctly, plus you set it up once for the whole document).

Rant/Question: I don't understand why \- is a primitive. What was the need? Couldn't it be just defined in terms of \discretionary? It's one of those doubts I have about TeX :)