Purpose of “ASCII text, with overstriking” file format

Overstriking is a method used in nroff (see the Troff paper) to offer more typographical possibilities than plain ASCII would allow:

  • bold text (by overstriking the same character)
  • underlined text (by overstriking _)
  • accents and diacritics (e.g. é produced by overstriking e with )

and various other symbols, as permitted by the target output device.

In bash, these .0 files are produced directly by nroff, with Makefile rules such as

.1.0:
        $(RM) $@
        -${NROFF} -man $< > $@

You can view such files using less; it will process the overstriking sequences and replace them as appropriate:

less bash.0

Originally nroff's output targeted typewriter-style output devices, which would back up every time they received a backspace character; overstriking would produce the desired visual output. As pointed out by chirlu, striking the same character twice would usually result in a bolder appearance thanks to the inevitable misalignment of the successive strikes; the increase in the amount of ink deposited would also help.

(troff targeted typesetting machines.)


A web search for "backspace" and "overstrike" would get better results.

The file is a manual page — formatted using nroff. Usually files such as bash.0 are simply generated and discarded. A while back, they were saved, to reduce work for the man program. Rather than /usr/share/man/man1, your manual pages would be read from /usr/share/man/cat1. Read the description of catman for instance.

nroff is the Unix command for formatting manual pages and other files. Back when it was first written, there were several other utilities, each with its own markup language. I've used at least a dozen different ones. But they all solved the problem of printing emphasized text in the same way: using carriage control. Backspaces are just noticeable because they are not used in other plain-text files. Tabs, carriage returns, line-feeds and form-feeds all have a role in plain text files (though form-feeds are far less important than they were originally).

nroff uses underlining to indicate italics and overstriking to represent bold. The technique is dated: it is useful for hard-copy devices where more than one character can be printed in the same position. Very few video terminals do that. In terminfo(5), that would be

   over_strike               os     os   terminal can over-
                                         strike

or more completely:

If the terminal overstrikes (rather than clearing a position when a character is struck over) then it should have the os capability.

In the usual case, the last character written on a given row/column of a video terminal would be all that is shown. nroff organized the output so that an underlined character was written as an underline, a backspace and the actual character. Doing that ensured that terminals without the overstrike feature would print something useful.

Among the very few video terminals listed which have the overstrike capability, I see the DEC gt40, which I used for about three years (1976-1979). There was no Unix on that system (it ran RT-11), but I wrote a text formatter, using the same type of overstruck text. Ultimately, I needed hardcopy, and wrote a utility to make that happen — something like col, perhaps — but solving a related problem. The terminal printed very slowly when it had a lot of underlined text, until my program reorganized the text to reduce the amount of switching between forward/backward motion.

With video terminals, there is no need for that. But they do not do overstriking. Instead, we have programs that recognize the underlining and show underlines, or have groff, which might show colored text instead of underlining (and bold).

Further reading:

  • How do I generate manpages using escape codes for bold, etc.?
  • catman - create or update the pre-formatted manual pages
  • repaginator (for your amusement)

And even earlier, it was a method of printing on golf-ball printers that worked like old typewriters and had a very limited set of characters that they could print. So nroff uses the byte stream of an old teletype printer to represent how to should look 'on screen'.