Grep: unexpected results when searching for words in heading from man page

If you add a | sed -n l to that tail command, to show non-printable characters, you'll probably see something like:

N\bNA\bAM\bME\bE

That is, each character is written as X Backspace X. On modern terminals, the character ends up being written over itself (as Backspace aka BS aka \b aka ^H is the character that moves the cursor one column to the left) with no difference. But in ancient tele-typewriters, that would cause the character to appear in bold as it gets twice as much ink.

Still, pagers like more/less do understand that format to mean bold, so that's still what roff does to output bold text.

Some man implementations would call roff in a way that those sequences are not used (or internally call col -b -p -x to strip them like in the case of the man-db implementation (unless the MAN_KEEP_FORMATTING environment variable is set)), and don't invoke a pager when they detect the output is not going to a terminal (so man bash | grep NAME would work there), but not yours.

You can use col -b to remove those sequences (there are other types (_ BS X) as well for underline).

For systems using GNU roff (like GNU or FreeBSD), you can avoid those sequences being used in the first place by making sure the -c -b -u options are passed to grotty, for instance by making sure the -P-cbu options is passed to groff.

For instance by creating a wrapper script called groff containing:

#! /bin/sh -
exec /usr/bin/groff -P-cbu "$@"

That you put ahead of /usr/bin/groff in $PATH.

With macOS' man (also using GNU roff), you can create a man-no-overstrike.conf with:

NROFF /usr/bin/groff -mandoc -Tutf8 -P-cbu

And call man as:

man -C man-no-overstrike.conf bash | grep NAME

Still with GNU roff, if you set the GROFF_SGR environment variable (or don't set the GROFF_NO_SGR variable depending on how the defaults have been set at compile time), then grotty (as long as it's not passed the -c option) will use ANSI SGR terminal escape sequences instead of those BS tricks for character attributes. less understand them when called with the -R option.

FreeBSD's man calls grotty with the -c option unless you're asking for colours by setting the MANCOLOR variable (in which case -c is not passed to grotty and grotty reverts to the default of using ANSI SGR escape sequences there).

MANCOLOR=1 man bash | grep NAME

will work there.

On Debian, GROFF_SGR is not the default. If you do:

GROFF_SGR=1 man bash | grep NAME

however, because man's stdout is not a terminal, it takes it upon itself to also pass a GROFF_NO_SGR variable to grotty (I suppose so it can use col -bpx to strip the BS sequences as col doesn't know how to strip the SGR sequences, even though it still does it with MAN_KEEP_FORMATTING) which overrides our GROFF_SGR. You can do instead:

GROFF_SGR=1 MANPAGER='grep NAME' man bash

(in a terminal) to have the SGR escape sequences.

That time, you'll notice that some of those NAMEs do appear in bold on the terminal (and in a less -R pager). If you feed the output to sed -n l (MANPAGER='sed -n /NAME/l'), you'll see something like:

\033[1mNAME\033[0m$

Where \e[1m is the sequence to enable bold in ANSI compatible terminals, and \e[0m the sequence to revert all SGR attributes to the default.

On that text grep NAME works as that text does contain NAME, but you could still have problems if looking for text where only parts of it is in bold/underline...


If you look at any manual page, you'll notice that the headers are in bold. This is achieved through formatting them with control characters. To be able to grep like you're wanting to, these have to be stripped out.

The col utility may be used for this:

$ man bash | col -b | grep 'NAME'

The -b option has the following description on OpenBSD:

Do not output any backspaces, printing only the last character written to each column position. This can be useful in processing the output of mandoc(1).


Linux the col manual (on Ubuntu) doesn't have the last sentence in there (but it works in the same way).

On Linux, unsetting the MAN_KEEP_FORMATTING environment variable (or setting it to an empty string) may also help, and will allow you to grep without passing the output of man through col -b.