grep doesn't output until EOF if piped through cat

When (at least GNU) grep’s output is not a terminal, it buffers its output, which is what causes the behaviour you’re seeing. You can disable this either using GNU grep’s --line-buffered option:

( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | grep --line-buffered LINE | cat

or the stdbuf utility:

( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | stdbuf -oL grep LINE | cat

Turn off buffering in pipe has more on this topic.


Simplified explanation

Like many utilities, this not being something peculiar to one program, grep varies its standard output between being line buffered and fully buffered. In the former case, the C library buffers output data in memory until either the buffer holding those data is filled or a linefeed character is added to it (or the program ends cleanly), whereupon it calls write() to actually write the buffer contents. In the latter case, only the in-memory buffer becoming full (or the program ending cleanly) triggers the write().

More detailed explanation

This is the well-known, but slightly wrong, explanation. In fact, standard output is not line buffered but smart buffered in the GNU C library and BSD C library. Standard output is also flushed when reading standard input exhausts its in-memory buffer (of pre-read input) and the C library has to call read() to fetch some more input and it is reading the beginning of a new line. (One reason for this is to prevent deadlock when another program connects itself to both ends of a filter and expects to be able to operate line-by-line, alternating between writing to the filter and reading from it; like "coprocesses" in GNU awk for example.)

C library influence

grep and the other utilities do this — or, more strictly, the C libraries that they use do this, because this is a defined feature of programming in the C language — based upon what they detect their standard output to be. If (and only if) it is not an interactive device, they choose full buffering, otherwise they choose smart buffering. A pipe is considered to be not an interactive device, because the definition of being an interactive device, at least in the world of Unix and Linux, is essentially the isatty() call returning true for the relevant file descriptor.

Workarounds to disable full buffering

Some utilities like grep have idiosyncratic options such as --line-buffered that change this decision, which as you can see is mis-named. But a vanishingly small fraction of the filter programs that one could use actually have such an option.

More generally, one can use tools that dig into the specific internals of the C library and change its decision making (which have security problems if the program to be altered is set-UID, and are also specific to particular C libraries, and indeed are specific to programs written in or layered on top of the C language), or tools such as ptybandage that do not change the internals of the program but simply interpose a pseudo-terminal as standard output so that the decision comes out as "interactive", to affect this.

Further reading

  • https://unix.stackexchange.com/a/407472/5132
  • https://unix.stackexchange.com/a/249801/5132

Use

grep --line-buffered

to make grep not buffer more than one line at a time.

Tags:

Bash

Grep

Pipe