Does tee slow down pipelines?

Yes, it slows things down. And it basically does have a queue of unwritten data, though that's actually maintained by the kernel—all programs have that, unless they explicitly request otherwise.

For example, here is a trivial pipe using pv, which is nice because it displays transfer rate:

$ pv -s 50g -S -pteba /dev/zero | cat > /dev/null 
  50GiB 0:00:09 [ 5.4GiB/s] [===============================================>] 100%

Now, let's add a tee in there, not even writing an extra copy—just forwarding it along:

$ pv -s 50g -S -pteba /dev/zero | tee | cat > /dev/null 
  50GiB 0:00:20 [2.44GiB/s] [===============================================>] 100%            

So, that's quite a bit slower, and it wasn't even doing anything! That's the overhead of tee internally copying STDIN to STDOUT. (Interestingly, adding a second pv in there stays at 5.19GiB/s, so pv is substantially faster than tee. pv uses splice(2), tee likely does not.)

Anyway, let's see what happens if I tell tee to write to a file on disk. It starts out fairly fast (~800MiB/s) but as it goes on, it keeps slowing down—ultimately down to ~100MiB/s, which is basically 100% of the disk write bandwidth. (The fast start is due to the kernel caching the disk write, and the slowdown to disk write speed is the kernel refusing to let the cache grow infinitely.)

Does it matter?

The above is a worst-case. The above uses a pipe to spew data as fast as possible. The only real-world use I can think of like this is piping raw YUV data to/from ffmpeg.

When you're sending data at slower rates (because you're processing them, etc.) it's going to be a much less significant effect.


Nothing surprising here, after all

> POSIX says,

DESCRIPTION

The tee utility shall copy standard input to standard output, making a copy in zero or more files. The tee utility shall not buffer output.

And also that

RATIONALE

The buffering requirement means that tee is not allowed to use ISO C standard fully buffered or line-buffered writes. It does not mean that tee has to do 1-byte reads followed by 1-byte writes.

So, without explaining "rationale", tee will probably only read and write up to however many bytes can fit into your pipe buffer at a time, flushing the output on every single write.

And yes, depending on the application, this can be rather inefficient — so feel free to simply remove/comment any of these out:
https://github.com/coreutils/coreutils/blob/master/src/tee.c#L208
https://github.com/coreutils/coreutils/blob/master/src/tee.c#L224

Tags:

Pipe

Tee