IO redirection and the head command

When the shell gets a command line like: command > file.out the shell itself opens (and maybe creates) the file named file.out. The shell sets file descriptor 0 to the file file descriptor it got from the open. That's how I/O redirection works: every process knows about file descriptors 0, 1 and 2.

The hard part about this is how to open file.out. Most of the time, you want file.out opened for write at offset 0 (i.e. truncated) and this is what the shell did for you. It truncated .hgignore, opened it for write, dup'ed the filedescriptor to 0, then exec'ed head. Instant file clobbering.

In bash shell, you do a set noclobber to change this behavior.


I think Bruce answers what's going on here with the shell pipeline.

One of my favorite little utilities is the sponge command from moreutils. It solves exactly this problem by "soaking" up all available input before it opens the target output file and writing the data. It allows you to write pipelines exactly how you expected to:

$ head -1 .hgignore | sponge .hgignore

The poor-man's solution is to pipe the output to a temporary file, then after the pipline is done (for example the next command you run) is to move the temp file back to the original file location.

$ head -1 .hgingore > .hgignore.tmp
$ mv .hgignore{.tmp,}

In

head -n 1 file > file

file is truncated before head is started, but if you write it:

head -n 1 file 1<> file

it's not as file is opened in read-write mode. However, when head finishes writing, it doesn't truncate the file, so the line above would be a no-op (head would just rewrite the first line over itself and leave the other ones untouched).

However, after head has returned and while the fd is still open, you can call another command that does the truncate.

For instance:

{ head -n 1 file; perl -e 'truncate STDOUT, tell STDOUT'; } 1<> file

What matters here is that truncate above, head just moves the cursor for fd 1 inside the file just after the first line. It does rewrite the first line which we didn't need it to, but that's not harmful.

With a POSIX head, we could actually get away without rewriting that first line:

{ head -n 1 > /dev/null
  perl -e 'truncate STDIN, tell STDIN'
} <> file

Here, we're using the fact that head moves the cursor position in its stdin. While head would typically read its input by big chunks to improve performance, POSIX would require it (where possible) to seek back just after the first line if it had gone beyond it. Note however that not all implementations do it.

Alternatively, you can use the shell's read command instead in this case:

{ read -r dummy; perl -e 'truncate STDIN, tell STDIN'; } <> file