Overwriting a running executable or .so

It depends on the kernel, and on some kernels it might depend on the type of executable, but I think all modern systems return ETXTBSY (”text file busy“) if you try to open a running executable for writing or to execute a file that's open for writing. Documentation suggests that it's always been the case on BSD, but it wasn't the case on early Solaris (later versions did implement this protection), which matches my memory. It's been the case on Linux since forever, or at least 1.0.

What goes for executables may or may not go as well for dynamic libraries. Overwriting a dynamic library causes exactly the same problem that overwriting an executable does: instructions will suddenly be loaded from the same old address in the new file, which probably has something completely different. But this is in fact not the case everywhere. In particular, on Linux, programs call the open system call to open a dynamic library under the hood, with the same flags as any data file, and Linux happily allows you to rewrite the library file even though a running process might load code from it at any time.

Most kernels allow removing and renaming files while they're being executed, just like they allow removing and renaming files while they're open for reading or writing. Just like an open file, a file that's removed while it's being executed will not be actually removed from the storage medium as long as it is in use, i.e. until the last instance of the executable exits. Linux and *BSD allow it, but Solaris and HP-UX don't.

Removing a file and writing a new file by the same name is perfectly safe: the association between the code to load and the open (or being-executed) file that contains the code goes by the file descriptor, not the file name. It has the additional benefit that it can be done atomically, by writing to a temporary file then moving that file into place (the rename system call atomically replaces an existing destination file by the source file). It's much better than remove-then-open-write since it doesn't temporarily put an invalid, partially-written executable in place

Whether cc and ld overwrite their output file, or remove it and create a new one, depends on the implementation. GCC (at least modern versions) and Clang do this, in both cases by calling unlink on the target if it exists then open to create a new file. (I wonder why they don't do write-to-temp-then-rename.)

I don't recommend depending on this behavior except as a safeguard since it doesn't work on every system (it may work on every modern systems for executables, but not for shared libraries), and common toolchains don't do things in the best way. In your build scripts, always generate files under a temporary file, then move them into place, unless you know the underlying tool does this.

What generally happens if a file is removed while it has an open file handle is that it is marked for deletion as soon as the last file handle is closed. The file will at that time no longer appear in directory listings (for instance), but will show up in e. g. lsof output, marked as a deleted-but-in-use file.

The outout of lsof below is trimmed for brevity and clarity:

$ cat - >> foo &
[1] 30779
$ lsof | grep 30779
cat       30779                  ghoti    1w      REG      252,0        0    262155 /home/ghoti/foo

[1]+  Stopped                 cat - >> foo
$ rm foo
$ ls foo
ls: cannot access 'foo': No such file or directory
$ lsof | grep 30779
cat       30779                  ghoti    1w      REG      252,0        0    262155 /home/ghoti/foo (deleted)

If I fg, I can still write to the (deleted) foo with the still-running cat (and indeed I can recover the file from /proc/30779/fd/1 should I need to, so long as I do so while cat still has the file open).

Answering my own question (or really, clarifying the question somewhat, and then explaining the answer to the clarified question):

My question was sort of, "I never see ETXTBSY any more. Is it still even an error? Or are modern kernels letting you overwrite running executables without complaint, and (somehow) without breaking the running executables?"

I was beginning to seriously suspect that modern kernels were implementing some kind of fancy copy-on-write semantics when writing to running executables.

But that wasn't it. ETXTBSY is still definitely an error.

The answer to my confusion is simply that writes to running executables really hardly ever come up in practice. If you move a new executable into place (and the old one is still running), you're almost never actually overwriting it; you're always removing and replacing it. If you're using mv, you're removing and replacing it. If you're using install, or dpkg -i, or anything like that, you're removing and replacing it. Only if for some reason you tried to use cp would you be trying to overwrite it, and running the risk of getting ETXTBSY if the old one were still running.

And then, due to the quiet change to ld, trying to cc -o on top of a running executable is now in the category of "removing and replacing", also.

Overwriting a running executable or .so

Tags:

Executable

Ld

Write

Related

Recent Posts