What happens when a file that is 100% paged in to the page cache gets modified by another process

Continuous release then replaces /apps/EXE with a brand new executable.

This is the important part.

The way a new file is released is by creating a new file (e.g. /apps/EXE.tmp.20190907080000), writing the contents, setting permissions and ownership and finally rename(2)ing it to the final name /apps/EXE, replacing the old file.

The result is that the new file has a new inode number (which means, in effect, it's a different file.)

And the old file had its own inode number, which is actually still around even though the file name is not pointing to it anymore (or there are no file names pointing to that inode anymore.)

So, the key here is that when we talk about "files" in Linux, we're most often really talking about "inodes" since once a file has been opened, the inode is the reference we keep to the file.

Assumption 1: I assume that process P (and anyone else with a file descriptor referencing the old executable) will continue to use the old, in memory /apps/EXE without an issue, and any new process which tries to exec that path will get the new executable.

Correct.

Assumption 2: I assume that if not all pages of the file are mapped into memory, that things will be fine until there is a page fault requiring pages from the file that have been replaced, and probably a segfault will occur?

Incorrect. The old inode is still around, so page faults from the process using the old binary will still be able to find those pages on disk.

You can see some effects of this by looking at the /proc/${pid}/exe symlink (or, equivalently, lsof output) for the process running the old binary, which will show /app/EXE (deleted) to indicate the name is no longer there but the inode is still around.

You can also see that the diskspace used by the binary will only be released after the process dies (assuming it's the only process with that inode open.) Check output of df before and after killing the process, you'll see it drop by the size of that old binary you thought wasn't around anymore.

BTW, this is not only with binaries, but with any open files. If you open a file in a process and remove the file, the file will be kept on disk until that process closes the file (or dies.) Similarly to how hardlinks keep a counter of how many names point to an inode in disk, the filesystem driver (in the Linux kernel) keeps a counter of how many references exist to that inode in memory, and will only release the inode from disk once all references from the running system have been released as well.

Question 1: If you mlock all of the pages of the file with something like vmtouch does that change the scenario

This question is based on the incorrect assumption 2 that not locking the pages will cause segfaults. It won't.

Question 2: If /apps/EXE is on a remote NFS, would that make any difference? (I assume not)

It's meant to work the same way and most of the time it does, but there are some "gotchas" with NFS.

Sometimes you can see the artifacts of deleting a file that's still open in NFS (shows up as a hidden file in that directory.)

You also have some way to assign device numbers to NFS exports, to make sure those won't get "reshuffled" when the NFS server reboots.

But the main idea is the same. NFS client driver still uses inodes and will try to keep files around (on the server) while the inode is still referenced.

Assumption 2: I assume that if not all pages of the file are mapped into memory, that things will be fine until there is a page fault requiring pages from the file that have been replaced, and probably a segfault will occur?

No, that will not happen, because the kernel will not let you open for write an replace anything inside a file which is currently executed. Such an action will fail with ETXTBSY [1]:

cp /bin/sleep sleep; ./sleep 3600 & echo none > ./sleep
[9] 5332
bash: ./sleep: Text file busy

When dpkg, etc updates a binary, it doesn't overwrite it, but uses rename(2) which simply points the directory entry to a completely different file, and any processes which still have mappings or open handles to the old file will continue to use it without problems.

[1] the ETXBUSY protection is not extended to other files which can also be considered "text" (= live code / executable): shared libraries, java classes, etc; modifying such a file while mapped by another process will cause the process to crash. On linux, the dynamic linker dutifully passes the MAP_DENYWRITE flag to mmap(2), but make no mistake -- it has no effect whatsoever. Example:

$ cc -xc - <<<'void lib(){}' -shared -o lib.so
$ cc -Wl,-rpath=. lib.so -include unistd.h -xc - <<<'
   extern void lib();
   int main(){ truncate("lib.so", 0); lib(); }
'
./a.out
Bus error

filbranden's answer is correct assuming the continuous release process does proper atomic replacement of files via rename. If it doesn't, but modifies the file in-place, things are different. However your mental model is still mistaken.

There is no possibility of things getting modified on disk and being inconsistent with the page cache, because the page cache is the canonical version and the one that's modified. Any writes to a file take place through the page cache. If it's already present there, the existing pages are modified. If it's not yet present, attempts to modify a partial page will cause the whole page to be cached, followed by modification as if it were already cached. Writes that span a whole page or more can (and almost surely do) optimize out the read step paging them in. In any case, there's only one canonical modifiable version of a file ever(*) in existence, the one in the page cache.

(*) I slightly lied. For NFS and other remote filesystems, there may be more than one, and they typically (depending on which one and what mount and server-side options are used) don't correctly implement atomicity and ordering semantics for writes. That's why a lot of us consider them fundamentally broken and refuse to use them for situations where there will be writes concurrent with use.

What happens when a file that is 100% paged in to the page cache gets modified by another process

Tags:

Linux

Buffer

Virtual Memory

Cache

Related

Recent Posts