Why read-only memory mapped regions have dirty pages?

A dirty page does not necessarily require a write-back. A dirty page is one that was written to since the kernel last marked it as clean. The data doesn't always need to be saved back into the original file.

The pages are private, not shared, so they wouldn't be saved back into the original file. It would be impossible to have a dirty page backed by a read-only file. If the page needs to be removed from RAM, it will be saved in swap.

Pages that are read-only, private and dirty, but within the range of a memory-mapped file, are typically data pages that contain constants that need to be initialized at run time, but don't change after they have been initialized. For example, they may contain static data that embeds pointers; the pointer values depend on the address at which the program or library is mapped, so it has to be computed after the program has started, with the page being read-write at this stage. After the pointers have been computed, the contents of the page won't ever change in this instance of the program, so the page can be changed to read-only. See “Hunting Down Dirty Memory Pages” by stosb for an example with code fragments.

You may, more rarely, see read-only, executable, private, dirty pages; these happen with some linkers that mix code and data more freely, or with just-in-time compilation.


In addition to the cases Gilles lists:

When a process forks, the kernel may mark all of its dirty pages as read-only, and they will be shared between the parent and child. When one of the processes writes to the page, an exception will occur, and the kernel will copy the page and mark it writable. This saves the work of copying pages that ultimately are not modified again by either process. (Note that in this situation, the pages are marked read-only in the hardware but are known by the kernel to be writable.)