How does memory mapping a file have significant performance increases over the standard I/O system calls?

Memory mapping a file directly avoids copying buffers which happen with read() and write() calls. Calls to read() and write() include a pointer to buffer in process' address space where the data is stored. Kernel has to copy the data to/from those locations. Using mmap() maps the file to process' address space, so the process can address the file directly and no copies are required.

There is also no system call overhead when accessing memory mapped file after the initial call if the file is loaded to memory at initial mmap(). If a page of the mapped file is not in memory, access will generate a fault and require kernel to load the page to memory. Reading a large block with read() can be faster than mmap() in such cases, if mmap() would generate significant number of faults to read the file. (It is possible to advise kernel in advance with madvise() so that the kernel may load the pages in advance before access).

For more details, there is related question on Stack Overflow: mmap() vs. reading blocks


First, in most IO operations the characteristics of the underlying storage hardware dominates the performance. A poorly-configured RAID5 array of twenty-nine S-L-O-W 5400 rpm SATA disks on a slow, memory-starved system using S/W RAID with mismatched block sizes and misaligned file systems is going to give you poor performance compared to a properly configured and aligned SSD RAID 1+0 on a high-performance controller despite any software tuning you might try.

But the only way mmap() can be significantly faster is if you read the same data more than once and the data you read doesn't get paged out between reads because of memory pressure.

Memory map steps:

  1. System call to create virtual mappings - very expensive
  2. Process accesses memory for the first time, causing a page fault - expensive (and may need to be repeated if paged out)
  3. Process actually reads the memory

If the process only does steps 2 and 3 once for each bit of data read, or the data gets dropped from memory because of memory pressure, mmap() is going to be slower.

read() steps:

  1. System call copies data from disk to page cache (may or may not page fault, data may already be in page cache causing this to be skipped)
  2. Data copied from page cache to process memory (may or may not page fault)

Memory mapping is only going to beat this performance-wise because of that extra copy from the page cache to process memory. But a mere copy of a page of memory (or less) has to be done multiple times to beat the cost of setting up the mapping - probably. How many times depends on your system. Memory bandwidth, how your entire system is being used, everything. For example, if the time used by the kernel's memory management to set up the mapping wouldn't have been used by any other process anyway, the cost of creating the mapping really isn't very high. Conversely, if you have a lot of processing on your system that involves a lot of virtual memory mapping creation/destruction (i.e., lots of short-lived processes) the impact of memory mapped IO might be significant.

Then there's read() using direct IO:

  1. System call to read from disk into process memory space. (may or may not cause a page fault)

Direct IO reads are pretty much impossible to beat performance-wise. But you have to really tune your IO patterns to your hardware to maximize performance.

Note that a process can pretty much control if reading data causes a page fault for the buffer the process is using to read.

So, is memory-mapped file access faster? Maybe it is, maybe it isn't.

It depends on your access pattern(s). Along with your hardware and everything else in your IO path(s).

If you're streaming a 30 GB video file on a machine with 4 GB of RAM, and you never go back and reread any of the data, memory-mapping the file is probably the worst way to read it.

Conversely, if you have a 100 MB lookup table for some data that you randomly access billions and billions of times in your processing and enough memory that the file never gets paged out, memory mapping will crush all other access methods.

One huge advantage of memory-mapped files

Memory mapping files has a huge advantage over other forms of IO: code simplicity. It's really hard to beat the simplicity of accessing a file as if it's in memory. And most times, the difference in performance between memory-mapping a file and doing discrete IO operations isn't all that much anyway.