Why does partially full RAM cause lag?

There is much involved here but I will try to explain it as simply as I can and in a way applicable to just about any OS.

There are 2 basic principles here:

  1. The sum total of everything that needs to be in RAM and those things that would benefit from being in RAM is almost always greater than the size of RAM. Things that would benefit from being in RAM include process working sets and the standby list. The latter contains data and code that was once in active use but has since lapsed into inactivity. Much of this will be used again, some of it quite soon, so it is beneficial to keep this in RAM. This memory acts as a kind of cache but is not really essential so is in the category of available memory. Like free memory it can be quickly given to any program that needs it. In the interests of performance standby memory should be large.

  2. The frequency of use of memory blocks is far from random but can be predicted with considerable accuracy. Memory is divided into blocks, often 4K bytes. Some blocks are accessed many times per second while others have not been accessed for many minutes, hours, days, or even weeks if the system has been up long enough. There is a wide range of usage between these 2 extremes. The memory manager knows which blocks have been accessed recently and those that have not. It is a reasonable assumption that a memory block that has been accessed recently will be needed again soon. Memory that has not been accessed recently probably won't be needed anytime soon. Long experience has proven this to be a valid principle.

The memory manager takes advantage of the second principle to largely mitigate the undesirable consequences of the first. To do this it does a balancing act of keeping recently accessed data in RAM while that keeping rarely used data in the original files or the pagefile.

When RAM is plentiful this balancing act is easy. Much of the not so recently used data can be kept in RAM. This is a good situation.

Things get more complicated when the workload increases. The the sum total of data and code in use is larger but the size of RAM remains the same. This means that a smaller subset of this can be kept in RAM. Some of the less recently used data can no longer be in RAM but must be left on disk. The memory manager tries very hard to maintain a good balance between memory in active use and available memory. But as the workload increases the memory manager will be forced to give more available memory to running processes. This is not a good situation but the memory manager has no choice.

The problem is that moving data to and from RAM as programs run takes time. When RAM is plentiful it won't happen very often and won't even be noticed. But when RAM usage reaches high levels it will happen much more often. The situation can become so bad that more time is spent moving data to and from RAM than is spent in actually using it. This is thrashing, a thing the memory manager tries very hard to avoid but with a high workload it often cannot be avoided.

The memory manger is on your side, always trying it's best to maintain optimum performance even under adverse conditions. But when the workload is great and available memory runs short it must do bad things in order to keep functioning. That is in fact the most important thing. The priority is first to keep things running then make then as fast as possible.


All modern operating systems use otherwise unused memory for caching data so that it can be accessed from fast RAM instead of slower storage. They will generally report this as free memory, since applications can clear the cache and use it if they need to, but it's still actually being used. The less of it there is, the less data can be cached, and the slower the computer will be.


This answer has been mostly rewritten to reorganise the structure and make the message clearer. I have also opened it as a community wiki answer; Feel free to edit.

Paging is a memory management scheme through which fixed-size blocks of memory have processes assigned to them. When memory usage rises to a high level (i.e. 80% capacity), paging begins to extend from RAM into vRAM (virtual RAM).

vRAM is located in system storage, usually within a hard drive, or other sizable storage locations.

Processes are assigned part of your hard drive to run as memory and will treat their section as RAM. This is a perfectly normal process, however, when the time spent to transfer data to and from the vRAM increases, system performance decreases.

While dedicated RAM is accessed directly through the motherboard from the CPU, which provides a fast connection, virtual RAM must transverse cabling between the board and the location of the vRAM.

This however, causes only slight performance impact. When the rate of which paging to vRAM takes place increases drastically (when dedicated RAM approaches capacity), thrashing takes place.

Thrashing is the practice of quickly and rapidly transferring pages of memory into your virtual memory. This takes a huge toll on performance as more time has to be spent fetching and addressing data.

Lets say, you want to write down a number 30 digits long. You could either sit next to your screen with your notepad and write it (using the dedicated memory), or you remember chunks of 5, run into the next room and write it down on your notepad in there (using virtual memory). Both get the job done, but which is going to be quicker?

Find out more about thashing here!

A big thanks to the contributors of this answer including Daniel B, xenoid and Jon Bentley.

Tags:

Memory