Caching/preloading files on Linux into RAM

Solution 1:

vmtouch seems like a good tool for the job.

Highlights:

  • query how much of a directory is cached
  • query how much of a file is cached (also which pages, graphical representation)
  • load file into cache
  • remove file from cache
  • lock files in cache
  • run as daemon

vmtouch manual

EDIT: Usage as asked in the question is listed in example 5 on vmtouch Hompage

Example 5

Daemonise and lock all files in a directory into physical memory:

vmtouch -dl /var/www/htdocs/critical/

EDIT2: As noted in the comments, there is now a git repository available.

Solution 2:

This is also possible using the vmtouch Virtual Memory Toucher utility.

The tool allows you to control the filesystem cache on a Linux system. You can force or lock a specific file or directory in the VM cache subsystem, or use it to check to see what portions of a file/directory are contained within VM.

How much of the /bin/ directory is currently in cache?

$ vmtouch /bin/
           Files: 92
     Directories: 1
  Resident Pages: 348/1307  1M/5M  26.6%
         Elapsed: 0.003426 seconds

Or...

Let's bring the rest of big-dataset.txt into memory...

$ vmtouch -vt big-dataset.txt
big-dataset.txt
[OOo                                                 oOOOOOOO] 6887/42116
[OOOOOOOOo                                           oOOOOOOO] 10631/42116
[OOOOOOOOOOOOOOo                                     oOOOOOOO] 15351/42116
[OOOOOOOOOOOOOOOOOOOOOo                              oOOOOOOO] 19719/42116
[OOOOOOOOOOOOOOOOOOOOOOOOOOOo                        oOOOOOOO] 24183/42116
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo                  oOOOOOOO] 28615/42116
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo              oOOOOOOO] 31415/42116
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo      oOOOOOOO] 36775/42116
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo  oOOOOOOO] 39431/42116
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 42116/42116

           Files: 1
     Directories: 0
   Touched Pages: 42116 (164M)
         Elapsed: 12.107 seconds

Solution 3:

A poor man's trick for getting stuff into the filesystem cache is to simply cat it and redirect that to /dev/null.


Solution 4:

Linux will cache as much disk IO in memory as it can. This is what the cache and buffer memory stats are. It'll probably do a better job than you will at storing the right things.

However, if you insist in storing your data in memory, you can create a ram drive using either tmpfs or ramfs. The difference is that ramfs will allocate all the memory you ask for, were as tmpfs will only use the memory that your block device is using. My memory is a little rusty, but you should be able to do:

 # mount -t ramfs ram /mnt/ram 

or

 # mount -t tmpfs tmp /mnt/tmp

and then copy your data to the directory. Obviously, when you turn the machine off or unmount that partition, your data will be lost.


Solution 5:

After some extensive reading on the 2.6 kernel swapping and page-caching features I found 'fcoretools'. Which consists of two tools;

  • fincore: Will reveal how many pages the application has stored in core memory
  • fadvise: Allows you to manipulate the core memory (page-cache).

(In case someone else finds this interesting I'm posting this here)