zram vs zswap vs zcache Ultimate guide: when to use which one

There is a whole lot of stuff about these three systems but none of it makes simple comparison between them let alone explain them well. I tried to make sense of it but my head exploded. Then I thought I had got it so I tried writing it down and my head exploded again. (see summary of implementations) I thought it will be useful to post this here as there were many stackexchange questions asking about pairwise comparisons between them.

Summary of what to use when:

  1. ZRAM if you have no swap device on HDD/SSD.
  2. ZSWAP if you do have a swap device on HDD/SSD.
  3. ZCACHE: It does what ZSWAP does and ALSO compresses and speeds the filesystem page cache. (It is internally much more complicated and is not in the mainline kernel as it is still under development).

Summary of their implementations:

  1. ZRAM is a compressed RAM based swap device
  2. ZSWAP is a compressed Cache if you already have a swap.
  3. ZCache is a backend for a special type of Virtual RAM thingy (Transcendent memory) that can be used to cache filesystem pages or swap data.

Details:

  • ZRAM: Makes a swap device in the RAM. Pages sent here are compressed as they are stored. It has a higher priority than other swap devices: pages that are swapped out are preferentially sent to the zram device till it is full, only then are any other swap devices used.

    • Benefits: Independent of other (physical) swap devices. It can be used when there is no swap partition to expand the available memory.
    • Disadvantages: If other swap devices (HDD/SSD) are present they are not used optimally. As the zram device is an independent swap device, once it is full, any new pages that need to be swapped out are sent to next swap device directly, hence:
      1. There is a real chance of LRU (least recently used) inversion: It will be the most recently swapped data that goes to the slow disk, while inactive pages that were swapped out long ago will remain in the fast ZRAM
      2. The data sent to and read from the disk will consume a lot of bandwidth as it is uncompressed.
    • Status: Merged into the mainline kernel 3.14. Once enabled on a system, it requires some userspace configuration to set up the swap devices and use them.
  • ZSWAP: The frontswap system hooks attempts to swap out pages and uses zswap as write-back-cache for a HDD/SSD swap device: An attempt is made to compress the page and if it contains poorly compressible data it is directly written to the disk. If the data is compressed, it is stored in the pool of zswap memory. If pages are swapped out of memory when the total compressed pages in RAM exceeds a certain size, the Least Recently Used (LRU) compressed page is written to the disk as it is unlikely to be required soon.

    • Benefits: Very efficient use RAM and disk based swap. Minimizes Disk I/O by both reducing the number of writes and reads required (data is compressed and held in RAM) and by reducing the bandwidth of these I/O operations as the data is in a compressed form.
    • Limitations: It is an enhancement of disk based swap systems and hence depends on a swap partition on the hard disk.
    • Status: Merged into the 3.11 mainline linux kernel.
  • ZCache: It is a backend for the Transcendent memory system. Transcendent memory provides a RAM-like memory that can only be accessed a page at a time by using put and get calls. This is unlike normal memory that can be accessed a byte at a time. The frontswap and cleancache systems hook attempts to swap and reclaim file-system page caches respectively and send them to the transcendent memory backends. When zcache is used as a backend, the data is compressed and stored in the RAM. When it fills up, compressed pages are evicted to the swap. (an alternate backend is RAMster which shares a pool of RAM across networked computers). Using only the frontswap frontend with the zcache backend works just like zswap. (In fact zswap is a simplified subset of zcache)

    • Benefits Provides compressed caching both for swap and for filesystem caches.
    • Status: Still not mainlined as it is very complicated and is being worked on.

The best resources I found were:

  • Transcendent memory in a nutshell
  • [PATCH 0/8] zswap: compressed swap caching
  • In-kernel memory compression
  • LSFMM: In-kernel memory compression
  • The zswap compressed swap cache


Regarding 2., zswap does seem to decompress the pages on write-back, confirming @Cbhihe's comment.

mm/zswap.c, line 828:

/*
 * Attempts to free an entry by adding a page to the swap cache,
 * decompressing the entry data into the page, and issuing a
 * bio write to write the page back to the swap device.
 * ...
 */
static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
{
    ...
    
    case ZSWAP_SWAPCACHE_NEW: /* page is locked */
        /* decompress */
        ...
        
        ret = crypto_comp_decompress(tfm, src, entry->length,
                         dst, &dlen);
        ...
        kunmap_atomic(dst);    


$ git show
commit 1573d2caf713874cfe0d1336c823d0fb548d8bed
Merge: 4cdf8db 0a86248
Author: Linus Torvalds <[email protected]>
Date:   Tue Oct 11 23:59:07 2016 -0700

So zswap is useful for situations where the compressed in-ram cache is likely to be forgotten soon before written back to disk. It is not for applications with large, long living heaps that will eventually need to be backed by the actual swap device.