Massive, unpredictable I/O performance drop in Linux

I managed to reproduce the problem again and it was result of a big disk cache. My disk caches can grow more than 8GB and seems that some applications doesn't like it and I/O suffers.

Dropping disk caches with echo 3 > /proc/sys/vm/drop_caches as root remedies the problem. I currently don't know why large disk caches causes this I/O degradation.

Last Update: After more investigation I've found out that number of files in the cache was triggering the problem. It was trashing the disks while trying to commit many small files back to the disk. Since I was using the system for ten years, I've took the plunge and reinstalled with 64 bit Debian. Now it's working smoothly. It was probably a side effect of ten years of upgrading with finding limits of 32 bit operating system.


Are there any suspicious messages in dmesg?

Some more tools you could try to gain some insights into your system's bottlenecks:

  • dstat
  • latencytop
  • sysprof