Server overloaded, acts like out of memory, but thats not true

Theres a lot of free memory, but these zones are totally fragmented:

Node 0 Normal: 1648026*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6592104kB
Node 1 Normal: 8390977*4kB 1181188*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB

There are very few non-zero order pages left, none in one zone left at all.

I cant guarantee anything but you may want to try to turn off ksmd and re-compact memory. Compaction only gets called automatically on higher order page allocations and never calls oom-killer, so I assume that the system has tried to allocate memory from orders 2 or 3 and got stuck.

To compact memory run echo 1 >/proc/sys/vm/compact_memory

Theres only so much to go off in this question, but I suspect ksmd is causing the fragmentation by scanning for pages duplicated in both VM's and swapping them all around.


@Matthew's answer should be marked as solution for this problem. The /proc/buddyinfo clearly shows fragmentation (due to ksmd or other behaviour). The memory compaction is a valid solution.

We just hit the same problem on our server :

# cat /proc/buddyinfo
Node 0, zone      DMA      1      0      1      0      0      1      0      0      0      1      3
Node 0, zone    DMA32   4941  14025  10661   1462   1715    154      1      0      0      0      0
Node 0, zone   Normal 420283 217678   3852      3      1      0      1      1      1      0      0
Node 1, zone   Normal 1178429 294431  21420    340      7      2      1      2      0      0      0

This clearly shows fragmentation, since most memory is fragmented in lots of small blocks of memory (large number on the left, zero on the right).

Now the compaction solves this :

# echo 1 >/proc/sys/vm/compact_memory
# cat /proc/buddyinfo
Node 0, zone      DMA      1      0      1      0      0      1      0      0      0      1      3
Node 0, zone    DMA32    485   1746   8588   3311   2076    505     98     19      3      0      0
Node 0, zone   Normal  83764  22474   8597   3130   1971   1421   1090    808    556    358     95
Node 1, zone   Normal  51928  36053  36093  29024  21498  13148   5719   1405    151      8      0