MySQL InnoDB page_cleaner settings might not be optimal

The innodb_page_cleaners default value was changed from 1 to 4 in MySQL 5.7.8. If the number of page cleaner threads exceeds the number of buffer pool instances, innodb_page_cleaners is automatically set to the same value as innodb_buffer_pool_instances

Check innodb_buffer_pool_instances with:

mysql> SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_instances'

You can only set innodb_page_cleaners as high as innodb_buffer_pool_instances. If you want innodb_page_cleaners=4 then you also need innodb_buffer_pool_instances=4.


We experienced the same problem across various clients and found out that the problem was due to setting the value of innodb_lru_scan_depth from the default of 1024 to as low as 128. Although lowering the value reduces the time taken to process a transaction especially in write bound workloads I believe that setting the value too low would make the buffer pool unable to keep up in clearing some of its buffers and buffer pool dirty pages.

In our case we have seen a drastic improvement by increasing the value from 128 to 256 but generally the right value depends on the hardware and the type of load. The trick is to find the right value between increasing OLTP performance and letting MySQL keep the buffer pool clean so as not to have the page_cleaner needing to do a lot of work, as stated by the above message ("InnoDB: page_cleaner: 1000ms intended loop took 15888ms").

The value can be changed dynamically without restarting MySQL, e.g.

SET GLOBAL innodb_lru_scan_depth=256;

This StackOverflow thread can be useful...

https://stackoverflow.com/questions/41134785/how-to-solve-mysql-warning-innodb-page-cleaner-1000ms-intended-loop-took-xxx

This basically means, Your DB is getting too many writes causing the BufferPool to fill up with dirty values. This triggers the PageCleaner to act and clear dirty pages. Since there were too many dirty pages than usual, it took more time for the PageCleaner to clear the buffer.

innodb_lru_scan_depth particular variable controls how much of the buffer pool scanning should be done for clearing. This might either be a large value or write throughput of the system is really high causing large number of dirty pages.