ZFS pool slow sequential read

I managed to get speeds very close to the numbers I was expecting.

I was looking for 400MB/sec and managed 392MB/sec. So I say that is problem solved. With the later addition of a cache device, I managed 458MB/sec read (cached I believe).

1. This at first was achieved simply by increasing the ZFS dataset recordsize value to 1M

zfs set recordsize=1M pool2/test

I believe this change just results in less disk activity, thus more efficient large synchronous reads and writes. Exactly what I was asking for.

Results after the change

  • bonnie++ = 226MB write, 392MB read
  • dd = 260MB write, 392MB read
  • 2 processes in parallel = 227MB write, 396MB read

2. I managed even better when I added a cache device (120GB SSD). The write is a tad slower, I'm not sure why.

Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
igor            63G           208325  48 129343  28           458513  35 326.8  16

The trick with the cache device was to set l2arc_noprefetch=0 in /etc/modprobe.d/zfs.conf. It allows ZFS to cache streaming/sequential data. Only do this if your cache device is faster than your array, like mine.

After benefiting from the recordsize change on my dataset, I thought it might be a similar way to deal with poor zvol performance.

I came across severel people mentioning that they obtained good performance using a volblocksize=64k, so I tried it. No luck.

zfs create -b 64k -V 120G pool/volume

But then I read that ext4 (the filesystem I was testing with) supports options for RAID like stride and stripe-width, which I've never used before. So I used this site to calculate the settings needed: https://busybox.net/~aldot/mkfs_stride.html and formatted the zvol again.

mkfs.ext3 -b 4096 -E stride=16,stripe-width=32 /dev/zvol/pool/volume

I ran bonnie++ to do a simple benchmark and the results were excellent. I don't have the results with me unfortunately, but they were atleast 5-6x faster for writes as I recall. I'll update this answer again if I benchmark again.