How to limit ZFS writes on NVME SSD in RAID1 to avoid rapid disk wear?

There are different reasons why your real writes were so much inflated. Lets mark some base point:

  • first, let set a baseline: from your zpool iostat output, we can infer a continuous ~1.5 MB/s write stream to each of the mirror leg. So, in 245 days, it add up to 1.5*86400*245 = 32 TB written;

  • the number above already take into account both ZFS recordsize write amplification and dual data write due to first writing to ZIL, then at txg_commit (for writes smaller than zfs_immediate_write_sz).

Give the above, to reduce ZFS-induced write amplification, you should:

  • set a small recordsize (ie: 16K);

  • set logbias=throughput

  • set compression=lz4 (as suggested by @poige)

EDIT: to more correctly estimate write-amplification, please show the output of nvme intel smart-log-add /dev/nvme0


In addition to already given advice to reduce recordsize — there's no reason not to use LZ4 compression (zfs set compression=lz4 …) as well by default, thus reducing size even more (and sometimes very significantly).


A few items...

If this is a leased server, isn't the provider responsible for the health of the equipment?

Your ZFS filesystem ashift values, pool txg_timeout and a few other parameters may make sense to review.