NVMe ssd: Why is 4k writing faster than reading?

4k reads are going to be about the hardest thing the drive can do. They are amongst the smallest block sizes the drive is going to be able to handle, and there's no way for the drive to preload large quantities of data, in fact they are probably quite inefficient if the drive load-ahead logic is intending to read anything larger than 4kb.

"Normal" drive reads are more likely to be larger than 4kb as there are very few files that are that small, and even the page file is likely to be read in large chunks as it would be odd for a program to have "only" 4KB of memory paged out. This means that any preloading that the drive tries to do will actually penalise the drive throughput.

4K reads might pass through the drive buffer, but the "random" part of the test makes them entirely unpredictable. The controller won't know when the drive might need the more usual "large" reads again.

4K writes on the other hand can be buffered, queued, and written out sequentially in an efficient manner. The drive buffer can do a lot of the catch-and-write work that it was designed for, and the wear leveller might even allocate all those 4K writes to the same drive erase block, occasionally turning what is a 4K "random" write into something closer to a sequential write.

In fact I suspect that this is what is happening in the "4K-64Thrd" writes, the "64-Thrd" is apparently using a large queue depth, thus signalling to the drive that it has a large amount of data to read or write. This triggers a lot of clustering of writes and so approaches the sequential write speed of the drive. There is still an overhead to performing a 4K write, but now you are fully exposing the potential of the buffer. In the Read version of the test the drive controller, now recognising that it is under very constant heavy load, stops preloading data, possibly avoids the buffer and instead switches to a "raw" read mode, again approaching the sequential read speed.

Basically the drive controller can do something to make a 4K write more efficient, especially if a cluster of them arrive at a similar time, while it can't do anything to make a single 4K read more efficient, especially if it is trying to optimise dataflow by pre-loading data into the cache.


Other answers have already explained why it may be that writing is faster than reading; I would like to add that for this drive this is absolutely normal, as it is confirmed by benchmarks that you can find in reviews.

ArsTecnica's review

ArsTechnica has reviewed the drive, both your version (512 GB) and the 2 TB one:

ArsTechnica (This graph is not immediately visible in the review, it's the 5th one in the first gallery, you have to click on it)

The performance of these 2 models is very similar, and their numbers look like yours: the drive can read at 37 MB/s and write at 151 MB/s.

AnandTech's review

AnandTech has also reviewed the drive: they used the 2TB model, averaging the results of tests with a queue depth of 1, 2 and 4. These are the graphs:

AnandTech 4K read AnandTech 4K write

The drive reads at 137 MB/s and writes at 437 MB/s. The number are much higher than yours, but it's probably due to the higher queue depths. Anyway the write speed is 3 times the read speed, as in your case.

PC World's review

One more review, by PC World: they have tested the 1 TB version, and the results for 4K are 30 MB/s for reading and 155 MB/s for writing: PC World graph The write speed is in line with yours, but here the drive is even slower at reading. The result is that the ratio is five to one, not three to one.

Conclusion

Reviews confirm that for this drive it is normal that the write speed for random 4K is much faster than the read speed: depending on the test, it can even be 5 times faster.

Your drive is fine. There's no reason to believe it is faulty, or that your system has a problem.


SSD controller caches writes in the onboard NVRAM, and flushes it to flash media at opportune times. Write latency is thus the cache access latency, typically 20us. Reads, on the contrary, are served off the media, with access time of 120-150us at best.