How can I benchmark my HDD?

I usually use hdparm to benchmark my HDD's. You can benchmark both the direct reads and the cached reads. You'll want to run the commands a couple of times to establish an average value.

Examples

Here's a direct read.

$ sudo hdparm -t /dev/sda2

/dev/sda2:
 Timing buffered disk reads: 302 MB in  3.00 seconds = 100.58 MB/sec

And here's a cached read.

$ sudo hdparm -T /dev/sda2

/dev/sda2:
 Timing cached reads:   4636 MB in  2.00 seconds = 2318.89 MB/sec

Details

-t     Perform  timings  of  device reads for benchmark and comparison 
       purposes.  For meaningful results, this operation should be repeated
       2-3 times on an otherwise inactive system (no other active processes) 
       with at least a couple of megabytes of free memory.  This displays  
       the  speed of reading through the buffer cache to the disk without 
       any prior caching of data.  This measurement is an indication of how 
       fast the drive can sustain sequential data reads under Linux, without 
       any filesystem overhead.  To ensure accurate  measurements, the 
       buffer cache is flushed during the processing of -t using the 
       BLKFLSBUF ioctl.

-T     Perform timings of cache reads for benchmark and comparison purposes.
       For meaningful results, this operation should be repeated 2-3
       times on an otherwise inactive system (no other active processes) 
       with at least a couple of megabytes of free memory.  This displays
       the speed of reading directly from the Linux buffer cache without 
       disk access.  This measurement is essentially an indication of the
       throughput of the processor, cache, and memory of the system under 
       test.

Using dd

I too have used dd for this type of testing as well. One modification I would make to the above command is to add this bit to the end of your command, ; rm ddfile.

$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile

This will remove the ddfile after the command has completed. NOTE: ddfile is a transient file that you don't need to keep, it's the file that dd is writing to (of=ddfile), when it's putting your HDD under load.

Going beyond

If you need more rigorous testing of your HDD's you can use Bonnie++.

References

  • How to use 'dd' to benchmark your disk or CPU?
  • Benchmark disk IO with DD and Bonnie++

(This is a very popular question - you can see variations of it on https://stackoverflow.com/q/1198691 , https://serverfault.com/q/219739/203726 and https://askubuntu.com/q/87035/740413 )

Are there better methods [than dd] to [benchmark disks]?

Yes but they will take longer to run and require knowledge of how to interpret the results - there's no single number that will tell you everything in one go because the following influence the type of test you should run:

  • Are you interested in the performance of I/O that is random, sequential or some mix of the two?
  • Are you reading from or writing to the disk (or some mixture of the two)?
  • Are you concerned about latency, throughput or both?
  • Are you trying to understand how different parts of the same hard disk perform (generally speeds a faster closer to the centre of spinning disks)?
  • Are you interested in how a given filesystem will perform when using your disk or do you want results closer to the disk's raw performance by doing I/O straight to a block device?
  • Are you interested in how a particular size of I/O performs?
  • Are you submitting the I/O synchronously or asynchronously?
  • How much I/O are you submitting (submit too little the wrong way and all the I/O could be cached so you wind up testing the speed of your RAM rather than the speed of the disk)?
  • How compressible is the content of the data you are writing (e.g. zero only data is highly compressible and some filesystems/disks even have a special fast-path for zero only data leading to numbers that are unobtainable with other content)?

And so on.

Here's a short list of tools with easiest to run at the top and difficult/more thorough/better nearer the bottom:

  1. dd (sequential reads or writes, only shows throughput, can be configured to use a filesystem or block device, can be configured to bypass the block cache/wait for I/O to be really completed)
  2. hdparm (sequential reads only, only shows throughput, never uses a filesystem, can be configured to bypass the block cache, cache test only re-reads the starting 2 MBytes)
  3. GNOME Disk Utility's benchmark (easy to run, never uses a filesystem, graphical but requires a full GNOME install, gives latency and throughput numbers for different types of I/O but write workload is actually doing read/write/fsync per sample size).
  4. fio (can do nearly anything and gives detailed results but requires configuration and an understanding of how to interpret said results). Here's what Linus says about it:

    Greg - get Jens' FIO code. It does things right, including writing actual pseudo-random contents, which shows if the disk does some "de-duplication" (aka "optimize for benchmarks):

    [ https://github.com/axboe/fio/ ]

    Anything else is suspect - forget about bonnie or other traditional tools.

Source: comment left on Google Plus to Greg Kroah-Hartman by Linus Torvalds.


with the IOPS tool

If you can't be bothered to read all this I'd just recommend the IOPS tool. It will tell you real-world speed depending on block size.


Otherwise - when doing an IO benchmark I would look at the following things:

  • blocksize/cache/IOPS/direct vs buffered/async vs sync
  • read/write
  • threads
  • latency
  • CPU utilization

  • Which blocksize will you use: If you want to read/write 1 GB from/to disk this will be quick if you do one I/O operation. But if your application needs to write in 512 byte chunks all over the harddisk in non-sequential pieces (called random I/O although it is not random) this will look differently. Now, databases will do random I/O for the data volume and sequential I/O for the log volume due to their nature. So, first you need to become clear what you want to measure. If you want to copy large video files that's different than if you want to install Linux.

    This blocksize is effecting the count of I/O operations you do. If you do e.g. 8 sequential read (or write, just not mixed) operations the I/O scheduler of the OS will merge them. If it does not, the controller's cache will do the merge. There is practically no difference if you read 8 sequential blocks of 512 bytes or one 4096 bytes chunk. One exception - if you manage to do direct sync IO and wait for the 512 bytes before you request the next 512 bytes. In this case, increasing the block size is like adding cache.

    Also you should be aware that there is sync and async IO: With sync IO you will not issue the next IO request before the current one returns. With async IO you can request e.g. 10 chunks of data and then wait as they arrive. Disctinct database threads will typically use sync IO for log and async IO for data. The IOPS tool takes care of that by measuring all relevant block sizes starting from 512 bytes.

  • Will you read or write: Usually reading is faster than writing. But note that caching works quite a different way for reads and writes:

    • For writes, the data will be handed over to the controller and if it caches, it will acknowledge before the data is on disk unless the cache is full. Using the tool iozone you can draw beautiful graphs of plateaus of cache effects (CPU cache effect and buffer cache effect). The caches becomes less efficient the more has been written.

    • For reads, read data is held in cache after the first read. The first reads take longest and caching becomes more and more effective during uptime. Noteable caches are the CPU cache, the OS' file system cache, the IO controller's cache and the storage's cache. The IOPS tool only measures reads. This allows it to "read all over the place" and you do not want it to write instead of read.

  • How many threads will you use: If you use one thread (using dd for disk benchmarks) you will probably get a much worse performance than with several threads. The IOPS tool takes this into account and reads on several threads.

  • How important is latency for you: Looking at databases, IO latency becomes enormously important. Any insert/update/delete SQL command will be written into the database journal ("log" in database lingo) on commit before it is acknowledged. This means the complete database may be waiting for this IO operation to be completed. I show here how to measure the average wait time (await) using the iostat tool.

  • How important is CPU utilization for you: Your CPU may easily become the bottleneck for your application's performance. In this case you must know how much CPU cycles get burned per byte read/written and optimize into that direction. This can mean to decide for/against PCIe flash memory depending on your measurement results. Again the iostat tool can give you a rough estimation on CPU utilization by your IO operations.