How do storage IOPS change in response to disk capacity?

Solution 1:

I know this is probably a hypothetical question... But the IT world really doesn't work that way. There are realistic constraints to consider, plus other things that can influence IOPS...

  • 50GB and 100GB disks don't really exist anymore. Think more: 72, 146, 300, 450, 600, 900, 1200GB in enterprise disks and 500, 1000, 2000, 3000, 4000, 6000GB in nearline/midline bulk-storage media.

  • There's so much abstraction in modern storage; disk caching, controller caching, SSD offload, etc. that any differences would be difficult to discern.

  • You have different drive form factors, interfaces and rotational speeds to consider. SATA disks have a different performance profile than SAS or nearline SAS. 7,200RPM disks behave differently than 10,000RPM or 15,000RPM. And the availability of the various rotational speeds is limited to certain capacities.

  • Physical controller layout. SAS expanders, RAID/SAS controllers can influence IOPS, depending on disk layout, oversubscription rates, whether the connectivity is internal to the server or in an external enclosure. Large numbers of SATA disks perform poorly on expanders and during drive error conditions.

  • Some of this can be influenced by fragmentation, used capacity on the disk array.

  • Ever hear of short-stroking?

  • Software versus hardware RAID, prefetching, adaptive profiling...

What leads you to believe that capacity would have any impact on performance in the first place? Can you provide more context?

Edit:

If the disk type, form factor, interface and used-capacity are the same, then there should be no appreciable difference in IOPS. Let's say you were going from 300GB to 600GB enterprise SAS 10k disks. With the same spindle count, you shouldn't see any performance difference...

However, if the NetApp disk shelves you mention employ 6Gbps or 12Gbps SAS backplanes versus a legacy 3Gbps, you may see a throughput change in going to newer equipment.

Solution 2:

One place where there is a direct relationship between disk size and IOPS is in the Amazon AWS Cloud and other "cloudy services". Two types of AWS services (Elastic Block Store and Relational Database Service ) provide higher IOPS for larger disk sizes.

Note that this is an artificial restriction placed by Amazon on their services. There is no hardware-bound reason for this to be the case. However, I have seen devops types who are unfamiliar with unvirtualized hardware believing this restriction to be appropriate also for desktop systems and the like. The disk size / IOPS relationship is a cloud marketing restriction, not a hardware restriction.


Solution 3:

To answer your question directly - all other things being equal = no change whatsoever when GB changes.

You don't measure IOPS with GB. You use the seek time and the latency.

I could re-write it all here but these examples below do all that already and I would simply be repeating it:

https://ryanfrantz.com/posts/calculating-disk-iops.html

http://www.big-data-storage.co.uk/how-to-calculate-iops/

http://www.wmarow.com/strcalc/

http://www.thecloudcalculator.com/calculators/disk-raid-and-iops.html


Solution 4:

I should point out that IOPS are not a great measurement of speed on sequential writes, but lets just go with it.

I suspect the seek and write times of disk heads is pretty consistent despite the size of the disks. 20 years ago we we're all using 60GB disks with (roughly - certainly not linearly) the same read/write speeds.

I am making an educated guess but I dont think that the density of the disk relates linearly with the performance of the disk.

For example, take an array with 10 X 100GB disks.

Measure IOPS for sequential 256kb block writes (or any IOPS metric)

Let's assume the resulting measured IOPS is 1000 IOPS.

OK

Change the array for one with 10 X 200GB disks. Format with same RAID configuration, same block size, etc.

Would one expect the IOPS to remain the same, increase, or decrease?

Probably remain roughly equivalent to one another.

Would the change be roughly linear?

The history of spinning media tells me there is probably no relationship.

Repeat these questions with 10 X 50GB disks

Again, roughly equivalent.

Your speed, in all these cases comes from the fact that the RAID acts like one single disk with ten write heads, so you can send 1/10th of the work in parallel to each disk.

Whilst I have no hard numbers to show you, my past experience tells me that increasing your disks performance is not quite so simple as getting more capacity.

Despite what the marketing people tell you is innovation, before the start of cheap(er) solid state disks there has been little significant development in the performance of spinning media in the last 20 years, presumably theres only so much you can get out of rust and only so fast we can get our current models of disk heads to go.


Solution 5:

The performance added to the storage scales with each spindle added. The rotational speed of the drive is the biggest factor, so adding a 10k RPM drive will give more performance (in terms of IO/s in random IO or MB/s in streaming IO) than a 7.2k RPM drive. The size of the drive has virtually no effect.

People say small drives go faster simply because you require more spindles per usable TB. Increasing the drive size of those spindles won't decrease performance, but it will allow you to fit more data on the disks, which may result in an increased workload.