What is the difference between sector and cluster?

The advantage of file systems considering a cluster/allocation unit/block as the smallest unit, is because addressing the entire disk per-sector would require a larger number of bits to index it all. This larger number of bits would make it slower, because there are a larger number of addresses and things to keep track of. It's far more efficient to address (and index!) locations using say, 48 bits (2^48 = 2.8e14), as opposed to 64 or more bits (2^64 = 1.8e16) for each single access of the device.

But yes, cluster size or allocation unit size (windows) or block size (Linux) is adjustable depending on the file system defined, and that is the smallest size that can normally be accessed by an OS to store file data. "Defining a filesystem" means to format the disk (or the specifications of that format), so implies erasure of all data on the disk. So on a disk with a cluster size of 4kiB, a 1-byte file would indeed take up an entire 4k cluster as in your example. Yes, the OS could write to some specific sector within that cluster, but the file will still use the same sectors of that cluster (file size will always be a multiple of cluster size, regardless of what data is in it.) To change that cluster size, means to re-format the disk, and is why all data must be erased.

Incidentally, smaller cluster sizes store small files more efficiently. However, the disk will run slower overall as a consequence, because of the increased number of clusters. When your PC is just sitting there grinding on the disk for a long time, this is because it's trying to read or write so many small blocks, and the sheer number of them slows everything down.

Ex: 100,000 768-byte files, stored on a disk with 1kiB clusters:

  • 768kB bytes of actual file data

  • 1.024MB of the disk used, because each file uses 1024 bytes of the disk.

  • Space efficiency = 0.768/1.024 = 75% (not bad...)

And likewise, larger clusters are better for disks with fewer, larger files on them like movies, images, and audio. Since there are fewer clusters, the disk is generally faster. But be careful putting lots of small files on it:

Ex: 100,000 768-byte files, stored on a disk with 64kiB clusters:

  • 768kB bytes of actual file data

  • 6.55GB of the disk used, because each file uses 65535 bytes of the disk.

  • Space efficiency = 0.768/6553.5 = 0.00017% !!!

Disks with mixed content, such as an operating system, generally have medium-to-small cluster/block sizes, as most of the files are medium-to-small in size. The end result is a compromise between space utilization and speed.

The disks themselves prefer anywhere from 32kB to 256kB blocks, as that allows them to transfer the most data per second.

This is all concerning traditional mechanical, rotating-platter magnetic-storage hard disks. SSD's or Solid-State Drives are quickly replacing traditional hard disks and boast much faster read/write/seek speeds. So is cluster size important on a SSD today? Well I'd say it is less important to the average user, but only because the SSD (and modern computers) are much faster already. Who is going to notice a SSD slow-down of 10% when already 5x faster than a magnetic hard disk?

What might influence the cluster size on a SSD more is the throughput. You might find (by formatting and benchmarking) that a certain cluster size works far better than others for that SSD. For example, some SSD's are optimized for 8kiB or 4kiB transfers. This has to do with how big a block of data the electronics inside are prepared to transfer per request. Match what the OS is attempting to use (cluster size) with the optimal size for that SSD = fastest transfer speed.

Cluster size is still important for file "overhead" reasons on SSD's however.

I've found a great tool for benchmarking SSD's is AS-SSD for Windows and these on Linux.

Tags:

Memory