How can I visualize hard disk space with millions of files?

Solution 1:

Assuming your OS is Windows...

Either way you slice it, tabulating millions of files is always going to take a long time and will be restricted by the I/O of the disk itself. I recommend TreeSize Professional. Or maybe SpaceObServer. You could give the freeware version of TreeSize a try as well.

Solution 2:

Definitely try WinDirStat: it gives a fantastic visualization of disk use by depicting each file as a rectangle drawn to scale, color coded by file type. Click on any item in the visualization and you'll see it in the directory tree.

The standard 32-bit build is limited to 10 million files and 2 GB RAM usage, but the source code will build successfully as a 64-bit application. The fact that the server in question has only 2GB of RAM may be problematic in this specific case, but most servers with such large numbers of files will have much more RAM.

Edit #1: I regret to have discovered that, when tested on a 4TB volume containing millions of files, WinDirStat Portable crashed after indexing about 6.5 million files. It may not work for the original question if the drive contains 6+ million files.

Edit #2: Full version of WinDirStat crashes at 10 million files and 1.9GB used

Edit #3: I got in touch with the WinDirStat developers and: (1) they agree that this was caused by memory usage limitations of the x86 architecture, and (2) mentioned that it can be compiled as 64-bit without errors. More soon.

Edit #4: The test of a 64-bit build of WinDirStat was successful. In 44 minutes, it indexed 11.4 million files and consumed 2.7 GB of RAM.


Solution 3:

I regularly use FolderSizes on several 1TB drives with several million files with no problems.


Solution 4:

+1 for the TreeSize products, but...

Your sentence about "not cleaning enough space" makes me wonder: Could you have run out of NTFS MFT reserved space? If the filesystem grabs more MFT space than is initially allocated, it is not returned to regular filespace, and is not shown in defrag operations.

http://support.microsoft.com/kb/174619

"Volumes with a small number of relatively large files exhaust the unreserved space first, while volumes with a large number of relatively small files exhaust the MFT zone space first. In either case, fragmentation of the MFT starts to take place when one region or the other becomes full. If the unreserved space becomes full, space for user files and directories starts to be allocated from the MFT zone competing with the MFT for allocation. If the MFT zone becomes full, space for new MFT entries is allocated from the remainder of the disk, again competing with other files. "


Solution 5:

  1. cd \
  2. dir /s > out.txt
  3. poof! Magic happens; or a perl hacker shows up
  4. Results!

Seriously. I've done this with 5 or 6 million files; not sure exactly what you're looking for but a good scripting language will eat this up.