Quicker (quickest?) way to get number of files in a directory with over 200,000 files

The code you've got is slow because it first gets an array of all the available files, then takes the length of that array.

However, you're almost certainly not going to find any solutions that work much faster than that.

Why?

Access controls.

Each file in a directory may have an access control list - which may prevent you from seeing the file at all.

The operating system itself can't just say "hey, there are 100 file entries here" because some of them may represent files you're not allowed to know exist - they shouldn't be shown to you at all. So the OS itself has to iterate over the files, checking access permissions file by file.

For a discussion that goes into more detail around this kind of thing, see two posts from The Old New Thing:

  • Why doesn't the file system have a function that tells you the number of files in a directory?
  • Why doesn't Explorer show recursive directory size as an optional column?

[As an aside, if you want to improve performance of a directory containing a lot of files, limit yourself to strictly 8.3 filenames. No I'm not kidding - it's faster, because the OS doesn't have to generate an 8.3 filename itself, and because the algorithm used is braindead. Try a benchmark and you'll see.]


FYI, .NET 4 includes a new method, Directory.EnumerateFiles, that does exactly what you need is awesome. Chances are you're not using .NET 4, but it's worth remembering anyway!

Edit: I now realise that the OP wanted the NUMBER of files. However, this method is so useful I'm keeping this post here.


I had a very similar problem with a directory containing (we think) ~300,000 files.

After messing with lots of methods for speeding up access (all unsuccessful) we solved our access problems by reorganising the directory into something more hierarchical.

We did this by creating directories a-z, representing the first letter of the file, then sub-directories for each of those, also containing a-z for the second letter of the file. Then we inserted the files in the related directory

e.g.

gbp32.dat

went in

g/b/gbp32.dat

and re-wrote our file access routines appropriately. This made a massive difference, and it's relatively trivial to do (I think we moved each file using a 10-line Perl script)

Tags:

.Net

File Io