Faster alternatives to "find" and "locate"?

Searching for source files in a project

Use a simpler command

Generally, source for a project is likely to be in one place, perhaps in a few subdirectories nested no more than two or three deep, so you can use a (possibly) faster command such as

(cd /path/to/project; ls *.c */*.c */*/*.c)

Make use of project metadata

In a C project you'd typically have a Makefile. In other projects you may have something similar. These can be a fast way to extract a list of files (and their locations) write a script that makes use of this information to locate files. I have a "sources" script so that I can write commands like grep variable $(sources programname).

Speeding up find

Search fewer places, instead of find / … use find /path/to/project … where possible. Simplify the selection criteria as much as possible. Use pipelines to defer some selection criteria if that is more efficient.

Also, you can limit the depth of search. For me, this improves the speed of 'find' a lot. You can use -maxdepth switch. For example '-maxdepth 5'

Speeding up locate

Ensure it is indexing the locations you are interested in. Read the man page and make use of whatever options are appropriate to your task.

   -U <dir>
          Create slocate database starting at path <dir>.

   -d <path>
          --database=<path> Specifies the path of databases to search  in.


   -l <level>
          Security  level.   0  turns  security checks off. This will make
          searchs faster.   1  turns  security  checks  on.  This  is  the
          default.

Remove the need for searching

Maybe you are searching because you have forgotten where something is or were not told. In the former case, write notes (documentation), in the latter, ask? Conventions, standards and consistency can help a lot.


I used the "speeding up locate" part of RedGrittyBrick's answer. I created a smaller db:

updatedb -o /home/benhsu/ben.db -U /home/benhsu/ -e "uninteresting/directory1 uninteresting/directory2"

then pointed locate at it: locate -d /home/benhsu/ben.db


A tactic that I use is to apply the -maxdepth option with find:

find -maxdepth 1 -iname "*target*"

Repeat with increasing depths until you find what you are looking for, or you get tired of looking. The first few iterations are likely to return instantaneously.

This ensures that you don't waste up-front time looking through the depths of massive sub-trees when what you are looking for is more likely to be near the base of the hierarchy.


Here's an example script to automate this process (Ctrl-C when you see what you want):

(
TARGET="*target*"
for i in $(seq 1 9) ; do
   echo "=== search depth: $i"
   find -mindepth $i -maxdepth $i -iname "$TARGET"
done
echo "=== search depth: 10+"
find -mindepth 10 -iname $TARGET
)

Note that the inherent redundancy involved (each pass will have to traverse the folders processed in previous passes) will largely be optimized away through disk caching.

Why doesn't find have this search order as a built-in feature? Maybe because it would be complicated/impossible to implement if you assumed that the redundant traversal was unacceptable. The existence of the -depth option hints at the possibility, but alas...