How to make this search faster in fgrep/Ag?

Since you're using ack and The Silver Searcher (ag), it seems that you are OK with using additional tools.

A new tool in this space is ripgrep (rg). It is designed to be fast in both finding files to search (like ag) and also fast in searching files themselves (like plain old GNU grep).

For the example in your question, you might use it something like this:

rg --files-with-matches --glob "*.tex" "and" "$HOME"

The author of ripgrep posted a detailed analysis of how the different searching tools work, along with benchmark comparisons.

One of the benchmarks, linux-literal-casei, is somewhat similar to the task you describe. It searches over a large number of files in a lot of nested directories (the Linux codebase), searching for a case-insensitive string literal.

In that benchmark, rg was fastest when using a whitelist (like your "*.tex" example). The ucg tool also did well on this benchmark.

rg (ignore)         0.345 +/- 0.073 (lines: 370)
rg (ignore) (mmap)  1.612 +/- 0.011 (lines: 370)
ag (ignore) (mmap)  1.609 +/- 0.015 (lines: 370)
pt (ignore)        17.204 +/- 0.126 (lines: 370)
sift (ignore)       0.805 +/- 0.005 (lines: 370)
git grep (ignore)   0.343 +/- 0.007 (lines: 370)
rg (whitelist)      0.222 +/- 0.021 (lines: 370)+
ucg (whitelist)     0.217 +/- 0.006 (lines: 370)* 

* - Best mean time. + - Best sample time.

The author excluded ack from the benchmarks because it was much slower than the others.


You could probably make it a little bit faster by running multiple find calls in parallel. For example, first get all toplevel directories and run N find calls, one for each dir. If you run the in a subshell, you can collect the output and pass it to vim or anything else:

shopt -s dotglob ## So the glob also finds hidden dirs
( for dir in $HOME/*/; do 
    find -L "$dir" -xtype f -name "*.tex" -exec grep -Fli and {} + & 
  done
) | vim -R -

Or, to be sure you only start getting output once all the finds have finished:

( for dir in $HOME/*/; do 
    find -L "$dir" -xtype f -name "*.tex" -exec grep -Fli and {} + & 
  done; wait
) | vim -R -

I ran a few tests and the speed for the above was indeed slightly faster than the single find. On average, over 10 runs, the single find call tool 0.898 seconds and the subshell above running one find per dir took 0.628 seconds.

I assume the details will always depend on how many directories you have in $HOME, how many of them could contain .tex files and how many might match, so your mileage may vary.