grep. Find multiple AND patterns in any order using single condition

If your version of grep supports PCRE (GNU grep does this with the -P or --perl-regexp option), you can use lookaheads to match multiple words in any order:

grep -P '(?=.*?word1)(?=.*?word2)(?=.*?word3)^.*$'

This won't highlight the words, though. Lookaheads are zero-length assertions, they're not part of the matching sequence.

I think your piping solution should work for that. By default, grep only colors the output when it's going to a terminal, so only the last command in the pipeline does highlighting, but you can override this with --color=always.

grep --color=always foo | grep --color=always bar

A complex option would be to generate all permutations of the patterns, supply them to grep and hope that the regexp compiler generates a reasonably optimized search tree.

But even with a small number of patterns, say six, the permutations would be 6!, or 720. It makes for an awkward command line.

But seeing as you seem to have no quarrels with piped grep except

and this will work but I would like to see colored output

then, provided that:

  • the patterns do not overlap
  • the patterns do not contain term control characters

an acceptable solution would be to pipe several greps, each with one pattern, in order of increasing likelihood so as to minimize the load.

Then to ensure that the colorization works, you'll have to force grep into believing that its standard output is a terminal using faketerm or otherwise, or if available (it should be), you can use the --color=always option:

cat file | grep --color=always pattern1 \
         | grep --color=always pattern2 \
         ...
         | grep --color=always patternN 

(A nice twist would be to wrap the greps into a single string to be executed by a subshell, and generate the string programmatically using e.g. sed).

Tags:

Grep