grep pairs of patterns and file

With GNU awk (gawk) you could use a BEGINFILE rule to read a new pattern each time the input file changes:

$ gawk 'BEGINFILE{getline pat < "search.patterns"} $0 ~ pat' file\ {1..3}.txt
home 3
dog 1
cat 4

You should really check that getline returns a new pattern, for example

gawk '
  BEGINFILE {
    if((getline pat < "search.patterns") <= 0) {
      print "Error reading pattern" > "/dev/stderr"
      exit 1
    }
  } 
  $0 ~ pat
' file\ {1..3}.txt

Note that awk patterns are extended regular expressions, similar to those supported by grep with the -E option.

You could achieve the same in non-GNU awk by passing search.patterns as the first file and using NR and FNR appropriately to either read the patterns into an indexed array, or look up the next pattern in the array.


Using bash:

#!/bin/bash

files=( 'file 1.txt' 'file 2.txt' 'file 3.txt' )

while IFS= read -r pattern; do
    grep -e "$pattern" "${files[0]}"
    files=( "${files[@]:1}" )
done <search.patterns

Testing it:

$ bash script.sh
home 3
dog 1
cat 4

The script saves the relevant filenames in the files array, and then proceeds to read patterns from the search.patterns file. For each pattern, the first file in the files list is queried. The processed file is then deleted from the files list (yielding a new first filename in the list).

If the number of patterns exceeds the number of files in files, there will be errors from grep.


You could use paste to match the pattern with the file:

paste <(printf "%s\n" *.txt) search.patterns | while IFS=$'\t' read -r file pattern; do
    grep -- "$pattern" "$file"
done

I'm assuming the filenames do not contain tabs.

Tags:

Linux

Grep