completely ignore lines that start with a specific pattern

To ignore some lines on a line-by-line basis, add /unwanted pattern/ {next} or ! /wanted pattern/ {next} at the beginning of the script.

Alternatively, filter with grep: grep -v 'unwanted pattern' | awk … or grep 'wanted pattern' | awk …. This may be faster if grep eliminates a lot of lines, because grep is typically faster than awk for the same task (grep is more specialized so it can be optimized for its task; awk is a full programming language, it can do a lot more but it's less efficient).

If you want to ignore a block of consecutive lines, awk has a convenient facility for that: add /^IRRELEVENT DATA/,/^END/ {next} at the top of the script to ignore all lines starting with IRRELEVENT DATA (sic) and the following lines until the first line that starts with END. You can't do that with grep; you can do it with sed (sed '/^IRRELEVENT DATA/,/^END/d' | awk …) but it's less likely to be a performance gain than grep.


Without using next, using negation instead.

input:

$ cat f.txt 
GOOD STUFF
----------------
IRRELEVENT DATA
----------------
IGNORE ALL THESE
----------------
END OF IT
----------------
GOOD STUFF

I want to ignore lines starting with string IRRELEVENT or IGNORE or END:

$ awk   '!/IRRELEVENT|IGNORE|END/{print }' <(cat f.txt)
GOOD STUFF
----------------
----------------
----------------
----------------
GOOD STUFF

Tags:

String

Awk