Deleting lines from one file which are in another file

Try comm instead (assuming f1 and f2 are "already sorted")

comm -2 -3 f1 f2

grep -v -x -f f2 f1 should do the trick.

Explanation:

  • -v to select non-matching lines
  • -x to match whole lines only
  • -f f2 to get patterns from f2

One can instead use grep -F or fgrep to match fixed strings from f2 rather than patterns (in case you want remove the lines in a "what you see if what you get" manner rather than treating the lines in f2 as regex patterns).


For exclude files that aren't too huge, you can use AWK's associative arrays.

awk 'NR == FNR { list[tolower($0)]=1; next } { if (! list[tolower($0)]) print }' exclude-these.txt from-this.txt 

The output will be in the same order as the "from-this.txt" file. The tolower() function makes it case-insensitive, if you need that.

The algorithmic complexity will probably be O(n) (exclude-these.txt size) + O(n) (from-this.txt size)

Tags:

Scripting

Bash

Sh