Find any line in VI that has something other than ATCG

First of all, you definitely do not want to open the file in an editor (it's much too large to edit that way).

Instead, if you just want to identify whether the file contains anything other than A, T, C and G, you may do that with

grep '[^ATCG]' filename

This would return all lines that contain anything other than those four characters.

If you would want to delete these characters from the file, you may do so with

tr -c -d 'ATCG\n' <filename >newfilename

(if this is the correct way to "correct" the file or not, I don't know)

This would remove all characters in the file that are not one of the four, and it would also retain newlines (\n). The edited file would be written to newfilename.

If it's a systematic error that has added something to the file, then this could possibly be corrected by sed or awk, but we don't yet know what your data looks like.


If you have the file open in vi or vim, then the command

/[^ATCG]

will find the next character in the editing buffer that is not a A, T, C or G.

And :%s/[^ATCG]//g will remove them all.