Separate fields with a multi-character delimiter using awk

I think the problem you are facing is related to the following statement in the (GNU) awk manpage [1]:

If FS is a single character, fields are separated by that character. If FS is the null string, then each individual character becomes a separate field. Otherwise, FS is expected to be a full regular expression.

Since your field delimiting pattern contains characters that have a special meaning in regular expressions (the | and the ^), you need to escape them properly. Because of the way awk interprets variables (string literals are parsed twice), you would need to specify that using double backslashes, as in

awk -F '\\|~\\^' '{print $2}' input.txt

Resulting output for your example:

20200425
abc
abc
abc
abc
abc
abc
20200425

To consider only those lines starting with T, use

awk -F '\\|~\\^' '/^T/ {print $2}' input.txt

or alternatively, by selecting only lines where a certain field (here, the first field) has a value of T:

awk -F '\\|~\\^' '$1=="T" {print $2}' input.txt

Result for your example in both cases

20200425

Notice that in general, the combined use of awk, grep and sed is rarely necessary. Furthermore, all these tools can directly access files, so using cat to feed them the text to process is also unnecessary.

[1]: As an (unrelated) side note: The part with the "null string" does not work on all Awk variants. The GNU Awk manual states "This is a common extension; it is not specified by the POSIX standard".

Tags:

Awk