Awk seems to be confused what $1 is

Those files contain the output from curl downloading files, and curl updates its progress information during downloads by outputting a carriage return (commonly represented as \r, the escape used to produce it in a number of contexts), which causes the cursor to return to the start of the line.

When you run grep 100 *.dl.tst, each line that’s output starts with the file name, but that’s followed by multiple updates which return the cursor to the start of the line, so you don’t see the file name — it’s overwritten by subsequent output. In more detail, the output looks like

shpr002.20201124_141036.dl.tst:

followed by a carriage return, followed by the first progress output from curl,

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

followed by a carriage return, etc., until the percentage reaches 100. Because all this is only separated by carriage returns, not line feeds, it counts as a single line, and grep matches that in its entirety.

The same effect explains the output of grep 100 *.dl.tst|awk '{print$0}'.

When you ask AWK to output $1, it outputs the first field, and now you can see it: it contains the file name, a colon, a carriage return, and that’s it — the start of curl’s output then starts with a space (to leave room for the percentage count), which is a field separator. When you ask it to output $2, it outputs the second field, which is the first percentage count, 0:

shpr002.20201124_141036.dl.tst:\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

<--          Field 1          -->  !     !    !     !  ...
                                   $2    $3   $4    $5 ...

Working on Stephen's description of the issue, a simple way to make the output easier to process would be just translate all the carriage returns into newlines, leaving you with curl's progress report as a bunch of individual lines, which you can then use awk on:

$ for f in *.dl; do < "$f" tr '\r' '\n' | awk '$1 == "100" {print $0}' ; done
100  720k  100  720k    0     0  22.5M      0 --:--:-- --:--:-- --:--:-- 22.7M
100 23.6M  100 23.6M    0     0   372M      0 --:--:-- --:--:-- --:--:--  369M

(Though, if curl rounds the percentage it prints to nearest integer instead of down, huge files might show multiple lines with 100 in the first column.)

On the other hand, if it's known that the files contain nothing but the output from curl, then we might as well just pick the last line instead of looking at the contents:

$ for f in *.dl; do < "$f" tr '\r' '\n' | tail -n1  ; done
100  720k  100  720k    0     0  22.5M      0 --:--:-- --:--:-- --:--:-- 22.7M
100 23.6M  100 23.6M    0     0   372M      0 --:--:-- --:--:-- --:--:--  369M

Tags:

Awk