How do you remove dot character from string without calling sed or awk again?

The sed command, the awk command, and the removal of the trailing period can all be combined into a single awk command:

while read -r host; do dig +search "$host" ALL; done <hostlist.txt | awk 'f{sub(/.$/,"",$1); print $1", "$NF; f=0} /ANSWER SECTION/{f=1}'

Or, as spread out over multiple lines:

while read -r host
do
    dig +search "$host" ALL
done <hostlist.txt | awk 'f{sub(/.$/,"",$1); print $1", "$NF; f=0} /ANSWER SECTION/{f=1}'

Because the awk command follows the done statement, only one awk process is invoked. Although efficiency may not matter here, this is more efficient than creating a new sed or awk process with each loop.

Example

With this test file:

$ cat hostlist.txt 
www.google.com
fd-fp3.wg1.b.yahoo.com

The command produces:

$ while read -r host; do dig +search "$host" ALL; done <hostlist.txt | awk 'f{sub(/.$/,"",$1); print $1", "$NF; f=0} /ANSWER SECTION/{f=1}'
www.google.com, 216.58.193.196
fd-fp3.wg1.b.yahoo.com, 206.190.36.45

How it works

awk implicitly reads its input one record (line) at a time. This awk script uses a single variable, f, which signals whether the previous line was an answer section header or not.

f{sub(/.$/,"",$1); print $1", "$NF; f=0}

If the previous line was an answer section header, then f will be true and the commands in curly braces are executed. The first removes the trailing period from the first field. The second prints the first field, followed by ,, followed by the last field. The third statement resets f to zero (false).

In other words, f here functions as a logical condition. The commands in curly braces are executed if f is nonzero (which, in awk, means 'true').
/ANSWER SECTION/{f=1}

If the current line contains the string ANSWER SECTION, then the variable f is set to 1 (true).

Here, /ANSWER SECTION/ serves as a logical condition. It evaluates to true if the current matches the regular expression ANSWER SECTION. If it does, then the command in curly braces in executed.

dig can read in a file containing a list of hostnames and process them one by one. You can also tell dig to suppress all output except the answer section.

This should give you the output you want:

dig -f hostlist.txt +noall +answer +search | 
    awk '{sub(/\.$/,"",$1); print $1","$5}'

awk's sub() function is used to strip the literal period . from the end of the first field. Then awk prints fields 1 and 5 separated by a comma.

NOTE: entries in hostlist.txt that do not resolve are completely discarded - they do not appear on stdout OR stderr.

(Tested on Linux and FreeBSD)

Change your invocation of gawk to the following:

| gawk '{print substr($1,1,length($1)-1)","$NF}' >fqdn-ip.csv

How do you remove dot character from string without calling sed or awk again?

Example

How it works

Tags:

String

Awk

Sed

Shell Script

Regular Expression

Related

Recent Posts