Echo multiple variables in same line Bash

Consider:

while read filename count
do
    count_dt=$(date "+%Y-%m-%d %H:%M:%S")
    echo "db|Abhi_Ram|record_count|${filename}||${count}||${count_dt}"
done <sample.txt >>output.txt

This produces the file:

$ cat output.txt 
db|Abhi_Ram|record_count|2015-03-04.01.Abhi_Ram.json||10||2015-08-10 14:42:39
db|Abhi_Ram|record_count|2015-03-04.02.Abhi_Ram.json||70||2015-08-10 14:42:39

Notes:

  1. It is best practice to use lower or mixed case for your shell variables. The system uses upper case variables and you don't want to accidentally overwrite one.

  2. The many double-quotes in the echo statement were unnecessary. The whole of the output string can be inside one double-quoted string.

  3. If you want to read a file one line at a time, it is safer to use the while read ... done <inputfile construct. The read statement also allows us to easily define the filename and count variables.

  4. For command substitution, many prefer the form $(...) over the backtick form. This is because (a) the $(...) makes the beginning and end of the command substitution visually distinct, (b) the $(...) form nests well, and (c) not all fonts clearly show backticks as different from regular ticks. (Thanks Chepner.)

  5. For efficiency, the redirection to output.txt has been moved to the end of the loop. In this way, the file is only opened and closed once. (Thanks Charles Duffy.)

  6. Unless you need count_dt updated with each individual entry, it could be placed before the loop and set just once everytime sample.txt was processed. If you have an up-to-date version of bash (no Mac OSX), then the count_dt assignment can be replaced (Thanks Charles Duffy) with a native bash statement (no shelling out required):

    printf -v count_dt '%(%Y-%m-%d %H:%M:%S)T'
    

John1024 has explained how to do this correctly; I'd like to take a look at why the original version didn't work. The basic problem is that for loops over words, not over lines. The file has two words on each line (a filename and a count), so it runs the loop twice per line. To see this, try:

for line in `hadoop fs -cat sample.txt`
do
    echo "$line"
done

...and it'll print something like:

2015-03-04.01.Abhi_Ram.txt
10
2015-03-04.02.Abhi_Ram.txt
70

...which isn't what you want at all. It also has some other unpleasant quirks, like if the input file contained the word "*", it'd insert a list of filenames in the current directory.

The while read ... done <file approach is the right way to iterate over lines in a shell script. It just happens to also be able to split each line into fields without having to mess with awk (in this case, read filename count does it).

Tags:

Bash

Echo