system() yields inconsistent results

Using shell

Try:

$ while read -r line; do date +%s -d "${line%%,*}"; done < input.csv
1597725964
1597726023
1597726083
1597726144

How it works

  1. while read -r line; do starts a while loop and reads a line from stdin.

  2. "${line%%,*}" strips the commas and everything after them from the line.

  3. date +%s -d "${line%%,*}" prints the date as epoch.

  4. done completes the while loop.

  5. <input.csv provides the stdin to the loop.

Variation

This prints the full line and adds the epoch as the final column:

$ while read line; do printf "%s, %s\n" "$line" $(date +%s -d "${line%%,*}"); done < input.csv
08/17/2020 21:46:04 -700 , 1 , 2 , 3, 1597725964
08/17/2020 21:47:03 -700 , 1 , 2 , 3, 1597726023
08/17/2020 21:48:03 -700 , 1 , 2, 1597726083
08/17/2020 21:49:04 -700 , 1 , 2, 1597726144

In awk you can use a coprocess with getline instead of system():

< input.csv awk -F' , ' '{
    "date +%s -d \047"$1"\047\n" | getline date
    print date
}'
1597725964
1597726023
1597726083
1597726144

With the help of Inian and oguz ismail in comments, and gawk, we came up with a better solution, which writes into date's stdin, instead of passing the arguments via command line to it. That's better because interpolating variables into a command line always comes with the risk of shell command injection (via input.csv).

< input.csv gawk -F' , ' '{
    cmd = "date +%s -f-";
    print $1 |& cmd;
    close(cmd, "to");
    if ((cmd |& getline line) > 0)
        print line; close(cmd)
}'
1597725964
1597726023
1597726083
1597726144

Thanks to both!


The call to system(...) returns zero, thus tmp is assigned $(0), i.e. the whole input line. Observe:

$ echo a b c d | awk '{ x = $(system("exit 3")); print x }'
c

You can't capture a shell command's output using the system function in awk; hek2mgl's answer demonstrates how to do it correctly.

Then in the printf(...) call $tmp is expanded to $8, because the longest prefix in $0 that constitutes a valid number is 08; hence the commas in the output. Which can be proven like so:

$ echo foo bar | awk '{ x = "0002junk"; print $x }'
bar

Anyways, for achieving the task described in the question, you don't really need awk. A conjunction of cut and GNU date yields the desired output.

$ cut -d, -f1 input.csv | date -f- +%s
1597725964
1597726023
1597726083
1597726144

And using paste, you can append these timestamps to corresponding records if you don't mind missing spaces around commas.

$ cut -d, -f1 input.csv | date -f- +%s | paste -d, input.csv -
08/17/2020 21:46:04 -700 , 1 , 2 , 3,1597725964
08/17/2020 21:47:03 -700 , 1 , 2 , 3,1597726023
08/17/2020 21:48:03 -700 , 1 , 2,1597726083
08/17/2020 21:49:04 -700 , 1 , 2,1597726144

Tags:

Shell

Date

Awk