Number formatting and rounding issue with awk

... | awk '{ sum+=$1} END { print sum/NR}'

By default, (GNU) awk prints numbers with up to 6 significant digits (plus the exponent part). This comes from the default value of the OFMT variable. It doesn't say that in the docs, but this only applies to non-integer valued numbers.

You could change OFMT to affect all print statements, or rather, just use printf here, so it also works if the average happens to be an integer. Something like %.3f would print the numbers with three digits after the decimal point.

...| awk '{ sum+=$1} END { printf "%.3f\n", sum/NR }'

See the docs for the meaning of the f and g, and the precision modifier (.prec in the second link):

  • https://www.gnu.org/software/gawk/manual/html_node/Control-Letters.html
  • https://www.gnu.org/software/gawk/manual/html_node/Format-Modifiers.html
awk 'NR == 1 { max=$1; min=$1; sum=0 } ...'

This doesn't initialize NR. Instead, it checks if NR is equal to one, i.e. we're on the first line. (== is comparison, = is assignment.) If so, initializes max, min and sum. Without that, max and min would start as zeroes. You could never have a negative maximum value, or a positive minimum value.


If using GNU awk, try this. Adds the commas by making use of the ' modifier.

$ awk '{sum+=$1}END{printf "%'\''.2f\n",sum/NR}' filename
1,316,375.05
$

If you've got jq, try this.

$ jq -s min,max,add/length filename
1153022
1439480
1316375.05
$

From gnu.org : gawk Format Modifiers

A single quote or apostrophe character is a POSIX extension to ISO C. It indicates that the integer part of a floating-point value, or the entire part of an integer decimal value, should have a thousands-separator character in it. This only works in locales that support such characters. For example: