Handy parsing for numbers with unit suffixes?

Solution 1:

Based on my answer at one of the questions you linked to:

awk '{
    ex = index("KMGTPEZY", substr($1, length($1)))
    val = substr($1, 0, length($1) - 1)

    prod = val * 10^(ex * 3)

    sum += prod
}
END {print sum}'

Another method that's used:

sed 's/G/ * 1000 M/;s/M/ * 1000 K/;s/K/ * 1000/; s/$/ +\\/; $a0' | bc

Solution 2:

You can use perl regular expressions to do this. For example,

$value = 0;
if($line =~ /(\d+\.?\d*)(\D+)\s+/) {
   $amplifier = 1024 if ($2 eq 'K');
   $amplifier = 1024 * 1024 if ($2 eq 'M');
   $amplifier = 1024 * 1024 * 1024 if ($2 eq 'G');
   $value = $1 * $amplifier;
}

This is a simple script. You can consider it as starting point. Hope it will help!


Solution 3:

Personally, I'd just not use the -h flag in the first place. The "human readable" version rounds off numbers which will need to be rounded again when you convert back, getting even less accurate. (For instance, 2.7MiB is 2831155.2 bytes. What did you do with the other 0.8th of a byte??!)

Otherwise, you can ask units to convert MiB/GiB/KiB to just "B" and it'll handle this, but you'd have to do something like (assuming your output is tabbed, otherwise cut appropriately)

{your output} | cut -f1 '-d{tab}' | xargs -L 1 -I {} units -1t {}iB B | awk '{s+=$1}END{printf "%d\n",s}'

Solution 4:

VALUE=$1

for i in "g G m M k K"; do
        VALUE=${VALUE//[gG]/*1024m}
        VALUE=${VALUE//[mM]/*1024k}
        VALUE=${VALUE//[kK]/*1024}
done

[ ${VALUE//\*/} -gt 0 ] && echo VALUE=$((VALUE)) || echo "ERROR: size invalid, pls enter correct size"