How do I trim leading and trailing whitespace from each line of some output?

awk '{$1=$1;print}'

or shorter:

awk '{$1=$1};1'

Would trim leading and trailing space or tab characters1 and also squeeze sequences of tabs and spaces into a single space.

That works because when you assign something to one of the fields, awk rebuilds the whole record (as printed by print) by joining all fields ($1, ..., $NF) with OFS (space by default).

1(and possibly other blank characters depending on the locale and the awk implementation)


The command can be condensed like so if you're using GNU sed:

$ sed 's/^[ \t]*//;s/[ \t]*$//' < file

Example

Here's the above command in action.

$ echo -e " \t   blahblah  \t  " | sed 's/^[ \t]*//;s/[ \t]*$//'
blahblah

You can use hexdump to confirm that the sed command is stripping the desired characters correctly.

$ echo -e " \t   blahblah  \t  " | sed 's/^[ \t]*//;s/[ \t]*$//' | hexdump -C
00000000  62 6c 61 68 62 6c 61 68  0a                       |blahblah.|
00000009

Character classes

You can also use character class names instead of literally listing the sets like this, [ \t]:

$ sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' < file

Example

$ echo -e " \t   blahblah  \t  " | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//'

Most of the GNU tools that make use of regular expressions (regex) support these classes (here with their equivalent in the typical C locale of an ASCII-based system (and there only)).

 [[:alnum:]]  - [A-Za-z0-9]     Alphanumeric characters
 [[:alpha:]]  - [A-Za-z]        Alphabetic characters
 [[:blank:]]  - [ \t]           Space or tab characters only
 [[:cntrl:]]  - [\x00-\x1F\x7F] Control characters
 [[:digit:]]  - [0-9]           Numeric characters
 [[:graph:]]  - [!-~]           Printable and visible characters
 [[:lower:]]  - [a-z]           Lower-case alphabetic characters
 [[:print:]]  - [ -~]           Printable (non-Control) characters
 [[:punct:]]  - [!-/:-@[-`{-~]  Punctuation characters
 [[:space:]]  - [ \t\v\f\n\r]   All whitespace chars
 [[:upper:]]  - [A-Z]           Upper-case alphabetic characters
 [[:xdigit:]] - [0-9a-fA-F]     Hexadecimal digit characters

Using these instead of literal sets always seems like a waste of space, but if you're concerned with your code being portable, or having to deal with alternative character sets (think international), then you'll likely want to use the class names instead.

References

  • Section 3 of the sed FAQ

xargs without arguments do that.

Example:

trimmed_string=$(echo "no_trimmed_string" | xargs)