Is it a sane approach to "back up" the $IFS variable?

In general, it is a good practice to return conditions to default.

However, in this case, not so much.

Why?:

  • Every time an script starts (in bash) IFS is set to $' \t\n'.
  • Just executing unset IFS makes it act as if it was set to default.

Also, storing IFS value has a problem.
If the original IFS was unset, the code IFS="$OldIFS" will set IFS to "", not unset it.

To actually keep the value of IFS (even if unset), use this:

${IFS+"false"} && unset oldifs || oldifs="$IFS"    # correctly store IFS.

IFS="error"                 ### change and use IFS as needed.

${oldifs+"false"} && unset IFS || IFS="$oldifs"    # restore IFS.

You can save and assign to IFS as needed. There is nothing wrong with doing so. It's not uncommon to save its value for restoration subsequent to a temporary, expeditious modification, like your array assignment example.

As @llua mentions in his comment to your question, simply unsetting IFS will restore the default behavior, equivalent to assigning a space-tab-newline.

It's worth considering how it can be more problematic to not explicitly set/unset IFS than it is to do so.

From the POSIX 2013 edition, 2.5.3 Shell Variables:

Implementations may ignore the value of IFS in the environment, or the absence of IFS from the environment, at the time the shell is invoked, in which case the shell shall set IFS to <space> <tab> <newline> when it is invoked.

A POSIX-compliant, invoked shell may or may not inherit IFS from its environment. From this follows:

  • A portable script cannot dependably inherit IFS via the environment.
  • A script that intends to use only the default splitting behavior (or joining, in the case of "$*"), but which may run under a shell which initializes IFS from the environment, must explicitly set/unset IFS to defend itself against environmental intrusion.

N.B. It is important to understand that for this discussion the word "invoked" has a particular meaning. A shell is invoked only when it is explicitly called using its name (including a #!/path/to/shell shebang). A subshell -- such as might be created by $(...) or cmd1 || cmd2 & -- is not an invoked shell, and its IFS (along with most of its execution environment) is identical to its parent's. An invoked shell sets the value of $ to its pid, while subshells inherit it.


This is not merely a pedantic disquisition; there is actual divergence in this area. Here is a brief script which tests the scenario using several different shells. It exports a modified IFS (set to :) to an invoked shell which then prints its default IFS.

$ cat export-IFS.sh
export IFS=:
for sh in bash ksh93 mksh dash busybox:sh; do
    printf '\n%s\n' "$sh"
    $sh -c 'printf %s "$IFS"' | hexdump -C
done

IFS is not generally marked for export, but, if it were, note how bash, ksh93, and mksh ignore their environment's IFS=:, while dash and busybox honor it.

$ sh export-IFS.sh

bash
00000000  20 09 0a                                          | ..|
00000003

ksh93
00000000  20 09 0a                                          | ..|
00000003

mksh
00000000  20 09 0a                                          | ..|
00000003

dash
00000000  3a                                                |:|
00000001

busybox:sh
00000000  3a                                                |:|
00000001

Some version info:

bash: GNU bash, version 4.3.11(1)-release
ksh93: sh (AT&T Research) 93u+ 2012-08-01
mksh: KSH_VERSION='@(#)MIRBSD KSH R46 2013/05/02'
dash: 0.5.7
busybox: BusyBox v1.21.1

Even though bash, ksh93, and mksh do not initialize IFS from the environment, they re-export their modified IFS.

If for whatever reason you need to portably pass IFS via the environment, you cannot do so using IFS itself; you will need to assign the value to a different variable and mark that variable for export. Children will then need to explicitly assign that value to their IFS.


You are right to be hesitant about clobbering a global. Fear not, it is possible to write clean working code without ever modifying the actual global IFS, or doing a cumbersome and error-prone save/restore dance.

You can:

  • set IFS for a single invocation:

    IFS=value command_or_function
    

    or

  • set IFS inside a subshell:

    (IFS=value; statement)
    $(IFS=value; statement)
    

Examples

  • To obtain a comma-delimited string from an array:

    str="$(IFS=, ; echo "${array[*]-}")"
    

    Note: The - is only to protect an empty array against set -u by providing a default value when unset (that value being the empty string in this case) .

    The IFS modification is only applicable inside the subshell spawned by the $() command substitution. This is because subshells have copies of the invoking shell's variables and can therefore read their values, but any modifications performed by the subshell only affect the subshell's copy and not the parent's variable.

    You might also be thinking: why not skip the subshell and just do this:

    IFS=, str="${array[*]-}"  # Don't do this!
    

    There is no command invocation here, and this line is instead interpreted as two independent subsequent variable assignments, as if it were:

    IFS=,                     # Oops, global IFS was modified
    str="${array[*]-}"
    

    Finally, let's explain why this variant will not work:

    # Notice missing ';' before echo
    str="$(IFS=, echo "${array[*]-}")" # Don't do this! 
    

    The echo command will indeed be called with its IFS variable set to ,, but echo does not care or use IFS. The magic of expanding "${array[*]}" to a string is done by the (sub-)shell itself before echo is even invoked.

  • To read in a whole file (that does not contain NULL bytes) into a single variable named VAR:

    IFS= read -r -d '' VAR < "${filepath}"
    

    Note: IFS= is the same as IFS="" and IFS='', all of which set IFS to the empty string, which is very different from unset IFS: if IFS is not set, behavior of all bash functionalities that internally use IFS is exactly the same as if IFS had the default value of $' \t\n'.

    Setting IFS to the empty string ensures leading and trailing whitespace is preserved.

    The -d '' or -d "" tells read to only stop its current invocation on a NULL byte, instead of the usual newline.

  • To split $PATH along its : delimiters:

    IFS=":" read -r -d '' -a paths <<< "$PATH"
    

    This example is purely illustrative. In the general case where you are splitting along a delimiter, it may be possible for the individual fields to contain (an escaped version of) that delimiter. Think of trying to read-in a row of a .csv file whose columns may themselves contain commas (escaped or quoted in some way). The above snippet will not work as intended for such cases.

    That said, you are unlikely to encounter such :-containing-paths within $PATH. While UNIX/Linux pathnames are allowed to contain a :, it seems bash wouldn't be able to handle such paths anyway if you try to add them to your $PATH and store executable files in them, as there is no code to parse escaped/quoted colons: source code of bash 4.4.

    Finally, note that the snippet appends a trailing newline to the last element of the resulting array (as called out by @StéphaneChazelas in now-deleted comments), and that if the input is the empty string, the output will be a single-element array, where the element will consist of a newline ($'\n').

Motivation

The basic old_IFS="${IFS}"; command; IFS="${old_IFS}" approach that touches the global IFS will work as expected for the simplest of scripts. However, as soon as you add any complexity, it can easily break apart and cause subtle issues:

  • If command is a bash function that also modifies the global IFS (either directly or, hidden from view, inside yet another function that it calls), and while doing so mistakenly uses the same global old_IFS variable to do the save/restore, you get a bug.
  • As pointed out in this comment by @Gilles, if the original state of IFS was unset, the naive save-and-restore won't work, and will even result in outright failures if the commonly (mis-)used set -u (a.k.a set -o nounset) shell option is in force.
  • It is possible for some shell code to execute asynchronously to the main execution flow, such as with signal handlers (see help trap). If that code also modifies the global IFS or assumes it has a particular value, you can get subtle bugs.

You could devise a more robust save/restore sequence (such as the one proposed in this other answer to avoid some or all of these problems. However, you would have to repeat that piece of noisy boilerplate code wherever you temporarily need a custom IFS. This reduces code readability and maintainability.

Additional considerations for library-like scripts

IFS is especially a concern for authors of shell function libraries who need to ensure their code works robustly regardless of the global state (IFS, shell options, ...) imposed by their invokers, and also without disturbing that state at all (the invokers might rely on it to always remain static).

When writing library code, you cannot rely on IFS having any particular value (not even the default one) or even being set at all. Instead, you need to explicitly set IFS for any snippet whose behavior depends on IFS.

If IFS is explicitly set to the necessary value (even if that happens to be the default one) in every line of code where the value matters using whichever of the two mechanisms described in this answer is appropriate to localize the effect, then the code is both independent of global state and avoids clobberring it altogether. This approach has the added benefit of making it very explicit to a person reading the script that IFS matters for precisely this one command/expansion at minimum textual cost (compared to even the most basic save/restore).

What code is affected by IFS anyway?

Fortunately, there are not that many scenarios where IFS matters (assuming you always quote your expansions):

  • "$*" and "${array[*]}" expansions
  • invocations of the read built-in targeting multiple variables (read VAR1 VAR2 VAR3) or an array variable (read -a ARRAY_VAR_NAME)
  • invocations of read targeting a single variable when it comes to leading/trailing whitespace or non-whitespace characters appearing in IFS.
  • word-splitting (such as for unquoted expansions, which you might want to avoid like the plague)
  • some other less common scenarios (See: IFS @ Greg's Wiki)