Serialize shell variable in bash or zsh

Warning: With any of these solutions, you need to be aware that you are trusting the integrity of the data files to be safe as they will get executed as shell code in your script. Securing them is paramount to your script's security!

Simple inline implementation for serializing one or more variables

Yes, in both bash and zsh you can serialize the contents of a variable in a way that is easy to retrieve using the typeset builtin and the -p argument. The output format is such that you can simply source the output to get your stuff back.

 # You have variable(s) $FOO and $BAR already with your stuff
 typeset -p FOO BAR > ./serialized_data.sh

You can get your stuff back like this either later in your script or in another script altogether:

# Load up the serialized data back into the current shell
source serialized_data.sh

This will work for bash, zsh and ksh including passing data between different shells. Bash will translate this to its builtin declare function while zsh implements this with typeset but as bash has an alias for this to work either way for we use typeset here for ksh compatibility.

More complex generalized implementation using functions

The above implementation is really simple, but if you call it frequently you might want to give yourself a utility function to make it easier. Additionally if you ever try to include the above inside custom functions you will run into issues with variable scoping. This version should eliminate those issues.

Note for all of these, in order to maintain bash/zsh cross-compatibility we will be fixing both the cases of typeset and declare so the code should work in either or both shells. This adds some bulk and mess that could be eliminated if you were only doing this for one shell or another.

The main problem with using functions for this (or including the code in other functions) is that the typeset function generates code that, when sourced back into a script from inside a function, defaults to creating a local variable rather than a global one.

This can be fixed with one of several hacks. My initial attempt to to fix this was parse the output of the serialize process through sed to add the -g flag so the created code defines a global variable when sourced back in.

serialize() {
    typeset -p "$1" | sed -E '0,/^(typeset|declare)/{s/ / -g /}' > "./serialized_$1.sh"
}
deserialize() {
    source "./serialized_$1.sh"
}

Note that the funky sed expression is to only match the first occurrence of either 'typeset' or 'declare' and add -g as a first argument. It is necessary to only match the first occurrence because, as Stéphane Chazelas rightly pointed out in comments, otherwise it will also match cases where the serialized string contains literal newlines followed by the word declare or typeset.

In addition to correcting my initial parsing faux pas, Stéphane also suggested a less brittle way to hack this that not only side steps the issues with parsing the strings but could be a useful hook to add additional functionality by using a wrapper function to redefine the actions taken when sourcing the data back in. This assumes you are not playing any other games with the declare or typeset commands, but this technique would be easier to implement in a situation where you were including this functionality as part of another function of your own or you were not in control of the data being written and whether or not it had the -g flag added. Something similar could also be done with aliases, see Gilles's answer for an implementation.

To make the result even more useful, we can iterate over multiple variables passed to our functions by assuming that each word in the argument array is a variable name. The result becomes something like this:

serialize() {
    for var in $@; do
        typeset -p "$var" > "./serialized_$var.sh"
    done
}

deserialize() {
    declare() { builtin declare -g "$@"; }
    typeset() { builtin typeset -g "$@"; }
    for var in $@; do
        source "./serialized_$var.sh"
    done
    unset -f declare typeset
}

With either solution, usage would look like this:

# Load some test data into variables
FOO=(an array or something)
BAR=$(uptime)

# Save it out to our serialized data files
serialize FOO BAR

# For testing purposes unset the variables to we know if it worked
unset FOO BAR

# Load  the data back in from out data files
deserialize FOO BAR

echo "FOO: $FOO\nBAR: $BAR"

Use redirection, command substitution, and parameter expansion. Double quotes are needed to preserve whitespace and special characters. The trailing x saves the trailing newlines which would be otherwise removed in the command substitution.

#!/bin/bash
echo "$var"x > file
unset var
var="$(< file)"
var=${var%x}

Serialize all — POSIX

In any POSIX shell, you can serialize all environment variables with export -p. This doesn't include non-exported shell variables. The output is properly quoted so that you can read it back in the same shell and get exactly the same variable values. The output might not be readable in another shell, for example ksh uses the non-POSIX $'…' syntax.

save_environment () {
  export -p >my_environment
}
restore_environment () {
  . ./my_environment
}

Serialize some or all — ksh, bash, zsh

Ksh (both pdksh/mksh and ATT ksh), bash and zsh provide a better facility with the typeset builtin. typeset -p prints out all defined variables and their values (zsh omits the values of variables that have been hidden with typeset -H). The output contains proper declaration so that environment variables are exported when read back (but if a variable is already exported when read back, it won't be unexported), so that arrays are read back as arrays, etc. Here also, the output is properly quoted but is only guaranteed to be readable in the same shell. You can pass a set of variables to serialize on the command line; if you don't pass any variable then all are serialized.

save_some_variables () {
  typeset -p VAR OTHER_VAR >some_vars
}

In bash and zsh, restoring can't be done from a function because typeset statements inside a function are scoped to that function. You need to run . ./some_vars in the context where you want to use the variables' values, taking care that variables that were global when exported will be redeclared as global. If you want to read back the values within a function and export them, you can declare a temporary alias or function. In zsh:

restore_and_make_all_global () {
  alias typeset='typeset -g'
  . ./some_vars
  unalias typeset
}

In bash (which uses declare rather than typeset):

restore_and_make_all_global () {
  alias declare='declare -g'
  shopt -s expand_aliases
  . ./some_vars
  unalias declare
}

In ksh, typeset declares local variables in functions defined with function function_name { … } and global variables in functions defined with function_name () { … }.

Serialize some — POSIX

If you want more control, you can export the content of a variable manually. To print the content of a variable exactly into a file, use the printf builtin (echo has a few special cases such as echo -n on some shells and adds a newline):

printf %s "$VAR" >VAR.content

You can read this back with $(cat VAR.content), except that the command substitution strips off trailing newlines. To avoid this wrinkle, arrange for the output not to end with a newline ever.

VAR=$(cat VAR.content && echo a)
if [ $? -ne 0 ]; then echo 1>&2 "Error reading back VAR"; exit 2; fi
VAR=${VAR%?}

If you want to print multiple variables, you can quote them with single quotes, and replace all embedded single quotes with '\''. This form of quoting can be read back into any Bourne/POSIX-style shell. The following snippet works in any POSIX shell. It only works for string variables (and numeric variables in shells that have them, though they'll be read back as strings), it doesn't try to deal with array variables in shells that have them.

serialize_variables () {
  for __serialize_variables_x do
    eval "printf $__serialize_variables_x=\\'%s\\'\\\\n \"\$${__serialize_variables_x}\"" |
    sed -e "s/'/'\\\\''/g" -e '1 s/=.../=/' -e '$ s/...$//'
  done
}

Here's another approach which doesn't fork a subprocess but is heavier on string manipulation.

serialize_variables () {
  for __serialize_variables_var do
    eval "__serialize_variables_tail=\${$__serialize_variables_var}"
    while __serialize_variables_quoted="$__serialize_variables_quoted${__serialize_variables_tail%%\'*}"
          [ "${__serialize_variables_tail%%\'*}" != "$__serialize_variables_tail" ]; do
      __serialize_variables_tail="${__serialize_variables_tail#*\'}"
      __serialize_variables_quoted="${__serialize_variables_quoted}'\\''"
    done
    printf "$__serialize_variables_var='%s'\n" "$__serialize_variables_quoted"
  done
}

Note that on shells that allow read-only variables, you'll get an error if you try to read back a variable that's read-only.