If processes inherit the parent's environment, why do we need export?

Your assumption is that shell variables are in the environment. This is incorrect. The export command is what defines a name to be in the environment at all. Thus:

a=1 b=2
export b

results in the current shell knowing that $a expands to 1 and $b to 2, but subprocesses will not know anything about a because it is not part of the environment (even in the current shell).

Some useful tools:

set: Useful for viewing the current shell's parameters, exported-or-not
set -k: Sets assigned args in the environment. Consider f() { set -k; env; }; f a=1
set -a: Tells the shell to put any name that gets set into the environment. Like putting export before every assignment. Useful for .env files, as in set -a; . .env; set +a.
export: Tells the shell to put a name in the environment. Export and assignment are two entirely different operations.
env: As an external command, env can only tell you about the inherited environment, thus, it's useful for sanity checking.
env -i: Useful for clearing the environment before starting a subprocess.

Alternatives to export:

name=val command # Assignment before command exports that name to the command.
declare/local -x name # Exports name, particularly useful in shell functions when you want to avoid exposing the name to outside scope.
set -a # Exports every following assignment.

Motivation

So why do shells need to have their own variables and and environment that is different? I'm sure there are some historical reasons, but I think the main reason is scoping. The enviroment is for subprocesses, but there are lots of operations you can do in the shell without forking a subprocess. Suppose you loop:

for i in {0..50}; do
    somecommand
done

Why waste memory for somecommand by including i, making its environment any bigger than it needs to be? What if the variable name you chose in the shell just happens to mean something unintended to the program? (Personal favorites of mine include DEBUG and VERBOSE. Those names are used everywhere and rarely namespaces adequately.)

What is the environment if not the shell?

Sometimes to understand Unix behavior you have to look at the syscalls, the basic API for interacting with the kernel and OS. Here, we're looking at the exec family of calls, which is what the shell uses when it creates a subprocess. Here's a quote from the manpage for exec(3) (emphasis mine):

The execle() and execvpe() functions allow the caller to specify the environment of the executed program via the argument envp. The envp argument is an array of pointers to null-terminated strings and must be terminated by a NULL pointer. The other functions take the environment for the new process image from the external variable environ in the calling process.

So writing export somename in the shell would be equivalent to copying the name to the global dictionary environ in C. But assigning somename without exporting it would be just like assigning it in C, without copying it to the environ variable.

There's a difference between shell variables and environment variables. If you define a shell variable without exporting it, it is not added to the processes environment and thus not inherited to its children.

Using export you tell the shell to add the shell variable to the environment. You can test this using printenv (which just prints its environment to stdout, since it's a child-process you see the effect of exporting variables):

#!/bin/sh

MYVAR="my cool variable"

echo "Without export:"
printenv | grep MYVAR

echo "With export:"
export MYVAR
printenv | grep MYVAR

A variable, once exported, is part of the environment. PATH is exported in the shell itself, while custom variables can be exported as needed. Using some setup code:

$ cat subshell.sh 
#!/usr/bin/env bash
declare | grep -e '^PATH=' -e '^foo='

Compare

$ cat test.sh 
#!/usr/bin/env bash
export PATH=/bin
export foo=bar
declare | grep -e '^PATH=' -e '^foo='
./subshell.sh
$ ./test.sh 
PATH=/bin
foo=bar
PATH=/bin
foo=bar

With

$ cat test2.sh 
#!/usr/bin/env bash
PATH=/bin
foo=bar
declare | grep -e '^PATH=' -e '^foo='
./subshell.sh
$ ./test2.sh 
PATH=/bin
foo=bar
PATH=/bin

Since foo is not exported by the shell, and test2.sh never exported it, it was not part of the environment of subshell.sh in the last run.

If processes inherit the parent's environment, why do we need export?

Motivation

What is the environment if not the shell?

Tags:

Shell

Process

Environment Variables

Related

Recent Posts