How can I harden bash scripts against causing harm when changed in the future?

set -u


set -o nounset

This would make the current shell treat expansions of unset variables as an error:

$ unset build
$ set -u
$ rm -rf "$build"/*
bash: build: unbound variable

set -u and set -o nounset are POSIX shell options.

An empty value would not trigger an error though.

For that, use

$ rm -rf "${build:?Error, variable is empty or unset}"/*
bash: build: Error, variable is empty or unset

The expansion of ${variable:?word} would expand to the value of variable unless it's empty or unset. If it's empty or unset, the word would be displayed on standard error and the shell would treat the expansion as an error (the command would not be executed, and if running in a non-interactive shell, this would terminate). Leaving the : out would trigger the error only for an unset value, just like under set -u.

${variable:?word} is a POSIX parameter expansion.

Neither of these would cause an interactive shell to terminate unless set -e (or set -o errexit) was also in effect. ${variable:?word} causes scripts to exit if the variable is empty or unset. set -u would cause a script to exit if used together with set -e.

As for your second question. There is no way to limit rm to not work outside of the current directory.

The GNU implementation of rm has a --one-file-system option that stops it from recursively delete mounted filesystems, but that's as close as I believe we can get without wrapping the rm call in a function that actually checks the arguments.

As a side note: ${build} is exactly equivalent to $build unless the expansion occurs as part of a string where the immediately following character is a valid character in a variable name, such as in "${build}x".

I'm going to suggest normal validation checks using test/[ ]

You would had been safe if you'd written your script as such:

[ -n "${build}" ] || exit 1
rm -rf "${build}/"*

The [ -n "${build}" ] checks that "${build}" is a non-zero length string.

The || is the logical OR operator in bash. It causes another command to be run if the first one failed.

In this way, had ${build} been empty/undefined/etc. the script would have exited (with a return code of 1, which is a generic error).

This also would have protected you in case you removed all uses ${build} because [ -n "" ] will always be false.

The advantage of using test/[ ] is there are many other more meaningful checks that it can also use.

For example:

[ -f FILE ] True if FILE exists and is a regular file.
[ -d FILE ] True if FILE exists and is a directory.
[ -O FILE ] True if FILE exists and is owned by the effective user ID.

In your specific case, I've reworked 'deletion' in the past to move files/directories instead (assuming /tmp is on the same partition as your directory):

# mktemp -d is also a good, reliable choice
mkdir -p "$trashdir"
mv "${build}"/* "$trashdir"

Behind the scenes, this moves the toplevel file/dir references from source to the $trashdir destination directory structures all on the same partition, and doesn't spend time walking the directory structure and freeing up the per-file disk blocks right then and there. This produces much faster cleanup while the system is in active use, in exchange for a slightly slower reboot (/tmp is cleaned on reboots).

Alternatively, a cron entry to periodically clean /tmp/.trash-$USER will keep /tmp from filling up, for processes (e.g., builds) that consume a lot of disk space. If your directory is on a different partition as /tmp, you could create a similar /tmp-like directory on your partition and have cron clean that instead.

Most importantly, though, if you screw up the variables in any way, you can recover the contents before the cleanup happens.