How are double quotation marks in bash matched (paired)?

Any of the nesting constructs that can be interpolated inside strings can have further strings inside them: they are parsed like a new script, up to the closing marker, and can even be nested multiple levels deep. All bar one of those starts with a $. All of them are documented in a combination of the Bash manual and POSIX shell command language specification.

There are a few cases of these constructs:

  • Command substitution with $( ... ), as you've found. POSIX specifies this behaviour:

    With the $(command) form, all characters following the open parenthesis to the matching closing parenthesis constitute the command. Any valid shell script can be used for command ...

    Quotes are part of valid shell scripts, so they're allowed with their normal meaning.

  • Command substitution using `, too.
  • The "word" element of advanced parameter substitution instances such as ${parameter:-word}. The definition of "word" is:

    A sequence of characters treated as a unit by the shell

    - which includes quoted text and even mixed quotes a"b"c'd'e - though the actual behaviour of the expansions is a little more liberal than that, and for example ${x:-hello world} works too.

  • Arithmetic expansion with $(( ... )), although it is largely useless there (but you can nest command substitution or variable expansions, too, and then have quotes usefully inside those). POSIX states that:

    The expression shall be treated as if it were in double-quotes, except that a double-quote inside the expression is not treated specially. The shell shall expand all tokens in the expression for parameter expansion, command substitution, and quote removal.

    so this behaviour is explicitly required. That means echo "abc $((4 "*" 5))" does arithmetic, rather than globbing.

    Note though that old-style $[ ... ] arithmetic expansion is not treated the same way: quotes will be an error if they appear, regardless of if the expansion is quoted or not. This form isn't documented at all any more, and isn't meant to be used anyway.

  • Locale-specific translation with $"...", which actually uses the " as a core element. $" is treated as a single unit.

There's one further nesting case you may not expect, not involving quotes, which is with brace expansion: {a,b{c,d},e} expands to "a bc bd e". ${x:-a{b,c}d} does not nest, however; it is treated as a parameter substitution giving "a{b,c", followed by "d}". That is also documented:

When braces are used, the matching ending brace is the first ‘}’ not escaped by a backslash or within a quoted string, and not within an embedded arithmetic expansion, command substitution, or parameter expansion.


As a general rule, all delimited constructs parse their bodies independently of the surrounding context (and exceptions are treated as bugs). In essence, on seeing $( the command-substitution code just asks the parser to consume what it can from the body as though it's a new program, and then checks that the expected terminating marker (an unescaped ) or )) or }) appears once the sub-parser runs out of things it can consume.

If you think about the functioning of a recursive-descent parser, that's just a simple recursion to the base case. It's actually easier to do than the other way, once you've got string interpolation at all. Regardless of the underlying parsing technique, shells supporting these constructs give the same result.

You can nest quoting as deeply as you like through these constructs and it will work as expected. Nowhere will get confused by seeing a quote in the middle; instead, that will be the start of a new quoted string in the interior context.


Perhaps looking at the two examples with printf (instead of echo) will help:

$ printf '<%s> ' "(echo " * ")"; echo
<(echo > <test.txt> <ppcg.sh> <file1> <file2> <file3> <)>

It prints (echo  (the first word, including a trailing space), some files, and the closing word ).
The parenthesis is just part of the quoted string (echo .
The asterisk (now unquoted, as the two double quotes are paired) is expanded as a glob to a list of matching files.
And then, the closing parenthesis.

However, your second command works as follows:

$ printf '<%s> ' "$(echo " * ")" ; echo
< * >

The $ starts a command substitution. That starts a-fresh the quoting.
The asterisk is quoted " * " and that is what the command (here it is a command and not a quoted string) echo outputs. Finally, printf re-formats the * and prints it as < * >.