Passing generated empty strings as command line arguments

in

./input $(cmd)

Because, $(cmd) is unquoted, that's a split+glob operator. The shell retrieves the output of cmd, removes all the trailing newline characters, then splits that based on the value of the $IFS special parameter, and then performs filename generation (for instance turns *.txt into the list of non-hidden txt files in the current directory) on the resulting words (that latter part not with zsh) and in the case of ksh also performs brace expansion (turns a{b,c} into ab and ac for instance).

The default value of $IFS contains the SPC, TAB and NL characters (also NUL in zsh, other shells either remove the NULs or choke on them). Those (not NUL) also happen to be IFS-whitespace character, which are treated specially when it comes to IFS-splitting.

If the output of cmd is " a b\nc \n", that split+glob operator will generate a "a", "b" and "c" arguments to ./input. With IFS-white-space characters, it's impossible for split+glob to generate an empty argument because sequences of one or more IFS-whitespace characters are treated as one delimiter. To generate an empty argument, you'd need to choose a separator that is not an IFS-whitespace character. Actually, any character but SPC, TAB or NL will do (best to also avoid multi-byte characters which are not supported by all shells here).

So for instance if you do:

IFS=:          # split on ":" which is not an IFS-whitespace character
set -o noglob  # disable globbing (also brace expansion in ksh)
./input $(cmd)

And if cmd outputs a::b\n, then that split+glob operator will result in "a", "" and "b" arguments (note that the "s are not part of the value, I'm just using them here to help show the values).

With a:b:\n, depending on the shell, that will result in "a" and "b" or "a", "b" and "". You can make it consistent across all shells with

./input $(cmd)""

(which also means that for an empty output of cmd (or an output consisting only of newline characters), ./input will receive one empty argument as opposed to no argument at all).

Example:

cmd() {
  printf 'a b:: c\n'
}
input() {
  printf 'I got %d arguments:\n' "$#"
  [ "$#" -eq 0 ] || printf ' - <%s>\n' "$@"
}
IFS=:
set -o noglob
input $(cmd)

gives:

I got 3 arguments:
 - <a b>
 - <>
 - < c>

Also note that when you do:

./input ""

Those " are part of the shell syntax, they are shell quoting operators. Those " characters are not passed to input.


You could generate the whole command line programmatically, and either copy-paste it, or run it through eval, e.g.:

$ perl -e 'printf "./args.sh %s\n", q/"" / x 10' 
./args.sh "" "" "" "" "" "" "" "" "" "" 

$ eval "$(perl -e 'printf "./args.sh %s\n", q/"" / x 100')"
$#: 100
$1: ><

(q/"" / is one of Perl's ways of quoting a string, x 100 makes hundred copies of it and concatenates them.)

eval processes its argument(s) as shell commands, running all quote processing and expansions. This means that if any of the input comes from untrusted sources, you'll need to be careful in generating the evaluated code to prevent vulnerabilities.

If you want the number of empty arguments variable, that should be doable without issues (at least I can't come up with how the second operand to Perl's x could be misused as it folds the operand to an integer):

$ n=33
$ eval "$(perl -e 'printf "./args.sh %s\n", q/"" / x $ARGV[0]' "$n")"
$#: 33
$1: ><

But what do you want to pass in fact? Empty quotes or empty strings? Both are valid arguments and this simple bash script can help illustrate this:

#!/bin/bash

printf "Argument count: %s.\n" "${#@}"

It just prints the number of arguments passed to it. I'll call it s for brevity.

$ ./s a
Argument count: 1.
$ ./s a b
Argument count: 2.
$ ./s a b ""
Argument count: 3.
$ ./s a b "" ""
Argument count: 4.
$ ./s a b "" "" \"\"
Argument count: 5.

As you can see the empty strings are just empty strings - the quotes are removed at parsing time - and they're still valid arguments. The shell feeds them into the command. But "" can be passed on as well. It's not an empty string though. It contains two characters.

Under the hood, for C, strings are NUL (\0) terminated and no quotes are needed to represent them.

Tags:

C

Bash

Arguments