Shell valid function name characters

Since POSIX documentation allow it as an extension, there's nothing prevent implementation from that behavior.

A simple check (ran in zsh):

$ for shell in /bin/*sh 'busybox sh'; do
    printf '[%s]\n' $shell
    $=shell -c 'á() { :; }'
  done
[/bin/ash]
/bin/ash: 1: Syntax error: Bad function name
[/bin/bash]
[/bin/dash]
/bin/dash: 1: Syntax error: Bad function name
[/bin/ksh]
[/bin/lksh]
[/bin/mksh]
[/bin/pdksh]
[/bin/posh]
/bin/posh: á: invalid function name
[/bin/yash]
[/bin/zsh]
[busybox sh]
sh: syntax error: bad function name

show that bash, zsh, yash, ksh93 (which ksh linked to in my system), pdksh and its derivation allow multi-bytes characters as function name.

yash is designed to support multibyte characters from the beginning, so there's no surprise it worked.

The other documentation you can refer is ksh93:

A blank is a tab or a space. An identifier is a sequence of letters, digits, or underscores starting with a letter or underscore. Identifiers are used as components of variable names. A vname is a sequence of one or more identifiers separated by a . and optionally preceded by a .. Vnames are used as function and variable names. A word is a sequence of characters from the character set defined by the current locale, excluding non-quoted metacharacters.

So setting to C locale:

$ export LC_ALL=C
$ á() { echo 1; }
ksh: á: invalid function name

make it failed.


Note that functions share the same namespace as other commands including commands in the file system, which on most systems have no limitation on the characters or even bytes they may contain in their path.

So while most shells restrict the characters of their functions, there's no real good reason why they would do that. That means in those shells, there are commands you can't replace with a function.

zsh and rc allow anything for their function names including some with / and the empty string. zsh even allows NUL bytes.

$ zsh
$ $'\0'() echo nul
$ ^@
nul
$ ""() uname
$ ''
Linux
$ /bin/ls() echo test
$ /bin/ls
test

A simple command in shell is a list of arguments, and the first argument is used to derive the command to execute. So, it's just logical that those arguments and function names share the same possible values and in zsh arguments to builtins and functions can be any byte sequence.

There's not security issue here as the functions you (the script author) define are the ones you invoke.

Where there may be security issues is when the parsing is affected by the environment, for instance with shells where the valid names for functions is affected by the locale.