Why does quoting not protect filenames that start with "-" against mis-interpretation?

The reason is that while single quotes actually remove the meaning of special characters, this refers to variable expansion, globbing and word splitting, i.e. those characters that are special to the shell and that are interpreted by the shell before the result is passed to the program, as well as shell metacharacters (such as the |).

The - is not a "special character" in that sense. What makes it special is that it is a de facto standard way to indicate to a program that the string started by it is an "option" argument. In fact, many (if not most) programs in the Unix/Linux ecosystem rely on the external getopt() function for that purpose1, which is part of the POSIX specification and provides a standardized way to handle command-line parameters, and the convention of interpreting parameters that start with - as "options" is embedded there.

So, the single quotes ensure that the -|h4ker|- is passed verbatim from the shell to the program (cat in your case), but also removes the quotes in that process. Hence the program still thinks that since this parameter starts with a -, it should be treated as an "option" argument, not an operand (like a file name to process).

This is the reason why many programs (again, all that rely on getopt()) interpret the -- token on the command line as a special "end-of-options" indicator, so that they can safely be applied to files that start with -.

Another possibility, which you already explored in your investigation, is to "protect" the filename by either stating it as an absolute filename ('/path/to/file/-|h4ker|-') or prepend the current directory ('./-|h4ker|-'), because then, the argument will no longer start with the "ambiguous" -. Note that quoting/escaping is still necessary in this example because the | is a shell metacharacter.

A nice demonstration is trying to list a file named -l:

~$ touch '-l'
touch: Invalid option -- l
~$ touch -- '-l'
~$ ls '-l'

< ... the entire directory content in long list format ... >

~$ ls -- '-l'
-l
~$ ls -l -- '-l'
-rw-r--r-- 1 user user 0 Feb 24 17:16 -l

1 For shell scripts, there is an equivalent getopts builtin command


There are two programs at work here.
One is the shell, the other is the cat program/executable/utility.

While the shell holds the single quotes as "a quoting" option, cat doesn't.

The shell gives no special meaning to an initial dash (-) (in arguments), cat does.

The shell just change the arguments to the executed program based in some rules:

$ echo '-|h4k3r|-'
-|h4k3r|-

$ echo "-|h4k3r|-"
-|h4k3r|-

$ echo -\|h4k3r\|-
-|h4k3r|-

The echo command receives the same argument -|h4k3r|- (after removing quotes) and prints the same result in all three cases above. The quoting of | removes its special meaning to the shell.

The echo command has few options and doesn't fully understand the initial dash (-) and there are no additional actions that it has to execute.

But cat does have many options, as explained in its man cat page. Options usually start with a - and all letters following the - should be processed by the command cat (not in all OSs). For example cat -n file would number each line from the file.

$ cat file
Line 1
Line 2

$ cat -n file
    1 Line 1
    2 Line 2

But if we end options with -- and still add a -n, it will be interpreted as a file name by cat:

$ cat -- -n file
cat: -n: No such file or directory
Line 1
Line 2

As it is possible to write directly a filename (a path actually) by starting it with ./, ../ or / (this directory, the directory above and the root directory).

Tags:

Bash

Quoting