Linux/Windows/Unix/... file names: Which characters are allowed? Which are unescaped?

Solution 1:

The only characters not allowed in a filename in *nix are NUL and /. In Windows, only NUL, :, and \ are truly not allowed, but many apps restrict that further, also preventing ?, *, +, and %.

At no point do any characters in a filename need to be escaped except as required in order to not be interpreted by the shell.

Solution 2:

There's a discussion of filename characters in the Wikipedia article on File Names.

You may find this essay informative: Fixing Unix/Linux/POSIX Filenames.

This article compares OS X and Windows XP: X vs. XP: Forbidden Characters in Filenames (PDF, see pp approx. 64-66).

Things That Shouldn’t Be in File Names for $1,000 Alex

I don't know which characters must be un-escaped, but in Linux, it's probably not a good idea to escape the characters that may have special meaning such as "n" (newline), "t" (tab) and others, but that's generally not a problem in file operations. Perhaps you mean "escaped" rather than "unescaped". The most common ones are ones that the shell will interpret such as space, ">", "<", etc. See some of the articles I linked for a discussion of those.


Solution 3:

If you create a file on Windows with Explorer using one of the following characters, it will complain that the characters are not allowed:

\ / : * ? " < > |

A good reference is here:

Naming Files, Paths, and Namespaces
http://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx

Microsoft further states:

"... on Windows-based desktop platforms, invalid path characters might include ASCII/Unicode characters 1 through 31, as well as quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0) and tab (\t)."

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars.aspx


Solution 4:

On Linux and other POSIX compatible systems, "/" is reserved as it's the directory separator, and "\0" (the NULL character) designates the end of the string. Everything else is allowed.