Are there any sequences other than ../ which will be intepreted as directory traversal in *nix or Windows?

Utilizing Unicode, it's possible to encode \ and / into multi-byte characters. If the string comparison functions are not unicode-aware, there could be a bug which allows these characters through.

Wikipedia has a section on this in relation to an old attack on Windows servers:

When Microsoft added Unicode support to their Web server, a new way of encoding ../ was introduced into their code, causing their attempts at directory traversal prevention to be circumvented.

Multiple percent encodings, such as

  • %c1%1c
  • %c0%af

translated into / or \ characters.

Technically this is still using the slash character when it comes to the directory traversal, it's just not the true single-byte character which may confuse some code.

I believe the best advice for avoiding such characters would be to disallow any characters for file system paths except a safe subset of ASCII characters. You can also sidestep other issues of allowed characters in some operating systems and file systems at the same time.


On Windows and Unix - no. There may be obscure operating systems that use different path separators.

To handle encoding securely there is a simple rule: fully decode before doing sanitisation. If you fail to do this, you sanitisation can be circumvented. Imagine an application that does open(urldecode(normalize(path))). If the path contains ../ then normalize with remove it. But if it contains %2e%2e%2f then normalize will do nothing, and urldecode will convert that to ../. This error has led to a number of real-world vulnerabilities including the well known IIS unicode bug.

Another issue that sometimes appears is nested sequences. Suppose your normalize function does path.replaceAll('../', ''). This can be circumvented by trying ....// - the inner ../ is removed, leaving ../. The solution is either to completely reject strings that contain forbidden sequences, or to recursively apply the normalisation function.

There are other characters that can have surprising results in paths. A null byte is generally allowed within strings in high-level languages, but when this is passed to the C library, the null string is a terminator. File names like evil.php%00.jpg can bypass file extension checks. There was also the IIS semi-colon bug.

In general, file names are not a good place for untrusted data. There is the potential for second-order attacks, where other processes that read the directory listing have vulnerabilities. There might be cross-site scripting in a web page that lists files; there have been Windows explorer vulnerabilities that malicious file names can exploit; and attacking a Unix shell through escape sequences is a recent concern. Instead, I recommend storing the user-supplied file name in a database, and having the file name be the primary key.