Directory traversal attack
A directory traversal (or path traversal) consists in exploiting insufficient security validation / sanitization of user-supplied input file names, such that characters representing "traverse to parent directory" are passed through to the file APIs.
The goal of this attack is to use an affected application to gain unauthorized access to the file system. This attack exploits a lack of security (the software is acting exactly as it is supposed to) as opposed to exploiting a bug in the code.
A typical example of vulnerable application in PHP code </source>g
An attack against this system could be to send the following HTTP request:
Generating a server response such as: <source lang="http"> HTTP/1.0 200 OK Content-Type: text/html Server: Apache root:fi3sED95ibqR6:0:1:System Operator:/:/bin/ksh daemon:*:1:1::/tmp: phpguru:f8fk3j1OIf31.:182:100:Developer:/home/users/phpguru/:/bin/csh
However, in more recent Unix systems, the passwd file does not contain the hashed passwords. They are, instead, located in the shadow file which cannot be read by unprivileged users on the machine. It is however, still useful for account enumeration on the machine, as it still displays the user accounts on the system.
Variations of directory traversalEdit
Listed below are some known directory traversal attack strings:
Directory traversal on UnixEdit
Common Unix-like directory traversal uses the
Sudo, a privilege management program ubiquitous in Unix is vulnerable to this attack when users use the glob wildcard (e.g.
chown /opt/myapp/myconfig/* could be exploited with the command
sudo chown baduser /opt/myapp/myconfig/../../../etc/passwd).
Directory traversal on Microsoft WindowsEdit
Each partition has a separate root directory (labeled
C:\ for a particular partition C) and there is no common root directory above that. This means that for most directory vulnerabilities on Windows, the attack is limited to a single partition.
URI encoded directory traversalEdit
Some web applications scan query string for dangerous characters such as:
to prevent directory traversal. However, the query string is usually URI decoded before use. Therefore, these applications are vulnerable to percent encoded directory traversal such as:
%2e%2e%2fwhich translates to
%2e%2e/which translates to
..%2fwhich translates to
%2e%2e%5cwhich translates to
Unicode / UTF-8 encoded directory traversalEdit
When Microsoft added Unicode support to their Web server, a new way of encoding
../ was introduced into their code, causing their attempts at directory traversal prevention to be circumvented.
Multiple percent encodings, such as
Percent encodings were decoded into the corresponding 8-bit characters by Microsoft webserver. This has historically been correct behavior as Windows and DOS traditionally used canonical 8-bit characters sets based upon ASCII.
However, the original UTF-8 was not canonical, and several strings were now string encodings translatable into the same string. Microsoft performed the anti-traversal checks without UTF-8 canonicalization, and therefore not noticing that (HEX)
C0AF and (HEX)
2F were the same character when doing string comparisons. Malformed percent encodings, such as
%c0%9v was also utilized.
Zip/archive traversal attacksEdit
The use of archive formats like zip allows for directory traversal attacks: files in the archive can be written such that they overwrite files on the filesystem by backtracking. Code that uncompresses archive files can be written to check that the paths of the files in the archive do not engage in path traversal.
Possible methods to prevent directory traversalEdit
A possible algorithm for preventing directory traversal would be to:
- Process URI requests that do not result in a file request, e.g., executing a hook into user code, before continuing below.
- When a URI request for a file/directory is to be made, build a full path to the file/directory if it exists, and normalize all characters (e.g.,
%20converted to spaces).
- It is assumed that a 'Document Root' fully qualified, normalized, path is known, and this string has a length N. Assume that no files outside this directory can be served.
- Ensure that the first N characters of the fully qualified path to the requested file is exactly the same as the 'Document Root'.
- If so, allow the file to be returned.
- If not, return an error, since the request is clearly out of bounds from what the web-server should be allowed to serve.
- Using a hard-coded predefined file extension to suffix the path does not limit the scope of the attack to files of that file extension.
<?php include($_GET['file'] . '.html');
The user can use the NULL character (indicating the end of the string) in order to bypass everything after the
$_GET. (This is PHP-specific.)
- "Naming Files, Paths, and Namespaces". Microsoft.
File I/O functions in the Windows API convert '/' to '\' as part of converting the name to an NT-style name
- Burnett, Mark (December 20, 2004). "Security Holes That Run Deep". SecurityFocus.
- "Microsoft: Security Vulnerabilities (Directory Traversal)". CVE Details.
- Crypto-Gram Newsletter July 2000
- "IIS cmd.exe attack strings".