This article does not cite any sources. (March 2008) (Learn how and when to remove this template message)
An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices, rather than to be displayed or printed as regular data bytes would be. These are also known as control sequences, reflecting their use in device control, beginning with the Control Sequence Initiator - originally the "Escape character" ASCII code - character 27 (decimal) - often written "Esc" on keycaps. With the introduction of ANSI terminals most escape sequences began with the two characters "ESC" then "[" or a specially-allocated CSI character with a code 155 (decimal). Not all control sequences used an Escape character (for example modem control sequences, and Data General terminal control sequences), but often were still called escape sequences, and the very common use of "escaping" special characters in programming languages and command-line paramters today often use the "backslash" character to begin the sequence.
Escape sequences in communications are commonly used when a computer and a peripheral have only a single channel through which to send information back and forth (so escape sequences are an example of in-band signaling). They were common when most dumb terminals used ASCII with 7 data bits for communication, and sometimes would be used to switch to a different character set for "foreign" or graphics characters that would otherwise been restricted by the 128 codes available in 7 data bits. Even relatively "dumb" terminals responded to some escape sequences, including the original mechanical Teletype printers (on which "glass Teletypes" or VDUs were based) responded to characters 27 and 31 to alternate between letters and figures modes.
An escape character is usually assigned to the Esc key on a computer keyboard, and can be sent in other ways than as part of an escape sequence. For example, the Esc key may be used as an input character in editors such as vi, or for backing up one level in a menu in some applications. The Hewlett Packard HP 2640 terminals had a key for a "display functions" mode which would display graphics for all control characters, including Esc, to aid in debugging applications.
If the Esc key and other keys that send escape sequences are both supposed to be meaningful to an application, an ambiguity arises if a character terminal is in use. When the application receives the ASCII escape character, it is not clear whether that character is the result of the user pressing the Esc key or whether it is the initial character of an escape sequence (e.g., resulting from an arrow key press). The traditional method of resolving the ambiguity is to observe whether or not another character quickly follows the escape character. If not, it is assumed not to be part of an escape sequence. This heuristic can fail under some circumstances, especially without fast modern communication speeds.
The Hayes command set, for instance, defines a single escape sequence, +++. (In order to interpret +++, which may be a part of data, as the escape sequence the sender stops communication for one second before and after the +++) .When the modem encounters this in a stream of data, it switches from its normal mode of operation which simply sends any characters to the phone, to a command mode in which the following data is assumed to be a part of the command language. You can switch back to the online mode by sending the O command.
The Hayes command set is modal, switching from command mode to online mode. This is not appropriate in the case where the commands and data will switch back and forth rapidly. An example of a non-modal escape sequence control language is the VT100, which used a series of commands prefixed by a Control Sequence Introducer.
Comparison with control charactersEdit
A control character is a character that, in isolation, has some control function, such as carriage return (CR). Escape sequences, by contrast, consist of one or more escape characters which change the interpretation of subsequent characters.
ASCII video data terminalsEdit
The VT52 terminal used simple digraph commands like escape-A: in isolation, "A" simply meant the letter "A", but as part of the escape sequence "escape-A", it had a different meaning. The VT52 also supported parameters: it was not a straightforward control language encoded as substitution.
The later VT100 terminal implemented the more sophisticated ANSI escape sequences standard (now ECMA-48) for functions such as controlling cursor movement, character set, and display enhancements. The Hewlett Packard HP 2640 series had perhaps the most elaborate escape sequences for block and character modes, programming keys and their soft labels, graphics vectors, and even saving data to tape or disk files.
Use in DOS and WindowsEdit
A utility, ANSI.SYS, can be used to enable the interpreting of the ANSI (ECMA-48) terminal escape sequences under DOS (by using $e in the PROMPT command) or in command windows in 16-bit Windows. The rise of GUI applications, which directly write to display cards, has greatly reduced the usage of escape sequences on Microsoft platforms, but they can still be used to create interactive random-access character-based screen interfaces with the character-based library routines such as printf without resorting to a GUI program.
Use in Linux and Unix displaysEdit
The default text terminal, and text windows (such as using xterm) respond to ANSI escape sequences.
A common use of escape sequences is in fact to remove control characters found in a binary data stream so that they will not cause their control function by mistake. In this case, the control character is replaced by a defined "escape character" (which need not be the US-ASCII escape character) and one or more other characters; after exiting the context where the control character would have caused an action, the sequence is recognized and replaced by the removed character. To transmit the "escape character" itself, two copies are sent.
In many programming languages and command line interfaces escape sequences are used in character literals and string literals, to express characters which are not printable or clash with the syntax of characters or strings. For example, control characters themselves might not be allowed to be placed in the program codde by the editor program, or may have undesirable side-effects if typed into a command. The end-of-quote character is also a problem for programmers that can be solved by escaping it. In most contexts the escape character is the backslash ("\").
For example the single quotation mark character might be expressed as
'\'' since writing
''' is not acceptable.
Many modern programming languages specify the doublequote character (
") as a delimiter for a string literal. The backslash escape character typically provides ways to include doublequotes inside a string literal, such as by modifying the meaning of the doublequote character embedded in the string (
\"), or by modifying the meaning of a sequence of characters including the hexadecimal value of a doublequote character (
\x22). Both sequences encode a literal doublequote (
print "Nancy said "Hello World!" to the crowd.";
produces a syntax error, whereas:
print "Nancy said \"Hello World!\" to the crowd."; ### example of \"
produces the intended output. Another alternative:
print "Nancy said \x22Hello World!\x22 to the crowd."; ### example of \x22
uses "\x" to indicate the following two characters are hexadecimal digits, "22" being the ASCII value for a doublequote in hexadecimal.
C, C++, Java, and Ruby all allow exactly the same two backslash escape styles. The PostScript language and Microsoft Rich Text Format also use backslash escapes. The quoted-printable encoding uses the equals sign as an escape character.
Another similar (and partially overlapping) syntactic trick is stropping.
Some programming languages also provide other ways to represent special characters in literals, without requiring an escape character (see e.g. delimiter collision).