Chomsky–Schützenberger representation theorem

In formal language theory, the Chomsky–Schützenberger representation theorem is a theorem derived by Noam Chomsky and Marcel-Paul Schützenberger about representing a given context-free language in terms of two simpler languages. These two simpler languages, namely a regular language and a Dyck language, are combined by means of an intersection and a homomorphism.

A few notions from formal language theory are in order. A context-free language is regular, if it can be described by a regular expression, or, equivalently, if it is accepted by a finite automaton. A homomorphism is based on a function $h$ which maps symbols from an alphabet $\Gamma$ to words over another alphabet $\Sigma$ ; If the domain of this function is extended to words over $\Gamma$ in the natural way, by letting $h(xy)=h(x)h(y)$ for all words $x$ and $y$ , this yields a homomorphism $h:\Gamma ^{*}\to \Sigma ^{*}$ . A matched alphabet $T\cup {\overline {T}}$ is an alphabet with two equal-sized sets; it is convenient to think of it as a set of parentheses types, where $T$ contains the opening parenthesis symbols, whereas the symbols in ${\overline {T}}$ contains the closing parenthesis symbols. For a matched alphabet $T\cup {\overline {T}}$ , the Dyck language $D_{T}$ is given by