Cone (formal languages)

From Wikipedia, the free encyclopedia

This article may require cleanup to meet Wikipedia's quality standards.
Please improve this article if you can. (October 2007)

In formal language theory, a cone is a set of formal languages that has some desirable closure properties enjoyed by some well-known sets of languages, in particular by the families of regular languages, context-free languages and the recursive languages^[1]. The concept of a cone is a more abstract notion that subsumes all of these families.

More precisely, a cone is a non-empty family $\mathcal{S}$ of languages such that, for any $L \in \mathcal{S}$ over some alphabet $Σ$ ,

if $h$ is an homomorphism from $\Sigma^\ast$ to some $\Delta^\ast$ , the language $h (L)$ is in $\mathcal{S}$ ;
if $h$ is an homomorphism from some $\Delta^\ast$ to $\Sigma^\ast$ , the language $h - 1 (L)$ is in $\mathcal{S}$ ;
if $R$ is any regular language over $Σ$ , then $L\cap R$ is in $\mathcal{S}$ .

The family of all regular languages is contained in any cone.

If one restricts the definition to homomorphisms that do not introduce the empty word $λ$ then one speaks of a faithful cone; the inverse homomorphisms are not restricted. Within the Chomsky hierarchy, the regular languages, the context-free languages, and the recursively enumerable languages are all cones, whereas the context sensitive languages and the recursive languages are only faithful cones.

The terminology cone has a French origin. In the American oriented literature one usually speaks of a full trio. The trio corresponds to the faithful cone.

[edit] Relation to Transducers

A finite state transducer is a finite state automaton that has both input and output. It defines a transduction $T$ , mapping a language $K$ over the input alphabet into another language $T (K)$ over the output alphabet. Each of the cone operations (homomorphism, inverse homomorphism, intersection with a regular language) can be implemented using a finite state transducer. And, since finite state transducers are closed under composition, every sequence of cone operations can be performed by a finite state transducer.

Conversely, every finite state transduction $T$ can be decomposed into cone operations. In fact, there exists a normal form for this decomposition $T(K) = g(h^{-1}(K \cap R))$ , where $g, h$ are homomorphisms, and $R$ is a regular language determined by $T$ .

All together this means that a family of languages is a cone iff it is closed under finite state transductions. This is a very powerful set of operations. For instance one easily writes a (nondeterministic) finite state transducer with alphabet ${a, b}$ that removes every second $b$ in words of even length (and does not change words otherwise). Since the context-free languages form a cone, they are closed under this exotic operation.