HFST: Definition of other and its relation to any

This gives one consistent view of the other and any symbols when used in transitions. In FSAs, the other symbol refers to any other input which is not explicitly mentioned as a label in that automaton. In FSTs there may be either pairs with other on the input or on the output side, or on both. Note that the any symbol of regular expressions of symbols (or symbol pairs) represents a different concept referring to known symbols (or symbol pairs).

Definition of other in FSA:
Let E be the set of all possible symbols and Σ(A) the set of explicitly known or listed symbols in automaton A. When used in A, the other symbol represents U(A) = { x ∊ E | x ∉ Σ(A) }

Definition of other in FST:
1. Let I(P) be the set of all possible input symbols and T a transducer. The input language of T is I(T) and the set of explicitly known or listed input symbols in T is Σ(I(T)). When used on the input side of T, the other symbol represents U(I(T)) = { x ∊ I(P) | x ∉ Σ(I(T)) ⋀ x ≠ other }
2. Let O(P) be the set of all possible output symbols and and T a transducer. The output language of T is O(T) and the set of explicitly known or listed output symbols in T is Σ(O(T)). When used on the output side of T, the other symbol represents U(O(T)) = { x ∊ O(P) | x ∉ Σ(O(T)) ⋀ x ≠ other }
3. Let P be the set of all possible input and output symbol pairs and Π(T) the set of explicitly known or listed symbol pairs in transducer T. When used in T, the other:other symbol pair represents U(T) = { x:y ∊ P | x:y ∉ Π(T) ⋀ x ≠ other ⋀ y ≠ other ⋀ x ≠ y }
4. For convenience, we denote x:x as x.

Corollary:
When used to indicate transitions:
1. other represents { x:x ∊ P | x ∊ U(I(T)) ⋀ x ∊ U(O(T)) }
2. other:a represents { x:a ∊ P | x ∊ U(I(T)) ⋀ a ∊ Σ(O(T)) ⋀ (x ≠ a ⋁ a ≠ other) }
3. a:other represents { a:x ∊ P | x ∊ U(O(T)) ⋀ a ∊ Σ(I(T)) ⋀ (x ≠ a ⋁ a ≠ other) }
4. other:other represents { x:y ∊ P | x:y ∊ U(T) }

NOTE1.:
We keep the semantics of the other symbol clean (and distinct from the the convenience introduced by the any symbol for regular expressions):
1. other ∩ other:a = ⊘
2. other ∩ a:other = ⊘
3. other ∩ other:other = ⊘
4. other:a ∩ other:other = ⊘
5. a:other ∩ other:other = ⊘

NOTE2.:
As the other symbol in a transducer also covers unknown pairs of identical symbols, it can be used for indicating that we wish to copy unknown input symbols to the output tape.

NOTE3.:
The other:other symbol pair representing two different and unknown symbols of the cross-product in E:E should not be confused with the convenience of the any:any symbol pair for a regular language referring to all pairs.

NOTE4.:
Even if the input and the output alphabet are disjoint, the other symbol can be used for indicating that unknown symbols on the input tape are copied to the output tape.

NOTE5.:
When speaking of an FSA or an FST, it is convenient to refer to all the symbols known to the machine with the any-known symbol <∙>. This meta symbol is distinct from the any symbol of regular expressions. When referring to the unknown symbols in an FSA, we can use the meta symbol <=>. For an FST, we will take this to mean an unknown symbol that is identical to the unknown symbol on the opposite tape. For FST, we also introduce <≠> meaning an unknown symbol that is different from the unknown symbol on the opposite tape.

One can give an overview of the relation between the symbol domains, in FST, of the any-known symbols (<∙>) and the other symbols (<=>, <≠>) in the following table:

Output
any-known other
<∙> <=> <≠>
Input any-known <∙> <∙>:<∙> N/A <∙>:<≠>
other <=> N/A <=>:<=> N/A
<≠> <≠>:<∙> N/A <≠>:<≠>

In the notes and definitions, above, of FSA, we have used the following:
x has the domain <∙> (i.e. any-known),
other has the domain <=>,

In the notes and definitions, above, of FST, we have used the following:
x, a, x:x and x:y have the domain <∙>:<∙>,
other has the domain <=>:<=> (aka reflexive-other),
x:other has the domain <∙>:<≠>,
other:x has the domain <≠>:<∙>, and
other:other has the domain <≠>:<≠> (aka irreflexive-other).

As a symbol becomes known to an automaton, it is included in the any-known domain and no longer part of the unknown domain of the other symbols. One can say that the any-known domain increases monotonically as more symbols become known to an automaton whereas the other domain decreases monotonically.


-- KristerLinden - 27 Sep 2008
Topic revision: r3 - 2008-09-29 - KristerLinden
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback