Difference: HfstTerminology (2 vs. 3)

Revision 32008-12-10 - KristerLinden

Line: 1 to 1
 
META TOPICPARENT name="HfstHome"

HFST: Terminology

Line: 6 to 6
  Below we list some of the key concepts used in morphological descriptions. Some of them are used differently in SFST, which is one of the underlying libraries of HFST. When creating finite-state morphologies we draw on ideas from several different domains and the concepts we import from these domains bring some baggage with them, which we need to be aware of in order to be able to speak with all of the communities we have borrowed from.
Changed:
<
<

Generative Phonology

>
>

Generative Phonology

 
Changed:
<
<
In Generative Phonology, the idea is that we have un underlying deep level, from which all forms on the surface level are generated. The deep level may never have been seen. It only manifests itself as surface forms. In two-level morphology, the deep level is represented by what Koskenniemi (1983) called the lexical level.
>
>
In Generative Phonology, the idea is that we have un underlying deep level, from which all forms on the surface level are generated. The deep level may never have been seen in practice. It only manifests itself as surface forms.
 
Changed:
<
<

Lexicography

>
>
Note. In two-level morphology, the deep level is represented by what Koskenniemi (1983) named the lexical level. He chose this term, because the lexical level is not as abstract as some proponents of generative phonology may like to advocate for the deep level.
 
Changed:
<
<
In Lexicography, the lemma (lexicon form, dictionary form, look-up form) represents all the forms of a word within a paradigm. The word forms in a paradigm are called inflected forms. The lemma is often chosen to be one of the forms in the inflectional paradigm of a word, in which case the lemma is also called the base form. In some languages, the lemma is a root, which different from any of the inflected forms.
>
>

Graphically

 
Changed:
<
<
Note. The lexicon form (in Lexicography) is different from the lexical level (in Two-level Morphology).
>
>
Conventionally, Generative Phonology has written the deep level on top and the surface level at the bottom, which today might be perceived as somewhat counterintuitive, but at the time was probably dictated by typewriter conventions and the lack of modern tools for visual graphics. As most cultures fill the writing surface from the top, the order of the levels can also be seen to reflect the dogma that the hypothetical deep level is primary and the surface level is always generated from the deep level.

deep level kauppa+Nom (grocery store) kauppa+Gen (of the grocery store)
lexical level kaupPA 0 kaupPA n
surface level kauppa kaupan

Lexicography

In Lexicography, the lemma (lexicon form, dictionary form or look-up form) represents all the forms of a word within the paradigm of a lexeme. The word forms in a paradigm are called inflected forms. The lemma is often chosen to be one of the forms in the inflectional paradigm of a word, in which case the lemma is also called the base form. In some languages, the lemma is a root, which is different from any of the inflected forms.

Note 1. In order to refer to the position of an inflected form in a paradigm it is often useful to give the base form or the root with corresponding morphological features, i.e. geese is goose+Plural+Nom. We refer to this base form with features as a grammatical word.

Note 2. The lexicon form (in Lexicography) is different from the lexical level (in Two-level Morphology).

Graphically

grammatical words kauppa+Nom (grocery store) kauppa+Gen (of the grocery store) ...
lexeme kauppa (base form) kaupan (inflected form) ...

Note. The base form and the inflected forms together constitute the full paradigm of the word forms of the lexeme.

 

Transducer Technology

Changed:
<
<
Transducers have an input and an output tapes. The input tape is always the tape from which data to be processed is originally read and the output tape is the tape to which the final result is written. Intermediary results may be stored on either tape.
>
>
Transducers have an input and an output tape. The input tape is always the tape from which data to be processed is originally read and the output tape is the tape to which the final result is written. Intermediary results may be stored on either tape.

Graphically

 
Added:
>
>
When adapting a morphological lexicon to the world of transducers, we have programs that analyze word forms into grammatical words and generate inflected forms from grammatical words. In this case, we still keep the vertical graphical order introduced by generative phonology, but we may switch what is seen as the input and output tapes as needed.
 
Added:
>
>
generator input kauppa+Nom (grocery store) kauppa+Gen (of the grocery store) ...
generator output kauppa kaupan ...
 
Changed:
<
<
Graafinen esitys: yläpuoli ja alapuoli (t. ylänauha ja alanauha)
>
>
analyzer output kauppa+Nom (grocery store) kauppa+Gen (of the grocery store) ...
analyzer input kauppa kaupan ...
 
Changed:
<
<
Merkkiparien kirjoittaminen: syötemerkki ja tulostusmerkki
>
>
However, if we represent the transducers horizontally, we are free from the conventions of Generative Phonology and may write the input to the left and the output to the right:
 
Added:
>
>
generator input generator output
kauppa+Nom (grocery store) kauppa
kauppa+Gen (of the grocery store) kaupan
... ...
 
Added:
>
>
analyzer input analyzer output
kauppa kauppa+Nom (grocery store)
kaupan kauppa+Gen (of the grocery store)
... ...

-- KristerLinden - 12 Dec 2008


 

FST Descriptions

Line: 39 to 73
 
    • diamond, a kind of an auxiliary marker, this name was introduced in [Yli-Jyrä and Koskenniemi 2005]
    • joiner, a kind of auxiliary marker, introduced in the HfstLexC documentation
Added:
>
>
-- AnssiYliJyra - 13 Aug 2008
 
Added:
>
>
 
<--  
-->
Line: 43 to 79
 
<--  
-->
Deleted:
<
<
-- AnssiYliJyra - 13 Aug 2008 - KristerLinden - 12 Dec 2008
 
META TOPICMOVED by="KristerLinden" date="1212070604" from="KitWiki.HFSTTopicTemplate" to="KitWiki.HfstTopicTemplate"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback