An inflected form of a word that is not in the analyzer lexicon and is therefore passed to the weighted analyser


The output displays three types of information for each analysis: base form, paradigm tags, analysis tags

The base form is the normal lexical lookup form delimited by:

<base> ... </base>

The paradigm tags are optional and they are delimited by:

<par> ... </par>

The analysis tags relate the base form to the inflected form within one paradigm and they are delimited by:

<anl> ... </anl>


  • The output provides the analyses in order of base form and paradigm likelihood for the inflected form.
  • The most likely base form and paradigm is first.
  • The different analyses which have the same base form and paradigm may appear in any order.

Implementation Note

Ideally, this method is implemented as a weighted lexicon transducer, but if the lexicographer has created a guesser for some other purpose, this method may be implemented as a language dependent shell script transforming the existing guesser output to the above mentioned format.

-- KristerLinden - 24 Apr 2008

Topic revision: r3 - 2008-05-28 - KristerLinden
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback