HFST: Esperanto

We exemplify the use of HFST command line tools with a set of examples taken from Beesley & Karttunen (Finite-State Morphology, pages 476 - 482). See the solutions in the book for more information on the examples. FORMAT is the implementation format that is used. The solutions given here can also be executed with single scripts.

Esperanto Nouns

Noun roots.

echo 'bird
hund
kat
elefant' | hfst-strings2fst -j -f $FORMAT > Nouns

Optional feminine, diminutive and augmentative suffixes.

echo '[ [i n] | [e t] | [e g] ]*'| hfst-regexp2fst -f $FORMAT > Nmf

Required noun suffix.

echo '[o]' | hfst-regexp2fst -f $FORMAT > Nend

Optional number suffix.

echo '[(j)]' | hfst-regexp2fst -f $FORMAT > Number

Optional accusative case suffix.

echo '[(n)]' | hfst-regexp2fst -f $FORMAT > Case

Concatenate the roots and suffixes into a single transducer.

echo '[0]' | hfst-regexp2fst -f $FORMAT > Result
for i in Nouns Nmf Nend Number Case;
do
  hfst-concatenate Result $i > TMP;
  mv TMP Result;
done

Minimize the result.

hfst-minimize Result > EsperantoNouns

Esperanto Adjectives

Optional adjective prefixes.

echo '[( [m a l] | [n e] )]' | hfst-regexp2fst -f $FORMAT -j > AdjPrefix

Adjective roots.

echo "bon
long
jun
alt
grav" | hfst-strings2fst -f $FORMAT -j > Adjectives

Optional augmentative and diminutive suffixes.

echo '[[e g] | [e t]]*' | hfst-regexp2fst -f $FORMAT -j > Adj

Required adjective suffix.

echo '[a]' | hfst-regexp2fst -f $FORMAT -j > AdjEnd

Optional plural suffix.

echo '[(j)]' | hfst-regexp2fst -f $FORMAT -j > Number

Optional accusative-case suffix.

echo '[(n)]' | hfst-regexp2fst -f $FORMAT -j > Case

Concatenate the roots and suffixes into a single transducer.

echo '[0]' | hfst-regexp2fst -f $FORMAT > Result
for i in AdjPrefix Adjectives Adj AdjEnd Number Case;
do
  hfst-concatenate Result $i > TMP;
  mv TMP Result;
done

Minimize the result.

hfst-minimize Result > EsperantoAdjectives

Esperanto Nouns and Adjectives

Noun and adjective roots.

echo "bird
hund
kat
elefant" | hfst-strings2fst -j -f $FORMAT > NounRoots

echo "bon
long
jun
alt
grav" | hfst-strings2fst -j -f $FORMAT > AdjRoots

Optional "male and female" prefix.

echo '[(g e)]' | hfst-regexp2fst -f $FORMAT > NounPrefix

Prefixes and suffixes.

echo '[ ([m a l] | [n e]) ]' | hfst-regexp2fst -f $FORMAT > AdjPrefixes

echo '[o]' | hfst-regexp2fst -f $FORMAT -j > Nend

echo '[a]' | hfst-regexp2fst -f $FORMAT -j > Adjend

echo '[(j)]' | hfst-regexp2fst -f $FORMAT -j > Number

echo '[(n)]' | hfst-regexp2fst -f $FORMAT -j > Case

Feminine, diminutive and augmentative. Note the Adjend for derivation of noun to adjective

echo '[ ([i n] | [e t] | [e g])* [@"Nend" | @"Adjend"] ]' | \
     hfst-regexp2fst -f $FORMAT > Nmf

Augmentative and diminutive. Note the Nend for derivation of adjective to noun.

echo '([e g] | [e t])* [@"Adjend" | [e c @"Nend"]]' | \
     hfst-regexp2fst -f $FORMAT > Adj

Concatenate the roots and suffixes into a single transducer.

hfst-concatenate NounPrefix NounRoots | hfst-concatenate -2 Nmf > NounStem
hfst-concatenate AdjPrefixes AdjRoots | hfst-concatenate -2 Adj > AdjectiveStem
hfst-disjunct NounStem AdjectiveStem > Stems
hfst-concatenate Number Case | hfst-concatenate -1 Stems > Result

Minimize the result.

hfst-minimize Result > EsperantoNounsAndAdjectives

Esperanto Nouns and Adjectives with Upper-Side Tags

echo '([MF%+:g 0:e])' | hfst-regexp2fst -f $FORMAT > NounPrefix

echo "bird
hund
kat
elefant" | hfst-strings2fst -j -f $FORMAT > NounRoots

echo "bon
long
jun
alt
grav" | hfst-strings2fst -j -f $FORMAT > AdjRoots

echo '%+Noun:0' | hfst-regexp2fst -f $FORMAT > Nmf

echo '%+NSuff:o' | hfst-regexp2fst -f $FORMAT > Nend

echo '%+ASuff:a' | hfst-regexp2fst -f $FORMAT > Adjend

echo '[[%+Fem:i 0:n] | [%+Dim:e 0:t] | [%+Aug:e 0:g]]* [@"Nend" | @"Adjend"]' \
     | hfst-regexp2fst -f $FORMAT > AugDimFem

echo '([Op%+:m 0:a 0:l] | [Neg%+:n 0:e])' | hfst-regexp2fst -f $FORMAT \
     > AdjPrefixes

echo '[[%+Aug:e 0:g] | [%+Dim:e 0:t]]* [[[%+Nize:e 0:c] @"Nend"]|[@"Adjend"]]' \
     | hfst-regexp2fst -f $FORMAT > AugDimNize

echo '%+Adj:0' | hfst-regexp2fst -f $FORMAT > Adj

echo '%+Pl:j | %+Sg:0' | hfst-regexp2fst -f $FORMAT > Number

echo '(%+Acc:n)' | hfst-regexp2fst -f $FORMAT > Case


hfst-concatenate NounPrefix NounRoots | hfst-concatenate -2 Nmf | \
  hfst-concatenate -2 AugDimFem  > NounStem
hfst-concatenate AdjPrefixes AdjRoots | hfst-concatenate -2 Adj | \
  hfst-concatenate -2 AugDimNize > AdjectiveStem
hfst-disjunct NounStem AdjectiveStem > Stems
hfst-concatenate Number Case | hfst-concatenate -1 Stems > Result

hfst-minimize Result > EsperantoNounsAndAdjectivesWithTags

Esperanto Nouns, Adjectives and Verbs

echo '([MF%+:g 0:e])' | hfst-regexp2fst -f $FORMAT > NounPrefix

echo "bird
hund
kat
elefant" | hfst-strings2fst -j -f $FORMAT > NounRoots

echo "bon
long
jun
alt
grav" | hfst-strings2fst -j -f $FORMAT > AdjRoots

echo "don
est
pens
dir
fal" | hfst-strings2fst -j -f $FORMAT > VerbRoots

echo '([Op%+:m 0:a 0:l] | [Neg%+:n 0:e])' | hfst-regexp2fst -f $FORMAT \
     > VerbPrefixes

echo '[%+Verb:0]' | hfst-regexp2fst -f $FORMAT > Verb

echo '([%+Cont:a 0:d])' | hfst-regexp2fst -f $FORMAT > Aspect

echo '[%+Inf:i] | [%+Pres:a 0:s] | [%+Past:i 0:s] | [%+Fut:o 0:s] | ' \
     '[%+Cond:u 0:s] | [%+Subj:u]' | hfst-regexp2fst -f $FORMAT > Vend


echo '%+Noun:0' | hfst-regexp2fst -f $FORMAT > Nmf

echo '%+NSuff:o' | hfst-regexp2fst -f $FORMAT > Nend

echo '%+ASuff:a' | hfst-regexp2fst -f $FORMAT > Adjend

echo '[[%+Fem:i 0:n] | [%+Dim:e 0:t] | [%+Aug:e 0:g]]* [@"Nend" | @"Adjend"]' \
     | hfst-regexp2fst -f $FORMAT > AugDimFem

echo '([Op%+:m 0:a 0:l] | [Neg%+:n 0:e])' | hfst-regexp2fst -f $FORMAT \
     > AdjPrefixes

echo '[[%+Aug:e 0:g] | [%+Dim:e 0:t]]* [[[%+Nize:e 0:c] @"Nend"]|[@"Adjend"]]' \
     | hfst-regexp2fst -f $FORMAT > AugDimNize

echo '%+Adj:0' | hfst-regexp2fst -f $FORMAT > Adj

echo '%+Pl:j | %+Sg:0' | hfst-regexp2fst -f $FORMAT > Number

echo '(%+Acc:n)' | hfst-regexp2fst -f $FORMAT > Case


hfst-concatenate NounPrefix NounRoots | hfst-concatenate -2 Nmf | \
  hfst-concatenate -2 AugDimFem  > NounStem

hfst-concatenate AdjPrefixes AdjRoots | hfst-concatenate -2 Adj | \
  hfst-concatenate -2 AugDimNize > AdjectiveStem

hfst-disjunct NounStem AdjectiveStem > NounAdjStems
hfst-concatenate Number Case | hfst-concatenate -1 NounAdjStems > NounAdjs

hfst-concatenate VerbPrefixes VerbRoots | hfst-concatenate -2 Verb | \
hfst-concatenate -2 Aspect | hfst-concatenate -2 Vend > Verbs

hfst-disjunct NounAdjs Verbs | hfst-minimize \
  > EsperantoNounsAdjectivesAndVerbs


-- ErikAxelson - 2011-10-20