HFST: Numbers to English Numerals

We examplify the use of HFST command line tools with an example taken from Beesley & Karttunen that creates a transducer that maps English numerals from "one" to "ninety-nine" to the corresponding numerals "1" ... "99". $FORMAT is the implementation type of the transducer. The solution given on this page can also be executed with a single script.

From one to nine.

echo "1:one
2:two
3:three
4:four
5:five
6:six
7:seven
8:eight
9:nine" | hfst-strings2fst -j -f $FORMAT > OneToNine.hfst

Numbers to prefixes that can preceed "-teen" or "-ten".

echo "3:thir
5:fif
6:six
7:seven
8:eigh
9:nine" | hfst-strings2fst -j -f $FORMAT > TeenTen.hfst

Special numbers.

echo "10:ten
11:eleven
12:twelve
14:fourteen" | hfst-strings2fst -j -f $FORMAT > Special.hfst

Here we handle ordinary teens and disjunct them with the special numbers.

echo ":teen" | hfst-strings2fst -f $FORMAT > Epsilon2Teen.hfst

echo "1:" | hfst-strings2fst -f $FORMAT > One2Epsilon.hfst

hfst-concatenate One2Epsilon.hfst TeenTen.hfst | hfst-concatenate Epsilon2Teen.hfst > Teens_.hfst

hfst-disjunct Special.hfst Teens_.hfst > Teens.hfst

Special stems.

echo "2:twen
4:for" | hfst-strings2fst -j -f $FORMAT | hfst-disjunct TeenTen.hfst > TenStem.hfst

TenStem is followed either by "ty" paired with a zero or by "ty-" mapped to an epsilon and followed by one number.

echo ":ty-" | hfst-strings2fst -f $FORMAT | hfst-concatenate OneToNine.hfst > TMP
echo "0:ty" | hfst-strings2fst -f $FORMAT | hfst-disjunct TMP | hfst-concatenate -1 TenStem.hfst > Tens.hfst

We finally disjunct all numbers.

hfst-disjunct OneToNine.hfst Teens.hfst | hfst-disjunct Tens.hfst > OneToNinetyNine.hfst

Let's test the result with random mappings and some test cases.

hfst-fst2strings -r 10 OneToNinetyNine.hfst;

echo "
21
44
62
77
90
12
6" > test_strings.txt; 
hfst-lookup OneToNinetyNine.hfst -I test_strings.txt


-- ErikAxelson - 2011-08-10