HFST: Hfst for Windows

Functionalities offered for Windows

Currently we offer experimental installers for Windows on our Sourceforge page. The file install-hfst-V.V.V_[32|64]-bit.exe contains all HFST command line tools, the file install-hfst-xfst-V.V.V_[32|64]-bit.exe is a standalone package for hfst-xfst. We are also planning to offer standalone packages for hfst-proc, hfst-lexc, hfst-twolc and tagger tools. The 32-bit installers should work both on 32- and 64-bit systems.

We are planning also to release Swig bindings for Windows.

Usage

At the moment, the installers are very simple. They will install the utilities in a chosen directory and create a shortcut (if you choose it when installing) to a batch script named something like hfst.bat or hfst-xfst.bat. The script opens up a Command Prompt window, sets all necessary environment variables and moves to the directory where you have installed HFST. You should be able to use the command line tools in a similar way as in Unix or Mac. We are planning to create a more Windows-like graphical interface in the future.

Examples

When you have Command Prompt opened (either by clicking on the HFST shortcut or running the batch script directly), you can try for example:

echo foo:bar| hfst-strings2fst | hfst-fst2fst -f optimized-lookup-weighted > foobar.ol
hfst-lookup foobar.ol

This will run hfst-lookup in interactive mode. Now you can e.g. look up words "foo" and "baz" in the transducer foobar.ol. Below is what gets printed on the window, the lines following the prompt ">" are written by the user.

> foo
foo     bar     0,000000

> baz
baz     baz+?   inf

>

To show how redirecting to and from standard streams works on Windows, here is an alternative version of the above. Note the usage of option --pipe-mode to avoid the prompts ">" from printing.

echo foo> words.txt
echo baz>> words.txt
type words.txt| hfst-lookup --pipe-mode foobar.ol > results.txt
type results.txt

Here is another example of using the tool hfst-xfst. First open the program with

hfst-xfst -w

The option -w means that weights are printed. Next we show what gets printed to the window, lines following prompt hfst[N] are user input.

We first create two transducers, one mapping "cat" to "chat" with weight 3 and the other "cat" to "gato" with weight 5.

hfst[0]: regex cat:chat::3;
2 states, 1 arcs
hfst[1]: regex cat:gato::5;
2 states, 1 arcs

We then disjunct the transducers on the stack, getting one transducer that contains both mappings.

hfst[2]: disjunct
2 states, 2 arcs

Next we save the transducer, clear the stack and read it again to the stack. This is just to show how files can be saved and read.

hfst[1]: save stack dictionary.hfst
hfst[1]: clear stack
hfst[0]: load stack dictionary.hfst
2 states, 2 arcs

Now we can perform lookup on the transducer. Lines following the apply prompt "apply up>" are user input.

hfst[1]: apply up
apply up> cat
chat    3.000000
gato    5.000000
apply up> dog
???
apply up> <ctrl-d>
hfst[1]:

Note the special string <ctrl-d> that can be used to signal end of input. The actual CTRL+D does not necessarily work for in Command Prompt and pressing CTRL+C would end the whole program hfst-xfst. This string can also be used in XFST script files.

For bigger transducers it is advisable first to convert the stack into optimized-lookup format:

hfst[1]: lookup-optimize
converting transducer type from openfst-tropical to hfst-optimized-lookup-weighted,
this might take a while...
hfst[1]: apply up
apply up> cat
chat    3.000000
gato    5.000000
apply up> dog
???
apply up> <ctrl-d>
hfst[1]: print net
Operation not supported for optimized lookup format. Consider 'remove-optimization'
to convert into ordinary format.
hfst[1]: remove-optimization
converting transducer type from hfst-optimized-lookup-weighted to openfst-tropical,
this might take a while...
hfst[1]: print net
Ss0:    <cat:chat> -> fs1, <cat:gato> -> fs1.
fs1:    (no arcs).
hfst[1]:

Note that the optimized-lookup format doesn't support operations other than converting, saving and loading, so a warning message is given if the user tries to to apply other operations.

Shortcomings

The installers are very experimental and not extensively tested.

One known shortcoming in hfst-xfst is that the readline library is not enabled, so autocompletion of commands is not possible. However, the Command Prompt's own readline funtionality allows command history to be gone through with the arrow keys.


-- ErikAxelson - 2014-02-24