hfst-determinize

Purpose

Determinize a transducer, i.e. create an equivalent, epsilon-free transducer that has no state with two or more transitions that have the same input and output symbols.

Usage

The help message:

Usage: hfst-determinize [OPTIONS...] [INFILE]
Determinize a transducer

Common options:
  -h, --help             Print help message
  -V, --version          Print version info
  -v, --verbose          Print verbosely while processing
  -q, --quiet            Only print fatal erros and requested output
  -s, --silent           Alias of --quiet
Input/Output options:
  -i, --input=INFILE     Read input transducer from INFILE
  -o, --output=OUTFILE   Write output transducer to OUTFILE
Command-specific options:
  -E, --encode-weights         Encode weights when determinizing
                               (default is false).


If OUTFILE or INFILE is missing or -, standard streams will be used.
Format of result depends on format of INFILE

Report bugs to <hfst-bugs@helsinki.fi> or directly to our bug tracker at:
<https://sourceforge.net/tracker/?atid=1061990&group_id=224521&func=browse>


Details

Before determinization, each transition symbol pair is encoded into a single symbol, i.e. the transducers are treated as automata. After determinization, the encoded symbol pairs are transformed back to the original input and output symbols.

It is also possible to encode both transition symbols and weights as a single symbol with option --encode-weights. There are some cases where a weighted transducer is not strictly speaking determinizable, but with this option it is possible to get an almost deterministic result. If hfst-determinize is used on such transducers without --encode-weights, the determinization algorithm will not terminate (until it will eventually run out of memory). For more information, see e.g. Allauzen & Mohri.

-- ErikAxelson - 09 Jul 2008