hfst-minimize

Purpose

Minimize a transducer, i.e. create an equivalent, epsilon-free, deterministic transducer that has as few states as possible.

Usage

The help message:

Usage: hfst-minimize [OPTIONS...] [INFILE]
Minimize a transducer

Common options:
  -h, --help             Print help message
  -V, --version          Print version info
  -v, --verbose          Print verbosely while processing
  -q, --quiet            Only print fatal erros and requested output
  -s, --silent           Alias of --quiet
Input/Output options:
  -i, --input=INFILE     Read input transducer from INFILE
  -o, --output=OUTFILE   Write output transducer to OUTFILE
Command-specific options:
  -E, --encode-weights         Encode weights when minimizing
                               (default is false).

If OUTFILE or INFILE is missing or -, standard streams will be used.
Format of result depends on format of INFILE

Report bugs to <hfst-bugs@helsinki.fi> or directly to our bug tracker at:
<https://sourceforge.net/tracker/?atid=1061990&group_id=224521&func=browse>

Details

Before the actual minimization, epsilons are removed from the transducers and the transducers are determinized.

Before determinization, each transition symbol pair is encoded into a single symbol, i.e. the transducers are treated as automata. After determinization, the encoded symbol pairs are transformed back to the original input and output symbols.

It is also possible to encode both transition symbols and weights as a single symbol with option --encode-weights. There are some cases where a weighted transducer is not strictly speaking determinizable, but with this option it is possible to get an almost deterministic result. If hfst-minimize is used on such transducers without --encode-weights, the determinization algorithm will not terminate (until it will eventually run out of memory). For more information, see e.g. Allauzen & Mohri.

-- ErikAxelson - 09 Jul 2008