OMorFi: hfst-string2fst

Purpose

Convert a string of pairs of symbols separated by white-space into a transducer.

Usage

hfst-string2fst [ -weight=INTEGER ] -input="..." -out=FILE_NAME

E.g.

hfst-string2fst -weight=1 -input="k a ~N:m p:m a n" -out=transducer
is a typical way of using the transducer. It will produce a file transducer, which contains an SFST or Open-Fst style transducer having the state-transition diagram:
0       k       1
1       a       2
2       ~N:m    3
3       p:m     4
4       a       5
5       n       6
final   6

If the program has been compiled using the HFST-library hofst, an additional file symbol_table will be written and the final state of the transducer will have weight 1.

If the library hsfst is used to compile the program, no weight will be added to the transducer (since SFST doesn't support weights). No file symbol_table will be written either.

Notes

The input has to be a string of ASCII or UTF-8 characters. Any sequence of characters not containing spaces, tabs, newlines or colons is considered to make a symbol in the alphabet of the transducer, which is being constructed (e.g. p). A colon separates the upper and lower characters of a pair.

A string of characters not including spaces, newlines, tabs or colons separated from its context by spaces, newlines or tabs (e.g. ~N) denotes a pair with equal deep-character and surface-character (here ~N:~N).

The input-string should be delimited by " characters, but shouldn't contain such (not even escaped ones). So the input-strings "a " e " b" and "a \" e \" b" are both illegal, but "a e b" is fine.

Getting the program

The program is in the CVS-repository on corpus in the directory

/c/appl/ling/koskenni/cvsrepo/hfst-tools/

Building the program

The program is distributed with a Makefile, which you might have to change a bit.

The line

HFSTPATH=../hfst
should be changed depending on, where you've got HFST installed.

If you want to build using the library hsfst, instead of the library hofst you should comment the line

INCLUDES=-I$(HFSTPATH)
uncomment the line
#INCLUDES=-I$(HFSTPATH)/ -I$(SFST_INCLUDE_PATH)/
comment the line
LIBS=-static -L$(HFSTPATH) -l$(OPEN_FST_LIB) -lpthread -lm -ldl
and uncomment the line
#LIBS=-static -L$(HFSTPATH) -l$(SFST_LIB)

-- MiikkaSilfverberg - 14 May 2008
Edit | Attach | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2008-05-14 - MiikkaSilfverberg
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback