fst-compiler-compose: A Compiler Implementing the Intersecting Composition Operation

General Outline

The program fst-compiler-compose is a compiler-tool intended for two-level grammars. It is buiilt using the SFST-toolbox for finite-state-transducers, developped by Helmut Schmid at the University of Stuttgart.

fst-compiler-compose also includes a test-facility for two-level-grammars. More information about the test-tool may be found on the page OMorFiRegressionTest.

The program is given a lexicon-file and a grammar of two-level-rules. These it compiles into a single transducer. It may additionally be given a test-file, containing pairs of analysis and surface-representations of words, which the grammar should recognize. These pairs are used to test the grammar.

fst-compiler-compose is a temporary program implementing some of the functionality present in the future program htwolc -- an open-source two-level rule-compiler implemented using the HFST interface.


You need to have SFST-1.2 installed.

The CVS-repository on corpus.csc.fi

The program fst-compiler-compose can be obtained from the CVS-repositiory

from the directory fst-compiler-compose. The directory now contains:
-rw-r--r--  1 silfverb kikosken   899 24. huhti  16:43 binSearch.C
-rw-r--r--  1 silfverb kikosken   309 24. huhti  16:43 binSearch.h
drwxr-xr-x  2 silfverb kikosken  1024  5. touko  17:21 CVS
drwxr-xr-x  3 silfverb kikosken  1024  5. touko  17:21 example
-rw-r--r--  1 silfverb kikosken 17357 28. huhti  09:56 fst-compiler-compose.yy
-rw-r--r--  1 silfverb kikosken 13876 25. huhti  14:22 fst-lex.C
-rw-r--r--  1 silfverb kikosken  1834 24. huhti  16:43 fst-lex.h
-rw-r--r--  1 silfverb kikosken 26098 24. huhti  16:43 fst-regression.C
-rw-r--r--  1 silfverb kikosken  1811 24. huhti  16:43 fst-regression.h
-rw-r--r--  1 silfverb kikosken  2819 28. huhti  23:01 Makefile
-rw-r--r--  1 silfverb kikosken   574 24. huhti  16:43 README
-rw-r--r--  1 silfverb kikosken  6105 25. huhti  14:22 scanner-pck.ll

The files fst-compiler-compose.yy and scanner-pck.ll are the Bison- and Flex-files used to generate the compiler.

The files fst-lex.h and fst-lex.C implement the intersecting composition.

The files fst-regression.C and fst-regression.h implement the test-tool for two-level grammars included in the compiler.

Please report bugs to the e-mail address in README.

The contents of the directory example is documented in the file example/README. The Makefile in the directory example serves to illustrate the use of the compiler and test-facility.

tpirinen@corpus3 cvs (518) $ CVSROOT=/c/appl/ling/koskenni/cvsrepo/ cvs co fst-compiler-compose
cvs checkout: Updating fst-compiler-compose
cvs checkout: failed to create lock directory for `/c/appl/ling/koskenni/cvsrepo/fst-compiler-compose' (/c/appl/ling/koskenni/cvsrepo/fst-compiler-compose/#cvs.lock): Permission denied
cvs checkout: failed to obtain dir lock in repository `/c/appl/ling/koskenni/cvsrepo/fst-compiler-compose'
cvs [checkout aborted]: read lock failed - giving up

(chown xxx:omorf && chmod 775)

-- TommiPirinen - 06 May 2008

The Makefile

The Makefile supplied (in the directory fst-compiler-compose) might have to be modified, depending on the C++ compiler, the platform and the versions of Flex and Bison you are using. It has been tested using

  • version 3.4.6 of the GNU C++ compiler g++
  • version 2.5.4 of Flex
  • version 1.875c of GNU Bison

on a 32-bit platform Red Hat 3.4.6-9.

It is assumed, that you install fst-compiler-compose in a sub-directory of the directory containing your SFST-binaries. If you install it somewhere else, you need to change the line


Installing fst-compiler-compose

Run make in the directory, where you've got the sources for fst-compiler-compose. This will give you the binaries fst-compiler-compose and fst-compiler-compose-utf8. The second compiler is utf8-compatible, where as the first one isn't.

Status of fst-compiler-compose

The program is a temporary tool implementing some of the functionality of a two-level rule-compiler. It will be replaced by something much better. It is actually mostly intended for testing the modules fst-regression and fst-lex.

Using fst-compiler-compose and the test-tool

Usage: fst-compiler-compose-utf8 [ -t test-file ] lexiconfile grammarfile outfile

It is probably worth-while studying the Makefile in the directory example/.

The utf-8 version of the compiler doesn't handle Latin-X files. In some cases Latin-character may be spotted and an error-message given, but in most cases the compiler just falls apart. If you get weird error-messages, you might have Latin encoding in your grammar- or test-file.

-- MiikkaSilfverberg - 05 May 2008
Topic revision: r2 - 2008-05-06 - TommiPirinen
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback