HFST: How to install HFST to CSC's servers

This guide describes the installation of HFST on CSC servers, at present taito-shell.csc.fi. You might want to read this guide if you compile HFST yourself and want to make sure you have a working environment. (There is also an LGPL version of HFST available that does not contain foma or sfst back-ends or command line tools. It can be installed with the same instructions given on this page, including some modifications that are given in parentheses.)

Installing HFST library

Fetch newest HFST source tarball hfst-X.Y.X.tar.gz from https://github.com/hfst/hfst/releases, extract it, go to the directory where you extracted it, and run

autoreconf -i && \
./configure --enable-all-tools --with-readline --enable-fsmbook-tests --disable-static && \
make && make check && make install 

(LGPL package: plain ./configure is probably enough.)

If you wish to install to another directory than the default one, add the option prefix to ./configure:

--prefix="full-path-to-hfst-installation-dir"

You possibly have to comment the following lines from configure after autoreconf -i:

# remove if not needed
if test "x$with_unicode_handler" != "xglib"; then

fi

(LGPL package: you will get warnings about missing tools when configure is run. They can be ignored.)

During ./configure you probably get the following warnings, that can safely be ignored:

configure: WARNING: HFST only supports basic unicode handling with limited case mapping tables etc.; for better support consider using glib or ICU --with-unicode-handler
configure: WARNING: Python bindings for HFST are not under autotools; see python/README for instructions about how to build and install them
configure: WARNING: automake version < 1.12; using .h extension for yacc/bison generated header files; if you are building with pre-generated files, modifying them will make building fail, because they use .hh extension

During make, you probably get following kind of warnings, which can be ignored (but should be fixed at some point...)

iface.o: In function `view_net':
/wrk/axelson/hfst-3.12.0/back-ends/foma/iface.c:1705: warning: the use of `tempnam' is dangerous, better use `mkstemp'
conflicts: 48 shift/reduce, 23 reduce/reduce
conflicts: 105 shift/reduce, 23 reduce/reduce
conflicts: 524 shift/reduce, 7 reduce/reduce
parsers/.libs/libhfstparsers.a(XfstCompiler.o): In function `hfst::xfst::XfstCompiler::view_net()':
/wrk/axelson/hfst-3.12.0/libhfst/src/parsers/XfstCompiler.cc:3497: warning: the use of `tempnam' is dangerous, better use `mkstemp'
hfst-compiler.cc: In function ‘int yyparse()’:
hfst-compiler.cc:2494:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
       yyerror (YY_("syntax error"));
                                   ^
hfst-compiler.cc:2636:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
   yyerror (YY_("memory exhausted"));
                                   ^
conflicts: 320 shift/reduce, 151 reduce/reduce

Installing Python bindings

We recommend generating Python bindings for python3, as it offers a better support for Unicode characters. Go to directory python (under hfst source top directory) and set up the modules needed for generating Python bindings:

module unload python
module load python/3.4.0
module load swig/3.0

Then generate the bindings (Parameter --local-hfst links against the HFST C++ library located in ../libhfst/src. If you have a compatible version of HFST C++ library installed on LD_LIBRARY_PATH, this parameter can be ignored):

python3 setup.py build_ext --inplace [--local-hfst]

Swig may print some warnings, for example

libhfst.i:447: Warning 321: 'compile' conflicts with a built-in name in python
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
but they shouldn't be too dangerous (although they should be fixed at some point too..).

Go to directory test. The run the tests:

./test.sh --python python3 --pythonpath PYTHONPATH

where PYTHONPATH is full path to upper level folder (the one where the bindings were just created).

(LGPL package: some cases of test_stream.py are skipped because foma and sfst implementation types are not available. This is as it should be.)

Go back to directory python and install the python bindings:

python3 setup.py install

Updating morphologies

Morphologies can be fetched from http://sourceforge.net/projects/hfst/files/resources/morphological-transducers/. There is a script for that. By default, the script installs morphologies to the same place where hfst command line tools are located (determined by calling which hfst-lookup). If you want to install to a different location, edit the script. Currently, the following morphologies are installed on Hippu: english, finnish, french, german, italian, omorfi, swedish, and turkish. (Should english-bnc be added?) All morphologies come in two variants, analyze and generate.

Installing other tools

hfstospell

Fetch newest hfstospell source tarball hfstospell-X.Y.X.tar.gz from https://github.com/hfst/hfst-ospell/releases, extract it, go to the directory where you extracted it, and run

autoreconf -i && ./configure --enable-hfst-ospell-office=no (--prefix="/path/to/hfstospell/installation/") && make && make check && make install

foma

Fetch source tarball from https://bitbucket.org/mhulden/foma/downloads and unpack it. Modify prefix on first line of Makefile, if needed. Then

make && make install

Tools that get installed are cgflookup, flookup and foma.

xfst

Go to http://web.stanford.edu/~laurik/fsmbook/home.html, follow link NewSoftware and click Accept. Choose 'Binaries Only' for 'Linux64'. Unpack bin.tar.gz. Copy manually /bin/xfst and other tools (lexc, lookup, tokenize and twolc) where you want to install them.

Making installation check

After installing , get newest hfst check package check-hfst-X.Y.Z.tar.gz (hidden in location http://hfst.github.io/downloads/check-hfst-X.Y.Z.tar.gz), unpack it and run

./check-tools.sh (--prefix full-path-to-hfst-installation-bin) &&
./check-morphologies.sh (--script-prefix full-path-to-morphology-scripts --hfst-tool-prefix full-path-to-hfst-installation-bin) &&
./check-ospell.sh (--prefix full-path-to-hfstospell-installation-bin)

The option --prefix full-path-to-hfst-installation-bin should not be needed if hfst tools are installed so that they can be called directly from command line. The option --prefix full-path-to-morphology-dir can be used for testing a local installation of morphologies.


-- ErikAxelson - 2011-03-25
Topic revision: r48 - 2017-09-30 - ErikAxelson
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback