HFST: Python bindings

This page has instructions for installing and compiling HFST Python interface. The interface works with python version 3 (tested and developed with versions 3.4 and higher) and most features are also supported for python version 2 (tested with 2.7). For information on the interface, see Doxygen-generated documentation on our Github pages.

(Option 1) Installing the debian package to linux

Fetch newest release (named python3-libhfst for python version 3 and python-libhfst for python version 2) and install it with

dpkg --install  python[3]-libhfst_***.deb

When choosing the right package, the command

lsb_release -a

might be helpful. It will e.g. print something like

 No LSB modules are available.
 Distributor ID: Ubuntu
 Description:    Ubuntu 12.04.4 LTS
 Release:        12.04
 Codename:       precise

In the example case, the line Codename shows that the right package is of form *~precise_*.deb.

The command file /usr/bin/file is one way to check whether your system is 64-bit or 32-bit. It will print something like:

/usr/bin/file: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked ...

In the case above, a package ending in amd64 is the right choice.

(Option 2) Installing via PyPI to OS X or Windows

Python bindings are available via PyPI (Python Package Index) as wheels for python versions 2.7, 3.4, 3.5 and 3.6 for Windows and OS X. The Windows wheels are 64 bit and the OS X ones 32/64 bit (universal binaries). The name of the package is hfst. Version numbering will the same as for HFST in general, i.e. every time a new release x.y.z is made for HFST, there will also be a new release x.y.z for python bindings in PyPi. At least in the beginning, there is probably a need to make releases more often for the python bindings than for HFST in general, because the bindings are still being developed. As a result, we will use a four-digit numbering scheme for the bindings. So, for release x.y.z of HFST, there will be release x.y.z.0 for python bindings, and until we make a new release for HFST, possibly releases x.y.z.1, x.y.z.2 and so on.

pip is the preferred installer program. Starting with Python versions 2.7.9 and 3.4.0, it is included by default so you can call it via python with the switch -m pip. Just run

python[3] -m pip install [--upgrade] hfst

You can also call pip directly:

pip[3] install [--upgrade] hfst

Troubleshooting

If the pip installation downloads the source package and starts to build it from scratch instead of using the available wheel, try upgrading pip:

python[3] -m pip install --upgrade pip

or

pip[3] install --upgrade pip

If there isn't a wheel available for your operating system, pip will compile the hfst package from scratch. In that case, you need a recent version of setuptools:

python[3] -m pip install --upgrade setuptools

or

pip[3] install --upgrade setuptools

and swig (tested with version 3.0) as well as a C++ compiler.

(Option 3) Installing from source to linux (or OS X)

This documentation shows how to install HFST C++ library and Python bindings globally and assumes that you have rights to do that on the computer you are working on. For local installation, see HfstPython#Local_installation for additional information.

Compiling and installing HFST C++ library

Requirements:

C/C++ compiler C++11 features must be supported. Tested with gcc 4.6.3 and g++ 4.8.2 and clang (Apple LLVM version 7.0.2, clang-700.1.81)
swig Tested with 3.0.
python Tested with python3.4.
lex/flex Needed only if .ll or .yy files are modified.
yacc/bison Needed only if .ll or .yy files are modified.

Fetch newest HFST source tarball hfst-X.Y.X.tar.gz from https://github.com/hfst/hfst/releases, extract it, go to the directory where you extracted it, and first run

autoreconf -i

You possibly have to comment the following lines from configure script (which was generated when autoreconf was run) so that it will work:

# remove if not needed
if test "x$with_unicode_handler" != "xglib"; then

fi

Then configure HFST. If you are interested using only the Python interface, you probably want to disable all command line tools and their tests as well as readline library with:

./configure --enable-no-tools --enable-fsmbook-tests=no --with-readline=no

Any warnings about missing tools can be ignored when configure is run, if --enable-no-tools was requested.

Then compile HFST:

make

Then check that HFST works as expected:

make check

Note that most tests are performed with command line tools, so there are only few tests that will be run when --enable-no-tools is defined. (There are some warnings about missing tools during make check, but they can be ignored.) However, the Python interface that we compile next will have more tests that make sure that everything works as it should.

Then install HFST:

make install

(Note that the python bindings will by default use the HFST library found on LD_LIBRARY_PATH. If there is an earlier version of HFST installed on that path, make sure the current version of HFST is listed first on the path.)

Installing Python bindings

We recommend generating Python bindings for python3, as it offers a better support for Unicode characters. Go to directory python (under hfst source top directory). You need python and swig for generating the bindings as well as a recent version of python's setuptools package. We have used python 3.4 and swig 3.0 for generating and testing.

Go to directory python and first build the bindings:

python3 setup.py build_ext --inplace

Swig may print some warnings, for example

libhfst.i:447: Warning 321: 'compile' conflicts with a built-in name in python
but they shouldn't be too dangerous (although they should be fixed at some point..).

(For some version combinations of swig and python, you need to make HfstException and its subclasses available for Python with the following hack:)

sed -i 's/class HfstException(_object):/class HfstException(Exception):/' libhfst.py

Go to directory python/test. The run the tests:

./test.sh --python python3 --pythonpath PYTHONPATH

where PYTHONPATH is full path to upper level folder (the one where the bindings were just created).

Then install the python bindings. go back to directory python and run:

python3 setup.py install

(You possibly have to do again the following

sed -i 's/class HfstException(_object):/class HfstException(Exception):/' libhfst.py

and then install manually just the file libhfst.py, because the installation command probably generates libhfst.py again.)

Local installation

HFST C++ library

If you wish to install C++ side of HFST and place it to another directory than the default one, add the option prefix to ./configure:

--prefix="full-path-to-hfst-installation-dir"

In this case, you must add "full-path-to-hfst-installation-dir" to LD_LIBRARY_PATH so that python will find HFST library:

    LD_LIBRARY_PATH="full/path/to/hfst/installation/dir:"$LD_LIBRARY_PATH

If you do not want to install HFST, it is also possible to link to the built C++ library (that is created in the folder ../libhfst/src/) from python by hard-coding its path. This is done by changing the following line in setup.py:

extra_link_arguments = []

to

extra_link_arguments = ["-Wl,-rpath=" + absolute_libhfst_src_path + "/.libs"]

Python bindings

If you do not want to install Python bindings, you can also use them locally. Either add the absolute path to folder python to PYTHONPATH, e.g. by executing

    PYTHONPATH="path/to/hfst-top-dir/python:"$PYTHONPATH

or do the following in Python before importing hfst:

    import sys
    sys.path.insert(1, 'path/to/hfst-top-dir/python')

Known bugs

Some version combinations of SWIG and Python make HFST exception classes subclasses of Python's _object instead of Exception. Then you will get an error like

    TypeError: catching classes that do not inherit from BaseException is not allowed

If this is the case, run

    sed -i 's/class HfstException(_object):/class HfstException(Exception):/' libhfst.py

after build/installation to be able to use HfstException and its subclasses in Python.

Using the bindings

The bindings should be usable after installation in Python with command:

  import hfst

The python interface

C++ side functions and classes are wrapped with SWIG under module 'libhfst'. It is possible to use this module directly, but there is a package named 'hfst' which encapsulates the libhfst module in a more user-friendly manner. The structure of the package is

  • hfst
    • hfst.exceptions
    • hfst.sfst_rules
    • hfst.xerox_rules

The module hfst.exceptions contains HfstException and its subclasses. The modules hfst.sfst_rules and hfst.xerox_rules have functions that create transducers implementing two-level and replace rules. All other functions and classes are in module hfst.

For more information and examples, see Doxygen-generated documentation. You can also use help and dir commands in Python, e.g.

dir(hfst)
help(hfst.HfstTransducer)

-- ErikAxelson - 2016-05-27

Topic revision: r16 - 2017-05-27 - ErikAxelson
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback