hfst-tag
The tagger programs are part of HFST (University of Helsinki Finite State Transducer interface) finite state toolkit distribution; a tool that creates weighted transducer for suffix based guessing. This tool is licenced under GNU GPL version 3 (other licences may be available at request). The licence can be found from file COPYING.
Installation
Configure hfst using
--enable-tagger
.
Usage
Usage: (null) [OPTIONS...] [INFILE]
Tag a text file given from stdin using an hfst tagger.
Common options:
-h, --help Print help message
-V, --version Print version info
-v, --verbose Print verbosely while processing
-q, --quiet Only print fatal erros and requested output
-s, --silent Alias of --quiet
Input/Output options:
-i, --input=INFILE Read input transducer from INFILE
-o, --output=OUTFILE Write output transducer to OUTFILE
The tool is used to tag text from STDIN with a tagger, which has been compiled using the tool
HfstTrainTagger.
HfstTrainTagger produces two files from its training data file
some_tagger_name.lex
and
some_tagger_name.seq
(currently it also produces an auxiliary file
some_tagger_name
). Usage of
HfstTag is
hfst-tag -i some_tagger_name
The input data should consist of words one word or (other token such as comma) per line. Sentences should be separated by empty lines. E.g.
This
is
a
sentence
.
This
is
another
sentence
.
The program prints the tagged input to STDOUT. Words and tags are separated by tabs. E.g.
This DT
is VBZ
a DT
sentence NN
. .
This DT
is VBZ
another DT
sentence NN
. .
+++ Help message
Usage: hfst-tag [OPTIONS...] [INFILE]
Tag a text file given from stdin using an hfst tagger.
Common options:
-h, --help Print help message
-V, --version Print version info
-v, --verbose Print verbosely while processing
-q, --quiet Only print fatal erros and requested output
-s, --silent Alias of --quiet
Input/Output options:
-i, --input=INFILE Read input transducer from INFILE
-o, --output=OUTFILE Write output transducer to OUTFILE
Report bugs to <hfst-bugs@helsinki.fi> or directly to our bug tracker at:
<https://sourceforge.net/tracker/?atid=1061990&group_id=224521&func=browse>
--
MiikkaSilfverberg - 2012-08-28