Difference: HfstInteractiveToolsTutorial (1 vs. 8)

Revision 82016-05-20 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 9 to 9
 

Examples

Added:
>
>
 All tools can be invoked from command line and they take input from user by default. Pressing Ctrl+C exits the program.
Added:
>
>

hfst-lookup

  hfst-lookup is the simplest of the tools, it basically looks up words in a transducer file FILE as shown below (lines beginning with > are user input):
Line: 31 to 36
  For each word, the tool prints the results it can find in the transducer file and their weights. If a word is not found, +? is appended to the result and an infinite weight inf is printed.
Added:
>
>

hfst-proc

 hfst-proc is a similar tool, but designed for text streams.
Line: 81 to 91
 
Added:
>
>

hfst-xfst

 hfst-xfst is the most complex of these three tools. Below is a simple example of looking up words using hfst-xfst, lines beginning with prompt hfst[N] or are user input. Note that option --print-weight must be specified if we want to see weights. Pressing Ctrl+D in apply up mode exits that mode and returns to the normal mode, where pressing Ctrl+C or writing exit exits the program.

Revision 72016-02-26 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 196 to 196
  --> -- ErikAxelson - 2014-02-24
Changed:
<
<
META PREFERENCE name="VIEW_TEMPLATE" title="VIEW_TEMPLATE" type="Set" value="FinCLARIN.ViewFinClarinTemplate"
>
>
META PREFERENCE name="VIEW_TEMPLATE" title="VIEW_TEMPLATE" type="Set" value="FinCLARIN.ViewFinClarinWideTemplate"

Revision 62016-02-25 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 182 to 193
  -- ErikAxelson - 2014-02-24

META PREFERENCE name="VIEW_TEMPLATE" title="VIEW_TEMPLATE" type="Set" value="FinCLARIN.ViewFinClarinTemplate"

Revision 52016-02-24 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 183 to 183
 
<--  
-->
-- ErikAxelson - 2014-02-24
Added:
>
>
META PREFERENCE name="VIEW_TEMPLATE" title="VIEW_TEMPLATE" type="Set" value="FinCLARIN.ViewFinClarinTemplate"

Revision 42014-03-26 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 9 to 9
 

Examples

Changed:
<
<
All tools can be invoked from command line and they take input from user by default. hfst-lookup is the simplest of the tools, it basically looks up words in a transducer file FILE as shown below (lines beginning with > are user input):
>
>
All tools can be invoked from command line and they take input from user by default. Pressing Ctrl+C exits the program.

hfst-lookup is the simplest of the tools, it basically looks up words in a transducer file FILE as shown below (lines beginning with > are user input):

 
hfst-lookup FILE
Line: 19 to 21
 > dog
dog chien 2.000000
Added:
>
>
> DOG
DOG DOG+? inf
 > mouse
mouse mouse+? inf
Line: 32 to 37
 ^Cat/Chat~1~$ ^Dog/Chien~2~$ ^mouse/*mouse$, ^cat/chat~1~$ ^DOG/CHIEN~2~$.
Changed:
<
<
Note that the tool by default recognizes words in upper and lower case. Weights are printed only if --show-weights is used.
>
>
Note that the tool by default recognizes words in upper and lower case. Weights are printed only if --show-weights is used, however the CG format does not support weights. The output format can be controlled through options:

option explanation
-p --apertium Apertium output format for analysis (default)
-C --cg Constraint Grammar output format for analysis
-x, --xerox Xerox output format for analysis

The apertium format keeps all punctuation and whitespace characters as they were in the input, the CG and Xerox formats discard them:

echo "Cat Dog mouse, cat DOG." | hfst-proc FILE --cg --print-weights
"<Cat>"
        "chat"
"<Dog>"
        "chien"
"<mouse>"
        "*mouse"
"<cat>"
        "chat"
"<DOG>"
        "chien"

echo "Cat Dog mouse, cat DOG." | hfst-proc FILE --xerox --print-weights
Cat             1
Cat     Chat

Dog             2
Dog     Chien

mouse   +?

cat             1
cat     chat

DOG             2
DOG     CHIEN

hfst-xfst is the most complex of these three tools. Below is a simple example of looking up words using hfst-xfst, lines beginning with prompt hfst[N] or are user input. Note that option --print-weight must be specified if we want to see weights. Pressing Ctrl+D in apply up mode exits that mode and returns to the normal mode, where pressing Ctrl+C or writing exit exits the program.

hfst-xfst --print-weight
hfst[0]: load stack FILE
hfst[1]: apply up
apply up> cat
chat    1.00000
apply up> dog
chien   2.00000
apply up> Dog
???
apply up> mouse
???
apply up> [user presses Ctrl+D here]
hfst[1]: exit

You can find more hfst-xfst examples here.

Optimized lookup (OL) format

There is a special HFST transducer format designed for fast look up, the optimized lookup (OL) format. hfst-proc only supports transducers in OL format. hfst-lookup supports transducers in other formats, but is much faster with OL transducers. hfst-xfst offers many operations, most of which are not implemented for OL format. However, the look up operation also works with OL transducers.

How to know in which format a transducer is

The tool hfst-format can be used:

hfst-format transducer.ofst
Transducers in transducer.ofst are of type OpenFST, std arc, tropical semiring
hfst-format transducer.ol
Transducers in transducer.ol are of type Hfst's lookup optimized, weighted

How to convert between transducer formats

The tool hfst-fst2fst can be used:

hfst-fst2fst --format optimized-lookup-weighted transducer.ofst > transducer.ol
hfst-fst2fst --format openfst-tropical transducer.ol > transducer.ofst
 
Changed:
<
<
Examples of hfst-xfst.
>
>
In hfst-xfst, there are special commands to convert between formats.
 

Interactive vs. non-interactive mode

Line: 62 to 149
 2 states, 1 arcs
Changed:
<
<
In hfst-xfst, there is an apply up mode where user can look up words in a transducer and exit from that mode by pressing CTRL+D. As this is difficult with scripts (and possibly in Windows Command Prompt in general), there is a special string <ctrl-d> reserved for this purpose. Below is an example in interactive mode:
>
>
In hfst-xfst, there is an apply up mode where user can look up words in a transducer and exit from that mode by pressing Ctrl+D. As this is difficult with scripts (and possibly in Windows Command Prompt in general), there is a special string <ctrl-d> reserved for this purpose. Below is an example in interactive mode:
 
$ hfst-xfst

Revision 32014-03-25 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 23 to 23
 mouse mouse+? inf

Changed:
<
<
For each word, the tool prints the results it can find in the transducer file and their weights. If a word is not found, +? is appended to the result and an infinite weight inf is printed.
>
>
For each word, the tool prints the results it can find in the transducer file and their weights. If a word is not found, +? is appended to the result and an infinite weight inf is printed.
  hfst-proc is a similar tool, but designed for text streams.
Changed:
<
<
echo -e "Cat Dog mouse, cat dog." | hfst-proc FILE --show-weights
>
>
echo "Cat Dog mouse, cat dog." | hfst-proc FILE --show-weights
 ^Cat/Chat~1~$ ^Dog/Chien~2~$ ^mouse/*mouse$, ^cat/chat~1~$ ^DOG/CHIEN~2~$.
Changed:
<
<
Note that the tool by default recognizes upper and lower case words.
>
>
Note that the tool by default recognizes words in upper and lower case. Weights are printed only if --show-weights is used.

Examples of hfst-xfst.

 

Interactive vs. non-interactive mode

Revision 22014-03-25 - ErikAxelson

Line: 1 to 1
 
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Line: 7 to 7
 Interactive HFST tools include hfst-xfst, hfst-proc and hfst-lookup.
Added:
>
>

Examples

All tools can be invoked from command line and they take input from user by default. hfst-lookup is the simplest of the tools, it basically looks up words in a transducer file FILE as shown below (lines beginning with > are user input):

hfst-lookup FILE
> cat
cat     chat    1.000000

> dog
dog     chien   2.000000

> mouse
mouse   mouse+? inf

For each word, the tool prints the results it can find in the transducer file and their weights. If a word is not found, +? is appended to the result and an infinite weight inf is printed.

hfst-proc is a similar tool, but designed for text streams.

echo -e "Cat Dog mouse, cat dog." | hfst-proc FILE --show-weights
^Cat/Chat~1~$ ^Dog/Chien~2~$ ^mouse/*mouse$, ^cat/chat~1~$ ^DOG/CHIEN~2~$.

Note that the tool by default recognizes upper and lower case words.

 

Interactive vs. non-interactive mode

All three tools can be given user input through standard input (the default) or a file (must be specified with an option or command line parameter):

Revision 12014-02-24 - ErikAxelson

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="HfstAllPages"

HFST: Tutorial for Interactive HFST Tools

Interactive HFST tools include hfst-xfst, hfst-proc and hfst-lookup.

Interactive vs. non-interactive mode

All three tools can be given user input through standard input (the default) or a file (must be specified with an option or command line parameter):

option/parameter explanation
hfst-xfst --scriptfile=FILE Read commands from FILE, and quit
hfst-xfst --startupfile=FILE Read commands from FILE on startup
hfst-proc transducer_file [input_file] Read input from input_file
hfst-lookup --input-strings=SFILE Read lookup strings from SFILE

In case of reading user input from standard input, the tools hfst-xfst and hfst-lookup make a difference between interactive mode (the default) and pipe mode. This is because both tools print a prompt in interactive mode unless option --silent or --quiet is used. An example with hfst-xfst in interactive mode:

$ hfst-xfst
hfst[0]: regex foo:bar::3;
2 states, 1 arcs
hfst[1]:

and the same in pipe mode:

$ echo "regex foo:bar::3;" | hfst-xfst --pipe-mode
2 states, 1 arcs

In hfst-xfst, there is an apply up mode where user can look up words in a transducer and exit from that mode by pressing CTRL+D. As this is difficult with scripts (and possibly in Windows Command Prompt in general), there is a special string <ctrl-d> reserved for this purpose. Below is an example in interactive mode:

$ hfst-xfst
hfst[0]: regex foo:bar;
2 states, 1 arcs
hfst[1]: apply up
apply up> foo
bar
apply up> [user presses ctrl+d here]
hfst[1]: echo done
done
hfst[1] exit
.

and the same in pipe mode:

$ echo -e "regex a:b;
apply up
a
<ctrl-d>
echo done
exit" | hfst-xfst --pipe-mode
2 states, 1 arcs
b
done

<--  
-->
-- ErikAxelson - 2014-02-24
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback