Perform matching/transformation on text streams with a RTN system.


The help message:

Usage: hfst-pmatch [OPTIONS...] TRANSDUCER
perform matching/lookup on text streams

Common options:
  -h, --help             Print help message
  -V, --version          Print version info
  -v, --verbose          Print verbosely while processing
  -q, --quiet            Only print fatal erros and requested output
  -s, --silent           Alias of --quiet
  -n  --newline          Newline as input separator (default is blank line)
  -x  --extract-tags     Only print tagged parts in output
  -l  --locate           Only print locations of matches
  -p  --profile          Produce profiling data
Use standard streams for input and output.

Report bugs to <hfst-bugs@helsinki.fi> or directly to our bug tracker at:


Given an inputfile streets.txt to


define CapWord UppercaseAlpha Alpha* ;
define StreetWordFr [{avenue} | {boulevard} | {rue}] ;
define DeFr [ [{de} | {du} | {des} | {de la}] Whitespace ] | [{d'} | {l'}] ;
define StreetFr StreetWordFr (Whitespace DeFr) CapWord+ ;
regex StreetFr EndTag(FrenchStreetName) ;

An interactive session with

might look like:

hfst-pmatch streets.pmatch

> Je marche seul dans l'avenue des Ternes
Je marche seul dans l'<FrenchStreetName>avenue des Ternes </FrenchStreetName>

A pipeline version of the same looks as follows:

hfst-pmatch2fst < streets.txt > streets.hfst

echo "Je marche seul dans l'avenue des Ternes" | hfst-pmatch -v streets.hfst 

Je marche seul dans l'<FrenchStreetName>avenue des Ternes</FrenchStreetName>

See also