hfst-diff-test

Purpose

Test the regular relation defined by a cascade of one or more transducers. Apply the transducers one after another on input strings to produce a set of output strings. Compare the results of the transduction to a predefined set of output strings. The input strings and output strings are given in a test file.

Each pair of an input form and a set of output forms in the test file constitutes a test. The test succeeds, if the predefined set of output forms is the same as the set of output forms obtained by applying the transducer cascade on the input form. Otherwise it fails.

When a test fails, the program will tell, if there were forms laking and/or if there were extra forms. If the program is run in verbose mode (-v or --verbose), it will also display a list of the forms, which were left out, and a list of the extra forms.

If no test file is given, the program will run in interactve mode prompting for input forms, applying the cascade transducers on the forms and displaying the resulting output forms.

Usage

USAGE
[ cat FST_FILE | ] ./hfst-diff-test [ OPTIONS ] [FST_FILE ]

Parameters

Parameters common for all commandline programs.

-h, --help Print help message
-V, --version Print version info
-v, --verbose Print verbosely while processing
-q, --quiet Do not print output
-s, --silent Alias of --quiet

Parameters common for all commandline programs taking one input stream and writing text as output.

-i, --input=FILENAME Read input transducer from FILENAME
-o, --output=FILENAME Write output to text-file FILENAME
-R, --read-symbols=FILENAME Read symbol table from FILENAME

Parameters specific for hfst-diff-test.

-d, --debug Display debug information (mainly for development).
-r, --relation=FILE The file with input strings and corresponding output string sets.
-S, --spaces Give symbol-pairs in the input strings separated by spaces.
-t, --test=TTYPE Use TTYPE test comparing result sets

Notes

  • The test forms are given in a test file. The test file should have input forms and the correspnding output forms. The forms should be grouped into blocks, so that each input form is followed by its output forms. The blocks should be separated by lines of whitespace.
  • The order of the output forms is not significant, but the input form should always be the first form of a block.
  • The test form file is supplied using the option -r (--relation).

Test types

The parameter -t can be used to specify type of test used comparing the result set R and reference set E of expected results. There are currently three test types.

exactly
Tests E = R, if not, prints E \ R and R \ E, if verbose is on.
at-least
Tests E \ R = 0, if not, prints E \ R, if verbose is on
none
Tests E & R = 0, if not, prints E & R if verbose is on

Testing a generator for English noun-forms

Consider the small HfstLexC example lexicon english.nouns.lexc below. It is meant to generate the sigular and plural forms of the nouns dog, volcano and glass. Since its only mechanism for doing this, is to adjoin an s at the end word stems, it will not generate the correct plural forms volcanoes and glasses, but it will generate the incorrect plural form glasss.

english.nouns.lexc:

Multichar_Symbols

+Noun +Pl +Sg

LEXICON Root
        dog+Noun:dog         Number ;
        volcano+Noun:volcano Number ;
        glass+Noun:glass     Number ;

LEXICON Number
        +Sg:    # ;
        +Pl:s   # ;

We compile the lexicon using the command

cat english.nouns.lexc | hfst-lexc > english.nouns.lexc.hfst

We use the file english.nouns.test as a diff-test of the lexicon.

english.nouns.test:

dog+Noun+Sg
dog

volcano+Noun+Sg
volcano

glass+Noun+Sg
glass

dog+Noun+Pl
dogs

volcano+Noun+Pl
volcanos
volcanoes

glass+Noun+Pl
glasses

We run the test and store the results into the file english.nouns.test.results

hfst-diff-test -v -i english.nouns.lexc.hfst -r english.nouns.test -o english.nouns.test.results 

The result file is self-explaining.

english.nouns.test.results:

>> dog+Noun+Sg
OK

>> volcano+Noun+Sg
OK

>> glass+Noun+Sg
OK

>> dog+Noun+Pl
OK

>> volcano+Noun+Pl
MISSING FORMS
volcanoes

>> glass+Noun+Pl
MISSING FORMS
glasses
EXTRA FORMS
glasss

Obtaining the program and installing

hfst-diff-test is a part of HfstCommandLineTools.


-- MiikkaSilfverberg - 2009-03-12
Edit | Attach | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r8 - 2009-10-02 - ErikAxelson
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback