HFST: Tool for Regression Testing

see also:

OMorFi: Tool for testing a morphological parser

General outline

The program fst-regression-test is a tool for testing two-level grammars. As arguments, it takes a file containing twol-rules and a test-file containing examples of how the rules should work. The examples consist of an analysis form of a word and a list of surface forms corresponding to the analysis form (the first belonging to the analysis language and the rest to the surface language of the two-level grammar). The program shows how well the two-level grammar is working with respect to the test-file, i.e. which surface forms should be generated, but aren't, and which shouldn't be generated, but are. The tool also attempts to give helpful diagnostics.

Each pair of an analysis string and a surface string accepted by a two-level grammar, has to be accepted by each and every rule in the grammar. Hence, it is easy to test, why a particular correspondence is not accepted by a two-level grammar. Each of the rules in the grammar is successively applied to the pair of strings and the rules that disallow the correspondence are singled out. This is what the program fst-regression-test does.

It is often enlightening to know where in a particular pair of strings a two-level rule fails. This is also a part of the diagnostics given by fst-regression-test.

If a two-level grammar is leaking, i.e. unwanted surface forms are generated, each of the rules in the grammar is leaking. This is problematic from the standpoint of diagnostics, since there is no single culprit, among the rules, for this kind of an error. The tool fst-regression-test does for the time being nothing else, except displays the user a list containing all of the word-forms, that are generated by the grammar, but not licensed by the test-file.

Usage.

fst-regression-test TEST-FILE GRAMMAR-FILE

The output of the program goes to the standard output stream.

The grammar-file.

The grammar is given as a succession of two-level rule (now in SFST-format). Here is a small example of a grammar used with english adjectives (originally written by Helmut Schmid, then modified by Miikka Silfverberg).

ASSIGNMENTS

ALPHABET = [A-Za-z] y:i e:<> #:<>

RULES

NAME "y:i variation"
y:i ^=> __ #:<> e

NAME "e-deletion"
e:<> ^<=>  __ #:<> e e

NAME "t-insertion"
<>:t ^<=>  t __ #:<>

NAME "l-insertion"
<>:l ^<=> l __ #:<>
Between the declarations ASSIGNMENTS and RULES the ALPHABET should be defined (now it is defined as an SFST transducer). Other assignments to ranges of the form #VAR# can also be made be made between ASSIGNMENTS and RULES. These may be used as variables in the rule transducers.

After the declaration RULES follow the two-level rules. Each rule consists of two lines. The first gives the name of the rule NAME "..." and the second the rule itself.

The test-file.

The test-file consists of blocks of lines. The block begins with a analysis (phonological) form of a word. The analysis form is succeeded by one or more surface forms (phonetical forms). The forms are separated by single newlines and the blocks by multiple newlines. An example of a test-file for the grammar above:

plain#
plain<>

eerie#er
eeri<><>er

prickly#est
prickli<>est
The list of surface forms associated to an anlysis form in the test-file should be exhaustive. It should contain all of the surface forms, that the two-level grammar is to generate from the analysis form (and only those). When no errors are given by the fst-regression-test tool, the two-level grammar agrees with the relation given by the test-file.

The output of the program.

The second rule in the grammar above

e:<> ^<=>  __ #:<> e e
is obviously flawed (the second e shouldn't be there, since the phenomenon of stem-final e-deletion in English adjectives is triggered by a single e after the morpheme-boundary). This will cause the correspondence between the analysis form eerie#er and surface form eeri<><>er not to be recognized.

The first rule

y:i ^=> __ #:<> e
is flawed as well. The arrow ^=> licenses the correspondence y:i in the context __ #:<>e but doesn't force it, so the correspondence y:y will also be allowed in this context. Hence a superfluous surface form prickly<>est will be generated corresponding to the analysis form prickly<>est.

With the grammar and test-file above, fst-regression-test gives the following output:

Testing the grammar in the file "easy12.fst" using the test file
"test-file".

The test file includes the analysis forms and surface forms:

plain#
  plain<>

eerie#er
  eeri<><>er

prickly#est
  prickli<>est


THE ANALYSIS FORM plain#

  All surface forms were generated.

  There were no superfluous surface forms generated.


THE ANALYSIS FORM eerie#er

   Rule 2  "e-deletion" isn't working. It fails
   to generate the form:
   eeri<><>er
       ^ THE GENERATION PROCESS STOPS HERE.

  There were superfluous forms generated:
    eerie<>er


THE ANALYSIS FORM prickly#est

  All surface forms were generated.

  There were superfluous forms generated:
    prickly<>est

The program lists each analysis form. Some of the surface forms, corresponding to the analysis form in the test-file, may not be generated by all of the rules. In that case, diagnostics is given about which rules fail to generate the form and where in the string the error might occur (e.g. the analysis form eerie#er above).

Given an analysis form, there might be surface forms generated by the two-level grammar, that don't correpsond to any of the surface forms given for the analysis form in the test-file. A list of these is also dislpayed (see both of the analysis forms eerie#er and prickly#er above).

-- MiikkaSilfverberg - 27 Mar 2008


-- KristerLinden - 27 May 2008
Topic revision: r2 - 2008-05-29 - KristerLinden
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback