This web is for holding topics deemed as old or irrelevant for KitWiki. If you think the topic doesn't belong here, please check that it's named properly (is a WikiWord) and descriptively, contains relevant data, and is put back to a relevant web.

pc-parse


PC-PARSE, a set of tools for morphological and syntactic analysis

Description

PC-Parse is a collection of tools for morphological and syntactic analysis. They include an Item-and-Arrangement analyzer AMPLE, two-level morphology analyzer and generator PC-KIMMO, unification-based syntactic parser PC-PATR, and tools to help with translation.

Parts:

  • pc-kimmo - Two-level morphological analyzer
  • pc-patr - Syntactic parser
  • ktext - Text analysis with PC-KIMMO parser
  • ktagger - A part-of-speech tagger based on PC-KIMMO.
  • ample - A morphological Item-and-Arrangement parser for linguistic exploration.
  • anadiff
  • intergen
  • stamp - Morphological transfer and synthesis for adapting text to a related language
  • tonegen - Allows modeling autosegmental tonology with STAMP
  • tonepars - Allows modeling autosegmental tonology with AMPLE
  • convlex
  • xample

Version and Copyright Information

version: v. 20051207

  • PC-Kimmo version 2.1.13
  • PC-PATR version 1.3.13
  • KTEXT version 2.1.8
  • KTAGGER version 1.0.10
  • AMPLE version 3.10.1
  • ANADIFF version 1.0.6
  • INTERGEN version 2.2.0
  • STAMP version 2.2.1
  • ToneGen version 1.0b20
  • TonePars version 1.0.19

copyright:

Usage

GETTING STARTED WITH PC-KIMMO

Here are instructions for trying out PC-KIMMO with Englex, a PC-KIMMO description of English morphology (rules, lexicon, grammar). Englex is in the pckimmo/test/eng subdirectory.

After getting the englex archive and unpacking it, go to the englex subdirectory and edit the file englex.tak. Fix the file paths for your local system using either absolute or relative paths. Here is one strategy: move the englex.tak file out of the Englex subdirectory into the directory just above it and modify the paths like this:

load rules englex/english.rul load lexicon englex/english.lex load grammar englex/english.grm

Launch PC-KIMMO and at the prompt type:

take englex

(The Take command expects .tak as the default file extension.)

Now use the Recognize command to recognize (parse) words. For example, at the prompt type:

recognize foxes

The command keyword "recognize" can be shortened to "r". Better, at the prompt type "recognize" (or "r") and press return. A special "recognizer" prompt will appear. Now you can keep typing words without repeating the "recognize" command. Note: use only lower case letters and no punctuation (except apostrophe and hyphen).

GETTING STARTED WITH PC-PATR

Here are instructions for trying out PC-PATR with the supplied toy English grammar.

The directory doc/pcpatr/english contains a toy English sentence grammar and lexicon. This grammar can also be used with Englex, a morphological description of English for PC-KIMMO (see above). Englex will provide a morphological parse of words on the fly, thus building up a word lexicon as you parse sentences. Note that you do not have to have the stand-alone PC-KIMMO executable in order to use Englex with PC-PATR; PC-KIMMO is built into PC-PATR.

Go to the doc/pcpatr/english subdirectory and start PC-PATR:

 
        % cd doc/pcpatr/english
        % pcpatr
        PC-PATR>take english

(The Take command expects .tak as the default file extension.)

Now use the Parse command to parse sentences. For example, at the prompt type:

PC-PATR>parse uther stormed cornwall

The command keyword "parse" can be shortened to "p". Better, at the prompt type "parse" (or "p") and press return. A special "parse" prompt will appear. Now you can keep typing sentences without repeating the "parse" command. Note: use only lower case letter and no punctuation (except apostrophe and hyphen inside words). Try sentences such as these (but remember that you are limited to words that are in the lexicon):

        uther sleeps
        the knights sleep
        uther storms cornwall
        the brave knights have stormed cornwall
        i sleep
        he sleeps
        he was sleeping
        he slept
        he has slept
        i see him
        he sees me
        i was seen
        i was seen by him
        i was seen by him clearly
        i saw the man with a telescope
        the tall man on the hill saw me with a telescope
        i saw uther before he stormed cornwall

USING PC-PATR WITH ENGLEX

Obtain Englex from the URL given above and install it. Go to the doc/pcpatr/english subdirectory under PC-PATR and edit the file english2.tak. Fix the take file for your system by modifying the file paths for your local system.

% emacs english2.tak or % vi english2.tak

Fix the first three lines starting with "load kimmo" so that they point to the Englex files. For example, if the englex subdirectory is on the same level as the doc and src subdirectories the default paths would be okay:

 
        load kimmo rules   ../../../englex/english.rul
        load kimmo lexicon ../../../englex/english.lex
        load kimmo grammar ../../../englex/english.grm

Note that the "kimmo mapping" file is found in the doc/pcpatr/english directory:

load kimmo mapping english2.map

Now you can parse sentences as described above. The difference is that any words you use which are not in the word lexicon (i.e. english.lex) are parsed by Englex. When you are done parsing sentences, you can save the modified word lexicon using the "Save Lexicon" command.

Help, Manuals and Documentation

help commands:

further information:
See:

Bugs

License Text

The software is available under SIL Language Freeware End User License Agreement. The full license text is available at http://www.sil.org/computing/catalog/freeware.html.

Other Information

Field of science: Linguistics

Available:
corpus

License: LicenseTypeAFreeSiteLicense

To be copied to: https://wwwk.csc.fi/english/research/software/pc-parse
To be seen at: http://www.csc.fi/english/research/software/pc-parse
See also: KitWiki.SuomenKielipankki:Dev:Linguistics_Software, Old.ToolResources
The users may add their own comments to: ToolResource_pc-parse_Comments

When editing, please move cursor to the form below. Do not add anything here.
Topic revision: r10 - 2008-11-21 - HennaRiikkaLaitinen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback