Tools in the Language Bank: XML tools

The Following Tools Exists in the Corpus Server:
  • xmllint (/usr/bin) - Tällä ohjelmalla voit käsitellä XML-dokumenttia eri tavoin. Kielipankin käyttöä varten ohjelman tärkeimmät valitsimet lienevät --format, jolla voit sisentää XML-dokumentin uudestaan, --encode encoding, jolla voit konvertoida XML-dokumentin merkistöön encoding (esim. UTF-8 tai ISO-8859-1) ja --loaddtd, jolla varmistat, että Kielipankin XML-muodon DTD latautuu eikä xmllint tämän takia käyttäydy kummallisesti.
  • xmln - Tämä ohjelma muuntaa XML-tiedoston PYX-muotoon, joka kaikessa yksinkertaisuudessaan on koodaustapa, jossa rakennetieto ja data on laitettu omille riveilleen niin, että tiedostoa voi käsitellä tavallisilla Unix-työkaluilla (grep, awk, sort jne.). Xmln -ohjelma on hyödyllinen, jos esim. haluaa laskea frekvenssitietoja suoraan morfosyntaktisesti koodatuista XML-muotoisista korpustiedostoista ilman ohjelmointia. Esimerkkejä xmln-ohjelman käytöstä tavallisten Unix-työkalujen kanssa
  • tei2snt (/l/appl/ling/contrib/bin/)
  • ict (/mnt/corpus/appl/ling/tools) .... has bad library paths, see README

Traditional Kielipankki tools (/l/appl/ling/tools/) original documentation

Please enter the wishes below:

 
  • Support for editing XML, XSL and XHTML files is definitely needed. Emacs with nXML mode is quite usable but this mode is not yet installed at Corpus, see http://www.thaiopensource.com/nxml-mode/ -- KimmoKoskenniemi - 10 May 2006 - 10:26
  • xsltproc would be the standard tool to convert XML files into other XML forms, to HTML or to text. It is restricted to the XSL 1.0 types of transformations, i.e. only one result file out of the source file among other thing. But xlstproc is easy to use and efficient (as compared to the Java based systems). -- KimmoKoskenniemi - 10 May 2006 - 10:12
  • SSAX - SSAX is a full-featured, algorithmically optimal, pure-functional parser, which can act as a stream processor. A SSAX functional XML parsing framework consists of a DOM/SXML parser, a SAX parser, and a supporting library of lexing and parsing procedures. The procedures in the package can be used separately to tokenize or parse various pieces of XML documents. The framework supports XML Namespaces, character, internal and external parsed entities, attribute value normalization, processing instructions and CDATA sections. The package includes a semi-validating SXML parser: a DOM-mode parser that is an instantiation of a SAX parser (called SSAX). This SourceForge project offers tools to inter-convert between an angular-bracket and a more efficient S-expression-based notations for markup documents, and to manipulate and query xML data in Scheme. The main components of the project are SSAX, SXML, SXPath, and SXSLT. more links See also XML and S-Expressions.

-- AnssiYliJyra - 05 Sep 2006

Topic revision: r3 - 2006-12-19 - AnssiYliJyra
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback