LT Research and Development

Current situation in 2006

Multilingualism and the interplay between academic and research parties makes the reuse and interoperability more difficult and demanding than what is customary in other environments. Obviously several aspects have to be taken care of:

  • Awareness of existing standards, recommendations and standardization efforts should be promoted.
  • Documentation of the resources and the annotation and coding used in them is vital.
  • Standardization of resources and APIs, as well as tools for interchange and conversion of data from one format to another should be readily available.
  • Knowledge and information for integrating LT with other technologies and design disciplines should be easily accessible.
  • Lack of low cost language resources for most small languages is a major obstacle for both research and development.
  • Lack of cooperation between different research groups is a weakness in the region (both nationally and regionally).

We need stimulating LT research for various application areas. National funding programs should provide the basis, and a Nordic/Baltic framework program for networking could provide the necessary regional infrastructure and communication.

Comments:

  • Preference should be given to research funding that integrates all research groups in a given area for a given country, or the Nordic area rather than supporting a centralized funding approach.
  • Sufficient funding for both long term (university) research and support for industrial development.
  • Good progress in the LT field needs support for joint projects and networks on the Nordic level.
  • In addition to open source, we also need open standards and publicly available APIs.

Vision for 2016

In 2016, basic tools and resources are available as open source and provide a platform for further innovation and new products due to a substantial economical effort provided from the governments in the Nordic and Baltic countries. Availability of necessary language resources improves the quality of LT research and application development and LT research and applications can develop freely in several directions in a stimulating research and business environment. Mono- and multilingual LT modules with uniform APIs for a wide array of languages are smooth and easy to integrate into software products and services. LT modules will be integrated in multimedia systems (e.g. aligned with video systems for video retrieval) and the use quality of LT systems is high, so that the citizens of the region are able to access software-mediated services in their mother tongue. Permanent LT research and development forums have been set up in the bigger Nordic countries in support of Nordic and Baltic languages with lesser volume in economic as well as human terms. For public funding of research and development projects, it is required that the projects either make the publicly funded efforts openly available or contribute resources to some ongoing open source software project.

Recommendations

The academic funding institutions ought to adopt recommendations or rules concerning linguistic resources which will be (or have been) developed using public funding. It ought to be a normal requirement that the researchers make the linguistic resources (e.g. tools and annotated corpora) available for the rest of the research community with as free conditions or licenses as possible. There ought to be a common goal in all Nordic countries to collect, produce and make available linguistic resources using terms which allow both academic use and the use of the resources for creating language technological products, even commercial ones, provided that the resources are used within the limits of copyright laws. In addition we may need to open up language resources on all levels (lexicons, grammars, written language corpora and speech corpora, etc.) which have been created through public funding. Common interfaces and tools should be created in cooperation between both commercial and academic parties.

Key Area Magnitude of funding needed Parties involved Mode of cooperation
Recommendations for research result materials 50 kEUR funding organizations, universities, NEALT working groups
Joint effort for standardization 15 MEUR universities and industry Academia/industry collaboration
Basic technology research 15 MEUR Universities Joint programme, Researcher exchange, workshop, division of research tasks
R&D Funding 50-80 MEUR Universities, Research institutes, industry Nordic projects

The R&D funding can be further specified into various fields of services and applications for the society:

  • (statistical) machine translation and automatic methods for multilingual information processing
  • information retrieval
    • public information tools adapted to the mobile life of users
    • cross-language information retrieval (CLIR) tools, focused CLIR tools for recent immigrants
    • bioinformatics
  • speech technology in multimodal applications
  • language learning

Key Area Magnitude of funding needed Parties involved Mode of cooperation
Several 5-10 MEUR per area public bodies, research partners, industry projects

-- KristerLinden - 12 Jun 2006

Topic revision: r7 - 2006-06-19 - KristerLinden
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback