Wednesday March 28:


09:30 - 15:00: WP Overview (WP Interdependencies by Peter)

WP0 - WP7 (see documents on the wiki)

15:00 - 18:00: Overview of drafting and administrative tasks

to produce the proposal & task allocation

Lunch: if and when we feel we deserve it

Thursday March 29:


09:30 - 10:30: State of affairs in the various countries

10:30 - 12:30: Relationship between tasks funded by EC and national funding agencies

12:30 - 14:00: Lunch

14:00 - 16:00: External affairs: relationships with other projects, other organisations, international affairs




  • funders board: (twice a year)
    • budget decisions: not funding suggestions
    • legal framework issues
    • decision taking group, use in WP 1, also as a decision organ
    • (EC wants to have a presence here) give stamp of approval
    • STRATEGIC decisions about fund allocations: large-scale

clarin is not supposed to be a project, but should live beyond the funding phase. can only do so by national funding, should exist in order to secure long-term support from national forces, much more involved in matters of clarin

substantial scientific representation on this board (? 3 people) at start of project, for contract negotiations: (don't want german money to go to other countries) involve funder's board

  • acceptance for scientific suggestions: get a paper from scientists to agree upon. decisions must agree with national policies.

overall control of preparatory phase.

  • scientific (supervisory) board: (more than 20)
    • composed of scientific people
    • decisions about objectives
    • standards decisions
    • need chairperson, 2-3 vice chairs

  • executive (scientific) board
    • do the work, present something to the scientific board
    • TACTICAL decisions about fund allocations: detailed-scale

  • must have statement about members' meeting (80 meetings), provide a platform, e.g., a meeting at LREC, or EACL/ key conferences. (possibly in dissemination WP)

  • PW: self reflection: periodic auditing: to make sure we are going where we planned to go, etc.
  • MW: bring in sociology of science experts who will evaluate the organization (may look better on the proposal),


specification of the common legal framework to be adopted. will have strong link with WP7.

NC: updating strategic plan, with objectives, milestones, to keep ourselves informed.

PW: legal framework/agreements of legal sort for the operation of the infrastructure, (e.g., exchange passwords may interfere w european law)

(NOT IPR issues, copyrights of individual resources, that goes to WP7.)

KK: agreements among organizations: spans both types of legal work (WP1 and WP7)

PW: is there duplication with humanities interoperation -- but here we are at POLICY level, not at detail level.


6 objectives corresponding to 6 tasks:

task 1: (includes a "recruitment" plan, to find proper people.)

MW: ? technical infrastructure: no mention of advice/training, the "soft" side, that is in WP6. ought to have more interlinking between these.

EH: (much follow up funding will depend on what we do now) one bad outcome: disconnected resources, not much use/visibility. another bad outcome: infrastructure but no cars on the road. must make it attractive for people to make their resources available. need to make sure infrastructure is well connected to tools and dissemination process. --> links to WP5.

KK: much will depend on AUTHORIZATION, even the specification of infrastructure. person from WP7 will need to be present on WP2.

TV: what is the concept we address here as INFRASTRUCTURE: must clarify labels/concepts/terms. infrastructure is too broad/general, call this WP "TECHNICAL INFRASTRUCTURE".

KK: DAMLR scheme may be appropriate for concordancing, but too slow for massive processing of large corpora. would like to see, task 5 does not EXCLUDE the other view (centralized/mirrored) for an alternative solution. e.g., we know CSC is more willing to participate if both aspects are covered.

WP3: Humanities Projects

EH: ? say we have 5 proposals from malta, who makes the decision on the outcome of this call? is it in WP4 (selection committee), or scientific board, or who?

PW: as long as there is national money, it's up to the national boards to decide. from our side, they must be clarin compliant, else they cannot join. WP3 will establish criteria, evaluate proposals, make suggestion to executive board, scientific board has to decide. if they want to join clarin they must be clarin compliant. as soon as they ask for EU money, they must be clarin-compliant.

TV: ? are we setting aside money in this WP3 preparatory phase to finance R&D support for this.

PW: no, must go through national money. (but we may have more national money than will be covered by matching funds. that money may be usable for this. do not want to make too definitive statements now, work this out together with funders, leave the decisions to the boards).

NC: we need some example projects to prove the concept: should be funded by EU money, (possibly in part national) a few prototypical cases.

EH: who would join project with no carrot? it is also in our interest that we have some showcases. need to put some money there. there should be national (or other FP7) carrots, but they will be eaten by the nation providing the carrots. there should be also some EU carrots, what will be usable for countries who have good showcase projects, but no funds to cover integration into clarin.

there is agreement that some money should be set aside for interfacing with such exemplary, international projects.

MP: should the calls be internal or external?

target audience is humanities, should be open. a language-resource/technology institute should be involved.

should focus on national money: e.g., if germans say we have interesting project, and we have funds, who will complain?

WP4: Exploring Humanities


it is about communicating, understanding each other.

identify potential collaboration partners to integrate the communities. this is in the large! WP3 deals with specific cases, interfacing CLARIN with projects.

therefore WP4 must start somewhat before WP3.

WP3 is more about concrete solutions, concrete solutions, BRICKS is WP3, DALOS is WP4, for which we can make support immediately, to validate the concept.

WP4 is more about (building) communitites, on a broader scale. (SK: only meetings, discussions, etc. whereas WP3 is implementation).

PW: add to task e: training applies to TWO DIFFERENT communities: humanitites (going out to humanities conferences), and our own people, language people, training them.

* MW: may include performance arts (drama) - may be interested in language resources.

links w/WP6 (dissemination)

DT: requirements may differ from language to language (even in the same humanities discipline).

(may resort to "emergency" solutions, for resource-poor languages, quick-dirty solutions, etc.)

MP: during WP4 should have broad investigation of possible solutions, create a roadmap for what should be implemented.

(Q: who selects what we implement? A: if a group of people from some country favor an idea, it will be implemented with national money.)

* cover in detail links between WP4-WP3 and WP4-WP6

WP5: Language KitWiki.Resource inventory and paving the way for multilinguality


NC: should have some synergies with ELRA, BLARK, etc.

want to make distinction between resources for which we only have metadata, (put a link to the publisher, etc.) vs, those that are available through clarin.

CLARIN will maintain a list of all resources. registered resouces to be usable in the infastructure, have to adhere to some standards, (together with IPR issues).

6-7: quality assessment == compliance with standards.

8 interrelation of resources == interoperability and conversion

there is some technical work in this WP.

PW: when do we do this? # take a limited set of resource-rich languages, and showcase them (hard to justify further development for these languages). # then identify some languages (e.g., Maltese) to do a little development and show what can happen with this kind of languages. (Saami is another good candidate).

make complete registry (not catalog) of resources that we have.

KK: wiki is efficient tool for maintaining the registry.

should mention:

* initially, resources will come from linguistic researchers, but will eventually include data from other humanities. * in introduction, mention history, about the two communities (NC)

WP5: Multilinguality (KK)


task 1:

NC: discuss during the project, do not commit at the level of proposal.

SK: emergency project to be conducted during the preparatory phase

DT: multilingual potential is a major differentiating factor of CLARIN! monolingual approach is not good goal for this consortium. we cover all languages in Europe, multilinguality should be very important point to be underlined. proposals to select subset of languages for proof of concept. but then we want to extend coverage to all the minor languages as well. a lot of attention should be devoted to collecting PARALLEL/COMPARABLE data!

distinguish: multilingual vs. poly-lingual (= broad language coverage)

RY: write into the proposal:

  • proof-of-concept languages (resources and technologies)
  • provide a framework for incorporating all relevant, resource-poor languages, by providing the enabling technologies for them.

KK: primary question: this issue may be reworded (in the proposal) but must NOT BE DELAYED, solutions for adding minor languages must have first priority.

MW: uncomfortable with focus in EU languages. (yes, useful and important, and would please funders.) but humanities researchers work on diachronic corpora and non-eu languages.

SK: EU: ok to spend money on eu languages, but ...

WP6: Dissemination


(in a. workshops: unclear text after colon...)

b. training:

PW: set up curriculim, for an entire semester. "new computational paradigms" convince universities to give this class, clarin members can give guest lectures. summer school.

KK: to do this ought to have some fraction of the CLARIN protorype already working!! also some tools! then can have a workshop on: "equip your language with the appropriate tools".

SK: should not focus exclusively on WRITTEN language. include the relevant conferences of these communities. (this is true for all workpackages).

PW: make a really PROFESSIONAL site, for spreading the word.

Intellectual Property Rights, Authorization, Licensing and Ethical Issues


EH: add language: welcome some advice and interaction with existing initiatives for this kind of work, funding board, to make sure we follow best practices. explicitly ELRA, ELDA.

PW: BUSINESS MODELS, accounting. if we change the game to more flexible services, people want to access just 10 words, then they pay only for 10 words. flexible accounting. in the objectives: make statements about OPEN ACCESS. trust agreements: authorization agreements, point in the proposal to TERENA, pick up models and existing practices (funders can help with this).

NC: integrating multiple different resources, how to charge for access. (now working on this question in ELDA).

EH: we have experience with licensing. it is easy with academic users, complex with commercial parties. what are the cost/accounting models. how do we position ourselves? do not take a stance "we want industry to pay", but mention the issue, and say this is to be worked out during preparatory phase, looking at other communities. add a few words about these possibilities that must be explored.

PW: in the general discussion mention that it is the intention of clarin is to produce OPEN SOURCE, based on OPEN STANDARDS, because of the continuity -- work of one party will be continued by other parties. this might be a clause in the AGREEMENT FOR JOINING clarin.

SK: state the ideal, and that we will strive for the ideal, but there are constraints.


Points to be discussed:

1. who is in the consortium at the beginning?

  • those who have a signed letter and are mentioned as delegates (scientific, funding)
  • those who will be mentioned with specific roles

2. questionnaire to all with commitment letters: how much money was "discussed"?

3. Work out cost estimates for those areas where national money will be involved (range, exclusion)

  • repository federation - numer of
  • LR adaptation, encapsulation and registration
  • LT "" "" ""
  • humanities projects
  • filtered registry services
  • language-specific ontology services
  • filling in the gaps in existing LR/LTs

(This can only be a vague estimate at this point.)

4. example letter to all members that current partnership is not exclusive that all will be part of the game, that there "should" be a national group

SK will write an example letter.

create a template letter, on which individual letters will be based. individual letters will be written by the country's representative.

invite all members to contact the leader of the WP in which they would like to participate to express interest.

5. new versions of WP descriptions should be ready by tuesday evening -- otherwise writing will begin with what there is now.

6. SK should email to all members when WP documents and some materials are on the wiki, asking for comments

- special page for all to write comments (all cannot edit WP pages!!!)

7. who takes which roles now:

WP1 U Utrecht (coordinator) steven
WP2 MPI peter
WP3 HAS tamas
WP4 OTA martin
WP5 U Tuebingen erhard
WP7 U Helsinki kimmo

WP leader decides what other partners have what roles in the WP.

scientific board:

  • chair: nicoletta
  • VC LR: erhard
  • VC LT: dan

roles of VC:

  • control activities on executive board
  • person well known in the field, able to establish contacts

8. for each WP: make comments about timing/phasing, amount of effort + plan for first 6/12 months, global estimate for years 2 and 3. first year - detailed milestones for 6 months.

9. SK will ask for A2 Forms and self-descriptions where necessary

External/international affairs: relationships with other projects, other organisations, international affairs

(? relationship with WP0/WP4)

demonstrate keen interest from prominent centers throughout the world.

  • apr 16: alpha version of proposal, to be distributed.
  • apr 23: beta version

-- RomanYangarber - 28 Mar 2007

Topic revision: r7 - 2007-12-03 - KimmoKoskenniemi
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback