Requirements specification


DATABASE: CSC's customer database is called Askare, later database in this document
HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.

Schedule and costs

The purpose is to plan, build and take into production an upgraded system for Language Bank users. New features of the upgraded system includes SAML2/Shibboleth support, web application forms and a referee process. There are also small enhancements including email address verification for non-Shibboleth applications and an upgrade of the Resource Manager demo onto production level.

The tasks and their estimated schedule for 2009 are:

  • Add required tables and fields to the Askare database (2 pm)
  • Contents of web forms, emails and web pages (2-4 pm)
  • Programs that check and store web forms' data in the database and send emails (2 pm)
  • Resource Manager Demo into production (1 pm)

The costs will be 59 500-76 500 euros (equals 7-9 pm).

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Shibboleth authentication

Automatic access to resources

For authenticated users:

User information will not need to be saved in the database nor will the user need a CSC user account.

Owner-controlled access to resources

Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees

Electronic application form processing

There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling. Commercial users need to sign a contract to access the resources.

After the Shibboleth authentication:

  • Electronic Application Form (as registered and prefilled)
    • Required attributes
  • Available linguistic research resources (with limitations defined by the owners)
  • Send

If the user cannot be authenticated with Shibboleth:

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (

  • Electronic Application Form for CSC users
  • Personal and project information update (if needed)
  • Available linguistic research resources (with limitations defined by the owners)
  • Send

After Send:

The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee.

Referee's authorization and authentication

An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. Also the email responses require handling.

The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referee List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
  2. If the user expects that a referee knows him or her, he or she selects that referee. The application will be sent by email to the referee with links for recommending and denying.
    • The referee candidates select a referee, too.
  3. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the owner and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
    • If the referee selects the deny link, a rejection message will be sent to the owner and the administrator.

In this model, a referee losing status would affect the associated users as well. Loss of status due to natural reasons (e.g. retirement, transition) lacks this effect.

Recommend and Deny

For the referee recommendation, the system has a very secret passphrase which is SHA-hashed with the applid value. There are two web programs: Recommend and Deny. The email message generated for the referee contains links to both of them together with the application data. When the referee recommends that the application be accepted, he or she clicks the recommend link that is parametrized with an applid and a SHA hash value, and the hash will be checked.

When the hash matches, the recommend program increments the CAC field value by 32.

If the referee fails to reply in e.g. one week, he will receive a reminder. If the referee still fails to reply, the application will be forwarded to the owner and the administrator after a predefined delay (e.g. one week).

If the hash does not match, the programs do nothing or warn the staff about abuse.

Emails of the referee procedure

  1. Referee Form sends an email to the referee for recommending or denying.
  2. Reminder email to the referee, if (s)he fails to reply (automatically after a delay).
  3. Referee's Recommend email to owner.
  4. Referee's Recommend email to administrator.
  5. Referee's Deny email to owner.
  6. Referee's Deny email to administrator.
  7. Referee's No reply email to owner (automatically after a delay).
  8. Referee's No reply email to administrator (automatically after a delay).

In the case of a referee candidate, emails to the owner will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.

Owner's and Administrator's acceptance

If both the administrator and the owner accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the owner agrees.

After the Owner's and Administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.

The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.

Accept and Reject

The program then sends the application by email to the owner (or contact person) of the corpus and the Language Bank administrator to be accepted. If both accept, the Accept program copies the application data into the database tables kayttajat (users), osoitteet (address) etc., and sends an acceptance email to the user.

What else does the Reject program do other than send a rejection email to the user? Will the application be deleted?

Emails of the Owner's and Administrator's procedure

  1. Owner's Accept email to administrator.
  2. Administrator's Accept email to (save the user's data in the database).
  3. Accept email to user.
  4. Owner's Reject email to administrator.
  5. Administrator's Reject email to user.

In the case of referee candidates, owner's emails will be replaced by emails of the nominator from the Helsinki University Department of General Linguistics, who accepts new referees.

Database changes

Email confirmation field

The application table has the email confirmation field which contains at least 128bit of random data generated when storing the application form. When using the application form as non-registered, the random data value will be emailed to the user. The user will receive a link to the confirmation form, where he or she needs to confirm the e-mail address by entering the random data value (refer to the KITWIKI registration). Submitting the confirmation form increments the CAC field value of the application table by 1 or 2 depending on the email address.

If the user's email address is invalid, the unconfirmed application will be dropped from the database once a day.

CSC Authentication Classes (CAC field)

The CAC field in the application table describes how the user's identity is verified. Information on how each user is authenticated needs to be stored in the database, because stronger authentication than the currently used personal signature may be required. It should be added into the database table kayttajat (users). The CAC field can get one or several of the values listed below. If several values are selected, they will be summarized.

  • 0. Not authenticated (data stored from web form).
  • 1. User-verified email. Authentication by an email confirmation from any address.
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization.
  • 4. Authentication using a credit card or a good certificate issued by well-known CA.
  • 8. Scanned signature in a pdf-document.
  • 16. Personal signature (default value for current CSC customers).
  • 32. Referee recommendation: a known professor or research director recommends that the application be accepted.
  • 64. Strong authentication using SAML2/Shibboleth or grid certificates (in the USA: urn:mace:incommon:iap:bronze). Alternatively, official identification (photo ID) verified by a referee.
  • 128. Official identification verified by a bank account (tupas) or more secure certificates (in the USA: urn:mace:incommon:iap:silver).
  • 256. CSC-checked official identification card or passport.

Application table

When the user sends the application, what to do with the application data which is not yet accepted? It can be stored in the existing tables with new status fields, or new table(s) can be created. We recommend that a new application table be created.

field type size null comment
applid int no
arrivaldate date no
usernamecantidate varchar 8 yes
CAC smallint no
display name varchar 20 no
familyname varchar 25 no
nationality smallint no phone code or TLD
position varchar 40 no
organization varchar 40 no
faculty varchar 40 yes
phone varchar 20 yes
gsm varchar 20 yes
email varchar 60 no
emailconfirmation varchar 20 yes only needed during confirmation process
referee smallint yes
datetime datetime no
projectname varchar yes
projectdescription text yes
newreferee char 1 yes

  • A postal address is required for sending the password, magazines and Christmas cards.
  • Will the applied resources be stored here?

Referees table

CSC has to add the new table referees in the database. The referee table must have the ID and status fields. The ID field is just a number which connects the table to the henkilo (person) table (includes e.g. first name and last name) and to the osoitteet (address) table (includes e.g. email, phone etc.). The status field can have the values 0 (no longer trusted), 1 (active) and 2 (retired).

It is necessary to document who was the referee for each user. The referees table needs to be connected to the kayttajat (users) table by adding the ID field of the referee table into the kayttajat table.

field type size null
ID smallint no
status char 1 no

Center self-assessment moved to:
Topic revision: r54 - 2008-11-21 - SatuTorikka
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback