Difference: RequirementsDocumentation (1 vs. 34)

Revision 342010-02-08 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation

Line: 14 to 14
 
Database MySQL Language Bank database, future references to the database will refer to this MySQL database in this document
HAKA Identity federation of the Finnish universities, polytechnics and research institutions
IdF Identity federation
Added:
>
>
IdM Identity Manager
 
IdP Identity provider
LRT Language resources and technology
SP Service provider
Line: 42 to 43
  (1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) Whether there will be resources falling in this category must be studied.
Changed:
<
<
(2) LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic authorization
>
>
(2) LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms.
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled authorization
Line: 130 to 131
 

Referee's authorization

Changed:
<
<
An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. The response to the email sent by the system will be via a web form (not by replying to the mail).
>
>
An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. The response to the email sent by the system will be via a web form (not by replying to the mail). When the IdM system is running, it will be used for the Referee's authorization.
  The referee's procedure to authorize an applying user could be the following:
Line: 188 to 189
 

CP's and administrator's acceptance

Changed:
<
<
If both the CP and the administrator accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the CP agrees.
>
>
If both the CP and the administrator accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the CP agrees. When the IdM system is running, it will be used for the CP's and administrator's acceptance.
  After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. CSC's current UNIX/LINUX based environment uses unix groups for user management (e.g Lemmie and DMA). The CSC user account allows command line
Changed:
<
<
access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.
>
>
access to a server. Opening up a normal CSC user account would offer tools for monitoring. When the IdM system is running, it can create the account.
  If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home organization. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.

Revision 332009-11-16 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"
Changed:
<
<

Requirements Documentation DRAFT

>
>

Requirements Documentation

  This document offers the requirements documentation for the development project of AAI for Finnish language resources.

Line: 20 to 20
 
SUI New Scientist's interface
Changed:
<
<

Common features for Automatic and Controlled authorization

>
>

Common features

 

Shibboleth authentication

Line: 35 to 35
 
  • CSC will implement an architecture that will support the addition of other national Identity Federations in the future in a relatively easy manner.
Deleted:
<
<

Resources

 
Changed:
<
<

Categories

>
>

Resource categories

  Linguistic resources (corpora) have to be equipped with access information divided into three categories:
Line: 50 to 49
 According to CLARIN policy, general metadata, including knowledge of the existence of a resource, should be publicly available for all prospective users.

Changed:
<
<

Terms and Conditions

>
>

Terms and Conditions

  (a) Terms of Access
Line: 82 to 81
  CSC will monitor and gather usage statistics.
Deleted:
<
<

Automatic authorization

Figure: User process for linguistics with Shibboleth authentication and automatic authorization

  • After Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before being authorized?
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
  • After being authorized, the user can download the resources of Category 2 and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
  • The user will not need a CSC user account. Whether the ePPN should be linked with the resource, must be studied.

Resource Manager Documentation

The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

  • Required attributes (per resource or file) must be solved.

Demo

In the Demo, a database which can store stat and hash data of files in a relational database was implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list. The MySQL Language Bank database will be the actual platform.

The source code of the demo is attached:

  • dl: Python program to download files
  • list: Python program to show the allowed files

The current database structure includes a table called resurssi > resourcedetails:

describe resurssi;
+------------+-----------------------+------+-----+---------+-------+
| Field      | Type                  | Null | Key | Default | Extra |
+------------+-----------------------+------+-----+---------+-------+
| path_hash  | varchar(32)           | NO   | PRI |         |       | 
| path       | text                  | YES  |     | NULL    |       | 
| path_utf8  | text                  | YES  |     | NULL    |       | 
| moderator  | varchar(64)           | YES  |     | NULL    |       | 
| right_type | mediumint(8) unsigned | YES  |     | NULL    |       | 
| rights     | varchar(255)          | YES  |     | NULL    |       | 
+------------+-----------------------+------+-----+---------+-------+

Each record in the resurssi table contains a file description. A resource can contain one or several files.
The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
The moderator is Shibboleth EPPN (EduPersonPrincipalName). The moderator can set the rights.
Only path information is shown to the user.
Only right_type 0 is used.
The rights field contains a Shibboleth attribute key value string. The rights field can contain one of the following sample strings :

HTTP_SHIB_SCHACHOMEORGANIZATIONTYPE=urn:mace:terena.org:schac:homeOrganizationType:fi:university
HTTP_SHIB_SCHACHOMEORGANIZATION=csc.fi
REMOTE_USER=pj@csc.fi
The program list only shows the user the files that the user has the right to access. The list of files has links to the dl program, which can send the requested file to the user if the rights allow sending.

Required features for production

We recommend that the Resource Manager model described as Demo will be chosen for production to grant automatic authorization to the chosen resources. In addition to the features of Demo, the following features are needed for production:

Option for setting rights (to be specified)

  • CP sees a list of all of his/her files/resources.
  • Moderator (e.g. the Language Bank Administrator) can edit the rights field on behalf of the CP. * this should include some (limited) prescribed usage right types, see Resource categories. Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.

Other required features for production

  • MySQL Language Bank database will be used instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This will be specified later. This may be implemented in the new MySQL Language Bank database.
  • Really carefully planning the database structure.
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Probably there will not be a need to deny access to certain files within a resource.
  • Showing the user the file sizes and adding the size information into the database.
  • An interface to add resources to a server, planning and implementation, may be a command line program because linguistic resources are static.
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
 

Controlled authorization

Line: 169 to 89
 

Electronic application form processing

Changed:
<
<
There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling.
>
>
There are two different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling.
  Commercial users need to contact CSC sales and sign a contract to access the resources. In the Language Bank the following types of licenses are currently available: A License (Academic License) and B License (Extended Commercial License).
Line: 182 to 102
 
Changed:
<
<
If the user cannot be authenticated with Shibboleth:

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):

>
>
If the user already has a CSC user account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):
 
  • Electronic Application Form for CSC users
  • Personal and project information update (if needed)
  • Available linguistic research resources (with limitations defined by the CPs)
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
  • Send

Changed:
<
<

After Send:

>
>
If the user cannot be authenticated with Shibboleth or CSC user account:
  • Current Application form for the Language Bank (as non-registered)
  • Available linguistic research resources (with limitations defined by CPs)
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
  • Continue current process (emails, send paper form with signature)
  • The following chapters don't describe this feature.

After Send

 
Line: 212 to 132
  An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. The response to the email sent by the system will be via a web form (not by replying to the mail).
Changed:
<
<
The referee's procedure to authorize and authenticate an applying user could be the following:
>
>
The referee's procedure to authorize an applying user could be the following:
 
  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application will be sent by email to the referee with links for recommending and denying, plus authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application will be sent by email to the referee with links for recommending and denying. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the CP (or contact person) of the corpus and the Language Bank administrator.
Line: 223 to 143
  In this model, a referee losing status would affect the associated users as well. Loss of status due to natural reasons (e.g. retirement, transition) lacks this effect.
Deleted:
<
<

Authentication of the user by a referee

A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth.

The referee should undertake to accept responsibility for a applicant by first agreeing to, for example, the statement given below, when authenticating to an applicant: "I (the referee) confirm that I have satisfied myself as to the identity of the applicant by checking his/her official identification (photo id)."

 

Recommend and Deny

Revision 322009-07-02 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 215 to 215
 The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application will be sent by email to the referee with links for recommending and denying and authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application will be sent by email to the referee with links for recommending and denying, plus authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the CP (or contact person) of the corpus and the Language Bank administrator.
Line: 225 to 225
 

Authentication of the user by a referee

Changed:
<
<
A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth.
>
>
A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth.
  The referee should undertake to accept responsibility for a applicant by first agreeing to, for example, the statement given below, when authenticating to an applicant: "I (the referee) confirm that I have satisfied myself as to the identity of the applicant by checking his/her official identification (photo id)."
Changed:
<
<
>
>
 

Recommend and Deny

Line: 374 to 374
 
projectdescription text yes
newreferee char 1 yes
Changed:
<
<
  • A postal address is required for sending the password, magazines and Christmas cards. Fields must be rechecked.
>
>
  • A postal address is required for sending the password, magazines and Christmas cards. Fields must be rechecked.
 
  • Will the applied resources be stored here?

Revision 312009-07-01 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 47 to 47
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled authorization
Added:
>
>
According to CLARIN policy, general metadata, including knowledge of the existence of a resource, should be publicly available for all prospective users.
 

Terms and Conditions

Line: 62 to 64
 will apply for the resource in question, or a resource-specific license agreement that the CP provides
Changed:
<
<
Must be specified later.
>
>
This must be specified later.
 

Language selection: Finnish/English

Line: 74 to 76
 

Loading linguistic resources

Changed:
<
<
Process for the Language Bank Administrator to add resources to a server will be specified later. Whether CP or other people can upload resources must be studied, there may be safety and copyright considerations.
>
>
Process for the Language Bank Administrator to add resources to a server will be specified later. Whether CP or other people can upload resources must be studied, there may be safety and copyright considerations.
 

Monitoring and statistics

Line: 86 to 88
 Figure: User process for linguistics with Shibboleth authentication and automatic authorization

Changed:
<
<
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authorized?
  • Acceptance by the user of Terms and Conditions attached to the resource is required. Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
>
>
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before being authorized?
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
 
  • After being authorized, the user can download the resources of Category 2 and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
  • The user will not need a CSC user account. Whether the ePPN should be linked with the resource, must be studied.
Line: 152 to 154
 
Other required features for production

  • MySQL Language Bank database will be used instead of the current demo database.
Changed:
<
<
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This will be specified later. This may be implemented in the new MySQL Language Bank database.
>
>
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This will be specified later. This may be implemented in the new MySQL Language Bank database.
 
  • Really carefully planning the database structure.
Changed:
<
<
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Probably there will not be a need to be denied access to certain files within a resource.
>
>
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Probably there will not be a need to deny access to certain files within a resource.
 
  • Showing the user the file sizes and adding the size information into the database.
  • An interface to add resources to a server, planning and implementation, may be a command line program because linguistic resources are static.
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
Line: 176 to 178
 After the Shibboleth authentication:
Changed:
<
<
  • Available linguistic research resources (with limitations defined by the CPs) How will the user view a list of resources before/after being authenticated and authorized?
>
>
  • Available linguistic research resources (with limitations defined by the CPs). How will the user view a list of resources before being authorized?
 
Line: 281 to 283
 groups for user management (e.g Lemmie and DMA). The CSC user account allows command line access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.
Changed:
<
<
If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home organization. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
>
>
If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home organization. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.
  The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.
Line: 303 to 305
 In the case of referee candidates, CP's emails will be replaced by emails of the nominator from the Helsinki University Department of General Linguistics, who accepts new referees.

Timer-process

Changed:
<
<
>
>
 Timer-process for the CP and administrator has to be initiated after referee's response. Time limits can be adjusted as desired.

  • if the CP or administrator has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (Reject email will be sent to the administrator)
  • timer will be cancelled if the CP or administrator sends Accept or Reject
Changed:
<
<
>
>
 

Database changes

Line: 329 to 332
 The CAC field in the application table describes how the user's identity is verified. Information on how each user is authenticated needs to be stored in the database, because stronger authentication than the currently used personal signature may be required. It should be added into the database table kayttajat (users).

  • The minimum level of trust for authentication (expressed by CAC values) is 32.
Changed:
<
<
  • Required CAC values per resource have to be defined.
>
>
  • Required CAC values per resource have to be defined by the CP at the time of deposition.
  The CAC field can get one or several of the values listed below. If several values are selected, they will be summarized.

Revision 302009-07-01 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 88 to 88
 
  • After Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authorized?
  • Acceptance by the user of Terms and Conditions attached to the resource is required. Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
Changed:
<
<
  • After being authorized, the user can download the resources of Category 2 and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
  • The user will not need to be saved in the database nor will the user need a CSC user account.
>
>
  • After being authorized, the user can download the resources of Category 2 and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
  • The user will not need a CSC user account. Whether the ePPN should be linked with the resource, must be studied.
 

Resource Manager Documentation

Line: 153 to 154
 
  • MySQL Language Bank database will be used instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This will be specified later. This may be implemented in the new MySQL Language Bank database.
  • Really carefully planning the database structure.
Changed:
<
<
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Is there a need to be denied access to certain files within a resource?
>
>
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Probably there will not be a need to be denied access to certain files within a resource.
 
  • Showing the user the file sizes and adding the size information into the database.
  • An interface to add resources to a server, planning and implementation, may be a command line program because linguistic resources are static.
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).

Revision 292009-06-18 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 25 to 25
 

Shibboleth authentication

Changed:
<
<
Shibboleth authentication means here the HAKA authentication for users of Finnish universities, polytechnics and research institutions. Shibboleth authentication is a common feature preceding both the Automatic and Controlled authorization.
>
>
Shibboleth authentication means here the HAKA authentication for users of Finnish universities, polytechnics and research institutions. Shibboleth authentication is a common feature preceding both the Automatic and Controlled authorization.
 
  • Available linguistic research resources (current www location)
  • HAKA as Identity Provider Federation (Haka pages)
  • HAKA login (WAYF Service, later to be replaced by Shibboleth2 Discovery Service)
  • Provided attributes (funetEduPerson schema)
  • In the CLARIN community, the ePPN attribute is currently seen as the minimum necessary attribute. The rest is dependent on how well the attributes sets and their semantics can be harmonized, something we hope will happen via the eduGAIN 3.0 project.
Added:
>
>
  • CSC will implement an architecture that will support the addition of other national Identity Federations in the future in a relatively easy manner.
 

Resources

Line: 40 to 41
  Linguistic resources (corpora) have to be equipped with access information divided into three categories:
Changed:
<
<
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) Whether there will be resources falling in this category must be studied. LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP.
>
>
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) Whether there will be resources falling in this category must be studied.
  (2) LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic authorization
Line: 86 to 86
 Figure: User process for linguistics with Shibboleth authentication and automatic authorization

Changed:
<
<
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authorized?
>
>
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authorized?
 
  • Acceptance by the user of Terms and Conditions attached to the resource is required. Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • After being authorized, the user can download the resources of Category 2 and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
Changed:
<
<
  • User information will not need to be saved in the database nor will the user need a CSC user account.
>
>
  • The user will not need to be saved in the database nor will the user need a CSC user account.
 

Resource Manager Documentation

Revision 282009-05-29 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 212 to 212
 The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying and authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application will be sent by email to the referee with links for recommending and denying and authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the CP (or contact person) of the corpus and the Language Bank administrator.
Line: 276 to 276
 If both the CP and the administrator accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the CP agrees.
Changed:
<
<
After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. CSC's current UNIX/LINUX based environment uses unix
>
>
After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. CSC's current UNIX/LINUX based environment uses unix
 groups for user management (e.g Lemmie and DMA). The CSC user account allows command line
Changed:
<
<
access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.
>
>
access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.
  If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home organization. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.

Revision 272009-05-29 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 86 to 86
 Figure: User process for linguistics with Shibboleth authentication and automatic authorization

Changed:
<
<
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authenticated and authorized? Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
  • After being authorized, the user can download the resources of Category 2 (corpora) and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
>
>
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authorized?
  • Acceptance by the user of Terms and Conditions attached to the resource is required. Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • After being authorized, the user can download the resources of Category 2 and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
 
  • User information will not need to be saved in the database nor will the user need a CSC user account.

Line: 257 to 257
  Timer-process has to be initiated when the email is sent to the referee. Time limits can be adjusted as desired.
Deleted:
<
<
  • if the referee/CP/nominator/administrator has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
  • timer will be cancelled if the referee sends Recommend or Deny

New:

Timer-process has to be initiated when the email is sent. Time limits can be adjusted as desired. In the case of referee candidates, the CP will be replaced by the nominator who accepts new referees.

Timer process for the referee:

 
  • if the referee has not answered in a certain time (reminder)= 8 days
Changed:
<
<
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
>
>
  • timer will expire after a delay = 15 days (email will be sent to the CP)
 
  • timer will be cancelled if the referee sends Recommend or Deny
Deleted:
<
<
Timer process for the CP and administrator:

  • if the CP or administrator has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (Reject email will be sent to the administrator and user)
  • timer will be cancelled if the CP or administrator sends Accept or Reject
 

Web forms (AA work flow)

Line: 297 to 280
 groups for user management (e.g Lemmie and DMA). The CSC user account allows command line access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.
Changed:
<
<
If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home university. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
>
>
If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home organization. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
  The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.
Line: 318 to 301
  In the case of referee candidates, CP's emails will be replaced by emails of the nominator from the Helsinki University Department of General Linguistics, who accepts new referees.
Added:
>
>

Timer-process

Timer-process for the CP and administrator has to be initiated after referee's response. Time limits can be adjusted as desired.

  • if the CP or administrator has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (Reject email will be sent to the administrator)
  • timer will be cancelled if the CP or administrator sends Accept or Reject
 

Database changes

Revision 262009-05-29 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 212 to 212
 The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying , and for authenticating and denying, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying and authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the CP (or contact person) of the corpus and the Language Bank administrator.

Revision 252009-05-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 215 to 215
 
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying , and for authenticating and denying, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
    • The referee candidates select a referee, too.
  2. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
Changed:
<
<
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
>
>
    • If the user does not know any referee, the application will be forwarded straight to the CP (or contact person) of the corpus and the Language Bank administrator.
 
    • If the referee selects the deny link, a rejection message will be sent to the CP and the administrator.

In this model, a referee losing status would affect the associated users as well. Loss of status due to natural reasons (e.g. retirement, transition) lacks this effect.

Line: 224 to 224
  A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth.
Changed:
<
<
The referee should undertake to accept responsibility for a applicant by first agreeing to, for example, the statement given below, when authenticating and granting authorization to an applicant:
>
>
The referee should undertake to accept responsibility for a applicant by first agreeing to, for example, the statement given below, when authenticating to an applicant:
 
Changed:
<
<
"I (the referee) confirm that I have satisfied myself as to the identity of the applicant and recommend that he be granted authorization to the resources applied for by checking his/her official identification (photo id)."
>
>
"I (the referee) confirm that I have satisfied myself as to the identity of the applicant by checking his/her official identification (photo id)."
 

Line: 298 to 298
 groups for user management (e.g Lemmie and DMA). The CSC user account allows command line access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.
Changed:
<
<
If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home university. During the first visit the user is also asked for the CSC user account, so that user's ePPN?CHECK can be linked to CSC user account. The next time the CSC user account will no longer be needed.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
>
>
If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home university. During the first visit the user is also asked for the CSC user account, so that user's ePPN can be linked to CSC user account. The next time the CSC user account will no longer be needed to log onto CSC Scientist's Interface.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
  The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.

Revision 242009-05-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 162 to 162
 

Controlled authorization

Changed:
<
<
Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees
>
>
Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees
 

Electronic application form processing

Line: 212 to 212
 The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying and authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying , and for authenticating and denying, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
Line: 262 to 262
 
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
  • timer will be cancelled if the referee sends Recommend or Deny
Added:
>
>
New:

Timer-process has to be initiated when the email is sent. Time limits can be adjusted as desired. In the case of referee candidates, the CP will be replaced by the nominator who accepts new referees.

Timer process for the referee:

  • if the referee has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
  • timer will be cancelled if the referee sends Recommend or Deny

Timer process for the CP and administrator:

  • if the CP or administrator has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (Reject email will be sent to the administrator and user)
  • timer will be cancelled if the CP or administrator sends Accept or Reject
 

Web forms (AA work flow)

Line: 376 to 393
 
field type size null
ID smallint no
status char 1 no
Added:
>
>

META FILEATTACHMENT attachment="linguistics_user_controlled_process_draft.png" attr="" comment="Controlled authorization drawing" date="1243527600" name="linguistics_user_controlled_process_draft.png" path="linguistics_user_controlled_process_draft.png" size="129352" stream="linguistics_user_controlled_process_draft.png" tmpFilename="/usr/tmp/CGItemp17957" user="SatuTorikka" version="2"

Revision 232009-05-28 - SeanCrowe

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 212 to 212
 The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. The application will be sent by email to the referee with links for recommending and denying. A timer-process has to be initiated when the email is sent to the referee.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. A notification of an application The application will be sent by email to the referee with links for recommending and denying and authenticating, if the user cannot be authenticated via Shibboleth. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
Line: 222 to 222
 

Authentication of the user by a referee

Changed:
<
<
A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth. This will be specified later.
>
>
A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth.
 
Added:
>
>
The referee should undertake to accept responsibility for a applicant by first agreeing to, for example, the statement given below, when authenticating and granting authorization to an applicant:

"I (the referee) confirm that I have satisfied myself as to the identity of the applicant and recommend that he be granted authorization to the resources applied for by checking his/her official identification (photo id)."

 

Recommend and Deny

Line: 253 to 258
  Timer-process has to be initiated when the email is sent to the referee. Time limits can be adjusted as desired.
Changed:
<
<
  • if the referee has not answered in a certain time (reminder)= 8 days
>
>
  • if the referee/CP/nominator/administrator has not answered in a certain time (reminder)= 8 days
 
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
  • timer will be cancelled if the referee sends Recommend or Deny

Revision 222009-05-26 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 10 to 10
 

Definitions

CO Copyright owner
Changed:
<
<
CP Content provider
>
>
CP Content provider, who acquires linguistic resources and sufficient rights to use them from the Copyright owner.
 
Database MySQL Language Bank database, future references to the database will refer to this MySQL database in this document
HAKA Identity federation of the Finnish universities, polytechnics and research institutions
IdF Identity federation
Line: 20 to 20
 
SUI New Scientist's interface
Changed:
<
<

Common features for Automatic access and Owner controlled access

>
>

Common features for Automatic and Controlled authorization

 
Added:
>
>

Shibboleth authentication

 
Changed:
<
<
The term Owner is replaced by the term Content provider (CP) in this document. The content provider acquires linguistic resources and sufficient rights to use them from the copyright owner.
>
>
Shibboleth authentication means here the HAKA authentication for users of Finnish universities, polytechnics and research institutions. Shibboleth authentication is a common feature preceding both the Automatic and Controlled authorization.
 
Deleted:
<
<

Shibboleth authentication

 
Changed:
<
<
  • In the CLARIN communtiy, the ePPN attribute is currently seen as the minimum necessary attribute. The rest is dependent on how well the attributes sets and their semantics can be harmonized, something we hope will happen via the eduGAIN 3.0 project.
>
>
  • In the CLARIN community, the ePPN attribute is currently seen as the minimum necessary attribute. The rest is dependent on how well the attributes sets and their semantics can be harmonized, something we hope will happen via the eduGAIN 3.0 project.
 

Resources

Categories

Changed:
<
<
Linguistic resources have to be equipped with access information divided into three categories:
>
>
Linguistic resources (corpora) have to be equipped with access information divided into three categories:
 
Changed:
<
<
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP. See Automatic access to resources
>
>
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) Whether there will be resources falling in this category must be studied. LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP.
 
Changed:
<
<
(2) LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
>
>
(2) LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic authorization
 
Changed:
<
<
(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources
>
>
(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled authorization
 

Terms and Conditions

Line: 80 to 80
  CSC will monitor and gather usage statistics.
Changed:
<
<
>
>
 
Changed:
<
<

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources
>
>

Automatic authorization

Figure: User process for linguistics with Shibboleth authentication and automatic authorization
 
Changed:
<
<
  • Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
  • Category 1: Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • Category 2: Acceptance by the user of Terms and Conditions attached to the resource is required.
  • User information will not need to be saved in the database nor will the user need a CSC user account in categories 1 and 2. The user can download the resources (corpora) and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
>
>
  • After Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
  • Available linguistic resources of Category 2 with limitations defined by the CPs (for example, the resource can be accessible to users from University A only). See Resource Manager documentation below. How will the user view a list of resources before/after being authenticated and authorized? Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
  • After being authorized, the user can download the resources of Category 2 (corpora) and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
  • User information will not need to be saved in the database nor will the user need a CSC user account.
 
Added:
>
>
 

Resource Manager Documentation

The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

Line: 137 to 139
 

Required features for production

Changed:
<
<
We recommend that the Resource Manager model described as Demo will be chosen for production to grant automatic access to the chosen resources. In addition to the features of Demo, the following features are needed for production:
>
>
We recommend that the Resource Manager model described as Demo will be chosen for production to grant automatic authorization to the chosen resources. In addition to the features of Demo, the following features are needed for production:
 
Option for setting rights (to be specified)
Line: 157 to 159
 
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
Changed:
<
<

OwnerControlled access to resources

>
>

Controlled authorization

  Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees
Line: 272 to 274
  After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. CSC's current UNIX/LINUX based environment uses unix groups for user management (e.g Lemmie and DMA). The CSC user account allows command line
Changed:
<
<
access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
>
>
access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account.

If the user's home organization is a member of Haka, (s)he can log onto Scientist's interface using the username and password issued by his/her home university. During the first visit the user is also asked for the CSC user account, so that user's ePPN?CHECK can be linked to CSC user account. The next time the CSC user account will no longer be needed.The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.

  The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.

Revision 212009-05-26 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 35 to 35
 

Resources

Added:
>
>

Categories

 
Changed:
<
<

Categories

Linguistic resources have to be equipped with access information divided into three categories:

>
>
Linguistic resources have to be equipped with access information divided into three categories:
  (1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP. See Automatic access to resources
Line: 47 to 47
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources
Changed:
<
<
>
>
 

Terms and Conditions

(a) Terms of Access

Line: 124 to 124
  Each record in the resurssi table contains a file description. A resource can contain one or several files.
The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
Changed:
<
<
The owner moderator is Shibboleth EPPN (EduPersonPrincipalName). In the future, the ownerThe moderator can set the rights.
>
>
The moderator is Shibboleth EPPN (EduPersonPrincipalName). The moderator can set the rights.
 Only path information is shown to the user.
Only right_type 0 is used.
The rights field contains a Shibboleth attribute key value string. The rights field can contain one of the following sample strings :
Line: 143 to 143
 
  • CP sees a list of all of his/her files/resources.
  • Moderator (e.g. the Language Bank Administrator) can edit the rights field on behalf of the CP.
Changed:
<
<
* this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required). Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
>
>
* this should include some (limited) prescribed usage right types, see Resource categories. Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
 

Other required features for production
Line: 174 to 174
 
  • Electronic Application Form (as Shibboleth authenticated and prefilled)
    • Required attributes
  • Available linguistic research resources (with limitations defined by the CPs) How will the user view a list of resources before/after being authenticated and authorized?
Changed:
<
<
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
>
>
 
  • Send

If the user cannot be authenticated with Shibboleth:

Line: 182 to 182
 
  • Email confirmation required
  • Update the CAC field in the database
  • Available linguistic research resources (with limitations defined by CPs)
Changed:
<
<
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
>
>
 
  • Send

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):

  • Electronic Application Form for CSC users
  • Personal and project information update (if needed)
  • Available linguistic research resources (with limitations defined by the CPs)
Changed:
<
<
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
>
>
 
  • Send

After Send:

Revision 202009-05-25 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Changed:
<
<
This document offers the requirements documentation for the development project of AAI for Finnish language resources.
>
>
This document offers the requirements documentation for the development project of AAI for Finnish language resources.
  Contents
Line: 31 to 31
 
  • HAKA as Identity Provider Federation (Haka pages)
  • HAKA login (WAYF Service, later to be replaced by Shibboleth2 Discovery Service)
  • Provided attributes (funetEduPerson schema)
Changed:
<
<
  • In the CLARIN communtiy, ePPN@Domain is currently seen as the minimum necessary attribute. The rest is dependent on how well the attributes sets and their semantics can be harmonized, something we hope will happen via the eduGAIN 3.0 project.
>
>
  • In the CLARIN communtiy, the ePPN attribute is currently seen as the minimum necessary attribute. The rest is dependent on how well the attributes sets and their semantics can be harmonized, something we hope will happen via the eduGAIN 3.0 project.
 

Resources

Line: 43 to 43
  (1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP. See Automatic access to resources
Changed:
<
<
(2) LRT to which the CP can grant a license automatically pending electronic signature by the user - One-sided: commitment by user to predetermined CP terms. LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
>
>
(2) LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
 
Changed:
<
<
(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources
>
>
(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources
 
Changed:
<
<

Terms and Conditions

>
>

Terms and Conditions

  (a) Terms of Access
Line: 66 to 62
 will apply for the resource in question, or a resource-specific license agreement that the CP provides
Changed:
<
<
Must be specified later.
>
>
Must be specified later.
 

Language selection: Finnish/English

Line: 76 to 72
 
Changed:
<
<

Loading linguistic resources

>
>

Loading linguistic resources

 
Changed:
<
<
Process for the Language Bank Administrator to add resources to a server will be specified later. Whether CP or other people can upload resources must be studied, there may be safety considerations.
>
>
Process for the Language Bank Administrator to add resources to a server will be specified later. Whether CP or other people can upload resources must be studied, there may be safety and copyright considerations.
 
Changed:
<
<

Monitoring and statistics

>
>

Monitoring and statistics

 
Changed:
<
<
CSC will monitor and gather usage statistics.
>
>
CSC will monitor and gather usage statistics.
 
Line: 146 to 142
 
Option for setting rights (to be specified)

  • CP sees a list of all of his/her files/resources.
Changed:
<
<
  • Moderator (e.g. the Language Bank Administrator) can edit the rights field on behalf of the CP.
>
>
  • Moderator (e.g. the Language Bank Administrator) can edit the rights field on behalf of the CP.
  * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required). Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
Line: 170 to 166
  There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling.
Changed:
<
<
Commercial users need to contact CSC sales and sign a contract to access the resources. In the Language Bank the following types of licenses are currently available: A License (Academic License) and B License (Extended Commercial License).
>
>
Commercial users need to contact CSC sales and sign a contract to access the resources. In the Language Bank the following types of licenses are currently available: A License (Academic License) and B License (Extended Commercial License).
 
Line: 203 to 199
 
Changed:
<
<
The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee. It will be clearly indicated on the web page whether user access or referee promotion is being applied for.
>
>
The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee. It will be clearly indicated on the web page whether user access or referee promotion is being applied for.
 

Line: 229 to 225
 

Recommend and Deny

Changed:
<
<
For the referee recommendation, the system has a very secret passphrase which is SHA-hashed with the applid value.
>
>
For the referee recommendation, the system has a secret passphrase which is SHA-hashed with the applid value.
 There are two web programs: Recommend and Deny. The email message generated for the referee contains links to both of them together with the application data. When the referee recommends that the application be accepted, he or she clicks the recommend link that is parametrized with an applid and a SHA hash value, and the hash will be checked.

When the hash matches, the recommend program increments the CAC field value by 32.

Line: 247 to 243
 
  1. Referee's Deny email to CP.
  2. Referee's Deny email to administrator.
  3. Referee's No reply email to CP (automatically after a delay).
Changed:
<
<
  1. Referee's No reply email to administrator (automatically after a delay 7 days).
>
>
  1. Referee's No reply email to administrator (automatically after a delay).
  In the case of a referee candidate, emails to the CP will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.

Timer-process

Changed:
<
<
Timer-process has to be initiated when the email is sent to the referee.
>
>
Timer-process has to be initiated when the email is sent to the referee. Time limits can be adjusted as desired.
 
  • if the referee has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
  • timer will be cancelled if the referee sends Recommend or Deny
Line: 273 to 270
 If both the CP and the administrator accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the CP agrees.
Changed:
<
<
After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
>
>
After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. CSC's current UNIX/LINUX based environment uses unix groups for user management (e.g Lemmie and DMA). The CSC user account allows command line access to a server. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
  The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.
Line: 311 to 310
  The CAC field in the application table describes how the user's identity is verified. Information on how each user is authenticated needs to be stored in the database, because stronger authentication than the currently used personal signature may be required. It should be added into the database table kayttajat (users).
Changed:
<
<
  • The minimum level of trust for authentication (expressed by CAC values) is 32.
  • Required CAC values per resource have to be defined.
>
>
  • The minimum level of trust for authentication (expressed by CAC values) is 32.
  • Required CAC values per resource have to be defined.
  The CAC field can get one or several of the values listed below. If several values are selected, they will be summarized.

  • 0. Not authenticated (data stored from web form).
  • 1. User-verified email. Authentication by an email confirmation from any address.
Changed:
<
<
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization. HAKA members and state institutions can be considered as well-known CSC customer organizations. Well-known CSC customer organizations must be defined.
>
>
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization. HAKA members and state institutions can be considered as well-known CSC customer organizations.
 
  • 4. Authentication using a credit card or a good certificate issued by well-known CA.
  • 8. Scanned signature in a pdf-document.
  • 16. Personal signature (default value for current CSC customers).
Changed:
<
<
  • 32. Referee recommendation: a known professor or research director recommends that the application be accepted. In addition, official identification (photo ID) can be verified by a referee.
  • 64. Strong authentication using SAML2/Shibboleth or grid certificates (in the USA: urn:mace:incommon:iap:bronze). Alternatively, official identification (photo ID) verified by a referee.
>
>
  • 32. Referee recommendation: a known professor or research director recommends that the application be accepted. In addition, official identification (photo ID) can be verified by a referee.
  • 64. Strong authentication using SAML2/Shibboleth or grid certificates (in the USA: urn:mace:incommon:iap:bronze).
 
  • 128. Official identification verified by a bank account (tupas) or more secure certificates (in the USA: urn:mace:incommon:iap:silver).
  • 256. CSC-checked official identification card or passport.

Revision 192009-05-18 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Added:
>
>
This document offers the requirements documentation for the development project of AAI for Finnish language resources.

 Contents

Definitions

Added:
>
>
CO Copyright owner
 
CP Content provider
Database MySQL Language Bank database, future references to the database will refer to this MySQL database in this document
HAKA Identity federation of the Finnish universities, polytechnics and research institutions
Line: 38 to 41
 Linguistic resources have to be equipped with access information divided into three categories:

(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)

Changed:
<
<
LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP. This has to be specified later.
>
>
LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP. See Automatic access to resources
  (2) LRT to which the CP can grant a license automatically pending electronic signature by the user - One-sided: commitment by user to predetermined CP terms.
Changed:
<
<
LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will result prevent continuation of the corpora access process - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
>
>
LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will prevent continuation of the resource access process - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
 

(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources

Line: 49 to 52
 

Terms and Conditions

Changed:
<
<
(a) Terms of Access
>
>

(a) Terms of Access

  A description of all the requirements that the applicant has to
Changed:
<
<
satisfy in order to gain access, containing e.g. the applicant's motivation, need for referee, other necessary information such CV, List of Publications, Research plan (we may have to discuss whether links or simple text [with ample length] will suffice or whether some standard file format such as pdf is necessary)
>
>
satisfy in order to gain access
  (b) Terms of Use: Code-of-Conduct/License Agreement

Here the alternatives are either some of a very few general research purpose EULAs that the applicant might already have signed, and which will apply for the resource in question, or a resource-specific

Changed:
<
<
license agreement that the owner provides
>
>
license agreement that the CP provides
  Must be specified later.
Line: 91 to 92
 
  • Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
  • Category 1: Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • Category 2: Acceptance by the user of Terms and Conditions attached to the resource is required.
Changed:
<
<
  • User information will not need to be saved in the database nor will the user need a CSC user account.
>
>
  • User information will not need to be saved in the database nor will the user need a CSC user account in categories 1 and 2. The user can download the resources (corpora) and save them to his/her own machine, but cannot execute applications (Lemmie or DMA) or log in CSC's computing environment.
 

Resource Manager Documentation

Line: 177 to 178
 
  • Electronic Application Form (as Shibboleth authenticated and prefilled)
    • Required attributes
  • Available linguistic research resources (with limitations defined by the CPs) How will the user view a list of resources before/after being authenticated and authorized?
Changed:
<
<
>
>
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
 
  • Send

If the user cannot be authenticated with Shibboleth:

Line: 185 to 186
 
  • Email confirmation required
  • Update the CAC field in the database
  • Available linguistic research resources (with limitations defined by CPs)
Added:
>
>
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
 
  • Send

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):

  • Electronic Application Form for CSC users
  • Personal and project information update (if needed)
  • Available linguistic research resources (with limitations defined by the CPs)
Added:
>
>
  • Acceptance by the user of Terms and Conditions attached to the resource is required.
 
  • Send

After Send:

Line: 200 to 203
 
Changed:
<
<
The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee.
>
>
The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee. It will be clearly indicated on the web page whether user access or referee promotion is being applied for.
 

Changed:
<
<

Referee's authorization and authentication

>
>

Referee's authorization

  An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. The response to the email sent by the system will be via a web form (not by replying to the mail).
Line: 219 to 222
  In this model, a referee losing status would affect the associated users as well. Loss of status due to natural reasons (e.g. retirement, transition) lacks this effect.
Added:
>
>

Authentication of the user by a referee

A referee can authenticate an applying user if the user cannot be authenticated via Shibboleth. This will be specified later.

 

Recommend and Deny

Changed:
<
<
For the referee recommendation, the system has a very secret passphrase which is SHA-hashed with the applid value.
>
>
For the referee recommendation, the system has a very secret passphrase which is SHA-hashed with the applid value.
 There are two web programs: Recommend and Deny. The email message generated for the referee contains links to both of them together with the application data. When the referee recommends that the application be accepted, he or she clicks the recommend link that is parametrized with an applid and a SHA hash value, and the hash will be checked.

When the hash matches, the recommend program increments the CAC field value by 32.

Line: 243 to 249
 
  1. Referee's No reply email to CP (automatically after a delay).
  2. Referee's No reply email to administrator (automatically after a delay 7 days).
Changed:
<
<
In the case of a referee candidate, emails to the CP will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics. Alternatively, an option to have e.g. one nominator in each country can be studied if necessary.
>
>
In the case of a referee candidate, emails to the CP will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.
 

Timer-process

Revision 182009-05-18 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 17 to 17
 
SUI New Scientist's interface
Changed:
<
<

Common features for Automatic access and Content provider controlled access

>
>

Common features for Automatic access and Owner controlled access

 

The term Owner is replaced by the term Content provider (CP) in this document. The content provider acquires linguistic resources and sufficient rights to use them from the copyright owner.

Line: 37 to 37
  Linguistic resources have to be equipped with access information divided into three categories:
Changed:
<
<
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)
>
>
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.) LRT to which the CP can grant an access automatically if the user has an affiliation with an IdP. This has to be specified later.
 
Changed:
<
<
(2) LRT to which the CP can grant a license automatically automatic access pending electronic signatureHAKA authentication by the user - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
>
>
(2) LRT to which the CP can grant a license automatically pending electronic signature by the user - One-sided: commitment by user to predetermined CP terms. LRT to which the CP can grant an access automatically pending acceptance by the user of Terms and Conditions attached to the corpus/resource. Failure to accept the Terms and Conditions will result prevent continuation of the corpora access process - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources
Changed:
<
<

Terms of use and other conditions

>
>

Terms and Conditions

 
Added:
>
>
(a) Terms of Access
 
Added:
>
>
A description of all the requirements that the applicant has to satisfy in order to gain access, containing e.g. the applicant's motivation, need for referee, other necessary information such CV, List of Publications, Research plan (we may have to discuss whether links or simple text [with ample length] will suffice or whether some standard file format such as pdf is necessary)
 
Added:
>
>
(b) Terms of Use: Code-of-Conduct/License Agreement

Here the alternatives are either some of a very few general research purpose EULAs that the applicant might already have signed, and which will apply for the resource in question, or a resource-specific license agreement that the owner provides

Must be specified later.

 

Language selection: Finnish/English

Line: 56 to 74
 
  • CSC will implement an architecture that will support the addition of more languages in the future in a relatively easy manner.
Deleted:
<
<

Automatic access to resources

 
Added:
>
>

Loading linguistic resources

 
Changed:
<
<
Figure: User process for linguistics with Shibboleth authentication and automatic access to resources
>
>
Process for the Language Bank Administrator to add resources to a server will be specified later. Whether CP or other people can upload resources must be studied, there may be safety considerations.

Monitoring and statistics

CSC will monitor and gather usage statistics.

 
Changed:
<
<
Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
>
>

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources
 
Changed:
<
<
User information will not need to be saved in the database nor will the user need a CSC user account.
>
>
  • Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
  • Category 1: Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
  • Category 2: Acceptance by the user of Terms and Conditions attached to the resource is required.
  • User information will not need to be saved in the database nor will the user need a CSC user account.
 

Resource Manager Documentation

The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

  • Required attributes (per resource or file) must be solved.
Deleted:
<
<
  • Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
 

Line: 126 to 152
 
Other required features for production

  • MySQL Language Bank database will be used instead of the current demo database.
Changed:
<
<
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This may be implemented in the new MySQL Language Bank database.
>
>
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This will be specified later. This may be implemented in the new MySQL Language Bank database.
 
  • Really carefully planning the database structure.
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Is there a need to be denied access to certain files within a resource?
  • Showing the user the file sizes and adding the size information into the database.
Line: 141 to 167
 

Electronic application form processing

Changed:
<
<
There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling. Commercial users need to sign a contract to access the resources.
>
>
There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling.

Commercial users need to contact CSC sales and sign a contract to access the resources. In the Language Bank the following types of licenses are currently available: A License (Academic License) and B License (Extended Commercial License).

  After the Shibboleth authentication:
Line: 213 to 243
 
  1. Referee's No reply email to CP (automatically after a delay).
  2. Referee's No reply email to administrator (automatically after a delay 7 days).
Changed:
<
<
In the case of a referee candidate, emails to the CP will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.
>
>
In the case of a referee candidate, emails to the CP will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics. Alternatively, an option to have e.g. one nominator in each country can be studied if necessary.
 

Timer-process

Line: 318 to 348
 
projectdescription text yes
newreferee char 1 yes
Changed:
<
<
  • A postal address is required for sending the password, magazines and Christmas cards.
>
>
  • A postal address is required for sending the password, magazines and Christmas cards. Fields must be rechecked.
 
  • Will the applied resources be stored here?

Revision 172009-05-15 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 18 to 18
 

Common features for Automatic access and Content provider controlled access

Changed:
<
<
>
>
  The term Owner is replaced by the term Content provider (CP) in this document. The content provider acquires linguistic resources and sufficient rights to use them from the copyright owner.

Shibboleth authentication

Changed:
<
<
>
>
 
  • Available linguistic research resources (current www location)
  • HAKA as Identity Provider Federation (Haka pages)
  • HAKA login (WAYF Service, later to be replaced by Shibboleth2 Discovery Service)
Line: 39 to 39
  (1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)
Changed:
<
<
(2) LRT to which the CP can grant a license automatically automatic access pending electronic signatureHAKA authentication by the user - One-sided: commitment by user to predetermined CP terms
>
>
(2) LRT to which the CP can grant a license automatically automatic access pending electronic signatureHAKA authentication by the user - One-sided: commitment by user to predetermined CP terms. See Automatic access to resources

(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP. See Controlled access to resources

 
Deleted:
<
<
(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP
 

Terms of use and other conditions

Line: 55 to 56
 
  • CSC will implement an architecture that will support the addition of more languages in the future in a relatively easy manner.
Changed:
<
<
>
>
 

Automatic access to resources

Added:
>
>
 Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Shibboleth authentication (Shibboleth authentication at the user's home organization is required)

Line: 132 to 134
 
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
Changed:
<
<

Content provider controlled access to resources

>
>

OwnerControlled access to resources

  Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees
Line: 271 to 273
 

CSC Authentication Classes (CAC field)

Changed:
<
<
The CAC field in the application table describes how the user's identity is verified. Information on how each user is authenticated needs to be stored in the database, because stronger authentication than the currently used personal signature may be required. It should be added into the database table kayttajat (users). The CAC field can get one or several of the values listed below. If several values are selected, they will be summarized.
>
>
The CAC field in the application table describes how the user's identity is verified. Information on how each user is authenticated needs to be stored in the database, because stronger authentication than the currently used personal signature may be required. It should be added into the database table kayttajat (users).

  • The minimum level of trust for authentication (expressed by CAC values) is 32.
  • Required CAC values per resource have to be defined.

The CAC field can get one or several of the values listed below. If several values are selected, they will be summarized.

 
  • 0. Not authenticated (data stored from web form).
  • 1. User-verified email. Authentication by an email confirmation from any address.
Changed:
<
<
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization. Well-known CSC customer organizations must be defined.
>
>
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization. HAKA members and state institutions can be considered as well-known CSC customer organizations. Well-known CSC customer organizations must be defined.
 
  • 4. Authentication using a credit card or a good certificate issued by well-known CA.
  • 8. Scanned signature in a pdf-document.
  • 16. Personal signature (default value for current CSC customers).
Changed:
<
<
  • 32. Referee recommendation: a known professor or research director recommends that the application be accepted.
  • 64. Strong authentication using SAML2/Shibboleth or grid certificates (in the USA: urn:mace:incommon:iap:bronze). Alternatively, official identification (photo ID) verified by a referee.
>
>
  • 32. Referee recommendation: a known professor or research director recommends that the application be accepted. In addition, official identification (photo ID) can be verified by a referee.
  • 64. Strong authentication using SAML2/Shibboleth or grid certificates (in the USA: urn:mace:incommon:iap:bronze). Alternatively, official identification (photo ID) verified by a referee.
 
  • 128. Official identification verified by a bank account (tupas) or more secure certificates (in the USA: urn:mace:incommon:iap:silver).
  • 256. CSC-checked official identification card or passport.

Revision 162009-05-14 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 33 to 33
 

Resources

Added:
>
>

Categories

 Linguistic resources have to be equipped with access information divided into three categories:

(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)

Changed:
<
<
(2) LRT to which the CP can grant a license automatically pending electronic signature??CHECK by the user - One-sided: commitment by user to predetermined CP terms
>
>
(2) LRT to which the CP can grant a license automatically automatic access pending electronic signatureHAKA authentication by the user - One-sided: commitment by user to predetermined CP terms
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP
Added:
>
>

Terms of use and other conditions

 

Language selection: Finnish/English

  • There are three different electronic application forms, both in English and Finnish.
Line: 92 to 99
  Each record in the resurssi table contains a file description. A resource can contain one or several files.
The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
Changed:
<
<
The owner>moderator is Shibboleth EPPN (EduPersonPrincipalName). The moderator can set the rights.
>
>
The owner moderator is Shibboleth EPPN (EduPersonPrincipalName). In the future, the ownerThe moderator can set the rights.
 Only path information is shown to the user.
Only right_type 0 is used.
The rights field contains a Shibboleth attribute key value string. The rights field can contain one of the following sample strings :

Revision 152009-05-13 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Added:
>
>
Contents

 

Definitions

CP Content provider
Line: 34 to 37
  (1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)
Changed:
<
<
(2) LRT to which the CP can grant a license automatically pending ??CHECK electronic signature by the user - One-sided: commitment by user to predetermined CP terms
>
>
(2) LRT to which the CP can grant a license automatically pending electronic signature??CHECK by the user - One-sided: commitment by user to predetermined CP terms
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP
Line: 107 to 110
 
Option for setting rights (to be specified)

  • CP sees a list of all of his/her files/resources.
Changed:
<
<
  • Moderator: The Language Bank Administrator can edit the rights field on behalf of the CP.
>
>
  • Moderator (e.g. the Language Bank Administrator) can edit the rights field on behalf of the CP.
  * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required). Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.

Revision 142009-05-12 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

Changed:
<
<
CP: Content provider DATABASE: MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
IdF: Identity federation IdP: Identity provider LRT: Language resources and technology SP: SUI: New Scientist's interface
>
>
CP Content provider
Database MySQL Language Bank database, future references to the database will refer to this MySQL database in this document
HAKA Identity federation of the Finnish universities, polytechnics and research institutions
IdF Identity federation
IdP Identity provider
LRT Language resources and technology
SP Service provider
SUI New Scientist's interface
 
Changed:
<
<

Common features for Automatic access and Owner > ??CP EVERYWHERE-controlled access

>
>

Common features for Automatic access and Content provider controlled access

 
Added:
>
>
The term Owner is replaced by the term Content provider (CP) in this document. The content provider acquires linguistic resources and sufficient rights to use them from the copyright owner.
 

Shibboleth authentication

Line: 27 to 30
 

Resources

Changed:
<
<
Linguistic resources have to be equipped with access information:
>
>
Linguistic resources have to be equipped with access information divided into three categories:
  (1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)
Changed:
<
<
(2) LRT to which the CP can grant a license automatically pending ??CHECK electronic signature by the user - One-sided: commitment by user to predetermined CP terms
>
>
(2) LRT to which the CP can grant a license automatically pending ??CHECK electronic signature by the user - One-sided: commitment by user to predetermined CP terms
  (3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP
Line: 56 to 59
 The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

  • Required attributes (per resource or file) must be solved.
Changed:
<
<
  • Whether automatic access on the basis of university affiliation alone may be given for research and education purposes, must be studied.
>
>
  • Whether automatic access on the basis of university affiliation alone may be given for research and education purposes must be studied.
 

Line: 78 to 81
 
path_hash varchar(32) NO PRI    
path text YES   NULL  
path_utf8 text YES   NULL  
Changed:
<
<
owner varchar(64) YES   NULL  
>
>
moderator varchar(64) YES   NULL  
 
right_type mediumint(8) unsigned YES   NULL  
rights varchar(255) YES   NULL  
+------------+-----------------------+------+-----+---------+-------+
Line: 86 to 89
  Each record in the resurssi table contains a file description. A resource can contain one or several files.
The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
Changed:
<
<
The owner is Shibboleth EPPN (EduPersonPrincipalName). In the future, the owner can set rights.
>
>
The owner>moderator is Shibboleth EPPN (EduPersonPrincipalName). The moderator can set the rights.
 Only path information is shown to the user.
Only right_type 0 is used.
The rights field contains a Shibboleth attribute key value string. The rights field can contain one of the following sample strings :
Line: 103 to 106
 
Option for setting rights (to be specified)
Changed:
<
<
  • The owner sees a list of all of his/her files/resources.
  • The Language Bank Administrator can edit the rights field on behalf of the owner.
>
>
  • CP sees a list of all of his/her files/resources.
  • Moderator: The Language Bank Administrator can edit the rights field on behalf of the CP.
  * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required). Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.

Other required features for production
Changed:
<
<
  • Using the CRAS database instead of the current demo database > New MySQL Language Bank database will be used instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This may be implemented in the new MySQL Language Bank database.
>
>
  • MySQL Language Bank database will be used instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This may be implemented in the new MySQL Language Bank database.
 
  • Really carefully planning the database structure.
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Is there a need to be denied access to certain files within a resource?
  • Showing the user the file sizes and adding the size information into the database.
Line: 120 to 123
 
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
Changed:
<
<

Owner-controlled access to resources

>
>

Content provider controlled access to resources

  Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees
Line: 131 to 134
 After the Shibboleth authentication:
Changed:
<
<
  • Available linguistic research resources (with limitations defined by the owners) How will the user view a list of resources before/after being authenticated and authorized?
>
>
  • Available linguistic research resources (with limitations defined by the CPs) How will the user view a list of resources before/after being authenticated and authorized?
 
  • Send
Line: 139 to 142
 
Changed:
<
<
  • Available linguistic research resources (with limitations defined by owners)
>
>
  • Available linguistic research resources (with limitations defined by CPs)
 
  • Send

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):

Changed:
<
<
  • Available linguistic research resources (with limitations defined by the owners)
>
>
  • Available linguistic research resources (with limitations defined by the CPs)
 
  • Send

After Send:

Line: 153 to 156
 
Changed:
<
<
>
>
  The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee.
Line: 165 to 168
  The referee's procedure to authorize and authenticate an applying user could be the following:
Changed:
<
<
  1. The user is forwarded to the Referee List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
>
>
  1. The user is forwarded to the Referees List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
 
  1. If the user expects that a referee knows him or her, he or she selects that referee. The application will be sent by email to the referee with links for recommending and denying. A timer-process has to be initiated when the email is sent to the referee.
    • The referee candidates select a referee, too.
Changed:
<
<
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the owner and the Language Bank administrator to be accepted.
>
>
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the CP and the Language Bank administrator to be accepted.
 
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
Changed:
<
<
    • If the referee selects the deny link, a rejection message will be sent to the owner and the administrator.
>
>
    • If the referee selects the deny link, a rejection message will be sent to the CP and the administrator.
  In this model, a referee losing status would affect the associated users as well. Loss of status due to natural reasons (e.g. retirement, transition) lacks this effect.
Line: 183 to 186
  When the hash matches, the recommend program increments the CAC field value by 32.
Changed:
<
<
If the referee fails to reply in e.g. one week, he will receive a reminder. If the referee still fails to reply, the application will be forwarded to the owner and the administrator after a predefined delay (e.g. one week).
>
>
If the referee fails to reply in e.g. one week, he will receive a reminder. If the referee still fails to reply, the application will be forwarded to the CP and the administrator after a predefined delay (e.g. one week).
  If the hash does not match, the programs do nothing or warn the staff about abuse.
Line: 191 to 194
 
  1. Referee Form sends an email to the referee for recommending or denying.
  2. Reminder email to the referee, if (s)he fails to reply (automatically after a delay).
Changed:
<
<
  1. Referee's Recommend email to owner.
>
>
  1. Referee's Recommend email to CP.
 
  1. Referee's Recommend email to administrator.
Changed:
<
<
  1. Referee's Deny email to owner.
>
>
  1. Referee's Deny email to CP.
 
  1. Referee's Deny email to administrator.
Changed:
<
<
  1. Referee's No reply email to owner (automatically after a delay).
>
>
  1. Referee's No reply email to CP (automatically after a delay).
 
  1. Referee's No reply email to administrator (automatically after a delay 7 days).
Changed:
<
<
In the case of a referee candidate, emails to the owner will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.
>
>
In the case of a referee candidate, emails to the CP will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.
 
Deleted:
<
<
 

Timer-process

Timer-process has to be initiated when the email is sent to the referee.

  • if the referee has not answered in a certain time (reminder)= 8 days
Changed:
<
<
  • timer will expire after a delay = 15 days (email will be sent to the Owner and Administrator)
>
>
  • timer will expire after a delay = 15 days (email will be sent to the CP and administrator)
 
  • timer will be cancelled if the referee sends Recommend or Deny
Line: 217 to 219
 (A CAPTCHA test could also be used on the web form.)
Changed:
<
<

Owner's and Administrator's acceptance

>
>

CP's and administrator's acceptance

 
Changed:
<
<
If both the administrator and the owner accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the owner agrees.
>
>
If both the CP and the administrator accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the CP agrees.
 
Changed:
<
<
After the Owner's and Administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
>
>
After the CP's and administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.
  The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.
Changed:
<
<
>
>
 

Accept and Reject

Changed:
<
<
The program then sends the application by email to the owner (or contact person) of the corpus and the Language Bank administrator to be accepted. If both accept, the Accept program copies the application data into the database tables kayttajat (users), osoitteet (address) etc., and sends an acceptance email to the user.
>
>
The program then sends the application by email to the CP (or contact person) of the corpus and the Language Bank administrator to be accepted. If both accept, the Accept program copies the application data into the database tables kayttajat (users), osoitteet (address) etc., and sends an acceptance email to the user.
  What else does the Reject program do other than send a rejection email to the user? Will the application be deleted?
Changed:
<
<

Emails of the Owner's and Administrator's procedure

>
>

Emails of the CP's and administrator's procedure

 
Changed:
<
<
  1. Owner's Accept email to administrator.
>
>
  1. CP 's Accept email to administrator.
 
  1. Administrator's Accept email to usermgr@csc.fi (save the user's data in the database).
  2. Accept email to user.
Changed:
<
<
  1. Owner's Reject email to administrator.
>
>
  1. CP's Reject email to administrator.
 
  1. Administrator's Reject email to user.
Changed:
<
<
In the case of referee candidates, owner's emails will be replaced by emails of the nominator from the Helsinki University Department of General Linguistics, who accepts new referees.
>
>
In the case of referee candidates, CP's emails will be replaced by emails of the nominator from the Helsinki University Department of General Linguistics, who accepts new referees.
 

Database changes

Changed:
<
<
These changes will be made in the MySQL Language Bank database.
>
>
These changes will be made in the MySQL Language Bank database.
 

Email confirmation field

Revision 132009-05-07 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 22 to 22
 
  • HAKA as Identity Provider Federation (Haka pages)
  • HAKA login (WAYF Service, later to be replaced by Shibboleth2 Discovery Service)
  • Provided attributes (funetEduPerson schema)
Added:
>
>
  • In the CLARIN communtiy, ePPN@Domain is currently seen as the minimum necessary attribute. The rest is dependent on how well the attributes sets and their semantics can be harmonized, something we hope will happen via the eduGAIN 3.0 project.
 

Resources

Line: 54 to 56
 The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

  • Required attributes (per resource or file) must be solved.
Changed:
<
<
  • Automatic access can be given for research and education purposes for the following resources:
    • Speech corpora
    • Ajatella, miettiń, pohtia, harkita Corpus (amph)
>
>
  • Whether automatic access on the basis of university affiliation alone may be given for research and education purposes, must be studied.
 

Demo

Changed:
<
<
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) DELETE
In the Demo, a database which can store stat and hash data of files in a relational database was implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list. The MySQL Language Bank database will be the actual platform.
>
>
In the Demo, a database which can store stat and hash data of files in a relational database was implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list. The MySQL Language Bank database will be the actual platform.
  The source code of the demo is attached:
  • dl: Python program to download files
Line: 85 to 84
 +------------+-----------------------+------+-----+---------+-------+
Changed:
<
<
Each record in the resurssi table contains a file description. A resource can contain one or several files.
>
>
Each record in the resurssi table contains a file description. A resource can contain one or several files.
 The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
The owner is Shibboleth EPPN (EduPersonPrincipalName). In the future, the owner can set rights.
Only path information is shown to the user.
Line: 105 to 104
 
Option for setting rights (to be specified)

  • The owner sees a list of all of his/her files/resources.
Changed:
<
<
  • Page > Option to edit the rights field for the owner (or the Language Bank Administrator?)
>
>
  • The Language Bank Administrator can edit the rights field on behalf of the owner.
  * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required). Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
Line: 116 to 115
 
  • Really carefully planning the database structure.
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Is there a need to be denied access to certain files within a resource?
  • Showing the user the file sizes and adding the size information into the database.
Changed:
<
<
  • An interface to add resources to database (or rather to a server as corpus.csc.fi currently?), planning and implementation, may be a command line program because linguistic resources are static.
>
>
  • An interface to add resources to a server, planning and implementation, may be a command line program because linguistic resources are static.
 
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
Line: 205 to 204
 

Timer-process

Timer-process has to be initiated when the email is sent to the referee.

Changed:
<
<
  • if the referee has not answered in a certain time (reminder)= 5 days
  • timer will expire after a delay = 7 days
>
>
  • if the referee has not answered in a certain time (reminder)= 8 days
  • timer will expire after a delay = 15 days (email will be sent to the Owner and Administrator)
 
  • timer will be cancelled if the referee sends Recommend or Deny
Deleted:
<
<
 
Changed:
<
<
>
>
 

Web forms (AA work flow)

The web application can process web forms, i.e. webform submissions.

Changed:
<
<
These webforms could include a text box field if there is a need for the referee to provide comments, and also a checkbox field if there is a need to flag that the candidate’s application should not continue to be processed automatically and should be subject to a further administrator decision.
>
>
These webforms could include a text box field if there is a need for the referee to provide comments, and also a checkbox field (size to be decided later) if there is a need to flag that the candidate’s application should not continue to be processed automatically and should be subject to a further administrator decision.
 (This should eliminate the need to deal with spam email if no email addresses are used in the application.) (A CAPTCHA test could also be used on the web form.)
Changed:
<
<
>
>
 

Owner's and Administrator's acceptance

Revision 122009-05-07 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

Changed:
<
<
DATABASE: New MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
>
>
CP: Content provider DATABASE: MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
 HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
Added:
>
>
IdF: Identity federation IdP: Identity provider LRT: Language resources and technology SP:
 SUI: New Scientist's interface
Deleted:
<
<

Schedule > Moved to Project Plan and Project Schedule, remove from here?

 
Changed:
<
<

Common features for Automatic access and Owner-controlled access

>
>

Common features for Automatic access and Owner > ??CP EVERYWHERE-controlled access

 

Shibboleth authentication

Line: 23 to 27
  Linguistic resources have to be equipped with access information:
Changed:
<
<
  1. free for all purposes;
  2. free for research and education;
  3. restricted; consent to specific terms required
>
>
(1) LRT which can be freely used by anyone (including resources with open licenses such as Open Access etc.)

(2) LRT to which the CP can grant a license automatically pending ??CHECK electronic signature by the user - One-sided: commitment by user to predetermined CP terms

(3) LRT which can only be accessed according to an individual application by the user and after (any) individual consideration by the CP - Two-sided: commitment by user to terms and permission by CP

 

Language selection: Finnish/English

Revision 112009-05-05 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

Changed:
<
<
DATABASE: New MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
>
>
DATABASE: New MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
 HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
SUI: New Scientist's interface
Line: 30 to 30
 

Language selection: Finnish/English

  • There are three different electronic application forms, both in English and Finnish.
Changed:
<
<
  • Emails bilingual?
>
>
  • Emails will be bilingual (English and Finnish).
  • CSC will implement an architecture that will support the addition of more languages in the future in a relatively easy manner.
 

Automatic access to resources

Revision 102009-04-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 8 to 8
 HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
SUI: New Scientist's interface
Changed:
<
<

Schedule > Moved to Project Plan and Project Schedule

>
>

Schedule > Moved to Project Plan and Project Schedule, remove from here?

 

Common features for Automatic access and Owner-controlled access

Line: 29 to 29
 

Language selection: Finnish/English

Changed:
<
<
There are three different electronic application forms, both in English and Finnish. Emails bilingual?
>
>
  • There are three different electronic application forms, both in English and Finnish.
  • Emails bilingual?
 

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Changed:
<
<
Shibboleth authentication at the user's home organization is required. Automatic access means here an attribute-based access to a chosen resource. User information will not need to be saved in the database nor will the user need a CSC user account. Once a user is authenticated, he can access any other Shibboleth-enabled resources without entering his/her login name and password again, providing that (s)he is authorized to access these resources.
>
>
Shibboleth authentication (Shibboleth authentication at the user's home organization is required)
 
Added:
>
>
User information will not need to be saved in the database nor will the user need a CSC user account.
 

Resource Manager Documentation

Deleted:
<
<
 The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.
Changed:
<
<
  • Required attributes (per resource or file) to be solved
>
>
  • Required attributes (per resource or file) must be solved.
  • Automatic access can be given for research and education purposes for the following resources:
    • Speech corpora
    • Ajatella, miettiń, pohtia, harkita Corpus (amph)
 

Demo

Changed:
<
<
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) > MySQL Language Bank database. In the Demo, a database which can store stat and hash data of files in a relational databasewas implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list.
>
>
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) DELETE
In the Demo, a database which can store stat and hash data of files in a relational database was implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list. The MySQL Language Bank database will be the actual platform.
  The source code of the demo is attached:
  • dl: Python program to download files
Line: 72 to 76
 +------------+-----------------------+------+-----+---------+-------+
Changed:
<
<
Each record in the resurssi table contains a file. A resource can contain one or several files.
>
>
Each record in the resurssi table contains a file description. A resource can contain one or several files.
 The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
The owner is Shibboleth EPPN (EduPersonPrincipalName). In the future, the owner can set rights.
Only path information is shown to the user.
Line: 85 to 89
  The program list only shows the user the files that the user has the right to access. The list of files has links to the dl program, which can send the requested file to the user if the rights allow sending.
Deleted:
<
<
The implementation of the demo took less than one week's work.
 

Required features for production

We recommend that the Resource Manager model described as Demo will be chosen for production to grant automatic access to the chosen resources. In addition to the features of Demo, the following features are needed for production:

Line: 94 to 96
 
Option for setting rights (to be specified)

  • The owner sees a list of all of his/her files/resources.
Changed:
<
<
  • Option to edit the rights field for the owner (or the Language Bank Administrator?) * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required).
  • Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
>
>
  • Page > Option to edit the rights field for the owner (or the Language Bank Administrator?) * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required). Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
 
Changed:
<
<

Other required features for production

>
>
Other required features for production
 
Changed:
<
<
  • Using the CRAS database instead of the current demo database > A new MySQL Language Bank database will be used instead.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0 > This may be implemented in the new MySQL Language Bank database.
>
>
  • Using the CRAS database instead of the current demo database > New MySQL Language Bank database will be used instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0. This may be implemented in the new MySQL Language Bank database.
 
  • Really carefully planning the database structure.
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Is there a need to be denied access to certain files within a resource?
  • Showing the user the file sizes and adding the size information into the database.
Changed:
<
<
  • An interface to add resources to database (or rather to a server as corpus.csc.fi currently?), planning and implementation, may be a command line program because linguistic resources are static.
>
>
  • An interface to add resources to database (or rather to a server as corpus.csc.fi currently?), planning and implementation, may be a command line program because linguistic resources are static.
 
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
Line: 191 to 192
  In the case of a referee candidate, emails to the owner will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.
Added:
>
>
 

Timer-process

Timer-process has to be initiated when the email is sent to the referee.

  • if the referee has not answered in a certain time (reminder)= 5 days
  • timer will expire after a delay = 7 days
  • timer will be cancelled if the referee sends Recommend or Deny
Added:
>
>
 
Changed:
<
<

Web form (AA work flow)

>
>

Web forms (AA work flow)

  The web application can process web forms, i.e. webform submissions. These webforms could include a text box field if there is a need for the referee to provide comments, and also a checkbox field if there is a need to flag that the candidate’s application should not continue to be processed automatically and should be subject to a further administrator decision. (This should eliminate the need to deal with spam email if no email addresses are used in the application.) (A CAPTCHA test could also be used on the web form.)
Added:
>
>
 

Owner's and Administrator's acceptance

Line: 235 to 240
 

Database changes

Changed:
<
<
These changes will be made in the MYSQL Language Bank database.
>
>
These changes will be made in the MySQL Language Bank database.
 

Email confirmation field

Line: 251 to 256
 
  • 0. Not authenticated (data stored from web form).
  • 1. User-verified email. Authentication by an email confirmation from any address.
Changed:
<
<
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization.
>
>
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization. Well-known CSC customer organizations must be defined.
 
  • 4. Authentication using a credit card or a good certificate issued by well-known CA.
  • 8. Scanned signature in a pdf-document.
  • 16. Personal signature (default value for current CSC customers).

Revision 92009-04-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

Changed:
<
<
DATABASE: New MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
>
>
DATABASE: New MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
 HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
SUI: New Scientist's interface
Changed:
<
<

Schedule

The purpose is to plan, build and take into production an upgraded system for Language Bank users. New features of the upgraded system includes SAML2/Shibboleth support, web application forms and a referee process. There are also small enhancements including email address verification for non-Shibboleth applications and an upgrade of the Resource Manager demo onto production level.

The tasks and their estimated schedule for 2009 are:

  • Add required tables and fields to the Askare database (2 pm)
  • Contents of web forms, emails and web pages (2-4 pm)
  • Programs that check and store web forms' data in the database and send emails (2 pm)
  • Resource Manager Demo into production (1 pm)
>
>

Schedule > Moved to Project Plan and Project Schedule

 

Common features for Automatic access and Owner-controlled access

Line: 36 to 27
 
  1. free for research and education;
  2. restricted; consent to specific terms required
Added:
>
>

Language selection: Finnish/English

There are three different electronic application forms, both in English and Finnish. Emails bilingual?

 

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Line: 43 to 39
 Shibboleth authentication at the user's home organization is required. Automatic access means here an attribute-based access to a chosen resource. User information will not need to be saved in the database nor will the user need a CSC user account. Once a user is authenticated, he can access any other Shibboleth-enabled resources without entering his/her login name and password again, providing that (s)he is authorized to access these resources.
Changed:
<
<

Resource Manager Description

>
>

Resource Manager Documentation

 

The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

Revision 82009-04-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

Changed:
<
<
DATABASE: New MYSQL Language Bank database, future references to the database will refer to this MySQL database in this document.
HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
>
>
DATABASE: New MySQL Language Bank database, future references to the database will refer to this MySQL database in this document.
HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
SUI: New Scientist's interface
 

Schedule

Revision 72009-04-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 52 to 52
 

Demo

Changed:
<
<
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) . In the Demo, a database which can store stat and hash data of files in a relational databasewas implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list.
>
>
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) > MySQL Language Bank database. In the Demo, a database which can store stat and hash data of files in a relational databasewas implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list.
  The source code of the demo is attached:
  • dl: Python program to download files
  • list: Python program to show the allowed files
Changed:
<
<
The current database structure includes a table called resurssi:
>
>
The current database structure includes a table called resurssi > resourcedetails:
 
describe resurssi;

Revision 62009-04-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

Changed:
<
<
DATABASE: CSC's customer database is called Askare, later database in this document
>
>
DATABASE: New MYSQL Language Bank database, future references to the database will refer to this MySQL database in this document.
 HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.

Schedule

Line: 18 to 18
 
  • Resource Manager Demo into production (1 pm)
Added:
>
>

Common features for Automatic access and Owner-controlled access

 
Changed:
<
<

Shibboleth authentication

>
>

Shibboleth authentication

 
Changed:
<
<

Automatic access to resources

>
>

Resources

 
Changed:
<
<
Figure: User process for linguistics with Shibboleth authentication and automatic access to resources
>
>
Linguistic resources have to be equipped with access information:
 
Changed:
<
<
Shibboleth authentication at the user's home organization is required. Automatic access means here an attribute-based access to a chosen resource.
>
>
  1. free for all purposes;
  2. free for research and education;
  3. restricted; consent to specific terms required
 
Changed:
<
<
User information will not need to be saved in the database nor will the user need a CSC user account.
>
>

Automatic access to resources

 
Changed:
<
<
Once a user is authenticated, he can access any other Shibboleth-enabled resources without entering his/her login name and password again, providing that (s)he is authorized to access these resources.
>
>
Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Shibboleth authentication at the user's home organization is required. Automatic access means here an attribute-based access to a chosen resource. User information will not need to be saved in the database nor will the user need a CSC user account. Once a user is authenticated, he can access any other Shibboleth-enabled resources without entering his/her login name and password again, providing that (s)he is authorized to access these resources.

 

Resource Manager Description

Line: 81 to 85
 HTTP_SHIB_SCHACHOMEORGANIZATION=csc.fi REMOTE_USER=pj@csc.fi
Changed:
<
<
The program list only shows the user the files that the user has the ight to access. The list of files has links to the dl program, which can send the requested file to the user if the rights allow sending.
>
>
The program list only shows the user the files that the user has the right to access. The list of files has links to the dl program, which can send the requested file to the user if the rights allow sending.
  The implementation of the demo took less than one week's work.
Line: 99 to 103
 

Other required features for production

Changed:
<
<
  • Using the CRAS database instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0.
>
>
  • Using the CRAS database instead of the current demo database > A new MySQL Language Bank database will be used instead.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0 > This may be implemented in the new MySQL Language Bank database.
 
  • Really carefully planning the database structure.
Changed:
<
<
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r

>
>
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r. Is there a need to be denied access to certain files within a resource?
 
  • Showing the user the file sizes and adding the size information into the database.
  • An interface to add resources to database (or rather to a server as corpus.csc.fi currently?), planning and implementation, may be a command line program because linguistic resources are static.
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
Line: 234 to 237
 

Database changes

Added:
>
>
These changes will be made in the MYSQL Language Bank database.
 

Email confirmation field

Revision 52009-04-28 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 109 to 109
 
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.
Deleted:
<
<
Doing everything above will take about a month.

Shibboleth Service Provider (SP) (Excluded from the contract)

Shibboleth SP has the Resource Manager functionality which is controlled by XML settings:

<?xml version="1.0" encoding="UTF-8"?>
<AccessControl xmlns="urn:mace:shibboleth:target:config:1.0">
   <Rule require="schacHomeOrganizationType">fi:university</Rule>
</AccessControl>
The rules can be combined with And and Or tags. Rules are called by the Shibboleth configuration file shibboleth.xml
<Path name="shib/appl/ling/kielipankki/amph" authType="shibboleth" requireSession="true">
    <AccessControlProvider uri="/v/net/hotpage.csc.fi/sec/shiblingacl.xml" type="edu.internet2.middleware.shibboleth.sp.provider.XMLAccessControl"/>
</Path>
This example does not work for an unknown reason. A similar example has worked on the test machine. The server uses Shibboleth 1.3 and it may work better with Shibboleth 2. Every change also requires restarting the Shibboleth SP, which is not acceptable in production use. Administering the Shibboleth SP will be very difficult and there is no sense to use insecure authorization.

If it is possible to get the Shibboleth Access Control working, it will require at least a week of work.

 

Owner-controlled access to resources

Line: 141 to 119
 There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling. Commercial users need to sign a contract to access the resources.

After the Shibboleth authentication:

Changed:
<
<
>
>
 
    • Required attributes
  • Available linguistic research resources (with limitations defined by the owners) How will the user view a list of resources before/after being authenticated and authorized?

  • Send

If the user cannot be authenticated with Shibboleth:

Changed:
<
<
>
>
 
  • Email confirmation required
  • Update the CAC field in the database
  • Available linguistic research resources (with limitations defined by owners)
  • Send

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):

Changed:
<
<
>
>
 
  • Personal and project information update (if needed)
  • Available linguistic research resources (with limitations defined by the owners)
  • Send

Revision 42009-04-24 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 30 to 30
  Figure: User process for linguistics with Shibboleth authentication and automatic access to resources
Changed:
<
<
Automatic access means here an attribute-based access to a chosen resource. Shibboleth authentication at the user's home organization is required. User information will not need to be saved in the database nor will the user need a CSC user account.
>
>
Shibboleth authentication at the user's home organization is required. Automatic access means here an attribute-based access to a chosen resource.

User information will not need to be saved in the database nor will the user need a CSC user account.

  Once a user is authenticated, he can access any other Shibboleth-enabled resources without entering his/her login name and password again, providing that (s)he is authorized to access these resources.
Line: 45 to 48
 

Demo

Changed:
<
<
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) which can store stat and hash data of files in a relational database. In the Demo, the same database was implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list.
>
>
CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) . In the Demo, a database which can store stat and hash data of files in a relational databasewas implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list.
  The source code of the demo is attached:
  • dl: Python program to download files
Line: 86 to 89
  We recommend that the Resource Manager model described as Demo will be chosen for production to grant automatic access to the chosen resources. In addition to the features of Demo, the following features are needed for production:
Added:
>
>
Option for setting rights (to be specified)

  • The owner sees a list of all of his/her files/resources.
  • Option to edit the rights field for the owner (or the Language Bank Administrator?) * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3. restricted; consent to specific terms required).
  • Moreover, this option should allow for the deposition of the specific terms for usage which the applicant may sign electronically.

Other required features for production

 
  • Using the CRAS database instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0.
  • Really carefully planning the database structure.
Deleted:
<
<
  • An owner's page to set the rights. * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3) restricted; consent to specific terms required). Moreover, this page should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
 
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r
Changed:
<
<
  • Showing the owner a list of all of his/her files/resources.
>
>
 
  • Showing the user the file sizes and adding the size information into the database.
Changed:
<
<
  • An interface to add resources to database, planning and implementation, may be a command line program because linguistic resources are static.
>
>
  • An interface to add resources to database (or rather to a server as corpus.csc.fi currently?), planning and implementation, may be a command line program because linguistic resources are static.
 
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.

Revision 32009-04-24 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 7 to 7
 DATABASE: CSC's customer database is called Askare, later database in this document
HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.
Changed:
<
<

Schedule and costs

>
>

Schedule

  The purpose is to plan, build and take into production an upgraded system for Language Bank users. New features of the upgraded system includes SAML2/Shibboleth support, web application forms and a referee process. There are also small enhancements including email address verification for non-Shibboleth applications and an upgrade of the Resource Manager demo onto production level.
Line: 18 to 18
 
  • Resource Manager Demo into production (1 pm)
Deleted:
<
<

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

 
Changed:
<
<

Shibboleth authentication

>
>

Shibboleth authentication

 
Changed:
<
<

Automatic access to resources

>
>

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Automatic access means here an attribute-based access to a chosen resource. Shibboleth authentication at the user's home organization is required. User information will not need to be saved in the database nor will the user need a CSC user account.

Once a user is authenticated, he can access any other Shibboleth-enabled resources without entering his/her login name and password again, providing that (s)he is authorized to access these resources.

Resource Manager Description

 
Deleted:
<
<
For authenticated users:
 
Changed:
<
<
>
>
The term Resource Manager is used for compatibility with the CLARIN Language Resource and Technology Federation document, in which the topic 5 (Requirements) leaves resource management to the centers. The Resource Manager is the authorization component that automatically allows or denies access to files according to user attributes. A linguistic resource or corpus can contain one or several files.

* Required attributes (per resource or file) to be solved

Demo

CSC has created the CRAS system (CSC Resource Accounting System, https://wiki.csc.fi/twiki/bin/view/Storage/CRAS) which can store stat and hash data of files in a relational database. In the Demo, the same database was implemented to control access. The URL of the demo is https://hotpage.csc.fi/shib-cgi-bin/r/list.

The source code of the demo is attached:

  • dl: Python program to download files
  • list: Python program to show the allowed files

The current database structure includes a table called resurssi:

describe resurssi;
+------------+-----------------------+------+-----+---------+-------+
| Field      | Type                  | Null | Key | Default | Extra |
+------------+-----------------------+------+-----+---------+-------+
| path_hash  | varchar(32)           | NO   | PRI |         |       | 
| path       | text                  | YES  |     | NULL    |       | 
| path_utf8  | text                  | YES  |     | NULL    |       | 
| owner      | varchar(64)           | YES  |     | NULL    |       | 
| right_type | mediumint(8) unsigned | YES  |     | NULL    |       | 
| rights     | varchar(255)          | YES  |     | NULL    |       | 
+------------+-----------------------+------+-----+---------+-------+

Each record in the resurssi table contains a file. A resource can contain one or several files.
The path_hash is an index and ensures the security of the demo system. It's generated by a python md5 object by the md5.new(realname).hexdigest() command, where realname is realpath(join(root, name)).
The owner is Shibboleth EPPN (EduPersonPrincipalName). In the future, the owner can set rights.
Only path information is shown to the user.
Only right_type 0 is used.
The rights field contains a Shibboleth attribute key value string. The rights field can contain one of the following sample strings :

HTTP_SHIB_SCHACHOMEORGANIZATIONTYPE=urn:mace:terena.org:schac:homeOrganizationType:fi:university
HTTP_SHIB_SCHACHOMEORGANIZATION=csc.fi
REMOTE_USER=pj@csc.fi
The program list only shows the user the files that the user has the ight to access. The list of files has links to the dl program, which can send the requested file to the user if the rights allow sending.

The implementation of the demo took less than one week's work.

Required features for production

We recommend that the Resource Manager model described as Demo will be chosen for production to grant automatic access to the chosen resources. In addition to the features of Demo, the following features are needed for production:

  • Using the CRAS database instead of the current demo database.
  • Adding AND and OR operations for the rights, may be implemented as a new right_type 1 or just by adding some parsing for right_type 0.
  • Really carefully planning the database structure.
  • An owner's page to set the rights. * this should include some (limited) prescribed usage right types (e.g. 1.free for all purposes; 2. free for research and education; 3) restricted; consent to specific terms required). Moreover, this page should allow for the deposition of the specific terms for usage which the applicant may sign electronically.
  • Recursive views and functionality per resources for all subdirectories and files under them like unix chmod -r
  • Showing the owner a list of all of his/her files/resources.
  • Showing the user the file sizes and adding the size information into the database.
  • An interface to add resources to database, planning and implementation, may be a command line program because linguistic resources are static.
  • Usage statistics (they are already httpd server log published by analog, but is it enough?).
  • Groups. Groups are functionally equal to an OR operation for the list of users, but long lists are more efficient and user-friendly for storing their own tables.

Doing everything above will take about a month.

Shibboleth Service Provider (SP) (Excluded from the contract)

Shibboleth SP has the Resource Manager functionality which is controlled by XML settings:

<?xml version="1.0" encoding="UTF-8"?>
<AccessControl xmlns="urn:mace:shibboleth:target:config:1.0">
   <Rule require="schacHomeOrganizationType">fi:university</Rule>
</AccessControl>
The rules can be combined with And and Or tags. Rules are called by the Shibboleth configuration file shibboleth.xml
<Path name="shib/appl/ling/kielipankki/amph" authType="shibboleth" requireSession="true">
    <AccessControlProvider uri="/v/net/hotpage.csc.fi/sec/shiblingacl.xml" type="edu.internet2.middleware.shibboleth.sp.provider.XMLAccessControl"/>
</Path>
This example does not work for an unknown reason. A similar example has worked on the test machine. The server uses Shibboleth 1.3 and it may work better with Shibboleth 2. Every change also requires restarting the Shibboleth SP, which is not acceptable in production use. Administering the Shibboleth SP will be very difficult and there is no sense to use insecure authorization.
 
Changed:
<
<
User information will not need to be saved in the database nor will the user need a CSC user account.
>
>
If it is possible to get the Shibboleth Access Control working, it will require at least a week of work.
 

Owner-controlled access to resources

Revision 22009-04-24 - SatuTorikka

Line: 1 to 1
 
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Line: 17 to 17
 
  • Programs that check and store web forms' data in the database and send emails (2 pm)
  • Resource Manager Demo into production (1 pm)
Deleted:
<
<
The costs will be 59 500-76 500 euros (equals 7-9 pm).
 

Automatic access to resources

Line: 51 to 50
 After the Shibboleth authentication:
Changed:
<
<
  • Available linguistic research resources (with limitations defined by the owners)
>
>
  • Available linguistic research resources (with limitations defined by the owners) How will the user view a list of resources before/after being authenticated and authorized?
 
  • Send

If the user cannot be authenticated with Shibboleth:

Line: 80 to 80
 

Referee's authorization and authentication

Changed:
<
<
An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. Also the email responses require handling.
>
>
An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. The response to the email sent by the system will be via a web form (not by replying to the mail).
  The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referee List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
Changed:
<
<
  1. If the user expects that a referee knows him or her, he or she selects that referee. The application will be sent by email to the referee with links for recommending and denying.
>
>
  1. If the user expects that a referee knows him or her, he or she selects that referee. The application will be sent by email to the referee with links for recommending and denying. A timer-process has to be initiated when the email is sent to the referee.
 
    • The referee candidates select a referee, too.
  1. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the owner and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
Line: 115 to 115
 
  1. Referee's Deny email to owner.
  2. Referee's Deny email to administrator.
  3. Referee's No reply email to owner (automatically after a delay).
Changed:
<
<
  1. Referee's No reply email to administrator (automatically after a delay).
>
>
  1. Referee's No reply email to administrator (automatically after a delay 7 days).
  In the case of a referee candidate, emails to the owner will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.
Added:
>
>

Timer-process

Timer-process has to be initiated when the email is sent to the referee.

  • if the referee has not answered in a certain time (reminder)= 5 days
  • timer will expire after a delay = 7 days
  • timer will be cancelled if the referee sends Recommend or Deny

Web form (AA work flow)

The web application can process web forms, i.e. webform submissions. These webforms could include a text box field if there is a need for the referee to provide comments, and also a checkbox field if there is a need to flag that the candidate’s application should not continue to be processed automatically and should be subject to a further administrator decision. (This should eliminate the need to deal with spam email if no email addresses are used in the application.) (A CAPTCHA test could also be used on the web form.)

 

Owner's and Administrator's acceptance

Revision 12009-04-23 - SatuTorikka

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="KieliaineistojenK"

Requirements Documentation DRAFT

Definitions

DATABASE: CSC's customer database is called Askare, later database in this document
HAKA: Identity federation of the Finnish universities, polytechnics and research institutions.

Schedule and costs

The purpose is to plan, build and take into production an upgraded system for Language Bank users. New features of the upgraded system includes SAML2/Shibboleth support, web application forms and a referee process. There are also small enhancements including email address verification for non-Shibboleth applications and an upgrade of the Resource Manager demo onto production level.

The tasks and their estimated schedule for 2009 are:

  • Add required tables and fields to the Askare database (2 pm)
  • Contents of web forms, emails and web pages (2-4 pm)
  • Programs that check and store web forms' data in the database and send emails (2 pm)
  • Resource Manager Demo into production (1 pm)

The costs will be 59 500-76 500 euros (equals 7-9 pm).

Automatic access to resources

Figure: User process for linguistics with Shibboleth authentication and automatic access to resources

Shibboleth authentication

Automatic access to resources

For authenticated users:

User information will not need to be saved in the database nor will the user need a CSC user account.

Owner-controlled access to resources

Figure: User process for linguistics with Shibboleth authentication, electronic applications and referees

Electronic application form processing

There are three different electronic application forms, both in English and Finnish. There are also forms for CSC's internal use to follow the application workflow status. Each form requires a program to handle it. Also the email responses require handling. Commercial users need to sign a contract to access the resources.

After the Shibboleth authentication:

  • Electronic Application Form (as registered and prefilled)
    • Required attributes
  • Available linguistic research resources (with limitations defined by the owners)
  • Send

If the user cannot be authenticated with Shibboleth:

If the user already has a CSC account, after logging onto the CSC Scientist's Interface (https://hotpage.csc.fi/):

  • Electronic Application Form for CSC users
  • Personal and project information update (if needed)
  • Available linguistic research resources (with limitations defined by the owners)
  • Send

After Send:

The electronic application form can also be used to collect new referee information. Each electronic application form can contain a checkbox for the referee candidate to express his or her willingness to function as a referee.

Referee's authorization and authentication

An applying user becomes trusted by being approved by a referee. A new electronic form with a referee list is needed in English and Finnish. The form requires a program to handle it. Also the email responses require handling.

The referee's procedure to authorize and authenticate an applying user could be the following:

  1. The user is forwarded to the Referee List Form containing a list of referees ordered by country (ref. Referees table). Some applications may skip the referee procedure.
  2. If the user expects that a referee knows him or her, he or she selects that referee. The application will be sent by email to the referee with links for recommending and denying.
    • The referee candidates select a referee, too.
  3. If the referee recommends that the application be accepted (ref. Recommend and Deny), the application will be forwarded to the owner and the Language Bank administrator to be accepted.
    • If the user does not know any referee, the application will be forwarded straight to the owner (or contact person) of the corpus and the Language Bank administrator.
    • If the referee selects the deny link, a rejection message will be sent to the owner and the administrator.

In this model, a referee losing status would affect the associated users as well. Loss of status due to natural reasons (e.g. retirement, transition) lacks this effect.

Recommend and Deny

For the referee recommendation, the system has a very secret passphrase which is SHA-hashed with the applid value. There are two web programs: Recommend and Deny. The email message generated for the referee contains links to both of them together with the application data. When the referee recommends that the application be accepted, he or she clicks the recommend link that is parametrized with an applid and a SHA hash value, and the hash will be checked.

When the hash matches, the recommend program increments the CAC field value by 32.

If the referee fails to reply in e.g. one week, he will receive a reminder. If the referee still fails to reply, the application will be forwarded to the owner and the administrator after a predefined delay (e.g. one week).

If the hash does not match, the programs do nothing or warn the staff about abuse.

Emails of the referee procedure

  1. Referee Form sends an email to the referee for recommending or denying.
  2. Reminder email to the referee, if (s)he fails to reply (automatically after a delay).
  3. Referee's Recommend email to owner.
  4. Referee's Recommend email to administrator.
  5. Referee's Deny email to owner.
  6. Referee's Deny email to administrator.
  7. Referee's No reply email to owner (automatically after a delay).
  8. Referee's No reply email to administrator (automatically after a delay).

In the case of a referee candidate, emails to the owner will be replaced by emails to the nominator from the Helsinki University Department of General Linguistics.

Owner's and Administrator's acceptance

If both the administrator and the owner accept the application (ref. Accept and Reject), the user will receive the access with the required permissions. Despite being rejected by the referee, the administrator still retains the option to accept the application, providing the owner agrees.

After the Owner's and Administrator's acceptance, all information will automatically be copied to the database tables User, Address etc. The CSC user manager process will create a new CSC user account with the appropriate rights and associate the new customer with a new or existing project. Opening up a normal CSC user account would offer tools for monitoring. If the IdM system is running, it could create the account. The user can then log onto CSC Scientist's Interface using a HAKA login or CSC user account login to access the resources.

The referees will be nominated by the Helsinki University Department of General Linguistics and the Administrator.

Accept and Reject

The program then sends the application by email to the owner (or contact person) of the corpus and the Language Bank administrator to be accepted. If both accept, the Accept program copies the application data into the database tables kayttajat (users), osoitteet (address) etc., and sends an acceptance email to the user.

What else does the Reject program do other than send a rejection email to the user? Will the application be deleted?

Emails of the Owner's and Administrator's procedure

  1. Owner's Accept email to administrator.
  2. Administrator's Accept email to usermgr@csc.fi (save the user's data in the database).
  3. Accept email to user.
  4. Owner's Reject email to administrator.
  5. Administrator's Reject email to user.

In the case of referee candidates, owner's emails will be replaced by emails of the nominator from the Helsinki University Department of General Linguistics, who accepts new referees.

Database changes

Email confirmation field

The application table has the email confirmation field which contains at least 128bit of random data generated when storing the application form. When using the application form as non-registered, the random data value will be emailed to the user. The user will receive a link to the confirmation form, where he or she needs to confirm the e-mail address by entering the random data value (refer to the KITWIKI registration). Submitting the confirmation form increments the CAC field value of the application table by 1 or 2 depending on the email address.

If the user's email address is invalid, the unconfirmed application will be dropped from the database once a day.

CSC Authentication Classes (CAC field)

The CAC field in the application table describes how the user's identity is verified. Information on how each user is authenticated needs to be stored in the database, because stronger authentication than the currently used personal signature may be required. It should be added into the database table kayttajat (users). The CAC field can get one or several of the values listed below. If several values are selected, they will be summarized.

  • 0. Not authenticated (data stored from web form).
  • 1. User-verified email. Authentication by an email confirmation from any address.
  • 2. Organization-verified email. Authentication by an email confirmation from a well-known CSC customer organization.
  • 4. Authentication using a credit card or a good certificate issued by well-known CA.
  • 8. Scanned signature in a pdf-document.
  • 16. Personal signature (default value for current CSC customers).
  • 32. Referee recommendation: a known professor or research director recommends that the application be accepted.
  • 64. Strong authentication using SAML2/Shibboleth or grid certificates (in the USA: urn:mace:incommon:iap:bronze). Alternatively, official identification (photo ID) verified by a referee.
  • 128. Official identification verified by a bank account (tupas) or more secure certificates (in the USA: urn:mace:incommon:iap:silver).
  • 256. CSC-checked official identification card or passport.

Application table

When the user sends the application, what to do with the application data which is not yet accepted? It can be stored in the existing tables with new status fields, or new table(s) can be created. We recommend that a new application table be created.

field type size null comment
applid int no
arrivaldate date no
usernamecantidate varchar 8 yes
CAC smallint no
display name varchar 20 no
familyname varchar 25 no
nationality smallint no phone code or TLD
position varchar 40 no
organization varchar 40 no
faculty varchar 40 yes
phone varchar 20 yes
gsm varchar 20 yes
email varchar 60 no
emailconfirmation varchar 20 yes only needed during confirmation process
referee smallint yes
datetime datetime no
projectname varchar yes
projectdescription text yes
newreferee char 1 yes

  • A postal address is required for sending the password, magazines and Christmas cards.
  • Will the applied resources be stored here?

Referees table

CSC has to add the new table referees in the database. The referee table must have the ID and status fields. The ID field is just a number which connects the table to the henkilo (person) table (includes e.g. first name and last name) and to the osoitteet (address) table (includes e.g. email, phone etc.). The status field can have the values 0 (no longer trusted), 1 (active) and 2 (retired).

It is necessary to document who was the referee for each user. The referees table needs to be connected to the kayttajat (users) table by adding the ID field of the referee table into the kayttajat table.

field type size null
ID smallint no
status char 1 no
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback