The Corpus Server of the Language Bank:

How the Common Directories Are Organized?

Basically, there are the following installation directories in addition to the directories provided by the Linux system:

directory purpose owner real location
/v/ used to create a multi-platform metacomputer environment at CSC root /v/
/f/bin the first binaries in the path root /fs/corpus/slash-f/
/usr/bin user binaries that belong to the Redhat Enterprise Linux distribution root /usr/bin/
/p/bin the CSC meta machine binaries ling /fs/corpus/slash-p/
/l/bin or /usr/local/bin the local binary directory ling /fs/corpus/slash-l/bin/
/c/bin or /usr/local/contrib/bin the CSC approved links to the user-contributed binaries ling /fs/corpus/slash-c/bin/
/c/man or /usr/local/contrib/man the CSC approved links to the user-contributed man pages ling /fs/corpus/slash-c/bin/
/c/appl/CLASS/USERNAME/APPNAME/VERSION/bin the user-contributed binary directories users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/VERSION/lib the user-contributed lib directories users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/VERSION/man the user-contributed man directories users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/VERSION/lic the user-contributed information on the license and conditions of use users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/VERSION/doc the user-contributed additional documentation users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/VERSION/README the user-contributed information on purpose of use and the maintenance plan and the contributor's contact information users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/VERSION/REBUILD the user-contributed information on the needed actions in rebuilding and reinstalling the software users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/latest a symbolic link to the latest 32-bit i386/linux version /c/appl/CLASS/USERNAME/APPNAME/VERSION users /fs/corpus/slash-c/appl/...
/c/appl/CLASS/USERNAME/APPNAME/latest64 a symbolic link to the latest 64-bit i386/linux version /c/appl/CLASS/USERNAME/APPNAME/VERSION users /fs/corpus/slash-c/appl/...

The user-contributed installations use the following directory naming conventions:

variable meaning
CLASS the class of software: ling = linguistics, lang = programming languages, comm = internet, ...
USERNAME the unix account of the contributing user
APPNAME the name of the application
VERSION the version of the application

When a user contributes a software to /c and asks the superuser for approval and inclusion to the global path, there are two possible scenarios for approval:

  1. ADOPTION: Depending on the type of the software and the contributor, the contributed software may be adopted as such. In that case, the adminstrator of the language bank takes over the ownership of the contributed version and links the executables to /l/cbin.
  2. REINSTALLATION: On times, the adminstrator may choose to reinstall the software. All the information for reinstalling the software must be provided to the language bank adminstrator, in the text file REBUILD. The installation may go either to /c/ or /l/, depending on the support level that will be quaranteed by CSC. In /c/, the support is basically delegated to the original contributor, while in /l/, the support is the responsibility of CSC.

What directories are in /usr/local, i.e. /l ?

directory abbreviation function
/l/kielipankki/   The linguistic resources maintained at CSC, the core of the Language Bank of Finland

/proj contains symlinks
/corp is a symlink
The linguistic resources maintained by the Department of General Linguistics in the University of Helsinki. This directory is planned as a new location for the University of Helsinki Corpus Server UHLCS (still in This directory is not maintained by CSC.
/l/contrib/ i.e. /c/ Contributed applications approved by CSC.
/l/appl/ executables linked to /l/bin/ Local applications maintained centrally by CSC.
/l/bin/ in the path Symbolic links to executables of centrally maintained local applications. Contains pointers to executables in /l/bin/appl/.../.../.../bin/...

According to the CSC design principles, the system-denpendent, but machine-independent software will be installed to the /v/ directory tree. Therefore, /l/ will eventually develop into a tree that contains links to /v/. Its function will be to represent a view into a subset of /v/ in a way that is relevant to the corpus server.

Some "frequently asked" questions?

  • Where are /mnt/corpus/ and /fs/corpus/? The former is a symbolic link to the latter that links to /fs/corpus2/.
  • Where is /corp/? It is a symbolic link to /l/venus/corp/ and is provided for compatibility.
  • Where is /proj/? It is a directory containing symbolic links to subdirectories of /l/venus/proj/. It is provided for compatibility.
  • What is the difference between /fs/kielipankki/ and /fs/corpus2/slash-l/kielipankki/ (or /l/kielipankki/ for short)? The content of the former is linked to the latter as a subset. The former is used by the scientists interface, while the latter is visible only to the corpus machine.

Multi-platform support

(Finnish draft text:) /c-hakemistoon kontribuoiville uutena lisätietona että mikäli asennus voidaan ja kannattaa tehdä myös 64-bittiselle linux-alustalle, kontribuoijien kannattaa tehdä se samalla kertaa, niin että voimme hyödyntää corpus4:sta 64-bittisenä sitten kun 64-bittinen linux-palvelin tulee saataville. 64-bittinen asennus tehdään omana versionaan hakemistoon /c/appl/ling/username/sovellus/versio-64/ jne. Tosin 64-bittisen version kääntäminen ei ole mahdollista corpus3:lla.

