Helsinki Corpus of English Texts

Laitos: Englannin kielen laitos
Yhteyshenkilö: Arja Nurmi

1. Linguistic research resource

w. Official or identificatory name and acronym Helsinki Corpus of English Texts (HC)
x. Short description of content A balanced multi-genre corpus of English texts between the years c. 730-1710.
y. Originality status primary/original location: Research Unit for Variation, Contacts and Change in English (VARIENG), Department of English, University of Helsinki
copy: Distributed world-wide through the Oxford Text Archive and ICAME.
z. Description of size and extent 1.6 million words
å. Storage format Electronic (plain text)
bb. (Estimated) time invested in the collection and processing of the resource 240 person months
cc. Contact person(s) and their contact information (E-mail & telephone); may be the same for all points (a), (b) and (c). a) person who in practice administers the resource and grants (possibly required) usage permits: Oxford Text Archive ( and ICAME (
b) person who has physical possession of the contracts concerning the resource, by which the resource has been acquired for use at the department Heli Tissari (, 09-19123104).
c) person(s) who has/have originally contracted acquired, collected, compiled and/or annotated the resource, and who thus has copyright to the material and whose permission is (possibly) required to access the resource. Corpus compilers: Matti Rissanen, Merja Kytö, (Old English part): Leena Kahlas-Tarkka, Matti Kilpiö, (Middle English part): Päivi Pahta, Kirsti Peitsara, Irma Taavitsainen, (Early Modern English part): Terttu Nevalainen, Helena Raumolin-Brunberg.
dd. (Main) references to published articles or other written works describing the resource itself or research based on its use. Kytö, Merja (comp.), Manual to the Diachronic Part of The Helsinki Corpus of English Texts: Coding Conventions and Lists of Source Texts (3rd ed. 1996). (
Matti Rissanen, Merja Kytö & Minna Palander (eds). 1993. Early English in the Computer Age: Explorations through the Helsinki Corpus. Berlin: Mouton de Gruyter.
Matti Rissanen, Merja Kytö & Kirsi Heikkonen (eds.) 1997 English in Transition: Corpus-based studies in linguistic variation and genre styles. Berlin and New York: Mouton de Gruyter.
Matti Rissanen, Merja Kytö & Kirsi Heikkonen (eds) 1997, Grammaticalization at Work: Studies of Long-Term Developments in English (Topics in English Linguistics 24). Berlin: Mouton de Gruyter.
ee. Link(s) to more extensive/thorough descriptions of the resource in the Internet (which may be in any language)
ff. Physical location of resource (server and directory path or Internet address, or room/person in the case of non-electronic materials) Several copies on computers at VARIENG. Copies available from Oxford Text Archive (free of charge) and ICAME.
gg. Miscellaneous other notes  

Resource Name Helsinki Corpus of English Texts (HC)
Resource Type Written Corpus
Languages English
Languages (other)

Description A balanced multi-genre corpus of English texts between the years c. 730-1710.

Institute Department of English, University of Helsinki
Contact Person

Begin year of resource creation

Finalization year

Format Electronic (plain text)
Metadata Link


Reference Link

Collection Working Languages

Collection Long term preservation by

Collection Location

Collection Content Type

Collection Format Detailed

Collection Quality

Collection Applications

Collection Project

Collection Size 1.6 million words
Collection Distribution Form

Collection Access

Collection Source

IPR Ethical Reference

IPR Legal Reference

IPR License Type

IPR Description

IPR Contact Person

Topic revision: r3 - 2011-11-14 - KimmoKoskenniemi
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback