Survey of English Usage

Survey of English Usage

The Survey of English Usage was the first research centre in Europe to carry out research with corpora. The Survey is based in the Department of English Language and Literature at University College London.


The Survey of English Usage was founded in 1959 by Randolph (now Lord) Quirk. Many well-known linguists have spent time doing research at the Survey, including Valerie Adams, John Algeo, Dwight Bolinger, Noël Burton-Roberts, David Crystal, Derek Davy, Jan Firbas, Sidney Greenbaum, Liliane Haegeman, Robert Ilson, Ruth Kempson, Geoffrey Leech, Jan Rusiecki, Jan Svartvik, Joe Taglicht and many others.

The original Survey Corpus predated modern computing. It was recorded on reel-to-reel tapes, transcribed on paper, filed in filing cabinets, and indexed on paper cards. Transcriptions were annotated with a detailed prosodic and paralinguistic annotation developed by Crystal and Quirk (1964). [Crystal, David, and Quirk, Randolph (1964). "Systems of Prosodic and Paralinguistic Features in English". The Hague: Mouton.] Sets of paper cards were manually annotated for grammatical structures and filed, so, for example, all noun phrases could be found in the noun phrase filing cabinet in the Survey. Naturally, corpus searches required a visit to the Survey.

This corpus is now known more widely as the London-Lund Corpus (LLC), as it was the responsibility of co-workers in Lund, Sweden, to computerise the corpus. Thirty-four of the spoken texts were published in book form as Svartvik and Quirk (1984), [Svartvik, Jan and Quirk, Randolph (1984) (eds.). "A Corpus of English Conversation" Lund: CWK Gleerup.] and the corpus was used as the basis for the famous "Comprehensive Grammar" (Quirk "et al." 1985). [Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey and Svartvik, Jan (1985). "A Comprehensive Grammar of the English Language" London: Longman.]

More information is available on the Survey of English Usage [ history pages] , which include reflections by Randolph Quirk and others on the early days of corpus linguistics.

Current Research

Constructing Corpora

In 1988 Sidney Greenbaum proposed a new project, ICE, the International Corpus of English. ICE was to be an international project, carried out at research centres around the world, to compile corpora of English varieties where English was the first or second official language. ICE texts would contain spoken and written English in a balanced sample of one million words per component so that these samples could be compared in a wide varieties of ways. The ICE project continues around the world to the present day.

ICE-GB, the [ British Component of ICE] , was compiled at the Survey. ICE-GB was annotated to a very detailed level, including constructing a full grammatical analysis (parse) for every sentence in the corpus. The first release of ICE-GB took place in 1998. ICE-GB was distributed with software for searching and exploring the parsed corpus called [ ICECUP] . Release 2 of ICE-GB has now been released and is available on CD.

As well as contrasting varieties of English, many researchers are interested in language development and change over time. A recent project at the Survey undertook the parsing of a large (400,000 word) selection of the spoken part of the LLC in a manner directly comparable with ICE-GB, forming a new, 800,000 word diachronic corpus, called the [ Diachronic Corpus of Present-Day Spoken English] (DCPSE). DCPSE has now been released and is available on CD from the Survey.

These two corpora comprise the largest collection of parsed and corrected, orthographically transcribed spoken English language data in the world, with over one million words of spoken English in this form.

Exploring Corpora

Parsed corpora are large databases containing detailed grammatical tree structures. One of the consequences of forming large collections of valuable linguistic data is a pressing need for methods and tools to help researchers and other users make the most of them. So in parallel with the parsing of natural language data, the Survey team have carried out research and development of software tools to help linguists use these corpora. The [ ICECUP] research platform uses an intuitive grammatical query representation called [ Fuzzy Tree Fragments] (FTFs) to search parsed corpora.

Linguistic Research with Corpora

As well as distributing corpora and tools to the Corpus Linguistics research community, the SEU carries out research into English language. Recent projects include research on the English Noun Phrase, Subordination in Spoken and Written English, and the English Verb Phrase. The Survey also provides support for a small number of PhD students who carry out research into English language corpora.


External Links

* [ The Survey of English Usage website]

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • English language in England — refers to the English language as spoken in England, part of the United Kingdom. There are many different accents and dialects throughout England and people are often very proud of their local accent or dialect, however there are many associated… …   Wikipedia

  • English language — English Pronunciation /ˈ …   Wikipedia

  • English words first attested in Chaucer — Contents 1 Etymology 2 List 2.1 Canterbury Tales General Prologue …   Wikipedia

  • English Cocker Spaniel — Infobox Dogbreed akcgroup = Sporting akcstd = cocker spaniel/index.cfm altname = Cocker Spaniel ankcgroup = Group 3 (Gundogs) ankcstd = ckcgroup = Group 1 Sporting Dogs ckcstd …   Wikipedia

  • English-language vowel changes before historic r — In the phonological history of the English language, vowels followed (or formerly followed) by the phoneme /r/ have undergone a number of phonological changes. In recent centuries, most or all of these changes have involved merging of vowel… …   Wikipedia

  • English cannon — The first recorded usage of cannon in Great Britain was in 1327, when they were used in battle by the English against the Scots. Under the Tudors, the first forts featuring cannon batteries were built, while cannon were first used by the Tudor… …   Wikipedia

  • English speaker — Anglais Wikipédia …   Wikipédia en Français

  • Canadian English — (CanE, en CA) [en CA is the language code for Canadian English , as defined by ISO standards (see ISO 639 1 and ISO 3166 1 alpha 2) and Internet standards (see IETF language tag).] is the variety of English used in Canada. More than 26 million… …   Wikipedia

  • American English — US English redirects here. For the political organization, see U.S. English (organization). For other uses, see American English (disambiguation). English language prevalence in the United States. Darker shades of blue indicate higher… …   Wikipedia

  • British English — or UK English (BrE, BE, en GB [en GB is the language code for British English , as defined by ISO standards (see ISO 639 1 and ISO 3166 1 alpha 2) and Internet standards (see IETF language tag).] ) is the broad term used to distinguish the forms… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”