- British National Corpus
The British National Corpus (or just
BNC ) is a 100-million-wordtext corpus of samples of written and spoken English from a wide range of sources. It was compiled as a general corpus (text collection) in the field ofcorpus linguistics . The corpus coversBritish English of the late twentieth century from a wide variety ofgenre s with the intention that it be a representative sample of spoken and written British English of that time.Of the two parts to the 10-million word spoken corpus, one is a
demographic part, containing transcriptions of spontaneous naturalconversation s made by members of the public and the other a context-governed part, containing transcriptions ofrecording s made at specific types of meeting and event. All the original recordings transcribed for inclusion in the BNC have been deposited at theNational Sound Archives of theBritish Library .The corpus is marked up following the recommendations of the
Text Encoding Initiative and includes full linguistic annotation and contextual information The most recent edition, from March 2007, is distributed inXML format along with theXAIRA software. It is freely available under a licence and is very widely distributed.ee also
*
Corpus of Contemporary American English (COCA) 360 million words, 1990-2007. Freely available online.
*American National Corpus
*Oxford English Corpus External links
* [http://www.natcorp.ox.ac.uk British National Corpus website]
* [http://corpus.byu.edu/bnc Free BNC interface]
Wikimedia Foundation. 2010.