TY - JOUR
T1 - The Spoken BNC2014
T2 - Designing and building a spoken corpus of everyday conversations
AU - Love, Robbie
AU - Dembry, Claire
AU - Hardie, Andrew
AU - Brezina, Vaclav
AU - McEnery, Tony
N1 - © John Benjamins Publishing Company
This is an open access article under a OA CC BY license
PY - 2017/12/31
Y1 - 2017/12/31
N2 - This paper introduces the Spoken British National Corpus 2014, an11.5-million-word corpus of orthographically transcribed conversationsamong L1 speakers of British English from across the UK, recorded in the years2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describethe main stages of the Spoken BNC2014’s creation: design, data and metadatacollection, transcription, XML encoding, and annotation. In doing so we aimto (i) encourage users of the corpus to approach the data with sensitivity to themany methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corporaof the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, bothlogistically and practically, than in the past.
AB - This paper introduces the Spoken British National Corpus 2014, an11.5-million-word corpus of orthographically transcribed conversationsamong L1 speakers of British English from across the UK, recorded in the years2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describethe main stages of the Spoken BNC2014’s creation: design, data and metadatacollection, transcription, XML encoding, and annotation. In doing so we aimto (i) encourage users of the corpus to approach the data with sensitivity to themany methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corporaof the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, bothlogistically and practically, than in the past.
UR - https://benjamins.com/catalog/ijcl.22.3.02lov/fulltext
U2 - 10.1075/ijcl.22.3.02lov
DO - 10.1075/ijcl.22.3.02lov
M3 - Article
SN - 1384-6655
VL - 22
SP - 319
EP - 344
JO - International Journal of Corpus Linguistics
JF - International Journal of Corpus Linguistics
IS - 3
ER -