Corpus-driven lexicography

Ramesh Krishnamurthy

Research output: Contribution to journalArticle


This paper discusses three important aspects of John Sinclair’s legacy: the corpus, lexicography, and the notion of ‘corpus-driven’. The corpus represents his concern with the nature of linguistic evidence. Lexicography is for him the canonical mode of language description at the lexical level. And his belief that the corpus should ‘drive’ the description is reflected in his constant attempts to utilize the emergent computer technologies to automate the initial stages of analysis and defer the intuitive, interpretative contributions of linguists to increasingly later stages in the process. Sinclair’s model of corpus-driven lexicography has spread far beyond its initial implementation at Cobuild, to most EFL dictionaries, to native-speaker dictionaries (e.g. the New Oxford Dictionary of English, and many national language dictionaries in emerging or re-emerging speech communities) and bilingual dictionaries (e.g. Collins, Oxford-Hachette).
Original languageEnglish
Pages (from-to)231-242
Number of pages12
JournalInternational Journal of Lexicography
Issue number3
Publication statusPublished - Sep 2008

Bibliographical note

This is an electronic version of an article published by Oxford University Press, in International Journal of Lexicography, volume 21(3), Pg 231-242.


  • lexicography
  • linguistic
  • technologies
  • dictionaries
  • native

Fingerprint Dive into the research topics of 'Corpus-driven lexicography'. Together they form a unique fingerprint.

  • Cite this