Corpus Linguistics in Authorship Identification

Krzysztof J Kredens; Richard M Coulthard

doi:10.1093/oxfordhb/9780199572120.013.0037

Corpus Linguistics in Authorship Identification

Krzysztof J Kredens, Richard M Coulthard

Research output: Chapter in Book/Published conference output › Chapter (peer-reviewed) › peer-review

Abstract

Corpus linguistics is basically ‘an empirical approach to studying language, which uses observations of attested data in order to make generalisations about lexis, grammar, and semantics’, and which, in the context of forensic linguistics, offers much more than explanatory possibilities. It provides methods for processing naturally occurring language data with a view to describing the nature of particular instances of language use and the behavior of particular (groups of) language users. Language corpora can thus be used for a variety of forensic linguistic tasks. This article explores how corpora and corpus methodology can aid the forensic linguist: in authorship identification; to analyze texts comparatively in order to comment on the authorship of questioned documents; to interpret the meaning of disputed utterances; and to investigate and describe language use in legal and forensic settings. After discussing authorship attribution, it looks at disputed meanings, corpora in language and law research, corpora for forensic applications, and the Internet as a corpus.

Original language	English
Title of host publication	Oxford Handbook of Language and Law
Editors	Lawrence M Solan, Peter M Tiersma
Place of Publication	Oxford
Publisher	Oxford University Press
ISBN (Print)	9780199572120
DOIs	https://doi.org/10.1093/oxfordhb/9780199572120.013.0037
Publication status	Published - Mar 2012

Keywords

forensic linguistics
forensic authorship analysis

Access to Document

10.1093/oxfordhb/9780199572120.013.0037

Cite this

@inbook{7e98f9b920164568be3f22f434a90e7c,

title = "Corpus Linguistics in Authorship Identification",

abstract = "Corpus linguistics is basically {\textquoteleft}an empirical approach to studying language, which uses observations of attested data in order to make generalisations about lexis, grammar, and semantics{\textquoteright}, and which, in the context of forensic linguistics, offers much more than explanatory possibilities. It provides methods for processing naturally occurring language data with a view to describing the nature of particular instances of language use and the behavior of particular (groups of) language users. Language corpora can thus be used for a variety of forensic linguistic tasks. This article explores how corpora and corpus methodology can aid the forensic linguist: in authorship identification; to analyze texts comparatively in order to comment on the authorship of questioned documents; to interpret the meaning of disputed utterances; and to investigate and describe language use in legal and forensic settings. After discussing authorship attribution, it looks at disputed meanings, corpora in language and law research, corpora for forensic applications, and the Internet as a corpus.",

keywords = "forensic linguistics, forensic authorship analysis",

author = "Kredens, {Krzysztof J} and Coulthard, {Richard M}",

year = "2012",

month = mar,

doi = "10.1093/oxfordhb/9780199572120.013.0037",

language = "English",

isbn = "9780199572120",

editor = "Solan, {Lawrence M} and Tiersma, {Peter M}",

booktitle = "Oxford Handbook of Language and Law",

publisher = "Oxford University Press",

address = "United Kingdom",

}

TY - CHAP

T1 - Corpus Linguistics in Authorship Identification

AU - Kredens, Krzysztof J

AU - Coulthard, Richard M

PY - 2012/3

Y1 - 2012/3

N2 - Corpus linguistics is basically ‘an empirical approach to studying language, which uses observations of attested data in order to make generalisations about lexis, grammar, and semantics’, and which, in the context of forensic linguistics, offers much more than explanatory possibilities. It provides methods for processing naturally occurring language data with a view to describing the nature of particular instances of language use and the behavior of particular (groups of) language users. Language corpora can thus be used for a variety of forensic linguistic tasks. This article explores how corpora and corpus methodology can aid the forensic linguist: in authorship identification; to analyze texts comparatively in order to comment on the authorship of questioned documents; to interpret the meaning of disputed utterances; and to investigate and describe language use in legal and forensic settings. After discussing authorship attribution, it looks at disputed meanings, corpora in language and law research, corpora for forensic applications, and the Internet as a corpus.

AB - Corpus linguistics is basically ‘an empirical approach to studying language, which uses observations of attested data in order to make generalisations about lexis, grammar, and semantics’, and which, in the context of forensic linguistics, offers much more than explanatory possibilities. It provides methods for processing naturally occurring language data with a view to describing the nature of particular instances of language use and the behavior of particular (groups of) language users. Language corpora can thus be used for a variety of forensic linguistic tasks. This article explores how corpora and corpus methodology can aid the forensic linguist: in authorship identification; to analyze texts comparatively in order to comment on the authorship of questioned documents; to interpret the meaning of disputed utterances; and to investigate and describe language use in legal and forensic settings. After discussing authorship attribution, it looks at disputed meanings, corpora in language and law research, corpora for forensic applications, and the Internet as a corpus.

KW - forensic linguistics

KW - forensic authorship analysis

UR - https://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780199572120.001.0001/oxfordhb-9780199572120

U2 - 10.1093/oxfordhb/9780199572120.013.0037

DO - 10.1093/oxfordhb/9780199572120.013.0037

M3 - Chapter (peer-reviewed)

SN - 9780199572120

BT - Oxford Handbook of Language and Law

A2 - Solan, Lawrence M

A2 - Tiersma, Peter M

PB - Oxford University Press

CY - Oxford

ER -

Corpus Linguistics in Authorship Identification

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this