Native Language Identification (NLID) for Forensic Authorship Analysis of Weblogs

Ria C Perkins

doi:10.4018/978-1-4666-8345-7.ch012

Native Language Identification (NLID) for Forensic Authorship Analysis of Weblogs

Ria C Perkins^*

^*Corresponding author for this work

School of Social Sciences and Humanities

Research output: Chapter in Book/Published conference output › Entry for encyclopedia/dictionary

Abstract

This chapter introduces Native Language Identification (NLID) and considers the casework applications with regard to authorship analysis of online material. It presents findings from research identifying which linguistic features were the best indicators of native (L1) Persian speakers blogging in English, and analyses how these features cope at distinguishing between native influences from languages that are linguistically and culturally related. The first chapter section outlines the area of Native Language Identification, and demonstrates its potential for application through a discussion of relevant case history. The next section discusses a development of methodology for identifying influence from L1 Persian in an anonymous blog author, and presents findings. The third part discusses the application of these features to casework situations as well as how the features identified can form an easily applicable model and demonstrates the application of this to casework. The research presented in this chapter can be considered a case study for the wider potential application of NLID.

Original language	English
Title of host publication	New Threats and Countermeasures in Digital Crime and Cyber Terrorism
Editors	Maurice Dawson, Marwan Omar
Publisher	IGI Global
Pages	213-234
Number of pages	22
ISBN (Electronic)	9781466683464
ISBN (Print)	1466683457, 9781466683457
DOIs	https://doi.org/10.4018/978-1-4666-8345-7.ch012 https://doi.org/10.4018/978-1-4666-8345-7.ch012
Publication status	Published - 30 Apr 2015

Publication series

Name	Premier Reference Source
Publisher	IGI Global

Access to Document

Cite this

@inbook{866b8933dfac430bacc4c96295999abb,

title = "Native Language Identification (NLID) for Forensic Authorship Analysis of Weblogs",

abstract = "This chapter introduces Native Language Identification (NLID) and considers the casework applications with regard to authorship analysis of online material. It presents findings from research identifying which linguistic features were the best indicators of native (L1) Persian speakers blogging in English, and analyses how these features cope at distinguishing between native influences from languages that are linguistically and culturally related. The first chapter section outlines the area of Native Language Identification, and demonstrates its potential for application through a discussion of relevant case history. The next section discusses a development of methodology for identifying influence from L1 Persian in an anonymous blog author, and presents findings. The third part discusses the application of these features to casework situations as well as how the features identified can form an easily applicable model and demonstrates the application of this to casework. The research presented in this chapter can be considered a case study for the wider potential application of NLID.",

author = "Perkins, {Ria C}",

year = "2015",

month = apr,

day = "30",

doi = "10.4018/978-1-4666-8345-7.ch012",

language = "English",

isbn = "1466683457",

series = "Premier Reference Source",

publisher = "IGI Global",

pages = "213--234",

editor = "Maurice Dawson and Marwan Omar",

booktitle = "New Threats and Countermeasures in Digital Crime and Cyber Terrorism",

address = "United States",

}

TY - CHAP

T1 - Native Language Identification (NLID) for Forensic Authorship Analysis of Weblogs

AU - Perkins, Ria C

PY - 2015/4/30

Y1 - 2015/4/30

N2 - This chapter introduces Native Language Identification (NLID) and considers the casework applications with regard to authorship analysis of online material. It presents findings from research identifying which linguistic features were the best indicators of native (L1) Persian speakers blogging in English, and analyses how these features cope at distinguishing between native influences from languages that are linguistically and culturally related. The first chapter section outlines the area of Native Language Identification, and demonstrates its potential for application through a discussion of relevant case history. The next section discusses a development of methodology for identifying influence from L1 Persian in an anonymous blog author, and presents findings. The third part discusses the application of these features to casework situations as well as how the features identified can form an easily applicable model and demonstrates the application of this to casework. The research presented in this chapter can be considered a case study for the wider potential application of NLID.

AB - This chapter introduces Native Language Identification (NLID) and considers the casework applications with regard to authorship analysis of online material. It presents findings from research identifying which linguistic features were the best indicators of native (L1) Persian speakers blogging in English, and analyses how these features cope at distinguishing between native influences from languages that are linguistically and culturally related. The first chapter section outlines the area of Native Language Identification, and demonstrates its potential for application through a discussion of relevant case history. The next section discusses a development of methodology for identifying influence from L1 Persian in an anonymous blog author, and presents findings. The third part discusses the application of these features to casework situations as well as how the features identified can form an easily applicable model and demonstrates the application of this to casework. The research presented in this chapter can be considered a case study for the wider potential application of NLID.

UR - http://www.scopus.com/inward/record.url?scp=84958252560&partnerID=8YFLogxK

U2 - 10.4018/978-1-4666-8345-7.ch012

DO - 10.4018/978-1-4666-8345-7.ch012

M3 - Entry for encyclopedia/dictionary

AN - SCOPUS:84958252560

SN - 1466683457

SN - 9781466683457

T3 - Premier Reference Source

SP - 213

EP - 234

BT - New Threats and Countermeasures in Digital Crime and Cyber Terrorism

A2 - Dawson, Maurice

A2 - Omar, Marwan

PB - IGI Global

ER -

Native Language Identification (NLID) for Forensic Authorship Analysis of Weblogs

Abstract

Publication series

Access to Document

Other files and links

Fingerprint

Cite this