Native Language Influence Detection for Forensic Authorship Analysis: Identifying L1 Persian Bloggers.

Research output: Contribution to journalArticle

View graph of relations Save citation


Research units


This article demonstrates and examines the potential use of interlingual identifiers for forensic authorship analysis and Native Language Influence Detection (NLID).The work focuses on the practical applications of native language (L1) identifiers by a human analyst in investigative situations. Using naturally occurring blog posts where the writer self-identifies as a native Persian speaker, a human analyst derived and coded setsof non-native features. Two logistic regression models were built,the firstwas used to select features to distinguish L1 Persian speakers from L1 English speakersin their English writings. Thesecond developed a feature list to contrastL1 languages that are geographically and linguistically close to Persian. The resultsclearly demonstrate that interlingual identifiers have the potential to aid in determining the L1 of an anonymous author and can be used by a human analyst in a short forensically-realisticexampletext. This article demonstratesthat Native Language Influence Detectionis possiblebeyond the more common computational approaches and can form auseful tool in the forensic linguist’stoolbox. This study is not a statisticalvalidation study, instead it demonstrates how a sociolinguistic approach can complement more traditional computational approaches.

Request a copy

Request a copy


  • Perkins and Grant NLID persian final edits (1)

    Accepted author manuscript, 70 KB, Word-document

    Embargo ends: 1/01/50


Original languageEnglish
JournalInternational Journal of Speech, Language and the Law
StateAccepted/In press - 30 Apr 2018

Employable Graduates; Exploitable Research

Copy the text from this field...