Authorship profiling in a forensic context

  • Andrea Nini

    Student thesis: Doctoral ThesisDoctor of Philosophy


    There are several unresolved problems in forensic authorship profiling, including a lack of research focusing on the types of texts that are typically analysed in forensic linguistics (e.g. threatening letters, ransom demands) and a general disregard for the effect of register variation when testing linguistic variables for use in profiling. The aim of this dissertation is therefore to make a first step towards filling these gaps by testing whether established patterns of sociolinguistic variation appear in malicious forensic texts that are controlled for register. This dissertation begins with a literature review that highlights a series of correlations between language use and various social factors, including gender, age, level of education and social class. This dissertation then presents the primary data set used in this study, which consists of a corpus of 287 fabricated malicious texts from 3 different registers produced by 96 authors stratified across the 4 social factors listed above. Since this data set is fabricated, its validity was also tested through a comparison with another corpus consisting of 104 naturally occurring malicious texts, which showed that no important differences exist between the language of the fabricated malicious texts and the authentic malicious texts. The dissertation then reports the findings of the analysis of the corpus of fabricated malicious texts, which shows that the major patterns of sociolinguistic variation identified in previous research are valid for forensic malicious texts and that controlling register variation greatly improves the performance of profiling. In addition, it is shown that through regression analysis it is possible to use these patterns of linguistic variation to profile the demographic background of authors across the four social factors with an average accuracy of 70%. Overall, the present study therefore makes a first step towards developing a principled model of forensic authorship profiling.
    Date of Award26 Feb 2015
    Original languageEnglish
    SupervisorJack W Grieve (Supervisor), Tim Grant (Supervisor) & Richard M Coulthard (Supervisor)


    • forensic linguistics
    • authorship profiling
    • authorship analysis
    • threatening texts
    • styometry
    • register variation
    • stylistics

    Cite this