Comparing sentence-level features for authorship analysis in Portuguese

Rui Sousa-Silva*, Luís Sarmento, Tim Grant, Eugénio Oliveira, Belinda Maia

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference publication

Abstract

In this paper we compare the robustness of several types of stylistic markers to help discriminate authorship at sentence level. We train a SVM-based classifier using each set of features separately and perform sentence-level authorship analysis over corpus of editorials published in a Portuguese quality newspaper. Results show that features based on POS information, punctuation and word / sentence length contribute to a more robust sentence-level authorship analysis.

Original languageEnglish
Title of host publicationComputational processing of the Portuguese language
Subtitle of host publication9th International Conference, PROPOR 2010, Porto Alegre, RS, Brazil, April 27-30, 2010. Proceedings
EditorsThiago Alexandre Salgueiro Pardo, António Branco, Aldebaro Klautau, et al
Place of PublicationBerlin (DE)
PublisherSpringer
Pages51-54
Number of pages4
ISBN (Electronic)978-3-642-12320-7
ISBN (Print)978-3-642-12319-1
DOIs
Publication statusPublished - 23 Dec 2010
Event9th International Conference on Computational Processing of the Portuguese Language - Porto Alegre, RS, Brazil
Duration: 27 Apr 201030 Apr 2010

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume6001
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Conference on Computational Processing of the Portuguese Language
Abbreviated titlePROPOR 2010
CountryBrazil
CityPorto Alegre, RS
Period27/04/1030/04/10

Fingerprint Dive into the research topics of 'Comparing sentence-level features for authorship analysis in Portuguese'. Together they form a unique fingerprint.

Cite this