Offensive language identification with multi-task learning

Marcos Zampieri, Tharindu Ranasinghe, Diptanu Sarkar, Alex Ororbia

Research output: Contribution to journalArticlepeer-review

Abstract

The widespread presence of offensive content is a major issue in social media. This has motivated the development of computational models to identify such content in posts or conversations. Most of these models, however, treat offensive language identification as an isolated task. Very recently, a few datasets have been annotated with post-level offensiveness and related phenomena, such as offensive tokens, humor, engaging content, etc., creating the opportunity of modeling related tasks jointly which will help improve the explainability of offensive language detection systems and potentially aid human moderators. This study proposes a novel multi-task learning (MTL) architecture that can predict: (1) offensiveness at both post and token levels in English; and (2) offensiveness and related subjective tasks such as humor, engaging content, and gender bias identification in multilingual settings. Our results show that the proposed multi-task learning architecture outperforms current state-of-the-art methods trained to identify offense at the post level. We further demonstrate that MTL outperforms single-task learning (STL) across different tasks and language combinations.
Original languageEnglish
Pages (from-to)613-630
JournalJournal of Intelligent Information Systems
Volume60
Issue number3
Early online date29 Apr 2023
DOIs
Publication statusPublished - Jun 2023

Bibliographical note

Copyright © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at:
https://doi.org/10.1007/s10844-023-00787-z

Keywords

  • Deep learning
  • Multi-task learning
  • Offensive language identification
  • Transformers

Fingerprint

Dive into the research topics of 'Offensive language identification with multi-task learning'. Together they form a unique fingerprint.

Cite this