Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala

Tharindu Ranasinghe, Koyel Ghosh, Aditya Shankar Pal, Apurbalal Senapati, Alphaeus Eric Dmonte, Marcos Zampieri, Sandip Modha, Shrey Satapara

Research output: Chapter in Book/Published conference outputConference publication

Abstract

The evaluation of content moderation systems requires reliable benchmark data. This task becomes particularly formidable for low-resource languages, where obtaining or curating such data poses significant challenges. Addressing this issue, HASOC 2023 organised various shared tasks focused on identifying offensive content in low-resource languages. This paper reports on tasks for hate speech detection in several Indo-Aryan languages—Assamese, Bengali, Gujarati, and Sinhala as well as a Sino-Tibetan language, Bodo, for which limited linguistic resources currently exist. The shared task involved the compilation of multiple datasets. In total, nearly 200 runs were submitted by more than 30 teams, which are presented and analysed in this report.
Original languageEnglish
Title of host publicationFIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation
EditorsDebasis Ganguly, Srijoni Majumdar, Bhaskar Mitra, Parth Gupta, Surupendu Gangopadhyay, Prasenjit Majumder
PublisherACM
Pages13-15
Number of pages3
ISBN (Electronic)9798400716324
DOIs
Publication statusPublished - 12 Feb 2024
EventFIRE 2023: Forum for Information Retrieval Evaluation - Panjim, India
Duration: 15 Dec 202318 Dec 2023

Publication series

NameProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation
PublisherACM

Conference

ConferenceFIRE 2023: Forum for Information Retrieval Evaluation
Country/TerritoryIndia
CityPanjim
Period15/12/2318/12/23

Keywords

  • Assamese
  • Bengali
  • Bodo
  • Gujarati
  • Hate speech
  • Multilingual Datasets
  • Sinhala
  • Social media
  • Under-resourced languages

Fingerprint

Dive into the research topics of 'Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala'. Together they form a unique fingerprint.

Cite this