Target-Based Offensive Language Identification

Marcos Zampieri; Skye Morgan; Kai North; Tharindu Ranasinghe; Austin Simmons; Paridhi Khandelwal; Sara Rosenthal; Preslav Nakov

doi:10.18653/v1/2023.acl-short.66

Target-Based Offensive Language Identification

Marcos Zampieri, Skye Morgan, Kai North, Tharindu Ranasinghe, Austin Simmons, Paridhi Khandelwal, Sara Rosenthal, Preslav Nakov

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

We present TBO, a new dataset for Target-based Offensive language identification. TBO contains post-level annotations regarding the harmfulness of an offensive post and token-level annotations comprising of the target and the offensive argument expression. Popular offensive language identification datasets for social media focus on annotation taxonomies only at the post level and more recently, some datasets have been released that feature only token-level annotations. TBO is an important resource that bridges the gap between post-level and token-level annotation datasets by introducing a single comprehensive unified annotation taxonomy. We use the TBO taxonomy to annotate post-level and token-level offensive language on English Twitter posts. We release an initial dataset of over 4,500 instances collected from Twitter and we carry out multiple experiments to compare the performance of different models trained and tested on TBO.

Original language	English
Title of host publication	Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics
Subtitle of host publication	(Volume 2: Short Papers)
Publisher	Association for Computational Linguistics (ACL)
Pages	762-770
Number of pages	9
Volume	2
ISBN (Electronic)	9781959429715
DOIs	https://doi.org/10.18653/v1/2023.acl-short.66
Publication status	Published - 9 Jul 2023
Event	61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada Duration: 9 Jul 2023 → 14 Jul 2023

Publication series

Name	Proceedings of the Annual Meeting of the Association for Computational Linguistics
Volume	2
ISSN (Print)	0736-587X

Conference

Conference	61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/Territory	Canada
City	Toronto
Period	9/07/23 → 14/07/23

Bibliographical note

Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.

Access to Document

10.18653/v1/2023.acl-short.66Licence: CC BY 4.0

2023.acl-short.66
Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
Final published version, 140 KBLicence: CC BY 4.0

Cite this

Zampieri, M., Morgan, S., North, K., Ranasinghe, T., Simmons, A., Khandelwal, P., Rosenthal, S., & Nakov, P. (2023). Target-Based Offensive Language Identification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics : (Volume 2: Short Papers) (Vol. 2, pp. 762-770). (Proceedings of the Annual Meeting of the Association for Computational Linguistics; Vol. 2). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-short.66

Zampieri, Marcos ; Morgan, Skye ; North, Kai et al. / Target-Based Offensive Language Identification. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics : (Volume 2: Short Papers). Vol. 2 Association for Computational Linguistics (ACL), 2023. pp. 762-770 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

@inproceedings{82484c36369b40f8bade9b93ac84c6db,

title = "Target-Based Offensive Language Identification",

abstract = "We present TBO, a new dataset for Target-based Offensive language identification. TBO contains post-level annotations regarding the harmfulness of an offensive post and token-level annotations comprising of the target and the offensive argument expression. Popular offensive language identification datasets for social media focus on annotation taxonomies only at the post level and more recently, some datasets have been released that feature only token-level annotations. TBO is an important resource that bridges the gap between post-level and token-level annotation datasets by introducing a single comprehensive unified annotation taxonomy. We use the TBO taxonomy to annotate post-level and token-level offensive language on English Twitter posts. We release an initial dataset of over 4,500 instances collected from Twitter and we carry out multiple experiments to compare the performance of different models trained and tested on TBO.",

author = "Marcos Zampieri and Skye Morgan and Kai North and Tharindu Ranasinghe and Austin Simmons and Paridhi Khandelwal and Sara Rosenthal and Preslav Nakov",

note = "Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.; 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 ; Conference date: 09-07-2023 Through 14-07-2023",

year = "2023",

month = jul,

day = "9",

doi = "10.18653/v1/2023.acl-short.66",

language = "English",

volume = "2",

series = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics (ACL)",

pages = "762--770",

booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",

address = "United States",

}

Zampieri, M, Morgan, S, North, K, Ranasinghe, T, Simmons, A, Khandelwal, P, Rosenthal, S & Nakov, P 2023, Target-Based Offensive Language Identification. in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics : (Volume 2: Short Papers). vol. 2, Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 2, Association for Computational Linguistics (ACL), pp. 762-770, 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023, Toronto, Canada, 9/07/23. https://doi.org/10.18653/v1/2023.acl-short.66

Target-Based Offensive Language Identification. / Zampieri, Marcos; Morgan, Skye; North, Kai et al.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics : (Volume 2: Short Papers). Vol. 2 Association for Computational Linguistics (ACL), 2023. p. 762-770 (Proceedings of the Annual Meeting of the Association for Computational Linguistics; Vol. 2).

Research output: Chapter in Book/Published conference output › Conference publication

TY - GEN

T1 - Target-Based Offensive Language Identification

AU - Zampieri, Marcos

AU - Morgan, Skye

AU - North, Kai

AU - Ranasinghe, Tharindu

AU - Simmons, Austin

AU - Khandelwal, Paridhi

AU - Rosenthal, Sara

AU - Nakov, Preslav

N1 - Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.

PY - 2023/7/9

Y1 - 2023/7/9

N2 - We present TBO, a new dataset for Target-based Offensive language identification. TBO contains post-level annotations regarding the harmfulness of an offensive post and token-level annotations comprising of the target and the offensive argument expression. Popular offensive language identification datasets for social media focus on annotation taxonomies only at the post level and more recently, some datasets have been released that feature only token-level annotations. TBO is an important resource that bridges the gap between post-level and token-level annotation datasets by introducing a single comprehensive unified annotation taxonomy. We use the TBO taxonomy to annotate post-level and token-level offensive language on English Twitter posts. We release an initial dataset of over 4,500 instances collected from Twitter and we carry out multiple experiments to compare the performance of different models trained and tested on TBO.

AB - We present TBO, a new dataset for Target-based Offensive language identification. TBO contains post-level annotations regarding the harmfulness of an offensive post and token-level annotations comprising of the target and the offensive argument expression. Popular offensive language identification datasets for social media focus on annotation taxonomies only at the post level and more recently, some datasets have been released that feature only token-level annotations. TBO is an important resource that bridges the gap between post-level and token-level annotation datasets by introducing a single comprehensive unified annotation taxonomy. We use the TBO taxonomy to annotate post-level and token-level offensive language on English Twitter posts. We release an initial dataset of over 4,500 instances collected from Twitter and we carry out multiple experiments to compare the performance of different models trained and tested on TBO.

UR - http://www.scopus.com/inward/record.url?scp=85172243992&partnerID=8YFLogxK

UR - https://aclanthology.org/2023.acl-short.66/

U2 - 10.18653/v1/2023.acl-short.66

DO - 10.18653/v1/2023.acl-short.66

M3 - Conference publication

AN - SCOPUS:85172243992

VL - 2

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 762

EP - 770

BT - Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics

PB - Association for Computational Linguistics (ACL)

T2 - 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023

Y2 - 9 July 2023 through 14 July 2023

ER -

Zampieri M, Morgan S, North K, Ranasinghe T, Simmons A, Khandelwal P et al. Target-Based Offensive Language Identification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics : (Volume 2: Short Papers). Vol. 2. Association for Computational Linguistics (ACL). 2023. p. 762-770. (Proceedings of the Annual Meeting of the Association for Computational Linguistics). doi: 10.18653/v1/2023.acl-short.66

Target-Based Offensive Language Identification

Abstract

Publication series

Conference

Bibliographical note

Access to Document

Other files and links

Fingerprint

Cite this