Approaches to automated detection of cyberbullying: A Survey

Semiu Salawu; Yulan He; Joanna Lumsden

doi:10.1109/TAFFC.2017.2761757

Approaches to automated detection of cyberbullying: A Survey

Semiu Salawu^*, Yulan He, Joanna Lumsden

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

Original language	English
Pages (from-to)	3-24
Number of pages	22
Journal	IEEE Transactions on Affective Computing
Volume	11
Issue number	1
Early online date	10 Oct 2017
DOIs	https://doi.org/10.1109/TAFFC.2017.2761757
Publication status	Published - 1 Mar 2020

Bibliographical note

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Keywords

Abuse and crime involving computers
computers
data mining
Electronic mail
machine learning
natural language processing
Sentiment analysis
sentiment analysis
Social network services
social networking
supervised learning

Access to Document

10.1109/TAFFC.2017.2761757

Approaches to Automated Detection of Cyberbullying
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Accepted author manuscript, 1.41 MB

Cite this

@article{ff370ce150b04e61a4ad05ba0b294876,

title = "Approaches to automated detection of cyberbullying: A Survey",

abstract = "Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Na{\"i}ve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.",

keywords = "Abuse and crime involving computers, computers, data mining, Electronic mail, machine learning, natural language processing, Sentiment analysis, sentiment analysis, Social network services, social networking, supervised learning",

author = "Semiu Salawu and Yulan He and Joanna Lumsden",

note = "{\textcopyright} 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.",

year = "2020",

month = mar,

day = "1",

doi = "10.1109/TAFFC.2017.2761757",

language = "English",

volume = "11",

pages = "3--24",

journal = "IEEE Transactions on Affective Computing",

issn = "1949-3045",

publisher = "IEEE",

number = "1",

}

TY - JOUR

T1 - Approaches to automated detection of cyberbullying

T2 - A Survey

AU - Salawu, Semiu

AU - He, Yulan

AU - Lumsden, Joanna

N1 - © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

PY - 2020/3/1

Y1 - 2020/3/1

N2 - Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

AB - Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

KW - Abuse and crime involving computers

KW - computers

KW - data mining

KW - Electronic mail

KW - machine learning

KW - natural language processing

KW - Sentiment analysis

KW - sentiment analysis

KW - Social network services

KW - social networking

KW - supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85031765182&partnerID=8YFLogxK

UR - https://ieeexplore.ieee.org/document/8063898

U2 - 10.1109/TAFFC.2017.2761757

DO - 10.1109/TAFFC.2017.2761757

M3 - Article

AN - SCOPUS:85031765182

SN - 1949-3045

VL - 11

SP - 3

EP - 24

JO - IEEE Transactions on Affective Computing

JF - IEEE Transactions on Affective Computing

IS - 1

ER -

Approaches to automated detection of cyberbullying: A Survey

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this