Approaches to automated detection of cyberbullying: A Survey

Semiu Salawu, Yulan He, Joanna Lumsden

Research output: Contribution to journalArticle

Abstract

Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

Original languageEnglish
JournalIEEE Transactions on Affective Computing
Volumeearly online
DOIs
Publication statusPublished - 10 Oct 2017

Fingerprint

Supervised learning
Learning systems
Classifiers
Processing

Bibliographical note

© Copyright 2017 IEEE - All rights reserved.

Keywords

  • Abuse and crime involving computers
  • computers
  • data mining
  • Electronic mail
  • machine learning
  • natural language processing
  • Sentiment analysis
  • sentiment analysis
  • Social network services
  • social networking
  • supervised learning

Cite this

Salawu, S., He, Y., & Lumsden, J. (2017). Approaches to automated detection of cyberbullying: A Survey. IEEE Transactions on Affective Computing, early online. https://doi.org/10.1109/TAFFC.2017.2761757
Salawu, Semiu ; He, Yulan ; Lumsden, Joanna. / Approaches to automated detection of cyberbullying : A Survey. In: IEEE Transactions on Affective Computing. 2017 ; Vol. early online.
@article{ff370ce150b04e61a4ad05ba0b294876,
title = "Approaches to automated detection of cyberbullying: A Survey",
abstract = "Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Na{\"i}ve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.",
keywords = "Abuse and crime involving computers, computers, data mining, Electronic mail, machine learning, natural language processing, Sentiment analysis, sentiment analysis, Social network services, social networking, supervised learning",
author = "Semiu Salawu and Yulan He and Joanna Lumsden",
note = "{\circledC} Copyright 2017 IEEE - All rights reserved.",
year = "2017",
month = "10",
day = "10",
doi = "10.1109/TAFFC.2017.2761757",
language = "English",
volume = "early online",

}

Salawu, S, He, Y & Lumsden, J 2017, 'Approaches to automated detection of cyberbullying: A Survey', IEEE Transactions on Affective Computing, vol. early online. https://doi.org/10.1109/TAFFC.2017.2761757

Approaches to automated detection of cyberbullying : A Survey. / Salawu, Semiu; He, Yulan; Lumsden, Joanna.

In: IEEE Transactions on Affective Computing, Vol. early online, 10.10.2017.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Approaches to automated detection of cyberbullying

T2 - A Survey

AU - Salawu, Semiu

AU - He, Yulan

AU - Lumsden, Joanna

N1 - © Copyright 2017 IEEE - All rights reserved.

PY - 2017/10/10

Y1 - 2017/10/10

N2 - Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

AB - Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. A growing body of work is emerging on automated approaches to cyberbullying detection. These approaches utilise machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. In this paper, we present a systematic review of published research (as identified via Scopus, ACM and IEEE Xplore bibliographic databases) on cyberbullying detection approaches. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely; supervised learning, lexicon based, rule based and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Lexicon based systems utilise word lists and use the presence of words within the lists to detect cyberbullying. Rules-based approaches match text to predefined rules to identify bullying and mixed-initiatives approaches combine human-based reasoning with one or more of the aforementioned approaches. We found lack of quality representative labelled datasets and non-holistic consideration of cyberbullying by researchers when developing detection systems are two key challenges facing cyberbullying detection research. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

KW - Abuse and crime involving computers

KW - computers

KW - data mining

KW - Electronic mail

KW - machine learning

KW - natural language processing

KW - Sentiment analysis

KW - sentiment analysis

KW - Social network services

KW - social networking

KW - supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85031765182&partnerID=8YFLogxK

U2 - 10.1109/TAFFC.2017.2761757

DO - 10.1109/TAFFC.2017.2761757

M3 - Article

AN - SCOPUS:85031765182

VL - early online

ER -

Salawu S, He Y, Lumsden J. Approaches to automated detection of cyberbullying: A Survey. IEEE Transactions on Affective Computing. 2017 Oct 10;early online. https://doi.org/10.1109/TAFFC.2017.2761757