High Resolution Sentiment Analysis by Ensemble Classification

Jordan J. Bird*, Anikó Ekárt, Christopher D. Buckingham, Diego R. Faria

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This study proposes an approach to ensemble sentiment classification of a text to a score in the range of 1–5 of negative-positive scoring. A high-performing model is produced from TripAdvisor restaurant reviews via a generated dataset of 684 word-stems, gathered by information gain attribute selection from the entire corpus. The best performing classification was an ensemble classifier of RandomForest, Naive Bayes Multinomial and Multilayer Perceptron (Neural Network) methods ensembled via a Vote on Average Probabilities approach. The best ensemble produced a classification accuracy of 91.02% which scored higher than the best single classifier, a Random Tree model with an accuracy of 78.6%. Other ensembles through Adaptive Boosting, Random Forests and Voting are explored with ten-fold cross-validation. All ensemble methods far outperformed the best single classifier methods. Even though extremely high results are achieved, analysis documents the few mis-classified instances as almost entirely being close to their real class via the model’s given error matrix.

Original languageEnglish
Title of host publicationIntelligent Computing - Proceedings of the 2019 Computing Conference
EditorsKohei Arai, Rahul Bhatia, Supriya Kapoor
PublisherSpringer-Verlag Wien
Pages593-606
Number of pages14
ISBN (Print)9783030228705
DOIs
Publication statusPublished - 23 Jun 2019
EventComputing Conference, 2019 - London, United Kingdom
Duration: 16 Jul 201917 Jul 2019

Publication series

NameAdvances in Intelligent Systems and Computing
Volume997
ISSN (Print)2194-5357
ISSN (Electronic)2194-5365

Conference

ConferenceComputing Conference, 2019
CountryUnited Kingdom
CityLondon
Period16/07/1917/07/19

Fingerprint

Classifiers
Adaptive boosting
Multilayer neural networks
Neural networks

Keywords

  • Classification
  • Ensemble learning
  • Machine learning
  • Opinion mining
  • Sentiment analysis

Cite this

Bird, J. J., Ekárt, A., Buckingham, C. D., & Faria, D. R. (2019). High Resolution Sentiment Analysis by Ensemble Classification. In K. Arai, R. Bhatia, & S. Kapoor (Eds.), Intelligent Computing - Proceedings of the 2019 Computing Conference (pp. 593-606). (Advances in Intelligent Systems and Computing; Vol. 997). Springer-Verlag Wien. https://doi.org/10.1007/978-3-030-22871-2_40
Bird, Jordan J. ; Ekárt, Anikó ; Buckingham, Christopher D. ; Faria, Diego R. / High Resolution Sentiment Analysis by Ensemble Classification. Intelligent Computing - Proceedings of the 2019 Computing Conference. editor / Kohei Arai ; Rahul Bhatia ; Supriya Kapoor. Springer-Verlag Wien, 2019. pp. 593-606 (Advances in Intelligent Systems and Computing).
@inproceedings{317cf5a417b343f49c58317cdc532dd7,
title = "High Resolution Sentiment Analysis by Ensemble Classification",
abstract = "This study proposes an approach to ensemble sentiment classification of a text to a score in the range of 1–5 of negative-positive scoring. A high-performing model is produced from TripAdvisor restaurant reviews via a generated dataset of 684 word-stems, gathered by information gain attribute selection from the entire corpus. The best performing classification was an ensemble classifier of RandomForest, Naive Bayes Multinomial and Multilayer Perceptron (Neural Network) methods ensembled via a Vote on Average Probabilities approach. The best ensemble produced a classification accuracy of 91.02{\%} which scored higher than the best single classifier, a Random Tree model with an accuracy of 78.6{\%}. Other ensembles through Adaptive Boosting, Random Forests and Voting are explored with ten-fold cross-validation. All ensemble methods far outperformed the best single classifier methods. Even though extremely high results are achieved, analysis documents the few mis-classified instances as almost entirely being close to their real class via the model’s given error matrix.",
keywords = "Classification, Ensemble learning, Machine learning, Opinion mining, Sentiment analysis",
author = "Bird, {Jordan J.} and Anik{\'o} Ek{\'a}rt and Buckingham, {Christopher D.} and Faria, {Diego R.}",
year = "2019",
month = "6",
day = "23",
doi = "10.1007/978-3-030-22871-2_40",
language = "English",
isbn = "9783030228705",
series = "Advances in Intelligent Systems and Computing",
publisher = "Springer-Verlag Wien",
pages = "593--606",
editor = "Kohei Arai and Rahul Bhatia and Supriya Kapoor",
booktitle = "Intelligent Computing - Proceedings of the 2019 Computing Conference",
address = "Austria",

}

Bird, JJ, Ekárt, A, Buckingham, CD & Faria, DR 2019, High Resolution Sentiment Analysis by Ensemble Classification. in K Arai, R Bhatia & S Kapoor (eds), Intelligent Computing - Proceedings of the 2019 Computing Conference. Advances in Intelligent Systems and Computing, vol. 997, Springer-Verlag Wien, pp. 593-606, Computing Conference, 2019, London, United Kingdom, 16/07/19. https://doi.org/10.1007/978-3-030-22871-2_40

High Resolution Sentiment Analysis by Ensemble Classification. / Bird, Jordan J.; Ekárt, Anikó; Buckingham, Christopher D.; Faria, Diego R.

Intelligent Computing - Proceedings of the 2019 Computing Conference. ed. / Kohei Arai; Rahul Bhatia; Supriya Kapoor. Springer-Verlag Wien, 2019. p. 593-606 (Advances in Intelligent Systems and Computing; Vol. 997).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - High Resolution Sentiment Analysis by Ensemble Classification

AU - Bird, Jordan J.

AU - Ekárt, Anikó

AU - Buckingham, Christopher D.

AU - Faria, Diego R.

PY - 2019/6/23

Y1 - 2019/6/23

N2 - This study proposes an approach to ensemble sentiment classification of a text to a score in the range of 1–5 of negative-positive scoring. A high-performing model is produced from TripAdvisor restaurant reviews via a generated dataset of 684 word-stems, gathered by information gain attribute selection from the entire corpus. The best performing classification was an ensemble classifier of RandomForest, Naive Bayes Multinomial and Multilayer Perceptron (Neural Network) methods ensembled via a Vote on Average Probabilities approach. The best ensemble produced a classification accuracy of 91.02% which scored higher than the best single classifier, a Random Tree model with an accuracy of 78.6%. Other ensembles through Adaptive Boosting, Random Forests and Voting are explored with ten-fold cross-validation. All ensemble methods far outperformed the best single classifier methods. Even though extremely high results are achieved, analysis documents the few mis-classified instances as almost entirely being close to their real class via the model’s given error matrix.

AB - This study proposes an approach to ensemble sentiment classification of a text to a score in the range of 1–5 of negative-positive scoring. A high-performing model is produced from TripAdvisor restaurant reviews via a generated dataset of 684 word-stems, gathered by information gain attribute selection from the entire corpus. The best performing classification was an ensemble classifier of RandomForest, Naive Bayes Multinomial and Multilayer Perceptron (Neural Network) methods ensembled via a Vote on Average Probabilities approach. The best ensemble produced a classification accuracy of 91.02% which scored higher than the best single classifier, a Random Tree model with an accuracy of 78.6%. Other ensembles through Adaptive Boosting, Random Forests and Voting are explored with ten-fold cross-validation. All ensemble methods far outperformed the best single classifier methods. Even though extremely high results are achieved, analysis documents the few mis-classified instances as almost entirely being close to their real class via the model’s given error matrix.

KW - Classification

KW - Ensemble learning

KW - Machine learning

KW - Opinion mining

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=85069154045&partnerID=8YFLogxK

UR - https://link.springer.com/chapter/10.1007%2F978-3-030-22871-2_40

U2 - 10.1007/978-3-030-22871-2_40

DO - 10.1007/978-3-030-22871-2_40

M3 - Conference contribution

AN - SCOPUS:85069154045

SN - 9783030228705

T3 - Advances in Intelligent Systems and Computing

SP - 593

EP - 606

BT - Intelligent Computing - Proceedings of the 2019 Computing Conference

A2 - Arai, Kohei

A2 - Bhatia, Rahul

A2 - Kapoor, Supriya

PB - Springer-Verlag Wien

ER -

Bird JJ, Ekárt A, Buckingham CD, Faria DR. High Resolution Sentiment Analysis by Ensemble Classification. In Arai K, Bhatia R, Kapoor S, editors, Intelligent Computing - Proceedings of the 2019 Computing Conference. Springer-Verlag Wien. 2019. p. 593-606. (Advances in Intelligent Systems and Computing). https://doi.org/10.1007/978-3-030-22871-2_40