Sarcasm detection in microblogs using Naïve Bayes and fuzzy clustering

Shubhadeep Mukherjee; Pradip Kumar Bala

doi:10.1016/j.techsoc.2016.10.003

Sarcasm detection in microblogs using Naïve Bayes and fuzzy clustering

Shubhadeep Mukherjee^*, Pradip Kumar Bala

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

Sarcasm detection of online text is a task of growing importance in the globalized world. Large corporations are interested in knowing how consumers perceive the various products launched by the companies based on analysis of microblogs, such as - Twitter, about their products.These reviews/comments/posts are under the constant threat of being classified in the wrong category due to use of sarcasm in sentences. Automatic detection of sarcasm in microblogs, such as - Twitter, is a difficult task. It requires a system that can use some knowledge to interpret the linguistic styles of authors. In this work, we try to provide this knowledge to the system by considering different sets of features which are relatively independent of the text, namely - function words and part of speech n-grams. We test a range of different feature sets using the Naïve Bayes and fuzzy clustering algorithms. Our results show that the sarcasm detection task benefits from the inclusion of features which capture authorial style of the microblog authors. We achieve an accuracy of approximately 65% which is on the higher side of the sarcasm detection literature.

Original language	English
Pages (from-to)	19-27
Number of pages	9
Journal	Technology in Society
Volume	48
DOIs	https://doi.org/10.1016/j.techsoc.2016.10.003
Publication status	Published - 1 Feb 2017

Access to Document

10.1016/j.techsoc.2016.10.003

Cite this

@article{a0e5a80e754049ad87af18fd001a418c,

title = "Sarcasm detection in microblogs using Na{\"i}ve Bayes and fuzzy clustering",

abstract = "Sarcasm detection of online text is a task of growing importance in the globalized world. Large corporations are interested in knowing how consumers perceive the various products launched by the companies based on analysis of microblogs, such as - Twitter, about their products.These reviews/comments/posts are under the constant threat of being classified in the wrong category due to use of sarcasm in sentences. Automatic detection of sarcasm in microblogs, such as - Twitter, is a difficult task. It requires a system that can use some knowledge to interpret the linguistic styles of authors. In this work, we try to provide this knowledge to the system by considering different sets of features which are relatively independent of the text, namely - function words and part of speech n-grams. We test a range of different feature sets using the Na{\"i}ve Bayes and fuzzy clustering algorithms. Our results show that the sarcasm detection task benefits from the inclusion of features which capture authorial style of the microblog authors. We achieve an accuracy of approximately 65% which is on the higher side of the sarcasm detection literature.",

author = "Shubhadeep Mukherjee and Bala, {Pradip Kumar}",

year = "2017",

month = feb,

day = "1",

doi = "10.1016/j.techsoc.2016.10.003",

language = "English",

volume = "48",

pages = "19--27",

}

TY - JOUR

T1 - Sarcasm detection in microblogs using Naïve Bayes and fuzzy clustering

AU - Mukherjee, Shubhadeep

AU - Bala, Pradip Kumar

PY - 2017/2/1

Y1 - 2017/2/1

N2 - Sarcasm detection of online text is a task of growing importance in the globalized world. Large corporations are interested in knowing how consumers perceive the various products launched by the companies based on analysis of microblogs, such as - Twitter, about their products.These reviews/comments/posts are under the constant threat of being classified in the wrong category due to use of sarcasm in sentences. Automatic detection of sarcasm in microblogs, such as - Twitter, is a difficult task. It requires a system that can use some knowledge to interpret the linguistic styles of authors. In this work, we try to provide this knowledge to the system by considering different sets of features which are relatively independent of the text, namely - function words and part of speech n-grams. We test a range of different feature sets using the Naïve Bayes and fuzzy clustering algorithms. Our results show that the sarcasm detection task benefits from the inclusion of features which capture authorial style of the microblog authors. We achieve an accuracy of approximately 65% which is on the higher side of the sarcasm detection literature.

AB - Sarcasm detection of online text is a task of growing importance in the globalized world. Large corporations are interested in knowing how consumers perceive the various products launched by the companies based on analysis of microblogs, such as - Twitter, about their products.These reviews/comments/posts are under the constant threat of being classified in the wrong category due to use of sarcasm in sentences. Automatic detection of sarcasm in microblogs, such as - Twitter, is a difficult task. It requires a system that can use some knowledge to interpret the linguistic styles of authors. In this work, we try to provide this knowledge to the system by considering different sets of features which are relatively independent of the text, namely - function words and part of speech n-grams. We test a range of different feature sets using the Naïve Bayes and fuzzy clustering algorithms. Our results show that the sarcasm detection task benefits from the inclusion of features which capture authorial style of the microblog authors. We achieve an accuracy of approximately 65% which is on the higher side of the sarcasm detection literature.

UR - http://www.scopus.com/inward/record.url?scp=84995487650&partnerID=8YFLogxK

UR - https://www.sciencedirect.com/science/article/pii/S0160791X16300070?via%3Dihub

U2 - 10.1016/j.techsoc.2016.10.003

DO - 10.1016/j.techsoc.2016.10.003

M3 - Article

AN - SCOPUS:84995487650

SN - 0160-791X

VL - 48

SP - 19

EP - 27

JO - Technology in Society

JF - Technology in Society

ER -

Sarcasm detection in microblogs using Naïve Bayes and fuzzy clustering

Abstract

Access to Document

Other files and links

Fingerprint

Cite this