Incorporating sentiment prior knowledge for weakly supervised sentiment analysis

Yulan He

doi:10.1145/2184436.2184437

Incorporating sentiment prior knowledge for weakly supervised sentiment analysis

Yulan He

Computer Science Research Group

Research output: Contribution to journal › Article › peer-review

Abstract

This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.

Original language	English
Article number	4
Journal	ACM transactions on Asian Language information processing
Volume	11
Issue number	2
DOIs	https://doi.org/10.1145/2184436.2184437
Publication status	Published - Jun 2012

Access to Document

10.1145/2184436.2184437

Cite this

@article{c93819552da745f386af80df3771b0ae,

title = "Incorporating sentiment prior knowledge for weakly supervised sentiment analysis",

abstract = "This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.",

author = "Yulan He",

year = "2012",

month = jun,

doi = "10.1145/2184436.2184437",

language = "English",

volume = "11",

journal = "ACM transactions on Asian Language information processing",

issn = "1530-0226",

publisher = "ACM",

number = "2",

}

TY - JOUR

T1 - Incorporating sentiment prior knowledge for weakly supervised sentiment analysis

AU - He, Yulan

PY - 2012/6

Y1 - 2012/6

N2 - This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.

AB - This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.

UR - http://www.scopus.com/inward/record.url?scp=84863688889&partnerID=8YFLogxK

UR - http://dl.acm.org/citation.cfm?doid=2184436.2184437

U2 - 10.1145/2184436.2184437

DO - 10.1145/2184436.2184437

M3 - Article

AN - SCOPUS:84863688889

SN - 1530-0226

VL - 11

JO - ACM transactions on Asian Language information processing

JF - ACM transactions on Asian Language information processing

IS - 2

M1 - 4

ER -

Incorporating sentiment prior knowledge for weakly supervised sentiment analysis

Abstract

Access to Document

Other files and links

Fingerprint

Cite this