This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.
|Journal||ACM transactions on Asian Language information processing|
|Publication status||Published - Jun 2012|