Abstract
We propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon. Preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than exiting weakly-supervised sentiment classification methods despite using no labeled documents.
Original language | English |
---|---|
Title of host publication | Proceeding : CIKM '10 proceedings of the 19th ACM international conference on information and knowledge management |
Place of Publication | New York (US) |
Publisher | ACM |
Pages | 1685-1688 |
Number of pages | 4 |
ISBN (Print) | 978-1-4503-0099-5 |
DOIs | |
Publication status | Published - 2010 |
Event | 19th ACM international conference on information and knowledge management, CIKM '10 - Toronto, Canada Duration: 26 Oct 2010 → 30 Oct 2010 |
Conference
Conference | 19th ACM international conference on information and knowledge management, CIKM '10 |
---|---|
Country/Territory | Canada |
City | Toronto |
Period | 26/10/10 → 30/10/10 |
Keywords
- sentiment analysis
- opinion mining
- generalized expectation
- self-learned features
- weakly-supervised classification