Unsupervised event exploration from social text streams

Research output: Contribution to journalArticle

View graph of relations Save citation

Open

Authors

Research units

Abstract

Social media provides unprecedented opportunities for people to disseminate information and share their opinions and views online. Extracting events from social media platforms such as Twitter could help in understanding what is being discussed. However, event extraction from social text streams poses huge challenges due to the noisy nature of social media posts and dynamic evolution of language. We propose a generic unsupervised framework for exploring events on Twitter which consists of four major steps, filtering, pre-processing, extraction and categorization, and post-processing. Tweets published in a certain time period are aggregated and noisy tweets which do not contain newsworthy events are filtered by the filtering step. The remaining tweets are pre-processed by temporal resolution, part-of-speech tagging and named entity recognition in order to identify the key elements of events. An unsupervised Bayesian model is proposed to automatically extract the structured representations of events in the form of quadruples < entity, keyword, date, location > and further categorize the extracted events into event types. Finally, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 million tweets which were collected for one month in December 2010. A precision of 78.01% is achieved for event extraction using our proposed Bayesian model, outperforming a competitive baseline by nearly 13.6%. Moreover, events are also clustered into coherence groups with the automatically assigned event type labels with an accuracy of 42.57%.

Documents

  • Unsupervised event exploration from social

    Rights statement: Copyright: 2017 – IOS Press and the authors. The final publication is available at IOS Press through http://dx.doi.org/10.3233/IDA-160048

    Accepted author manuscript, 3 MB, PDF-document

Details

Original languageEnglish
Pages (from-to)849-866
Number of pages18
JournalIntelligent Data Analysis
Volume21
Issue4
DOIs
StatePublished - 19 Aug 2017

Bibliographic note

Copyright: 2017 – IOS Press and the authors. The final publication is available at IOS Press through http://dx.doi.org/10.3233/IDA-160048 Funding: This work was funded by the National Natural Science Foundation of China (61528302), the Natural Science Foundation of Jiangsu Province of China (BK20161430), the Innovate UK under the grant number 101779 and the Collaborative Innovation Center of Wireless Communications Technology.

    Keywords

  • Bayesian model, event extraction, social media, unsupervised learning

DOI

Employable Graduates; Exploitable Research

Copy the text from this field...