The paper addresses the problem of forecasting consumer expenditure from social media data. Previous research of the topic exploited the intuition that search engine traffic reflects purchase intentions and constructed predictive models of consumer behaviour from search query volumes. In contrast, we derive predictors from explicit expressions of purchase intentions found in social media posts. Two types of predictors created from these expressions are explored: those based on word embeddings and those based on topical word clusters. We introduce a new clustering method, which takes into account temporal co-occurrence of words, in addition to their semantic similarity, in order to create predictors relevant to the forecasting problem. The predictors are evaluated against baselines that use only macroeconomic variables, and against models trained on search traffic data. Conducting experiments with three different regression methods on Facebook and Twitter data, we find that both word embeddings and word clusters help to reduce forecasting errors in comparison to purely macroeconomic models. In most experimental settings, the error reduction is statistically significant, and is comparable to error reduction achieved with search traffic variables.
|Title of host publication||Proceedings of the Fourteenth International AAAI Conference on Web and Social Media|
|Number of pages||12|
|Publication status||Accepted/In press - 1 May 2020|
|Event||14th International Conference on Web and Social Media - Atlanta, United States|
Duration: 8 Jun 2020 → 11 Jun 2020
Conference number: 14
|Name||Proceedings of the International AAAI Conference on Web and Social Media|
|Conference||14th International Conference on Web and Social Media|
|Period||8/06/20 → 11/06/20|
Bibliographical noteCopyright 2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
- social media
- Natural Language Processing
- Macroeconomic forecasting