Abstract
Multi-class labelling in the absence of ground truth is a known hard problem in the computational intelligence paradigms. This problem is amplified in the case of e-commerce due to both high volume and high velocity of information. Specifically, it is hard to find labels for mass online reviews, rendering it unsuitable for supervised learning. Till date, the most sought solution is manual labelling, which remains a labour-intensive and time-consuming task. The purpose of this study is to develop an end-to-end approach for identifying the sources of quality-stimulated customer dissatisfaction and automatically assigning them in the context of e-commerce. The above objective is achieved by using a novel ensemble-based semi supervised pseudo-labelling technique on a large corpus of Amazon.com reviews. As a first step, a subset is manually labelled, followed by an ensemble approach of retaining commonly labelled (pseudo) class to iteratively label the entire dataset. We then apply Large Language Models (LLMs) and Deep Learning (DL) architectures on the (pseudo) labelled data to accomplish a multi-class classification problem. We contrast and showcase statistically significant improvement to the baseline machine learning models, where the pre-trained transformer models demonstrate best performance. Our approach proposes a roadmap to streamline automatically identifying sources of quality-related dissatisfaction in e-commerce channels using an amalgamation of ensemble and sophisticated computational techniques. We believe that our approach, if adopted, can bolster grievance redressal for online customers.
| Original language | English |
|---|---|
| Pages (from-to) | 545-574 |
| Number of pages | 30 |
| Journal | Annals of Operations Research |
| Volume | 353 |
| Issue number | 2 |
| Early online date | 8 Sept 2025 |
| DOIs | |
| Publication status | Published - 1 Oct 2025 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
Keywords
- Customer dissatisfaction
- Deep learning
- E-Commerce
- Ensemble learning
- Large language models
- Pseudo-labelling
- Semi-supervised learning
- Transformers