TY - JOUR
T1 - An Inception Convolutional Autoencoder Model for Chinese Healthcare Question Clustering
AU - Dai, Dan
AU - Tang, Juan
AU - Yu, Zhiwen
AU - Wong, Hau-San
AU - You, Jane
AU - Cao, Wenming
AU - Hu, Yang
AU - Chen, C. L. Philip
PY - 2021/4
Y1 - 2021/4
N2 - Healthcare question answering (HQA) system plays a vital role in encouraging patients to inquire for professional consultation. However, there are some challenging factors in learning and representing the question corpus of HQA datasets, such as high dimensionality, sparseness, noise, nonprofessional expression, etc. To address these issues, we propose an inception convolutional autoencoder model for Chinese healthcare question clustering (ICAHC). First, we select a set of kernels with different sizes using convolutional autoencoder networks to explore both the diversity and quality in the clustering ensemble. Thus, these kernels encourage to capture diverse representations. Second, we design four ensemble operators to merge representations based on whether they are independent, and input them into the encoder using different skip connections. Third, it maps features from the encoder into a lower-dimensional space, followed by clustering. We conduct comparative experiments against other clustering algorithms on a Chinese healthcare dataset. Experimental results show the effectiveness of ICAHC in discovering better clustering solutions. The results can be used in the prediction of patients’ conditions and the development of an automatic HQA system.
AB - Healthcare question answering (HQA) system plays a vital role in encouraging patients to inquire for professional consultation. However, there are some challenging factors in learning and representing the question corpus of HQA datasets, such as high dimensionality, sparseness, noise, nonprofessional expression, etc. To address these issues, we propose an inception convolutional autoencoder model for Chinese healthcare question clustering (ICAHC). First, we select a set of kernels with different sizes using convolutional autoencoder networks to explore both the diversity and quality in the clustering ensemble. Thus, these kernels encourage to capture diverse representations. Second, we design four ensemble operators to merge representations based on whether they are independent, and input them into the encoder using different skip connections. Third, it maps features from the encoder into a lower-dimensional space, followed by clustering. We conduct comparative experiments against other clustering algorithms on a Chinese healthcare dataset. Experimental results show the effectiveness of ICAHC in discovering better clustering solutions. The results can be used in the prediction of patients’ conditions and the development of an automatic HQA system.
UR - https://ieeexplore.ieee.org/document/8730479
U2 - 10.1109/TCYB.2019.2916580
DO - 10.1109/TCYB.2019.2916580
M3 - Article
SN - 2168-2267
VL - 51
SP - 2019
EP - 2031
JO - IEEE Transactions on Cybernetics
JF - IEEE Transactions on Cybernetics
IS - 4
ER -