TY - GEN
T1 - Learning task specific distributed paragraph representations using a 2-tier convolutional neural network
AU - Chen, Tao
AU - Xu, Ruifeng
AU - He, Yulan
AU - Wang, Xuan
PY - 2015/11/12
Y1 - 2015/11/12
N2 - We introduce a type of 2-tier convolutional neural network model for learning distributed paragraph representations for a special task (e.g. paragraph or short document level sentiment analysis and text topic categorization). We decompose the paragraph semantics into 3 cascaded constitutes: word representation, sentence composition and document composition. Specifically, we learn distributed word representations by a continuous bag-of-words model from a large unstructured text corpus. Then, using these word representations as pre-trained vectors, distributed task specific sentence representations are learned from a sentence level corpus with task-specific labels by the first tier of our model. Using these sentence representations as distributed paragraph representation vectors, distributed paragraph representations are learned from a paragraph-level corpus by the second tier of our model. It is evaluated on DBpedia ontology classification dataset and Amazon review dataset. Empirical results show the effectiveness of our proposed learning model for generating distributed paragraph representations.
AB - We introduce a type of 2-tier convolutional neural network model for learning distributed paragraph representations for a special task (e.g. paragraph or short document level sentiment analysis and text topic categorization). We decompose the paragraph semantics into 3 cascaded constitutes: word representation, sentence composition and document composition. Specifically, we learn distributed word representations by a continuous bag-of-words model from a large unstructured text corpus. Then, using these word representations as pre-trained vectors, distributed task specific sentence representations are learned from a sentence level corpus with task-specific labels by the first tier of our model. Using these sentence representations as distributed paragraph representation vectors, distributed paragraph representations are learned from a paragraph-level corpus by the second tier of our model. It is evaluated on DBpedia ontology classification dataset and Amazon review dataset. Empirical results show the effectiveness of our proposed learning model for generating distributed paragraph representations.
KW - convolutional neural network
KW - distributed representation
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=84952778703&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-26532-2_51
DO - 10.1007/978-3-319-26532-2_51
M3 - Conference publication
AN - SCOPUS:84952778703
SN - 978-3-319-26531-5
T3 - Lecture notes in computer science
SP - 467
EP - 475
BT - Neural information processing
PB - Springer
CY - Cham (CH)
T2 - 22nd International Conference on Neural Information Processing
Y2 - 9 November 2015 through 12 November 2015
ER -