Exploring differential topic models for comparative summarization of scientific papers

Lei He, Wei Li, Hai Zhuge

Research output: Chapter in Book/Published conference outputConference publication

Abstract

This paper investigates differential topic models (dTM) for summarizing the differences among document groups. Starting from a simple probabilistic generative model, we propose dTM-SAGE that explicitly models the deviations on group-specific word distributions to indicate how words are used differentially across different document groups from a background word distribution. It is more effective to capture unique characteristics for comparing document groups. To generate dTM-based comparative summaries, we propose two sentence scoring methods for measuring the sentence discriminative capacity. Experimental results on scientific papers dataset show that our dTM-based comparative summarization methods significantly outperform the generic baselines and the state-of-the-art comparative summarization
methods under ROUGE metrics.
Original languageEnglish
Title of host publicationProceedings of COLING 2016, the 26th International Conference on Computational Linguistics
Subtitle of host publicationtechnical papers
PublisherAssociation for Computational Linguistics
Pages1028-1038
Number of pages10
ISBN (Print)978-4-87974-702-0
Publication statusPublished - 11 Dec 2016
Event26th International Conference on Computational Linguistics: COLIN 2016 - Osaka, Japan
Duration: 11 Dec 201616 Dec 2016
Conference number: 26

Conference

Conference26th International Conference on Computational Linguistics
Abbreviated titleCOLING 2016
CountryJapan
CityOsaka
Period11/12/1616/12/16

Bibliographical note

-This work is licenced under a Creative Commons Attribution 4.0 International License. License details: http:// creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Exploring differential topic models for comparative summarization of scientific papers'. Together they form a unique fingerprint.

Cite this