Summarization of scientific documents by detecting common facts in citations

Jingqiang Chen, Hai Zhuge*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Reading scientific articles is more time-consuming than reading news because readers need to search and read many citations. This paper proposes a citation guided method for summarizing multiple scientific papers. A phenomenon we can observe is that citation sentences in one paragraph or section usually talk about a common fact, which is usually represented as a set of noun phrases co-occurring in citation texts and it is usually discussed from different aspects. We design a multi-document summarization system based on common fact detection. One challenge is that citations may not use the same terms to refer to a common fact. We thus use term association discovering algorithm to expand terms based on a large set of scientific article abstracts. Then, citations can be clustered based on common facts. The common fact is used as a salient term set to get relevant sentences from the corresponding cited articles to form a summary. Experiments show that our method outperforms three baseline methods by ROUGE metric.

Original languageEnglish
Pages (from-to)246-252
Number of pages7
JournalFuture Generation Computer Systems
Volume32
Early online date22 Aug 2013
DOIs
Publication statusPublished - Mar 2014

Keywords

  • natural language processing
  • semantic link network
  • summarization

Fingerprint Dive into the research topics of 'Summarization of scientific documents by detecting common facts in citations'. Together they form a unique fingerprint.

Cite this