Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources

Muhidin A. Mohamed, Mourad Oussalah

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we present an approach for an extractive query focused multi-document summarization which stands on an enhanced knowledge-based short text semantic similarity measures. We incorporate WordNet Taxonomy with Categorial Variation Database (CatVar) and Morphosemantic Links to determine query similarity with sentences and intra-sentences similarities. Besides, we enrich WordNet-derived similarity with named entity semantic relatedness inferred from Wikipedia and underpinned by Normalized Google Distance. We show that our summarizer built primarily on such an improved semantic similarity measure to model relevance, centrality and diversity factors outperforms the best-performing relevant DUC systems and recent closely related studies in at least one or more of the investigated ROUGE metrics. An anti-redundancy mechanism is augmented with the proposed summarizer design using Maximum Marginal Relevance algorithm-MMR.

Original languageEnglish
Title of host publicationProceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015
PublisherIEEE
Pages80-87
Number of pages8
Volume2
ISBN (Electronic)9781467379519
DOIs
Publication statusPublished - 2 Dec 2015
Event14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015 - Helsinki, Finland
Duration: 20 Aug 201522 Aug 2015

Conference

Conference14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015
CountryFinland
CityHelsinki
Period20/08/1522/08/15

Fingerprint

Semantics
Taxonomies
Redundancy

Keywords

  • knowledge-enriched similarity
  • named entity relatedness
  • query-based summarization
  • Word category subsumption

Cite this

Mohamed, M. A., & Oussalah, M. (2015). Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources. In Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015 (Vol. 2, pp. 80-87). [7345478] IEEE. https://doi.org/10.1109/Trustcom.2015.565
Mohamed, Muhidin A. ; Oussalah, Mourad. / Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources. Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015. Vol. 2 IEEE, 2015. pp. 80-87
@inproceedings{aba8151958c54335a14551647b51b6d3,
title = "Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources",
abstract = "In this paper we present an approach for an extractive query focused multi-document summarization which stands on an enhanced knowledge-based short text semantic similarity measures. We incorporate WordNet Taxonomy with Categorial Variation Database (CatVar) and Morphosemantic Links to determine query similarity with sentences and intra-sentences similarities. Besides, we enrich WordNet-derived similarity with named entity semantic relatedness inferred from Wikipedia and underpinned by Normalized Google Distance. We show that our summarizer built primarily on such an improved semantic similarity measure to model relevance, centrality and diversity factors outperforms the best-performing relevant DUC systems and recent closely related studies in at least one or more of the investigated ROUGE metrics. An anti-redundancy mechanism is augmented with the proposed summarizer design using Maximum Marginal Relevance algorithm-MMR.",
keywords = "knowledge-enriched similarity, named entity relatedness, query-based summarization, Word category subsumption",
author = "Mohamed, {Muhidin A.} and Mourad Oussalah",
year = "2015",
month = "12",
day = "2",
doi = "10.1109/Trustcom.2015.565",
language = "English",
volume = "2",
pages = "80--87",
booktitle = "Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015",
publisher = "IEEE",
address = "United States",

}

Mohamed, MA & Oussalah, M 2015, Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources. in Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015. vol. 2, 7345478, IEEE, pp. 80-87, 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015, Helsinki, Finland, 20/08/15. https://doi.org/10.1109/Trustcom.2015.565

Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources. / Mohamed, Muhidin A.; Oussalah, Mourad.

Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015. Vol. 2 IEEE, 2015. p. 80-87 7345478.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources

AU - Mohamed, Muhidin A.

AU - Oussalah, Mourad

PY - 2015/12/2

Y1 - 2015/12/2

N2 - In this paper we present an approach for an extractive query focused multi-document summarization which stands on an enhanced knowledge-based short text semantic similarity measures. We incorporate WordNet Taxonomy with Categorial Variation Database (CatVar) and Morphosemantic Links to determine query similarity with sentences and intra-sentences similarities. Besides, we enrich WordNet-derived similarity with named entity semantic relatedness inferred from Wikipedia and underpinned by Normalized Google Distance. We show that our summarizer built primarily on such an improved semantic similarity measure to model relevance, centrality and diversity factors outperforms the best-performing relevant DUC systems and recent closely related studies in at least one or more of the investigated ROUGE metrics. An anti-redundancy mechanism is augmented with the proposed summarizer design using Maximum Marginal Relevance algorithm-MMR.

AB - In this paper we present an approach for an extractive query focused multi-document summarization which stands on an enhanced knowledge-based short text semantic similarity measures. We incorporate WordNet Taxonomy with Categorial Variation Database (CatVar) and Morphosemantic Links to determine query similarity with sentences and intra-sentences similarities. Besides, we enrich WordNet-derived similarity with named entity semantic relatedness inferred from Wikipedia and underpinned by Normalized Google Distance. We show that our summarizer built primarily on such an improved semantic similarity measure to model relevance, centrality and diversity factors outperforms the best-performing relevant DUC systems and recent closely related studies in at least one or more of the investigated ROUGE metrics. An anti-redundancy mechanism is augmented with the proposed summarizer design using Maximum Marginal Relevance algorithm-MMR.

KW - knowledge-enriched similarity

KW - named entity relatedness

KW - query-based summarization

KW - Word category subsumption

UR - http://www.scopus.com/inward/record.url?scp=84969165398&partnerID=8YFLogxK

UR - https://ieeexplore.ieee.org/document/7345478

U2 - 10.1109/Trustcom.2015.565

DO - 10.1109/Trustcom.2015.565

M3 - Conference contribution

AN - SCOPUS:84969165398

VL - 2

SP - 80

EP - 87

BT - Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015

PB - IEEE

ER -

Mohamed MA, Oussalah M. Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources. In Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015. Vol. 2. IEEE. 2015. p. 80-87. 7345478 https://doi.org/10.1109/Trustcom.2015.565