A framework for automated construction of resource space based on background knowledge

Xu Yu; Li Peng; Zhixing Huang; Hai Zhuge

doi:10.1016/j.future.2013.07.017

A framework for automated construction of resource space based on background knowledge

Xu Yu, Li Peng, Zhixing Huang^*, Hai Zhuge

^*Corresponding author for this work

Computer Science Research Group

Research output: Contribution to journal › Special issue › peer-review

Abstract

Resource Space Model is a kind of data model which can effectively and flexibly manage the digital resources in cyber-physical system from multidimensional and hierarchical perspectives. This paper focuses on constructing resource space automatically. We propose a framework that organizes a set of digital resources according to different semantic dimensions combining human background knowledge in WordNet and Wikipedia. The construction process includes four steps: extracting candidate keywords, building semantic graphs, detecting semantic communities and generating resource space. An unsupervised statistical language topic model (i.e., Latent Dirichlet Allocation) is applied to extract candidate keywords of the facets. To better interpret meanings of the facets found by LDA, we map the keywords to Wikipedia concepts, calculate word relatedness using WordNet's noun synsets and construct corresponding semantic graphs. Moreover, semantic communities are identified by GN algorithm. After extracting candidate axes based on Wikipedia concept hierarchy, the final axes of resource space are sorted and picked out through three different ranking strategies. The experimental results demonstrate that the proposed framework can organize resources automatically and effectively.

Original language	English
Pages (from-to)	222-231
Number of pages	10
Journal	Future Generation Computer Systems
Volume	32
Early online date	5 Aug 2013
DOIs	https://doi.org/10.1016/j.future.2013.07.017
Publication status	Published - Mar 2014

Keywords

latent Dirichlet allocation
resource space model
semantic graph
Wikipedia

Access to Document

10.1016/j.future.2013.07.017

Cite this

@article{c7199a249d95456c87d1ec9027a13e36,

title = "A framework for automated construction of resource space based on background knowledge",

abstract = "Resource Space Model is a kind of data model which can effectively and flexibly manage the digital resources in cyber-physical system from multidimensional and hierarchical perspectives. This paper focuses on constructing resource space automatically. We propose a framework that organizes a set of digital resources according to different semantic dimensions combining human background knowledge in WordNet and Wikipedia. The construction process includes four steps: extracting candidate keywords, building semantic graphs, detecting semantic communities and generating resource space. An unsupervised statistical language topic model (i.e., Latent Dirichlet Allocation) is applied to extract candidate keywords of the facets. To better interpret meanings of the facets found by LDA, we map the keywords to Wikipedia concepts, calculate word relatedness using WordNet's noun synsets and construct corresponding semantic graphs. Moreover, semantic communities are identified by GN algorithm. After extracting candidate axes based on Wikipedia concept hierarchy, the final axes of resource space are sorted and picked out through three different ranking strategies. The experimental results demonstrate that the proposed framework can organize resources automatically and effectively.",

keywords = "latent Dirichlet allocation, resource space model, semantic graph, Wikipedia",

author = "Xu Yu and Li Peng and Zhixing Huang and Hai Zhuge",

year = "2014",

month = mar,

doi = "10.1016/j.future.2013.07.017",

language = "English",

volume = "32",

pages = "222--231",

journal = "Future Generation Computer Systems",

issn = "0167-739X",

publisher = "Elsevier",

}

TY - JOUR

T1 - A framework for automated construction of resource space based on background knowledge

AU - Yu, Xu

AU - Peng, Li

AU - Huang, Zhixing

AU - Zhuge, Hai

PY - 2014/3

Y1 - 2014/3

N2 - Resource Space Model is a kind of data model which can effectively and flexibly manage the digital resources in cyber-physical system from multidimensional and hierarchical perspectives. This paper focuses on constructing resource space automatically. We propose a framework that organizes a set of digital resources according to different semantic dimensions combining human background knowledge in WordNet and Wikipedia. The construction process includes four steps: extracting candidate keywords, building semantic graphs, detecting semantic communities and generating resource space. An unsupervised statistical language topic model (i.e., Latent Dirichlet Allocation) is applied to extract candidate keywords of the facets. To better interpret meanings of the facets found by LDA, we map the keywords to Wikipedia concepts, calculate word relatedness using WordNet's noun synsets and construct corresponding semantic graphs. Moreover, semantic communities are identified by GN algorithm. After extracting candidate axes based on Wikipedia concept hierarchy, the final axes of resource space are sorted and picked out through three different ranking strategies. The experimental results demonstrate that the proposed framework can organize resources automatically and effectively.

AB - Resource Space Model is a kind of data model which can effectively and flexibly manage the digital resources in cyber-physical system from multidimensional and hierarchical perspectives. This paper focuses on constructing resource space automatically. We propose a framework that organizes a set of digital resources according to different semantic dimensions combining human background knowledge in WordNet and Wikipedia. The construction process includes four steps: extracting candidate keywords, building semantic graphs, detecting semantic communities and generating resource space. An unsupervised statistical language topic model (i.e., Latent Dirichlet Allocation) is applied to extract candidate keywords of the facets. To better interpret meanings of the facets found by LDA, we map the keywords to Wikipedia concepts, calculate word relatedness using WordNet's noun synsets and construct corresponding semantic graphs. Moreover, semantic communities are identified by GN algorithm. After extracting candidate axes based on Wikipedia concept hierarchy, the final axes of resource space are sorted and picked out through three different ranking strategies. The experimental results demonstrate that the proposed framework can organize resources automatically and effectively.

KW - latent Dirichlet allocation

KW - resource space model

KW - semantic graph

KW - Wikipedia

UR - http://www.scopus.com/inward/record.url?scp=84891634777&partnerID=8YFLogxK

U2 - 10.1016/j.future.2013.07.017

DO - 10.1016/j.future.2013.07.017

M3 - Special issue

AN - SCOPUS:84891634777

SN - 0167-739X

VL - 32

SP - 222

EP - 231

JO - Future Generation Computer Systems

JF - Future Generation Computer Systems

ER -

A framework for automated construction of resource space based on background knowledge

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this