Ontologies, taxonomies, thesauri: learning from texts

Christopher Brewster; Yorick Wilks

Ontologies, taxonomies, thesauri: learning from texts

Christopher Brewster, Yorick Wilks

Research output: Chapter in Book/Published conference output › Chapter

Abstract

The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.

Original language	English
Title of host publication	Proceedings of the Use of Computational Linguistics in the Extraction of Keyword Information from Digital Library Content Workshop
Editors	Marilyn Deegan
Publication status	Published - Feb 2004
Event	The Keyword Project: Unlocking Content through Computational Linguistics - London , United Kingdom Duration: 5 Feb 2004 → 6 Feb 2004

Conference

Conference	The Keyword Project: Unlocking Content through Computational Linguistics
Country/Territory	United Kingdom
City	London
Period	5/02/04 → 6/02/04

Keywords

ontologies
knowledge
text corpora
automated natural language
construction of ontologies
associating terms
constructing hierarchies
labelling relations
data-sparsity

Access to Document

KeyWord_FMO.pdf
The Keyword Project: Unlocking Content through Computational Linguistics, 5-6 February 2004, Kings College, London (UK).

Cite this

@inbook{5d42a014b50d47b4bd8ffadbcf4b4709,

title = "Ontologies, taxonomies, thesauri: learning from texts",

abstract = "The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.",

keywords = "ontologies, knowledge, text corpora, automated natural language, construction of ontologies, associating terms, constructing hierarchies, labelling relations, data-sparsity",

author = "Christopher Brewster and Yorick Wilks",

year = "2004",

month = feb,

language = "English",

editor = "Marilyn Deegan",

booktitle = "Proceedings of the Use of Computational Linguistics in the Extraction of Keyword Information from Digital Library Content Workshop",

note = "The Keyword Project: Unlocking Content through Computational Linguistics ; Conference date: 05-02-2004 Through 06-02-2004",

}

TY - CHAP

T1 - Ontologies, taxonomies, thesauri

T2 - The Keyword Project: Unlocking Content through Computational Linguistics

AU - Brewster, Christopher

AU - Wilks, Yorick

PY - 2004/2

Y1 - 2004/2

N2 - The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.

AB - The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.

KW - ontologies

KW - knowledge

KW - text corpora

KW - automated natural language

KW - construction of ontologies

KW - associating terms

KW - constructing hierarchies

KW - labelling relations

KW - data-sparsity

M3 - Chapter

BT - Proceedings of the Use of Computational Linguistics in the Extraction of Keyword Information from Digital Library Content Workshop

A2 - Deegan, Marilyn

Y2 - 5 February 2004 through 6 February 2004

ER -

Ontologies, taxonomies, thesauri: learning from texts

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this