Abstract
The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.
Original language | English |
---|---|
Title of host publication | Proceedings of the Use of Computational Linguistics in the Extraction of Keyword Information from Digital Library Content Workshop |
Editors | Marilyn Deegan |
Publication status | Published - Feb 2004 |
Event | The Keyword Project: Unlocking Content through Computational Linguistics - London , United Kingdom Duration: 5 Feb 2004 → 6 Feb 2004 |
Conference
Conference | The Keyword Project: Unlocking Content through Computational Linguistics |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 5/02/04 → 6/02/04 |
Keywords
- ontologies
- knowledge
- text corpora
- automated natural language
- construction of ontologies
- associating terms
- constructing hierarchies
- labelling relations
- data-sparsity