Natural language processing as a foundation of the semantic Web

Yorick Wilks; Christopher Brewster

doi:10.1561/1800000002

Natural language processing as a foundation of the semantic Web

Yorick Wilks, Christopher Brewster

Research output: Contribution to journal › Article › peer-review

Abstract

The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.

Original language	English
Pages (from-to)	199-327
Number of pages	129
Journal	Foundations and Trends in Web Science
Volume	1
Issue number	3-4
DOIs	https://doi.org/10.1561/1800000002
Publication status	Published - 15 Apr 2009

Bibliographical note

Authors' copyright

Keywords

natural language processing
semantic web
unstructured sources
empirical computations

Access to Document

10.1561/1800000002

Cite this

@article{535a6447ce1b480b99565eeb8bba219f,

title = "Natural language processing as a foundation of the semantic Web",

abstract = "The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.",

keywords = "natural language processing, semantic web, unstructured sources, empirical computations",

author = "Yorick Wilks and Christopher Brewster",

note = "Authors' copyright",

year = "2009",

month = apr,

day = "15",

doi = "10.1561/1800000002",

language = "English",

volume = "1",

pages = "199--327",

journal = "Foundations and Trends in Web Science",

issn = "1555-007X",

publisher = "Now Publishers Inc",

number = "3-4",

}

TY - JOUR

T1 - Natural language processing as a foundation of the semantic Web

AU - Wilks, Yorick

AU - Brewster, Christopher

N1 - Authors' copyright

PY - 2009/4/15

Y1 - 2009/4/15

N2 - The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.

AB - The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.

KW - natural language processing

KW - semantic web

KW - unstructured sources

KW - empirical computations

UR - https://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=8187064

U2 - 10.1561/1800000002

DO - 10.1561/1800000002

M3 - Article

SN - 1555-007X

VL - 1

SP - 199

EP - 327

JO - Foundations and Trends in Web Science

JF - Foundations and Trends in Web Science

IS - 3-4

ER -

Natural language processing as a foundation of the semantic Web

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this