Protein-protein interactions classification from text via local learning with class priors

Yulan He, Chenghua Lin

Research output: Chapter in Book/Report/Conference proceedingOther chapter contribution

Abstract

Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.
Original languageEnglish
Title of host publicationNatural language processing and information systems
Subtitle of host publication14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, June 24-26, 2009. Revised Papers
EditorsHelmut Horacek, Elisabeth Métais, Rafael Muñoz, Magdalena Wolska
PublisherSpringer
Pages182-191
Number of pages10
Volume5723
ISBN (Print)3-642-12549-2, 978-3-642-12549-2
DOIs
Publication statusPublished - 2009
Event14th international conference on applications of natural language to information systems, NLDB 2009 - Saarbrücken, Germany
Duration: 24 Jun 200926 Jun 2009

Publication series

NameLecture notes in computer science
PublisherSpringer
Volume5723
ISSN (Print)0302-9743

Conference

Conference14th international conference on applications of natural language to information systems, NLDB 2009
CountryGermany
CitySaarbrücken
Period24/06/0926/06/09

Fingerprint

Proteins
Learning algorithms
Supervised learning
Vector spaces

Keywords

  • text classification
  • protein-protein interactions extraction
  • semi-supervised learning
  • local learning

Cite this

He, Y., & Lin, C. (2009). Protein-protein interactions classification from text via local learning with class priors. In H. Horacek, E. Métais, R. Muñoz, & M. Wolska (Eds.), Natural language processing and information systems: 14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, June 24-26, 2009. Revised Papers (Vol. 5723, pp. 182-191). (Lecture notes in computer science; Vol. 5723). Springer. https://doi.org/10.1007/978-3-642-12550-8_15
He, Yulan ; Lin, Chenghua. / Protein-protein interactions classification from text via local learning with class priors. Natural language processing and information systems: 14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, June 24-26, 2009. Revised Papers. editor / Helmut Horacek ; Elisabeth Métais ; Rafael Muñoz ; Magdalena Wolska. Vol. 5723 Springer, 2009. pp. 182-191 (Lecture notes in computer science).
@inbook{932b747ae4034b21b147e04ad86f360d,
title = "Protein-protein interactions classification from text via local learning with class priors",
abstract = "Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.",
keywords = "text classification, protein-protein interactions extraction, semi-supervised learning, local learning",
author = "Yulan He and Chenghua Lin",
year = "2009",
doi = "10.1007/978-3-642-12550-8_15",
language = "English",
isbn = "3-642-12549-2",
volume = "5723",
series = "Lecture notes in computer science",
publisher = "Springer",
pages = "182--191",
editor = "Helmut Horacek and Elisabeth M{\'e}tais and Rafael Mu{\~n}oz and Magdalena Wolska",
booktitle = "Natural language processing and information systems",
address = "Germany",

}

He, Y & Lin, C 2009, Protein-protein interactions classification from text via local learning with class priors. in H Horacek, E Métais, R Muñoz & M Wolska (eds), Natural language processing and information systems: 14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, June 24-26, 2009. Revised Papers. vol. 5723, Lecture notes in computer science, vol. 5723, Springer, pp. 182-191, 14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, 24/06/09. https://doi.org/10.1007/978-3-642-12550-8_15

Protein-protein interactions classification from text via local learning with class priors. / He, Yulan; Lin, Chenghua.

Natural language processing and information systems: 14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, June 24-26, 2009. Revised Papers. ed. / Helmut Horacek; Elisabeth Métais; Rafael Muñoz; Magdalena Wolska. Vol. 5723 Springer, 2009. p. 182-191 (Lecture notes in computer science; Vol. 5723).

Research output: Chapter in Book/Report/Conference proceedingOther chapter contribution

TY - CHAP

T1 - Protein-protein interactions classification from text via local learning with class priors

AU - He, Yulan

AU - Lin, Chenghua

PY - 2009

Y1 - 2009

N2 - Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.

AB - Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.

KW - text classification

KW - protein-protein interactions extraction

KW - semi-supervised learning

KW - local learning

UR - http://www.scopus.com/inward/record.url?scp=78651250571&partnerID=8YFLogxK

UR - http://www.springerlink.com/content/n7127894u7g80h32/

U2 - 10.1007/978-3-642-12550-8_15

DO - 10.1007/978-3-642-12550-8_15

M3 - Other chapter contribution

AN - SCOPUS:78651250571

SN - 3-642-12549-2

SN - 978-3-642-12549-2

VL - 5723

T3 - Lecture notes in computer science

SP - 182

EP - 191

BT - Natural language processing and information systems

A2 - Horacek, Helmut

A2 - Métais, Elisabeth

A2 - Muñoz, Rafael

A2 - Wolska, Magdalena

PB - Springer

ER -

He Y, Lin C. Protein-protein interactions classification from text via local learning with class priors. In Horacek H, Métais E, Muñoz R, Wolska M, editors, Natural language processing and information systems: 14th international conference on applications of natural language to information systems, NLDB 2009, Saarbrücken, Germany, June 24-26, 2009. Revised Papers. Vol. 5723. Springer. 2009. p. 182-191. (Lecture notes in computer science). https://doi.org/10.1007/978-3-642-12550-8_15