On the hierarchical classification of G protein-coupled receptors

Matthew N. Davies, Andrew Secker, Alex A. Freitas, Miguel Mendao, Jon Timmis, Darren R. Flower

Research output: Contribution to journalArticle

Abstract

MOTIVATION: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. RESULTS: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.
Original languageEnglish
Pages (from-to)3113-3118
Number of pages6
JournalBioinformatics
Volume23
Issue number23
Early online date22 Oct 2007
DOIs
Publication statusPublished - Dec 2007

Fingerprint

Hierarchical Classification
G Protein
G-Protein-Coupled Receptors
Receptor
Proteins
Data Mining
Data mining
Classifiers
Classifier
Subdivision
Support vector machines
Assign
Support Vector Machine
Drugs
High Accuracy
Alignment
Servers
Server
Protein
Predict

Cite this

Davies, M. N., Secker, A., Freitas, A. A., Mendao, M., Timmis, J., & Flower, D. R. (2007). On the hierarchical classification of G protein-coupled receptors. Bioinformatics, 23(23), 3113-3118. https://doi.org/10.1093/bioinformatics/btm506
Davies, Matthew N. ; Secker, Andrew ; Freitas, Alex A. ; Mendao, Miguel ; Timmis, Jon ; Flower, Darren R. / On the hierarchical classification of G protein-coupled receptors. In: Bioinformatics. 2007 ; Vol. 23, No. 23. pp. 3113-3118.
@article{48d5b4d59cae4b938dc7c453e67a721f,
title = "On the hierarchical classification of G protein-coupled receptors",
abstract = "MOTIVATION: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50{\%} of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. RESULTS: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.",
author = "Davies, {Matthew N.} and Andrew Secker and Freitas, {Alex A.} and Miguel Mendao and Jon Timmis and Flower, {Darren R.}",
year = "2007",
month = "12",
doi = "10.1093/bioinformatics/btm506",
language = "English",
volume = "23",
pages = "3113--3118",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "23",

}

Davies, MN, Secker, A, Freitas, AA, Mendao, M, Timmis, J & Flower, DR 2007, 'On the hierarchical classification of G protein-coupled receptors', Bioinformatics, vol. 23, no. 23, pp. 3113-3118. https://doi.org/10.1093/bioinformatics/btm506

On the hierarchical classification of G protein-coupled receptors. / Davies, Matthew N.; Secker, Andrew; Freitas, Alex A.; Mendao, Miguel; Timmis, Jon; Flower, Darren R.

In: Bioinformatics, Vol. 23, No. 23, 12.2007, p. 3113-3118.

Research output: Contribution to journalArticle

TY - JOUR

T1 - On the hierarchical classification of G protein-coupled receptors

AU - Davies, Matthew N.

AU - Secker, Andrew

AU - Freitas, Alex A.

AU - Mendao, Miguel

AU - Timmis, Jon

AU - Flower, Darren R.

PY - 2007/12

Y1 - 2007/12

N2 - MOTIVATION: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. RESULTS: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.

AB - MOTIVATION: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. RESULTS: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.

UR - http://bioinformatics.oxfordjournals.org/content/23/23/3113

U2 - 10.1093/bioinformatics/btm506

DO - 10.1093/bioinformatics/btm506

M3 - Article

C2 - 17956878

VL - 23

SP - 3113

EP - 3118

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 23

ER -

Davies MN, Secker A, Freitas AA, Mendao M, Timmis J, Flower DR. On the hierarchical classification of G protein-coupled receptors. Bioinformatics. 2007 Dec;23(23):3113-3118. https://doi.org/10.1093/bioinformatics/btm506