Peptide-binding MHC proteins are thought the most variable proteins across the human population; the extreme MHC polymorphism observed is functionally important and results from constrained divergent evolution. MHCs have vital functions in immunology and homeostasis: cell surface MHC class I molecules report cell status to CD8+ T cells, NKT cells and NK cells, thus playing key roles in pathogen defence, as well as mediating smell recognition, mate choice, Adverse Drug Reactions, and transplantation rejection. MHC peptide specificity falls into several supertypes exhibiting commonality of binding. It seems likely that other supertypes exist relevant to other functions. Since comprehensive experimental characterization is intractable, structure-based bioinformatics is the only viable solution. We modelled functional MHC proteins by homology and used calculated Poisson-Boltzmann electrostatics projected from the top surface of the MHC as multi-dimensional descriptors, analysing them using state-of-the-art dimensionality reduction techniques and clustering algorithms. We were able to recover the 3 MHC loci as separate clusters and identify clear sub-groups within them, vindicating unequivocally our choice of both data representation and clustering strategy. We expect this approach to make a profound contribution to the study of MHC polymorphism and its functional consequences, and, by extension, other burgeoning structural systems, such as GPCRs.
Bibliographical note© 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/
- major histocompatibility complex
- probabilistic visualisation
- HLA supertypes
- multi-level Gaussian process latent variable model