Abstract
Motivation: In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous datasets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing amount of epitope data for a broad range of pathogens indicate that training organism or taxonspecific models may become a feasible alternative, with unexplored potential gains in predictive performance. Results: This article shows how organism-specific training of epitope prediction models can yield substantial performance gains across several quality metrics when compared to models trained with heterogeneous and hybrid data, and with a variety of widely used predictors from the literature. These results suggest a promising alternative for the development of custom-tailored predictive models with high predictive power, which can be easily implemented and deployed for the investigation of specific pathogens.
Original language | English |
---|---|
Pages (from-to) | 4826–4834 |
Number of pages | 9 |
Journal | Bioinformatics |
Volume | 37 |
Issue number | 24 |
Early online date | 21 Jul 2021 |
DOIs | |
Publication status | Published - 15 Dec 2021 |
Bibliographical note
© The Author(s) 2021. Published by Oxford University Press.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.