Enhancing Linear B-cell Epitope Prediction Through Organism-Specific Training

  • Jodie S.M. Ashford

Student thesis: Doctoral ThesisDoctor of Philosophy


B-cell epitopes play a crucial role in immune responses, with their identification being
a vital activity for numerous medical endeavours, including developing diagnostic tests,
therapeutic antibodies, and vaccines. Linear B-cell epitopes (LBCE) are often prioritised
as targets for epitope predictors over conformational epitopes due to the availability
of data, lower experimental complexity for determination and their stability in various
conditions, facilitating easier storage and transport. Despite advancements in computational
techniques, existing LBCE prediction methods still exhibit suboptimal performance. This
thesis explores the efficacy of organism-specific training in improving the accuracy and
efficiency of linear B-cell epitope prediction models.

Most LBCE prediction tools adopt a generalist approach, training models on large heterogeneous
data sets from numerous organisms to develop predictors that are applicable
across a wide variety of pathogens. In contrast, this work investigates the training of
bespoke, tailored, organism-specific LBCE prediction models. The main hypothesis posits
that using smaller, but potentially more directly relevant, organism-specific data sets for
training could yield predictors that demonstrate superior predictive performance for new
epitopes of the target organism over a single generalist model.

The main research objectives of this work were: to investigate whether training linear
B-cell epitope prediction models using organism-specific data leads to improved prediction
performance compared to models trained on heterogeneous or hybrid data, and against
well-established epitope predictors from the literature; And to investigate the limits of this
organism-specific training approach by systematically quantifying the effect of the amount
of training data on the performance of the models developed.

Results indicate that organism-specific training significantly enhances the prediction performance
of linear B-cell epitopes, even for organisms with limited training data. Comparative
analysis demonstrates the superiority of organism-specific models over heterogenous, hybrid
and other conventional predictors, highlighting the potential of tailored modelling
approaches in epitope prediction.
Date of AwardSept 2023
Original languageEnglish
SupervisorFelipe Campelo (Supervisor) & Aniko Ekárt (Supervisor)


  • Epitope Prediction
  • Machine Learning
  • Computational Biology

Cite this