Mason Archival Repository Service

Machine Learning Models of B-cell and T-cell Epitopes Using Sequence and Structure Information

Show simple item record

dc.contributor.advisor Vaisman, Iosif
dc.contributor.author Sewsankar, Kiran
dc.creator Sewsankar, Kiran
dc.date.accessioned 2022-08-03T20:18:39Z
dc.date.available 2022-08-03T20:18:39Z
dc.date.issued 2022
dc.identifier.uri http://hdl.handle.net/1920/12963
dc.description.abstract Epitopes, the regions of antigens that are detected by the immune system, have garnered considerable scientific interest in recent years due to their potential for influencing the development of novel medical countermeasures, an example of which, are epitope-based vaccines that can be both safer and more efficacious than those currently available. Innovative vaccines are profoundly needed to keep pace with the ever-changing landscape of global infectious disease. The drive towards creating new vaccines and treatments is critical to preserving world health and will be aided by studying epitopes. Epitopes of the antigen play an important role in immune response. For example, they are recognized by B-cells and T-cells and are the site of antibody binding. Therefore, identifying epitopes can help researchers better understand how foreign disease agents cause illness and how the host immune system reacts against it. Traditional methods for the identification of epitopes centered around experimental structural studies including, X-ray crystallography and NMR techniques, which are time consuming and costly. Thus, bioinformatics and computational approaches have been explored to facilitate the epitope identification process. In this work, models of B-cell and T-cell epitopes were developed to assist in the prediction and classification of non-validated but potential epitope protein sequences. Specifically, machine learning algorithms were trained on a diverse set of epitope/non-epitope representative feature vectors, comprised of sequence derived features based on reduced amino acid alphabets and n-grams and structure derived features based on Delaunay tessellation and amino acid propensity scores to reliably predict epitope sequences and residues. Feature vectors were constructed based on the specific problem at hand, either linear or conformational epitope prediction. The epitope sequence and structure data were obtained from publicly available databases and several machine learning algorithms, including Random Forest, Gaussian Naïve Bayes, and Support Vector Machine were applied to the descriptor space. The best performing epitope prediction models trained here can be used to identify unknown epitope sequences or residues, consequently reducing the search space for candidate epitopes, epitopes that will be the basis for the development of new vaccines and other medical countermeasures. The models are incorporated into the TESSETOPE V1.0, which is a freely available web accessible API for epitope prediction available at http://omics.gmu.edu/tessetope/.
dc.format.extent 134 pages
dc.language.iso en
dc.rights Copyright 2022 Kiran Sewsankar
dc.subject Bioinformatics
dc.subject Bioinformatics
dc.subject epitope
dc.subject epitope prediction
dc.subject epitope-based vaccines
dc.subject Immunoinformatics
dc.subject machine learning
dc.title Machine Learning Models of B-cell and T-cell Epitopes Using Sequence and Structure Information
dc.type Dissertation
thesis.degree.name Ph.D. in Bioinformatics and Computational Biology
thesis.degree.level Ph.D.
thesis.degree.discipline Bioinformatics and Computational Biology
thesis.degree.grantor George Mason University


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search MARS


Browse

My Account

Statistics