Structural and Topological Variations in Amino Acids Encoded by Synonymous Codons




Wang, Shengyuan

Journal Title

Journal ISSN

Volume Title



This dissertation explores the relationship between protein structural and topological properties using computational geometry approach. A representative nonredundant dataset containing 10,220 individual protein chains with known structures was created, and each amino acid residue in the set was matched to the corresponding codon. The Delaunay tessellation of all proteins in the dataset resulted in the four-body statistical potentials with both 20 letter amino acid alphabet and 61 letter codon alphabet. Compositional, geometric, and topological patterns in the codon based representation were identified and influence of the synonymous codons on protein structure was assessed. Both amino acid and codon based potentials were extensively tested for reliability and consistency and their performance in a number of applications was evaluated. Computational mutagenesis approach, where the new potentials were used in the machine learning models for predicting protein fitness and activity changes caused by mutations, demonstrated high accuracy of the predictions. In addition, a new method for accurate identification of kinked α-helices by using both geometric and topological parameters was developed.



Bioinformatics, Computational mutagenesis, Delaunay tessellation, Kinked alpha-helix, Machine learning, Synonymous mutations