Mason Archival Repository Service

Computational Geometry Approach to the Analysis of Organism-Dependent Features in Protein Structures

Show simple item record

dc.contributor.author Luo, Yong
dc.creator Luo, Yong
dc.date 2009-12-11
dc.date.accessioned 2010-02-16T20:14:08Z
dc.date.available NO_RESTRICTION en_US
dc.date.available 2010-02-16T20:14:08Z
dc.date.issued 2010-02-16T20:14:08Z
dc.identifier.uri https://hdl.handle.net/1920/5691
dc.description.abstract Recent research on organism-dependent features of proteins has been mostly focused on the analysis of their primary sequences, while studies of their structural differences are rare. The general organism-dependent structural features in proteins are obscured by the strong sequence and structural similarities between the homologous proteins across different genomes. In this work we implemented protein structure descriptors based on Delaunay tessellation of the structures. Delaunay tessellation identifies the quadruplets of nearest neighbor residues, in which we enumerated all possible residue compositions using full 20 letter alphabet as well as a number of reduced alphabets. Feature vectors based on these descriptors were generated to represent organism-dependent features of an individual protein structure. Protein and domain structures of a series of organisms were collected. We applied supervised machine learning techniques to develop classifiers for proteins from different organisms. This result strongly indicates the presence of organism-dependent signals in protein structure. The discrimination capability of machine learning models is strongly dependent on the reduced residue alphabet used in the modeling. Comparison of the model performance with different amino acid residue alphabet reduction schemes and organism pairs provides novel insights into the evolution of protein structure.
dc.language.iso en_US en_US
dc.subject protein structure en_US
dc.subject computational geometry en_US
dc.subject residue alphabet reduction en_US
dc.subject Delaunay tessellation en_US
dc.subject organism-dependent features en_US
dc.subject Machine learning en_US
dc.title Computational Geometry Approach to the Analysis of Organism-Dependent Features in Protein Structures en_US
dc.type Dissertation en
dc.description.note Supporting data in the form of Microsoft Excel documents is included. en_US
thesis.degree.name Doctor of Philosophy in Bioinformatics and Computational Biology en_US
thesis.degree.level Doctoral en
thesis.degree.discipline Bioinformatics and Computational Biology en
thesis.degree.grantor George Mason University en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search MARS


Browse

My Account

Statistics