A Probabilistic Search Algorithm for Protein-protein Docking




Hashmi, Irina

Journal Title

Journal ISSN

Volume Title



Computational methods able to assist or complement wet-laboratory experiments in structural characterization of molecular assemblies promise to provide detailed insight into molecular interactions, drug-design, and biological function in the living and diseased cell. Methods that predict three-dimensional structures of protein-protein assemblies are abundant in computational structural biology. However, challenges remain in accurately detecting the interacting interface between participating units in an assembly. For search algorithms, the task of predicting the biologically-active structure of an assembly poses particular challenges due to the high dimensionality of the search space where potentially relevant assembly configurations lie. The work presented in this thesis is a step towards developing a new set of computational techniques and algorithms for structural characterization of protein-protein assemblies. Specifically, the work here focuses on modeling the three-dimensional quaternary structure of a protein dimer, a complex formed by interactions between two participating protein chains. This problem is commonly known as protein-protein docking. This work addresses the problem of rigid protein-protein docking, where the given unbounded structures of the protein units about to dimerize are expected to be the same as the bounded ones after dimerization. In addition to techniques proposed to alleviate certain computational aspects related with finding the right docking interface in protein dimers, this thesis proposes a new probabilistic search algorithm that employs both geometry and energy to sample low-energy configurations of a protein dimer. Analysis of evolutionary conservation and a geometric treatment of the molecular surface are combined in order to identify potentially-relevant contact interfaces between the two units in the dimer. Docking is focused only on evolutionary- conserved geometrically-complementary regions between the units' molecular surfaces, resulting in a narrower search space of rigid-body motions matching only such regions. This treatment is the first contribution of this work. The second contribution is a probabilistic search algorithm that efficiently explores the space of rigid-body motions corresponding to local minima in an energy function capturing interactions in a dimeric configuration. The proposed algorithm is an adaptation of the Basin Hopping (BH) framework. The work presented in this thesis details implementation and careful analysis of the components that result in an effective BH algorithm for rigid protein-protein docking. Application on a diverse list of protein shows that the algorithm is able to recover the native dimeric configuration as well as produce other relevant minima near the native configuration of a given dimer. A detailed analysis is presented that shows the algorithm reproduces known properties of the BH framework in other contexts and application, most notably the relationship between adjacency between consecutively-obtained local minima and proximity to the known native dimeric configuration. Taken together, the results presented show that the algorithm can be employed as a first stage in a computational docking protocol to sample low-energy near-native dimeric configurations that can then be further refined and discriminated with more computationally-intensive optimization protocols.



Protein-protein docking, Evolutionary conservation, Basin hopping, Geometric-hashing, Rigid-body transformation