Protein sequence-structure-dynamics-function relationships: The close association of dynamics with protein function
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The intrinsic dynamics of globular proteins is the key to the understanding of their function, being a consequence of protein structure and geometry. The view of protein structures has recently changed from native structures being considered to be a single rigid, static object into one where conformational ensembles coexist. Besides, allostery, the transmission of signals from a distant site to the active site, is a direct outcome of the detailed dynamics of a given protein. Investigating how dynamics controls protein function is one of the overall aims of our studies. It is essential to probe protein function by combining information from all three types of data: sequence, structure and dynamics, which combine to define their functions. The abundance of protein sequence data in repositories like UniProt and Pfam is huge and is strongly complementary to the rich data of protein structures in PDB. Exploiting this wealth of information and coupling it with molecular simulations that provide information on protein dynamics, facilitates the understanding and predicting of protein function, which is the underlying motivation and overall objective of the present work.
The dynamic behavior of proteins is often altered upon the binding of ligands, partner proteins or other biological macromolecules such as DNA and RNA. This work describes the influence of binding on the intrinsic dynamics of proteins through studies on homooligomeric protein assemblies which are comprised of multiple subunits of the same protein. Specifically, this work compares the dynamics of functionally important residues of a single subunit in isolation with those in its assembled form. Next, is presented a systematic investigation of the extent of similarity between the protein dynamic communities obtained from molecular dynamics with those from a simpler molecular simulation method, the elastic network models. The focus is on the separate dynamic communities, which are those groups of residues, highly cohesive in terms of their motions and which move like a rigid unit. Elastic network models are models for protein cohesion and are particularly appropriate for application to this task. We also show how they can effectively capture the differences in community distributions for mutant and wild type forms of T4 lysozyme. Finally, a machine learning classification method is developed wherein protein dynamics information is coupled with structure, evolutionary and physicochemical properties to predict regulatory and functional binding sites.
This work emphasizes the collective interplay between sequence, structure and dynamics as the key to the understanding of protein function. It also highlights the use of simplified molecular representations for simulations, i.e., the elastic network model, which can often be suitable as a substitute for atomic molecular dynamics. The machine learning models developed as a part of this work strongly point up the importance of including protein dynamics to improve predictions. The methods developed have potential practical applications, for instance as predictive models for identification of hot spot residues for site-directed mutagenesis or even for the prediction of sites where potential therapeutics could bind to restore dynamics and other disturbed functions, or even to suggest ways to generate new functions.