Analysis on protein structures using statistical and bioinformatical methods

Yan, Aimin
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Journal Issue

This PhD dissertation mainly focuses on the statistical analysis for the protein structure data. The first research project focuses on data mining and prediction for side chain orientation in protein structure. Through this study, we find that the general side chain orientation can be viewed as a manifestation of hydrophobic force. Along with this study, we also developed the software for visualizing the general side chain vector and applied statistical machine learning methods to fit several models for predicting general side chain orientation. In the second project, we studied the motion of partially assembled ribosome 30S subunit using the coarse-grained elastic network model. Besides our studies on ribosome motion, using 176 NMR structure ensembles, we applied principal component analysis to analyze the essential conformational changes to validate the motion generated by the elastic network model. Furthermore, we also studied the effects of different superposition methods on the correspondence between the conformational changes and the simulated motion. Principal component shaving method is often used to cluster gene with the similar gene expression pattern in micro-array data analysis. In the third research project, we applied this method to cluster the structures within a NMR structure ensemble and demonstrated that this method could be used to find the similar structure cluster.

Bioinformatics and computational biology;Biochemistry, Biophysics and molecular biology