Protein wild-type and mutant ensemble database
Protein structures have been determined and deposited into Protein Data Bank at an increasing rate. In this work, we organize all the protein structures in the PDB and form a wild type and mutant structure database. The database groups the wild type and mutant structures of the same protein together. One direct benefit of the database is thus the easy accessibility of the structure ensembles of all the proteins. Such ensembles are known to be highly useful for representing the native states of proteins and for understanding their functions. For each protein, mutants are sorted by the number of mutations and the location(s) of the mutations. What distinguishes our work from other mutation databases is that it is structure-based and includes all the existing structures of the PDB. Synchronization with the PDB database will be maintained. As an application, we carry out an experimental structure-based statistical analysis of the effects of mutations, on both protein structure and protein dynamics. A key question we address in this work is: is it valid to use mutant structures (or variants from different species) to represent a native state sample of a given protein? Our results indicate that mutations can cause significant structure changes and dynamics changes, more than commonly expected. This implies that cautions must be taken when mutation structures are considered to be included as representative samples of the conformation space of a given protein.