15.0K Views
•
09:51 min
•
July 16th, 2017
DOI :
July 16th, 2017
•0:05
Title
0:40
Structure Search and Selection (SEARCH)
1:46
Multiple Sequence Alignment Analysis (ALIGN)
2:51
Structure Fitting and Analysis (FIT)
3:52
Principle Component Analysis (PCA)
5:24
Ensemble Normal Mode Analysis (eNMA)
6:52
Results: Bio3D-web Analysis of Adenylate Kinase
8:01
Conclusion
Trascrizione
The overall goal of Bio3D-web is to enable interactive analysis of sequence, structure, and confirmational diversity of protein families. Bio3D-web is a new web app that can help us understand the relationships between sequence, structure, and dynamics within large protein families. The main advantage of Bio3D-web is that it provides interactive and reproducible online sequence, structure, and dynamics analysis requiring no prior programming knowledge on the user's part.
To input a structure for analysis, obtain the PDB ID of adenylate kinase by searching the protein database. Enter the four character-long PDB ID for adenylate kinase or paste a protein sequence in the text box of the input structure or sequence panel. Click the blue next hit selection button in the first panel, or simply scroll down to panel B hit selection for further analysis.
Make sure the limit total number of included structures slider is set to its maximum value to include all structures above the cutoff. Lower the inclusion BitScore cutoff to include more distantly related hits or increase it to exclude. Make sure the selected hits represent relevant structures by inspecting details of the table such as PDB name, species, and bound ligands.
Click the align tab to perform sequence alignment of the selected structures from the search tab. There is a potential emergence of alignment errors, especially when analyzing more distantly related proteins such as when sequence similarity is below 30%Always double-check and correct the sequence alignment in the align tab. Review the alignment summary in panel A alignment summary.
Make sure that the regions of interest are aligned and not masked by gaps in one or more structures. Click the blue next analysis button to perform sequence-based clustering analysis of the collected structures. Then, click the blue next conservation button to calculate the column-wise residue conservation.
Select the aligned structures sets to generate a plot of the residue conservation at each alignment position. Perform structure superimposition by entering the fit tab. Make sure the protein structures are superimposed to corresponding and relevant regions by visual inspection.
Click and drag the mouse over the structure to rotate, and scroll to zoom. Adjust the coloring of the structures by clicking on the color options. Then, click the blue next analysis button to perform structure-based clustering of the collected PDB structures.
Toggle the RMSD heat map in the plot options dropdown menu. Click the blue nest RMSF button to view the structure variability of each residue with major secondary structure elements shown in the marginal regions of the X-axis. Perform principal component analysis by entering the PCA tab.
To visualize the principal components, toggle the show PC trajectory checkbox to visualize motions described by the principal components with the in-browser visualization tool. Make sure principal component one is chosen from the first drop-down menu. To visualize the motions described by other principal components, choose the desired PC from the choose principal component drop-down menu.
Project the individual structures onto two selected PCs by clicking on the blue next plot button. Make sure PC on X-axis is set to one and PC on Y-axis to two. To project the structures onto other principal components, adjust the PC numbering accordingly.
Click on any individual points in the plot to label the structures. Alternatively, highlight one or more structures in the table PCA conformer plot annotation below the plot. Calculate the residue contributions to the individual principal components by clicking the blue next residue contributions button.
Plot the contributions for additional principal components by including the PC number in the choose principal component text box. Click the eNMA tab to initiate normal mode's calculation. Adjust the number of structures by lowering or increasing the cutoff for structure inclusion or exclusion.
Click the green run ensemble NMA to start the normal mode analysis calculation. Scroll down to the second panel of the eNMA tab, called normal modes visualization, for visualization of the normal modes. Click the blue next fluctuations button to calculate the residue-wise fluctuations of structures selected for ensemble normal mode analysis.
Toggle the cluster by RMSD to color the fluctuation profiles by RMSD-based clustering. Toggle the spread lines checkbox to plot the grouped fluctuation profiles apart from each other. Then, click the blue next PCA versus NMA button to calculate the similarity between the individual normal modes and principal components.
Click the blue next button at the bottom to generate a report of analysis results, user defined input, and parameter choices. The report can be downloaded in shareable PDF, Word, and HTML formats. The report also includes a link to revisit the analysis session at a later time, which can also be shared with collaborators.
Shown here are available PDB structures of adenylate kinase superimposed on the identified invariant core. Structures are colored according to RMSD-based clustering provided in the fit tab. Visualization of the principal components is available from the PCA tab to characterize the major conformational variations in the data set.
Here, the trajectory corresponding to the first principal component is shown in tube representation showing the large-scale closing motion of the protein. Structures are projected onto their two first principal components in a conformer plot showing a low dimensional representation of the conformational variability. Each dot or structure is colored according to user specified criteria, in this case, PCA-based clustering results.
Normal mode analysis in the eNMA tab suggests enhanced local and global dynamics for structures in the open state in comparison to the closed state. Bio3D-web can be used to explore sequence and structure sets and help understand general trends in the protein family of interest within minutes. Bio3D-web ha helped me inform structural analysis of the kinesin superfamily.
Bio3D-web provides a complete workflow for fully customized investigation of sequence-structure-dynamic relationships. This workflow provides functionality for interconformity relationship mapping with PCA and quantitative comparison of local and global dynamics across large sets of heterogenous structures using ensemble NMA. It is envisaged that most researchers will us Bio3D-web to understand general trends in their protein family of interest, which can then inform further specialized analyses.
Bio3D-web is therefore designed as a hypothesis generating tool with shareable summary reports that include all user defined parameters. The results from Bio3D-web are also downloadable and are fully compatible with Bio3D-R package for further in-depth analysis. Many structures are now available for a given protein family, often determined under different conditions.
Comparison of these large number of structures can tell us about structural mechanisms critical for their function, such as ligand binding, catalysis, and allosteric regulation. Bio3D-web is widely accessible and easy to use without any dedicated software installations or learning a new programming language. Bio3D-web is supported on all major browsers.
Full source code and setup instructions for local setup are available of free of cost.
A protocol for the online investigation of protein sequence-structure-dynamics relationships using Bio3D-web is presented.