Automated methods for NMR structure determination of proteins are continuously becoming more robust. However, current methods addressing larger, more complex targets rely on analyzing 6–10 complementary spectra, suggesting the need for alternative approaches. Here, we describe 4D-CHAINS/autoNOE-Rosetta, a complete pipeline for NOE-driven structure determination of medium- to larger-sized proteins. The 4D-CHAINS algorithm analyzes two 4D spectra recorded using a single, fully protonated protein sample in an iterative ansatz where common NOEs between different spin systems supplement conventional through-bond connectivities to establish assignments of sidechain and backbone resonances at high levels of completeness and with a minimum error rate. The 4D-CHAINS assignments are then used to guide automated assignment of long-range NOEs and structure refinement in autoNOE-Rosetta. Our results on four targets ranging in size from 15.5 to 27.3 kDa illustrate that the structures of proteins can be determined accurately and in an unsupervised manner in a matter of days.
The Arg119 side chain must be properly situated for efficient catalysis, but for other debilitating variants, the functional defects could be explained by structural perturbations and/or associated decamer destabilization rather than direct effects. This underscores the importance of our comprehensive approach. A remarkable new finding was the preference of the reductant for decamers. Antioxid. Redox Signal. 28, 521-536.
Highly accurate protein structures reveal how large bond angle distortions enable “disallowed” protein folding transitions.
Ensembles of protein structures are increasingly used to represent the conformational variation of a protein as determined by experiment and/or by molecular simulations, as well as uncertainties that may be associated with structure determinations or predictions. Making the best use of such information requires the ability to quantitatively compare entire ensembles. For this reason, we recently introduced the Ensemblator (Clark et al., Protein Sci 2015; 24:1528), a novel approach to compare user-defined groups of models, in residue level detail. Here we describe Ensemblator v3, an open-source program that employs the same basic ensemble comparison strategy but includes major advances that make it more robust, powerful, and user-friendly. Ensemblator v3 carries out multiple sequence alignments to facilitate the generation of ensembles from non-identical input structures, automatically optimizes the key global overlay parameter, optionally performs "ensemble clustering" to classify the models into subgroups, and calculates a novel "discrimination index" that quantifies similarities and differences, at residue or atom level, between each pair of subgroups. The clustering and automatic options mean that no preknowledge about an ensemble is required for its analysis. After describing the novel features of Ensemblator v3, we demonstrate its utility using three case studies that illustrate the ease with which complex analyses are accomplished, and the kinds of insights derived from clustering into subgroups and from the detailed information that locates significant differences. The Ensemblator v3 enhances the structural biology toolbox by greatly expanding the kinds of problems to which this ensemble comparison strategy can be applied.Keywords: protein structure comparison; superposition; clustering; ensemble clustering; python; NMR ensemble; Rosetta; template-based modeling; structure prediction Broad StatementTo compare ensembles of protein structures with residue level detail and without losing the ensemble information, we have developed Ensemblator v3. It is a versatile, user-friendly, and open-source tool that allows the facile assembly of related protein models to create an ensemble, the automatic
Ultra-high resolution protein crystal structures have been considered as relatively reliable sources for defining details of protein geometry, such as the extent to which the peptide unit deviates from planarity. Chellapa and Rose (Proteins 2015; 83:1687) recently called this into question, reporting that for a dozen representative protein structures determined at 1 Å resolution, the diffraction data could be equally well fit with models restrained to have highly planar peptides, i.e. having a standard deviation of the x torsion angles of only 18 instead of the typically observed value of 68. Here, we document both conceptual and practical shortcomings of that study and show that the more tightly restrained models are demonstrably incorrect and do not fit the diffraction data equally well. We emphasize the importance of inspecting electron density maps when investigating the agreement between a model and its experimental data. Overall, this report reinforces that modern standard refinement protocols have been well-conceived and that ultra-high resolution protein crystal structures, when evaluated carefully and used with an awareness of their levels of coordinate uncertainty, are powerful sources of information for providing reliable information about the details of protein geometry.
Here, we report the solution NMR structure of the isolated thumb subdomain of HIV-1 reverse transcriptase (RT). A detailed comparison of the current structure with dozens of the highest resolution crystal structures of this domain in the context of the full-length enzyme reveals that the overall structures are very similar, with only two regions exhibiting local conformational differences. The C-terminal capping pattern of the αH helix is subtly different, and the loop connecting the αI and αJ helices in the p51 chain of the full-length p51/p66 heterodimeric RT differs from our NMR structure due to unique packing interactions in mature RT. Overall, our data show that the thumb subdomain folds independently and essentially the same in isolation as in its natural structural context.
The prediction of absorption, distribution, metabolism, excretion, and toxicity (ADMET) of small molecules from their molecular structure is a central problem in medicinal chemistry with great practical importance in drug discovery. Creating predictive models conventionally requires substantial trial-and-error for the selection of molecular representations, machine learning (ML) algorithms, and hyperparameter tuning. A generally applicable method that performs well on all datasets without tuning would be of great value but is currently lacking. Here, we describe Pareto-Optimal Embedded Modeling (POEM), a similarity-based method for predicting molecular properties. POEM is a non-parametric, supervised ML algorithm developed to generate reliable predictive models without need for optimization. POEM's predictive strength is obtained by combining multiple different representations of molecular structures in a context-specific manner, while maintaining low dimensionality. We benchmark POEM relative to industry-standard ML algorithms and published results across 17 classifications tasks. POEM performs well in all cases and reduces the risk of overfitting.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.