Metalloproteins are proteins capable of binding one or more metal ions, which may be required for their biological function, or for regulation of their activities or for structural purposes. Genome sequencing projects have provided a huge number of protein primary sequences, but, even though several different elaborate analyses and annotations have been enabled by a rich and ever-increasing portfolio of bioinformatic tools, metal-binding properties remain difficult to predict as well as to investigate experimentally. Consequently, the present knowledge about metalloproteins is only partial. The present bioinformatic research proposes a strategy to answer the question of how many and which proteins encoded in the human genome may require zinc for their physiological function. This is achieved by a combination of approaches, which include: (i) searching in the proteome for the zinc-binding patterns that, on their turn, are obtained from all available X-ray data; (ii) using libraries of metal-binding protein domains based on multiple sequence alignments of known metalloproteins obtained from the Pfam database; and (iii) mining the annotations of human gene sequences, which are based on any type of information available. It is found that 1684 proteins in the human proteome are independently identified by all three approaches as zinc-proteins, 746 are identified by two, and 777 are identified by only one method. By assuming that all proteins identified by at least two approaches are truly zinc-binding and inspecting the proteins identified by a single method, it can be proposed that ca. 2800 human proteins are potentially zinc-binding in vivo, corresponding to 10% of the human proteome, with an uncertainty of 400 sequences. Available functional information suggests that the large majority of human zinc-binding proteins are involved in the regulation of gene expression. The most abundant class of zinc-binding proteins in humans is that of zinc-fingers, with Cys4 and Cys2His2 being the most common types of coordination environment.
Zinc is one of the metal ions essential for life, as it is required for the proper functioning of a large number of proteins. Despite its importance, the annotation of zinc-binding proteins in gene banks or protein domain databases still has significant room for improvement. In the present work, we compiled a list of known zinc-binding protein domains and of known zinc-binding sequence motifs (zinc-binding patterns), and then used them jointly to analyze the proteome of 57 different organisms to obtain an overview of zinc usage by archaeal, bacterial, and eukaryotic organisms. Zinc-binding proteins are an abundant fraction of these proteomes, ranging between 4% and 10%. The number of zinc-binding proteins correlates linearly with the total number of proteins encoded by the genome of an organism, but the proportionality constant of Eukaryota (8.8%) is significantly higher than that observed in Bacteria and Archaea (from 5% to 6%). Most of this enrichment is due to the larger portfolio of regulatory proteins in Eukaryota.
The solution structure of oxidized horse heart cytochrome c was obtained at pH 7.0 in 100 mM phosphate buffer from 2278 NOEs and 241 pseudocontact shift constraints. The final structure was refined through restrained energy minimization. A 35-member family, with RMSD values with respect to the average structure of 0.70 ( 0.11 Å and 1.21 ( 0.14 Å for the backbone and all heavy atoms, respectively, and with an average penalty function of 130 ( 4.0 kJ/mol and 84 ( 3.7 kJ/mol for NOE and pseudocontact shift constraints, respectively (corresponding to a target function of 0.9 Å 2 and 0.2 Å 2 ), was obtained. The solution structure is somewhat different from that recently reported (Qi et al., 1996) and appears to be similar to the X-ray structure of the same oxidation state (Bushnell et al., 1990). A noticeable difference is a rotation of 17 ( 8°of the imidazole plane between solid and solution structure. Detailed and accurate structural determinations are important within the frame of the current debate of the structural rearrangements occurring upon oxidation or reduction. From the obtained magnetic susceptibility tensor a separation of the hyperfine shifts into their contact and pseudocontact contributions is derived and compared to that of the analogous isoenzyme from S. cereVisiae and to previous results.
Cellular systems allow transition-metal ions to reach or leave the cell or intracellular locations through metal transfer between proteins. By coupling mutagenesis and advanced NMR experiments, we structurally characterized the adduct between the copper chaperone Atx1 and the first copper(I)-binding domain of the Ccc2 ATPase. Copper was required for the interaction. This study provides an understanding of metal-mediated protein-protein interactions in which the metal ion is essential for the weak, reversible interaction between the partners.
The full series of lanthanide ions (except the radioactive promethium and the S-state gadolinium) has been incorporated into the C-terminal calcium binding site of the dicalcium protein calbindin D(9k). A fairly constant coordination environment is maintained throughout the series. At variance with several lanthanide complexes with small chelating ligands investigated in the past, the large protein moiety provides a large number of NMR signals whose hyperfine shifts can be exclusively ascribed to pseudocontact shifts (PCS). The chemical shifts of 1H and 15N backbone and side chain amide NH groups were accurately measured through HSQC experiments. 1097 PCS were estimated from these by subtracting the diamagnetic contributions measured on HSQC spectra of either the 4f(0) lanthanum(III) or the 4f(14) lutetium(III) derivatives and used to define a quality factor for the structure. The differences in diamagnetic chemical shifts between the two diamagnetic blanks were relatively small, although some were not negligible especially for the nuclei closest to the metal center. These differences were used as a tolerance for the PCS. The magnetic susceptibility tensor anisotropies for each paramagnetic lanthanide ion were obtained as the result of the solution structure determination performed by using the NOEs of the cerium(III) derivative and the PCS of all lanthanides simultaneously. This set of reliable magnetic data permits an experimental assessment of Bleaney's theory relative to the magnetic properties for an extended series of lanthanide complexes in solution. All of the obtained tensors show some rhombicity, as could be expected from the lack of symmetry of the protein environment. The directions of the largest magnetic susceptibility component for Ce, Pr, Nd, Sm, Tb, Dy, and Ho and of the smallest magnetic susceptibility component for Eu, Er, Tm, and Yb were found to be all within 15 degrees from their average (within 20 degrees for Sm), confirming the essential similarity of the coordination environment for all lanthanides. Bleaney's theory is in excellent qualitative agreement with the observed pattern of axial anisotropies. Its quantitative agreement is substantially better than that suggested by previous analyses performed on more limited sets of PCS data for small lanthanide complexes, the so-called crystal field parameter varying only within +/-30% from one lanthanide to another. These variations are even smaller (+/-15%) if a reasonable T(-3) correction is taken into consideration. A knowledge of magnetic susceptibility anisotropy properties of lanthanides is essential in determining the self-orienting properties of lanthanide complexes in solution when immersed in magnetic fields.
Structural biology aims at characterizing the structural and dynamic properties of biological macromolecules at atomic details. Gaining insight into three dimensional structures of biomolecules and their interactions is critical for understanding the vast majority of cellular processes, with direct applications in health and food sciences. Since 2010, the WeNMR project (www.wenmr.eu) has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the high throughput computing infrastructure provided by EGI. These services have been further developed in subsequent initiatives under H2020 projects and are now operating as Thematic Services in the European Open Science Cloud portal (www.eosc-portal.eu), sending >12 millions of jobs and using around 4,000 CPU-years per year. Here we review 10 years of successful e-infrastructure solutions serving a large worldwide community of over 23,000 users to date, providing them with user-friendly, web-based solutions that run complex workflows in structural biology. The current set of active WeNMR portals are described, together with the complex backend machinery that allows distributed computing resources to be harvested efficiently.
Genome-wide studies are providing researchers with a potentially complete list of the molecular components present in living systems. It is now evident that several metal ions are essential to life and that metalloproteins, that is, proteins that require a metal ion to perform their physiological function, are widespread in all organisms. However, there is currently a lack of well-established experimental methods aimed at analyzing the complete set of metalloproteins encoded by an organism (the metalloproteome). This information is essential for a comprehensive understanding of the whole of the processes occurring in living systems. Predictive tools must thus be applied to define metalloproteomes. In this Account, we discuss the current progress in the development of bioinformatics methods for the prediction, based solely on protein sequences, of metalloproteins. With these methods, it is possible to scan entire proteomes for metalloproteins, such as zinc proteins or copper proteins, which are identified by the presence of specific metal-binding sites, metal-binding domains, or both. The predicted metalloproteins can be then analyzed to obtain information on their function and evolution. For example, the comparative analysis of the content and usage of different metalloproteins across living organisms can be used to obtain hints on the evolution of metalloproteomes. As case studies, we predicted the content of zinc, nonheme iron, and copper-proteins in a representative set of organisms taken from the three domains of life. The zinc proteome represents about 9% of the entire proteome in eukaryotes, but it ranges from 5% to 6% in prokaryotes, therefore indicating a substantial increase of the number of zinc proteins in higher organisms. In contrast, the number of nonheme iron proteins is relatively constant in eukaryotes and prokaryotes, and therefore their relative share diminishes in passing from archaea (about 7%), to bacteria (about 4%), to eukaryotes (about 1%). Copper proteins represent less than 1% of the proteomes in all the organisms studied. We also discuss the limits of these methods, the approaches used to overcome some of these limits to improve our predictions, and possible future developments in the field of bioinformatics-based investigation of metalloproteins. As a long-standing goal of the biological sciences, the understanding of life at the systems level, or systems biology, is experiencing a rekindling of interest; ready access to complete information on metalloproteomes is crucial to correctly represent the role of metal ions in living organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.