The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal-organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface.
An assessment of the number of molecular targets that represent an opportunity for therapeutic intervention is crucial to the development of post-genomic research strategies within the pharmaceutical industry. Now that we know the size of the human genome, it is interesting to consider just how many molecular targets this opportunity represents. We start from the position that we understand the properties that are required for a good drug, and therefore must be able to understand what makes a good drug target.
The Cambridge Crystallographic Data Centre (CCDC) was established in 1965 to record numerical, chemical and bibliographic data relating to published organic and metal-organic crystal structures. The Cambridge Structural Database (CSD) now stores data for nearly 700,000 structures and is a comprehensive and fully retrospective historical archive of small-molecule crystallography. Nearly 40,000 new structures are added each year. As X-ray crystallography celebrates its centenary as a subject, and the CCDC approaches its own 50th year, this article traces the origins of the CCDC as a publicly funded organization and its onward development into a self-financing charitable institution. Principally, however, we describe the growth of the CSD and its extensive associated software system, and summarize its impact and value as a basis for research in structural chemistry, materials science and the life sciences, including drug discovery and drug development. Finally, the article considers the CCDC's funding model in relation to open access and open data paradigms.
The results of the sixth blind test of organic crystal structure prediction methods are presented and discussed, highlighting progress for salts, hydrates and bulky flexible molecules, as well as on-going challenges.
Over the past decade, pharmaceutical companies have seen a decline in the number of drug candidates successfully passing through clinical trials, though billions are still spent on drug development. Poor aqueous solubility leads to low bio-availability, reducing pharmaceutical effectiveness. The human cost of inefficient drug candidate testing is of great medical concern, with fewer drugs making it to the production line, slowing the development of new treatments. In biochemistry and biophysics, water mediated reactions and interactions within active sites and protein pockets are an active area of research, in which methods for modelling solvated systems are continually pushed to their limits. Here, we discuss a multitude of methods aimed towards solvent modelling and solubility prediction, aiming to inform the reader of the options available, and outlining the various advantages and disadvantages of each approach.
The principal protein excreted in male rat urine, urinary alpha 2-globulin and the homologous mouse protein, major urinary protein, have been well characterized, although their functions remain unclear. Male rat urine affects the behaviour and sexual response of female rats, leading to the proposal that rodent urinary proteins are responsible for binding pheromones and their subsequent release from drying urine. Urinary alpha 2-globulin is also involved in hyaline droplet nephropathy, an important toxicological syndrome in male rats resulting from exposure to a number of industrial chemicals and characterized by the accumulation of liganded urinary alpha 2-globulin in lysosomes in the kidney, followed by the induction of renal cancer. We now report the three-dimensional structures of mouse major urinary protein (at 2.4 A resolution) and rat urinary alpha 2-globulin (at 2.8 A resolution). The results corroborate the role of these proteins in pheromone transport and elaborate the structural basis of ligand binding.
Small aromatic ring systems are of central importance in the development of novel synthetic protein ligands. Here we generate a complete list of 24,847 such ring systems. We call this list and associated annotations VEHICLe, which stands for virtual exploratory heterocyclic library. Searches of literature and compound databases, using this list as substructure queries, identified only 1701 as synthesized. Using a carefully validated machine learning approach, we were able to estimate that the number of unpublished, but synthetically tractable, VEHICLe rings could be over 3000. However, analysis also shows that the rate of publication of novel examples to be as low as 5-10 per year. With this work, we aim to provide fresh stimulus to creative organic chemists by highlighting a small set of apparently simple ring systems that are predicted to be tractable but are, to the best of our knowledge, unconquered.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.