Mass spectrometry is a fundamental tool for discovery and analysis in the life sciences. With the rapid advances in mass spectrometry technology and methods, it has become imperative to provide a standard output format for mass spectrometry data that will facilitate data sharing and analysis. Initially, the efforts to develop a standard format for mass spectrometry data resulted in multiple formats, each designed with a different underlying philosophy. To resolve the issues associated with having multiple formats, vendors, researchers, and software developers convened under the banner of the HUPO PSI to develop a single standard. The new data format incorporated many of the desirable technical attributes from the previous data formats, while adding a number of improvements, including features such as a controlled vocabulary with validation tools to ensure consistent usage of the format, improved support for selected reaction monitoring data, and immediately available implementations to facilitate rapid adoption by the community. The resulting standard data format, mzML, is a well tested open-source format for mass spectrometer output files that can be readily utilized by the community and easily adapted for incremental advances in mass spectrometry technology.
Motivation: The human leukocyte antigen (HLA) gene cluster plays a crucial role in adaptive immunity and is thus relevant in many biomedical applications. While next-generation sequencing data are often available for a patient, deducing the HLA genotype is difficult because of substantial sequence similarity within the cluster and exceptionally high variability of the loci. Established approaches, therefore, rely on specific HLA enrichment and sequencing techniques, coming at an additional cost and extra turnaround time.Result: We present OptiType, a novel HLA genotyping algorithm based on integer linear programming, capable of producing accurate predictions from NGS data not specifically enriched for the HLA cluster. We also present a comprehensive benchmark dataset consisting of RNA, exome and whole-genome sequencing data. OptiType significantly outperformed previously published in silico approaches with an overall accuracy of 97% enabling its use in a broad range of applications.Contact: szolek@informatik.uni-tuebingen.deSupplementary information: Supplementary data are available at Bioinformatics online.
We present, on behalf of EuroGentest and the European Society of Human Genetics, guidelines for the evaluation and validation of next-generation sequencing (NGS) applications for the diagnosis of genetic disorders. The work was performed by a group of laboratory geneticists and bioinformaticians, and discussed with clinical geneticists, industry and patients' representatives, and other stakeholders in the field of human genetics. The statements that were written during the elaboration of the guidelines are presented here. The background document and full guidelines are available as supplementary material. They include many examples to assist the laboratories in the implementation of NGS and accreditation of this service. The work and ideas presented by others in guidelines that have emerged elsewhere in the course of the past few years were also considered and are acknowledged in the full text. Interestingly, a few new insights that have not been cited before have emerged during the preparation of the guidelines. The most important new feature is the presentation of a 'rating system' for NGS-based diagnostic tests. The guidelines and statements have been applauded by the genetic diagnostic community, and thus seem to be valuable for the harmonization and quality assurance of NGS diagnostics in Europe.
BackgroundMass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.ResultsWe present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.ConclusionOpenMS is available under the Lesser GNU Public License (LGPL) from the project website at .
The TOPP components are available as open-source software under the lesser GNU public license (LGPL). Source code is available from the project website at www.OpenMS.de
Objective:To delineate the full phenotypic spectrum, discriminative features, piloting longitudinal progression data, and sample size calculations of RFC1-repeat expansions, recently identified as causing cerebellar ataxia, neuropathy, vestibular areflexia syndrome (CANVAS).Methods:Multimodal RFC1 repeat screening (PCR, southern blot, whole-exome/genome (WES/WGS)-based approaches) combined with cross-sectional and longitudinal deep-phenotyping in (i) cross-European cohort A (70 families) with ≥2 features of CANVAS and/or ataxia-with-chronic-cough (ACC); and (ii) Turkish cohort B (105 families) with unselected late-onset ataxia.Results:Prevalence of RFC1-disease was 67% in cohort A, 14% in unselected cohort B, 68% in clinical CANVAS, and 100% in ACC. RFC1-disease was also identified in Western and Eastern Asians, and even by WES. Visual compensation, sensory symptoms, and cough were strong positive discriminative predictors (>90%) against RFC1-negative patients. The phenotype across 70 RFC1-positive patients was mostly multisystemic (69%), including dysautonomia (62%) and bradykinesia (28%) (=overlap with cerebellar-type multiple system atrophy [MSA-C]), postural instability (49%), slow vertical saccades (17%), and chorea and/or dystonia (11%). Ataxia progression was ∼1.3 SARA points/year (32 cross-sectional, 17 longitudinal assessments, follow-up ≤9 years [mean 3.1]), but also included early falls, variable non-linear phases of MSA-C-like progression (SARA 2.5-5.5/year), and premature death. Treatment trials require 330 (1-year-trial) and 132 (2-year-trial) patients in total to detect 50% reduced progression.Conclusions:RFC1-disease is frequent and occurs across continents, with CANVAS and ACC as highly diagnostic phenotypes, yet as variable, overlapping clusters along a continuous multisystemic disease spectrum, including MSA-C-overlap. Our natural history data help to inform future RFC1-treatment trials.Classification of Evidence:This study provides Class II evidence that RFC1-repeat expansions are associated with CANVAS and ACC.
We present a computational pipeline for the quantification of peptides and proteins in label-free LC-MS/MS data sets. The pipeline is composed of tools from the OpenMS software framework and is applicable to the processing of large experiments (50+ samples). We describe several enhancements that we have introduced to OpenMS to realize the implementation of this pipeline. They include new algorithms for centroiding of raw data, for feature detection, for the alignment of multiple related measurements, and a new tool for the calculation of peptide and protein abundances. Where possible, we compare the performance of the new algorithms to that of their established counterparts in OpenMS. We validate the pipeline on the basis of two small data sets that provide ground truths for the quantification. There, we also compare our results to those of MaxQuant and Progenesis LC-MS, two popular alternatives for the analysis of label-free data. We then show how our software can be applied to a large heterogeneous data set of 58 LC-MS/MS runs.
Despite extensive efforts, half of patients with rare movement disorders such as hereditary spastic paraplegias and cerebellar ataxias remain genetically unexplained, implicating novel genes and unrecognized mutations in known genes. Non-coding DNA variants are suspected to account for a substantial part of undiscovered causes of rare diseases. Here we identified mutations located deep in introns of POLR3A to be a frequent cause of hereditary spastic paraplegia and cerebellar ataxia. First, whole-exome sequencing findings in a recessive spastic ataxia family turned our attention to intronic variants in POLR3A, a gene previously associated with hypomyelinating leukodystrophy type 7. Next, we screened a cohort of hereditary spastic paraplegia and cerebellar ataxia cases (n = 618) for mutations in POLR3A and identified compound heterozygous POLR3A mutations in ∼3.1% of index cases. Interestingly, >80% of POLR3A mutation carriers presented the same deep-intronic mutation (c.1909+22G>A), which activates a cryptic splice site in a tissue and stage of development-specific manner and leads to a novel distinct and uniform phenotype. The phenotype is characterized by adolescent-onset progressive spastic ataxia with frequent occurrence of tremor, involvement of the central sensory tracts and dental problems (hypodontia, early onset of severe and aggressive periodontal disease). Instead of the typical hypomyelination magnetic resonance imaging pattern associated with classical POLR3A mutations, cases carrying c.1909+22G>A demonstrated hyperintensities along the superior cerebellar peduncles. These hyperintensities may represent the structural correlate to the cerebellar symptoms observed in these patients. The associated c.1909+22G>A variant was significantly enriched in 1139 cases with spastic ataxia-related phenotypes as compared to unrelated neurological and non-neurological phenotypes and healthy controls (P = 1.3 × 10-4). In this study we demonstrate that (i) autosomal-recessive mutations in POLR3A are a frequent cause of hereditary spastic ataxias, accounting for about 3% of hitherto genetically unclassified autosomal recessive and sporadic cases; and (ii) hypomyelination is frequently absent in POLR3A-related syndromes, especially when intronic mutations are present, and thus can no longer be considered as the unifying feature of POLR3A disease. Furthermore, our results demonstrate that substantial progress in revealing the causes of Mendelian diseases can be made by exploring the non-coding sequences of the human genome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.