Comprehensive identification and cataloging of copy number variations (CNVs) is required to provide a complete view of human genetic variation. The resolution of CNV detection in previous experimental designs has been limited to tens or hundreds of kilobases. Here we present PennCNV, a hidden Markov model (HMM) based approach, for kilobase-resolution detection of CNVs from Illumina high-density SNP genotyping data. This algorithm incorporates multiple sources of information, including total signal intensity and allelic intensity ratio at each SNP marker, the distance between neighboring SNPs, the allele frequency of SNPs, and the pedigree information where available. We applied PennCNV to genotyping data generated for 112 HapMap individuals; on average, we detected ∼27 CNVs for each individual with a median size of ∼12 kb. Excluding common rearrangements in lymphoblastoid cell lines, the fraction of CNVs in offspring not detected in parents (CNV-NDPs) was 3.3%. Our results demonstrate the feasibility of whole-genome fine-mapping of CNVs via high-density SNP genotyping.
We executed a genome-wide association scan for age-related macular degeneration (AMD) in 2,157 cases and 1,150 controls. Our results validate AMD susceptibility loci near CFH (P < 10 −75), ARMS2 (P < 10 −59), C2/CFB (P < 10 −20), C3 (P < 10 −9 ), and CFI (P < 10 −6). We compared our top findings with the Tufts/Massachusetts General Hospital genome-wide association study of advanced AMD (821 cases, 1,709 controls) and genotyped 30 promising markers in additional individuals (up to 7,749 cases and 4,625 controls). With these data, we identified a susceptibility locus near TIMP3 (overall P = 1.1 × 10), a metalloproteinase involved in degradation of the extracellular matrix and previously implicated in early-onset maculopathy. In addition, our data revealed strong association signals with alleles at two loci (LIPC, P = 1.3 × 10 −7; CETP, P = 7.4 × 10 −7 ) that were previously associated with high-density lipoprotein cholesterol (HDL-c) levels in blood. Consistent with the hypothesis that HDL metabolism is associated with AMD pathogenesis, we also observed association with AMD of HDL-c-associated alleles near LPL (P = 3.0 × 10 −3) and ABCA1 (P = 5.6 × 10 −4). Multilocus analysis including all susceptibility loci showed that 329 of 331 individuals (99%) with the highest-risk genotypes were cases, and 85% of these had advanced AMD. Our studies extend the catalog of AMD associated loci, help identify individuals at high risk of disease, and provide clues about underlying cellular pathways that should eventually lead to new therapies.genome-wide association study | single nucleotide polymorphism A ge-related macular degeneration (AMD) is a progressive neurodegenerative disease and a common cause of blindness in the elderly population, particularly in developed countries (1). The disease affects primarily the macular region of the retina, which is necessary for sharp central vision. An early hallmark of AMD is the appearance of drusen, which are extracellular deposits of proteins and lipids under the retinal pigment epithelium (RPE). As the disease progresses, drusen grow in size and number. In advanced stages of AMD, atrophy of the RPE (geographic atrophy) and/or development of new blood vessels (neovascularization) result in death of photoreceptors and central vision loss.
The ability to computationally predict whether a compound treats a disease would improve the economy and success rate of drug approval. This study describes Project Rephetio to systematically model drug efficacy based on 755 existing treatments. First, we constructed Hetionet (neo4j.het.io), an integrative network encoding knowledge from millions of biomedical studies. Hetionet v1.0 consists of 47,031 nodes of 11 types and 2,250,197 relationships of 24 types. Data were integrated from 29 public resources to connect compounds, diseases, genes, anatomies, pathways, biological processes, molecular functions, cellular components, pharmacologic classes, side effects, and symptoms. Next, we identified network patterns that distinguish treatments from non-treatments. Then, we predicted the probability of treatment for 209,168 compound–disease pairs (het.io/repurpose). Our predictions validated on two external sets of treatment and provided pharmacological insights on epilepsy, suggesting they will help prioritize drug repurposing candidates. This study was entirely open and received realtime feedback from 40 community members.
The genetics underlying the autism spectrum disorders (ASDs) is complex and remains poorly understood. Previous work has demonstrated an important role for structural variation in a subset of cases, but has lacked the resolution necessary to move beyond detection of large regions of potential interest to identification of individual genes. To pinpoint genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. Through prioritization of exonic deletions (eDels), exonic duplications (eDups), and whole gene duplication events (gDups), we identified more than 150 loci harboring rare variants in multiple unrelated probands, but no controls. Importantly, 27 of these were confirmed on examination of an independent replication cohort comprised of 859 cases and an additional 1,051 controls. Rare variants at known loci, including exonic deletions at NRXN1 and whole gene duplications encompassing UBE3A and several other genes in the 15q11–q13 region, were observed in the course of these analyses. Strong support was likewise observed for previously unreported genes such as BZRAP1, an adaptor molecule known to regulate synaptic transmission, with eDels or eDups observed in twelve unrelated cases but no controls (p = 2.3×10−5). Less is known about MDGA2, likewise observed to be case-specific (p = 1.3×10−4). But, it is notable that the encoded protein shows an unexpectedly high similarity to Contactin 4 (BLAST E-value = 3×10−39), which has also been linked to disease. That hundreds of distinct rare variants were each seen only once further highlights complexity in the ASDs and points to the continued need for larger cohorts.
To develop and validate a deep learning algorithm that predicts the final diagnosis of Alzheimer disease (AD), mild cognitive impairment, or neither at fluorine 18 (18 F) fluorodeoxyglucose (FDG) PET of the brain and compare its performance to that of radiologic readers. Materials and Methods: Prospective 18 F-FDG PET brain images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (2109 imaging studies from 2005 to 2017, 1002 patients) and retrospective independent test set (40 imaging studies from 2006 to 2016, 40 patients) were collected. Final clinical diagnosis at follow-up was recorded. Convolutional neural network of InceptionV3 architecture was trained on 90% of ADNI data set and tested on the remaining 10%, as well as the independent test set, with performance compared to radiologic readers. Model was analyzed with sensitivity, specificity, receiver operating characteristic (ROC), saliency map, and t-distributed stochastic neighbor embedding. Results: The algorithm achieved area under the ROC curve of 0.98 (95% confidence interval: 0.94, 1.00) when evaluated on predicting the final clinical diagnosis of AD in the independent test set (82% specificity at 100% sensitivity), an average of 75.8 months prior to the final diagnosis, which in ROC space outperformed reader performance (57% [four of seven] sensitivity, 91% [30 of 33] specificity; P , .05). Saliency map demonstrated attention to known areas of interest but with focus on the entire brain. Conclusion: By using fluorine 18 fluorodeoxyglucose PET of the brain, a deep learning algorithm developed for early prediction of Alzheimer disease achieved 82% specificity at 100% sensitivity, an average of 75.8 months prior to the final diagnosis.
We examined the burden of large, rare, copy-number variants (CNVs) in 192 individuals with renal hypodysplasia (RHD) and replicated findings in 330 RHD cases from two independent cohorts. CNV distribution was significantly skewed toward larger gene-disrupting events in RHD cases compared to 4,733 ethnicity-matched controls (p = 4.8 × 10(-11)). This excess was attributable to known and novel (i.e., not present in any database or in the literature) genomic disorders. All together, 55/522 (10.5%) RHD cases harbored 34 distinct known genomic disorders, which were detected in only 0.2% of 13,839 population controls (p = 1.2 × 10(-58)). Another 32 (6.1%) RHD cases harbored large gene-disrupting CNVs that were absent from or extremely rare in the 13,839 population controls, identifying 38 potential novel or rare genomic disorders for this trait. Deletions at the HNF1B locus and the DiGeorge/velocardiofacial locus were most frequent. However, the majority of disorders were detected in a single individual. Genomic disorders were detected in 22.5% of individuals with multiple malformations and 14.5% of individuals with isolated urinary-tract defects; 14 individuals harbored two or more diagnostic or rare CNVs. Strikingly, the majority of the known CNV disorders detected in the RHD cohort have previous associations with developmental delay or neuropsychiatric diseases. Up to 16.6% of individuals with kidney malformations had a molecular diagnosis attributable to a copy-number disorder, suggesting kidney malformations as a sentinel manifestation of pathogenic genomic imbalances. A search for pathogenic CNVs should be considered in this population for the diagnosis of their specific genomic disorders and for the evaluation of the potential for developmental delay.
Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically significant findings in multiple independent cohorts, with a total of 2,493 cases with ADHD and 9,222 controls of European ancestry, using matched platforms. CNVs affecting metabotropic glutamate receptor genes were enriched across all cohorts (P = 2.1 × 10−9). We saw GRM5 (encoding glutamate receptor, metabotropic 5) deletions in ten cases and one control (P = 1.36 × 10−6). We saw GRM7 deletions in six cases, and we saw GRM8 deletions in eight cases and no controls. GRM1 was duplicated in eight cases. We experimentally validated the observed variants using quantitative RT-PCR. A gene network analysis showed that genes interacting with the genes in the GRM family are enriched for CNVs in ~10% of the cases (P = 4.38 × 10−10) after correction for occurrence in the controls. We identified rare recurrent CNVs affecting glutamatergic neurotransmission genes that were overrepresented in multiple ADHD cohorts.
To identify genetic variants associated with head circumference in infancy, we performed a meta-analysis of seven genome-wide association (GWA) studies (N=10,768 from European ancestry enrolled in pregnancy/birth cohorts) and followed up three lead signals in six replication studies (combined N=19,089). Rs7980687 on chromosome 12q24 (P=8.1×10−9), and rs1042725 on chromosome 12q15 (P=2.8×10−10) were robustly associated with head circumference in infancy. Although these loci have previously been associated with adult height1, their effects on infant head circumference were largely independent of height (P=3.8×10−7 for rs7980687, P=1.3×10−7 for rs1042725 after adjustment for infant height). A third signal, rs11655470 on chromosome 17q21, showed suggestive evidence of association with head circumference (P=3.9×10−6). SNPs correlated to the 17q21 signal show genome-wide association with adult intra cranial volume2, Parkinson’s disease and other neurodegenerative diseases3-5, indicating that a common genetic variant in this region might link early brain growth with neurological disease in later life.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.