SUMMARY Enhancers control the correct temporal and cell type-specific activation of gene expression in higher eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. We use the FANTOM5 panel of samples covering the majority of human tissues and cell types to produce an atlas of active, in vivo transcribed enhancers. We show that enhancers share properties with CpG-poor mRNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity. The atlas is used to compare regulatory programs between different cells at unprecedented depth, identify disease-associated regulatory single nucleotide polymorphisms, and classify cell type-specific and ubiquitous enhancers. We further explore the utility of enhancer redundancy, which explains gene expression strength rather than expression patterns. The online FANTOM5 enhancer atlas represents a unique resource for studies on cell type-specific enhancers and gene regulation.
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
Approximately one third of all mammalian genes are essential for life. Phenotypes resulting from mouse knockouts of these genes have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5000 knockout mouse lines, we have identified 410 lethal genes during the production of the first 1751 unique gene knockouts. Using a standardised phenotyping platform that incorporates high-resolution 3D imaging, we identified novel phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes identified in our screen, thus providing a novel dataset that facilitates prioritization and validation of mutations identified in clinical sequencing efforts.
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-014-0560-6) contains supplementary material, which is available to authorized users.
Although it is generally accepted that cellular differentiation requires changes to transcriptional networks, dynamic regulation of promoters and enhancers at specific sets of genes has not been previously studied en masse. Exploiting the fact that active promoters and enhancers are transcribed, we simultaneously measured their activity in 19 human and 14 mouse time courses covering a wide range of cell types and biological stimuli. Enhancer RNAs, then messenger RNAs encoding transcription factors, dominated the earliest responses. Binding sites for key lineage transcription factors were simultaneously overrepresented in enhancers and promoters active in each cellular system. Our data support a highly generalizable model in which enhancer transcription is the earliest event in successive waves of transcriptional change during cellular differentiation or activation.
The International Mouse Phenotyping Consortium (IMPC) web portal (http://www.mousephenotype.org) provides the biomedical community with a unified point of access to mutant mice and rich collection of related emerging and existing mouse phenotype data. IMPC mouse clinics worldwide follow rigorous highly structured and standardized protocols for the experimentation, collection and dissemination of data. Dedicated ‘data wranglers’ work with each phenotyping center to collate data and perform quality control of data. An automated statistical analysis pipeline has been developed to identify knockout strains with a significant change in the phenotype parameters. Annotation with biomedical ontologies allows biologists and clinicians to easily find mouse strains with phenotypic traits relevant to their research. Data integration with other resources will provide insights into mammalian gene function and human disease. As phenotype data become available for every gene in the mouse, the IMPC web portal will become an invaluable tool for researchers studying the genetic contributions of genes to human diseases.
A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially druggable, proteins, the US National Institutes of Health launched the Illuminating the Druggable Genome (IDG) initiative in 2014. In this article, we discuss how the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins, which indicates a substantial knowledge deficit for approximately one out of three proteins in the human proteome. We then present spotlights on the TDL categories as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development.
The role of sex in biomedical studies has often been overlooked, despite evidence of sexually dimorphic effects in some biological studies. Here, we used high-throughput phenotype data from 14,250 wildtype and 40,192 mutant mice (representing 2,186 knockout lines), analysed for up to 234 traits, and found a large proportion of mammalian traits both in wildtype and mutants are influenced by sex. This result has implications for interpreting disease phenotypes in animal models and humans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.