DISEASES: Text mining and data integration of disease–gene associations

Pletscher-Frankild, Sune; Pallejà, Albert; Tsafou, Kalliopi; Binder, Janos X.; Jensen, Lars Juhl

doi:10.1016/j.ymeth.2014.11.020

Cited by 508 publications

(466 citation statements)

References 48 publications

Supporting

Mentioning

457

Contrasting

Unclassified

Order By: Relevance

“…To this end, we used a single, monotonic calibration function for all datasets from all organisms, which we calibrated on the text-mining results for human gene–tissue associations (Supplementary Figure S3). We chose to use this particular set of associations, because it is large, facilitating robust results, and because text-mining is also available for other types of associations, allowing unified confidence scores across TISSUES and the related databases COMPARTMENTS (27) and DISEASES (49). The calibrated functions used for transforming raw expression values into final confidence star scores are available in Supplementary Table S4.…”

Section: Methodsmentioning

confidence: 99%

TISSUES 2.0: an integrative web resource on mammalian tissue expression

et al. 2018

Self Cite

View full text Add to dashboard Cite

Physiological and molecular similarities between organisms make it possible to translate findings from simpler experimental systems—model organisms—into more complex ones, such as human. This translation facilitates the understanding of biological processes under normal or disease conditions. Researchers aiming to identify the similarities and differences between organisms at the molecular level need resources collecting multi-organism tissue expression data. We have developed a database of gene–tissue associations in human, mouse, rat and pig by integrating multiple sources of evidence: transcriptomics covering all four species and proteomics (human only), manually curated and mined from the scientific literature. Through a scoring scheme, these associations are made comparable across all sources of evidence and across organisms. Furthermore, the scoring produces a confidence score assigned to each of the associations. The TISSUES database (version 2.0) is publicly accessible through a user-friendly web interface and as part of the STRING app for Cytoscape. In addition, we analyzed the agreement between datasets, across and within organisms, and identified that the agreement is mainly affected by the quality of the datasets rather than by the technologies used or organisms compared. Database URL: http://tissues.jensenlab.org/

show abstract

Section: Methodsmentioning

confidence: 99%

TISSUES 2.0: an integrative web resource on mammalian tissue expression

et al. 2018

Self Cite

View full text Add to dashboard Cite

show abstract

“…Because prior depression is a risk factor for PPD (Miller, 2002) and depression per se is much more studied that PPD, many depression genes could be relevant to PPD. We also found enrichment with PPD genes that were produced by combining two small lists from DISEASES (Copenhagen) (Pletscher-Frankild et al, 2015) and Malacards (Rappaport et al, 2013). By combining matches of top 700 maternal genes with either depression (2 or more databases) or PPD, we identify a subset of genes that may be new high priority PPD genes (Fig.…”

Section: Depression and Postpartum Depressionmentioning

confidence: 70%

Genetic and neuroendocrine regulation of the postpartum brain

Gammie

Driessen

Zhao

et al. 2016

Frontiers in Neuroendocrinology

View full text Add to dashboard Cite

Changes in expression of hundreds of genes occur during the production and function of the maternal brain that support a wide range of processes. In this review, we synthesize findings from four microarray studies of different maternal brain regions and identify a core group of 700 maternal genes that show significant expression changes across multiple regions. With those maternal genes, we provide new insights into reward-related pathways (maternal bonding), postpartum depression, social behaviors, mental health disorders, and nervous system plasticity/developmental events. We also integrate the new genes into well-studied maternal signaling pathways, including those for prolactin, oxytocin/vasopressin, endogenous opioids, and steroid receptors (estradiol, progesterone, cortisol). A newer transcriptional regulation model for the maternal brain is provided that incorporates recent work on maternal microRNAs. We also compare the top 700 genes with other maternal gene expression studies. Together, we highlight new genes and new directions for studies on the postpartum brain.

show abstract

“…Counted at the level of individual mentions, the SPECIES and ENVIRONMENTS taggers showed precision of 83.9 and 87.8%, recall of 72.6 and 77.0%, and F1 scores of 78.8 and 82.0%, respectively. The quality of the NER of tissues and diseases has not been benchmarked directly; however, these NER components have shown to give good results when used for co-mentioning-based extraction of protein–tissue and protein–disease associations (31, 32). In terms of perception metrics, the evaluators generally found the system to be intuitive, useful, well documented and sufficiently accurate to be helpful in spotting relevant text passages and extracting organism and environment terms (Figure 3 and Table 7).…”

Section: Resultsmentioning

confidence: 99%

Overview of the interactive task in BioCreative V

et al. 2016

View full text Add to dashboard Cite

Fully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption. The BioCreative Interactive task (IAT) is a track designed for exploring user-system interactions, promoting development of useful TM tools, and providing a communication channel between the biocuration and the TM communities. In BioCreative V, the IAT track followed a format similar to previous interactive tracks, where the utility and usability of TM tools, as well as the generation of use cases, have been the focal points. The proposed curation tasks are user-centric and formally evaluated by biocurators. In BioCreative V IAT, seven TM systems and 43 biocurators participated. Two levels of user participation were offered to broaden curator involvement and obtain more feedback on usability aspects. The full level participation involved training on the system, curation of a set of documents with and without TM assistance, tracking of time-on-task, and completion of a user survey. The partial level participation was designed to focus on usability aspects of the interface and not the performance per se. In this case, biocurators navigated the system by performing pre-designed tasks and then were asked whether they were able to achieve the task and the level of difficulty in completing the task. In this manuscript, we describe the development of the interactive task, from planning to execution and discuss major findings for the systems tested.Database URL: http://www.biocreative.org

show abstract

DISEASES: Text mining and data integration of disease–gene associations

Cited by 508 publications

References 48 publications

TISSUES 2.0: an integrative web resource on mammalian tissue expression

TISSUES 2.0: an integrative web resource on mammalian tissue expression

Genetic and neuroendocrine regulation of the postpartum brain

Overview of the interactive task in BioCreative V

Contact Info

Product

Resources

About