Daniel Domingo‐Fernándéz scite author profile

Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.

show abstract

COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology

Domingo‐Fernándéz

Baksi²,

Schultz

et al. 2020

View full text Add to dashboard Cite

Summary The COVID-19 crisis has elicited a global response by the scientific community that has led to a burst of publications on the pathophysiology of the virus. However, without coordinated efforts to organize this knowledge, it can remain hidden away from individual research groups. By extracting and formalizing this knowledge in a structured and computable form, as in the form of a knowledge graph, researchers can readily reason and analyze this information on a much larger scale. Here, we present the COVID-19 Knowledge Graph, an expansive cause-and-effect network constructed from scientific literature on the new coronavirus that aims to provide a comprehensive view of its pathophysiology. To make this resource available to the research community and facilitate its exploration and analysis, we also implemented a web application and released the KG in multiple standard formats. Availability The COVID-19 Knowledge Graph is publicly available under CC-0 license at https://github.com/covid19kg and https://bikmi.covid19-knowledgespace.de. Supplementary information Supplementary data are available online.

show abstract

BioKEEN: a library for learning and evaluating biological knowledge graph embeddings

Ali

Hoyt

Domingo‐Fernándéz

et al. 2019

View full text Add to dashboard Cite

Summary Knowledge graph embeddings (KGEs) have received significant attention in other domains due to their ability to predict links and create dense representations for graphs’ nodes and edges. However, the software ecosystem for their application to bioinformatics remains limited and inaccessible for users without expertise in programing and machine learning. Therefore, we developed BioKEEN (Biological KnowlEdge EmbeddiNgs) and PyKEEN (Python KnowlEdge EmbeddiNgs) to facilitate their easy use through an interactive command line interface. Finally, we present a case study in which we used a novel biological pathway mapping resource to predict links that represent pathway crosstalks and hierarchies. Availability and implementation BioKEEN and PyKEEN are open source Python packages publicly available under the MIT License at https://github.com/SmartDataAnalytics/BioKEEN and https://github.com/SmartDataAnalytics/PyKEEN Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

A Computational Approach for Mapping Heme Biology in the Context of Hemolytic Disorders

Humayun

Domingo‐Fernándéz

George

et al. 2020

Front. Bioeng. Biotechnol.

View full text Add to dashboard Cite

Heme is an iron ion-containing molecule found within hemoproteins such as hemoglobin and cytochromes that participates in diverse biological processes. Although excessive heme has been implicated in several diseases including malaria, sepsis, ischemiareperfusion, and disseminated intravascular coagulation, little is known about its regulatory and signaling functions. Furthermore, the limited understanding of heme's role in regulatory and signaling functions is in part due to the lack of curated pathway resources for heme cell biology. Here, we present two resources aimed to exploit this unexplored information to model heme biology. The first resource is a terminology covering heme-specific terms not yet included in standard controlled vocabularies. Using this terminology, we curated and modeled the second resource, a mechanistic knowledge graph representing the heme's interactome based on a corpus of 46 scientific articles. Finally, we demonstrated the utility of these resources by investigating the role of heme in the Toll-like receptor signaling pathway. Our analysis proposed a series of crosstalk events that could explain the role of heme in activating the TLR4 signaling pathway. In summary, the presented work opens the door to the scientific community for exploring the published knowledge on heme biology.

show abstract

Multimodal mechanistic signatures for neurodegenerative diseases (NeuroMMSig): a web server for mechanism enrichment

Domingo‐Fernándéz

Kodamullil

Iyappan

et al. 2017

View full text Add to dashboard Cite

MotivationThe concept of a ‘mechanism-based taxonomy of human disease’ is currently replacing the outdated paradigm of diseases classified by clinical appearance. We have tackled the paradigm of mechanism-based patient subgroup identification in the challenging area of research on neurodegenerative diseases.ResultsWe have developed a knowledge base representing essential pathophysiology mechanisms of neurodegenerative diseases. Together with dedicated algorithms, this knowledge base forms the basis for a ‘mechanism-enrichment server’ that supports the mechanistic interpretation of multiscale, multimodal clinical data.Availability and implementationNeuroMMSig is available at http://neurommsig.scai.fraunhofer.de/Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Using Multi-Scale Genetic, Neuroimaging and Clinical Data for Predicting Alzheimer’s Disease and Reconstruction of Relevant Biological Mechanisms

Khanna

Domingo‐Fernándéz

Iyappan

et al. 2018

Sci Rep

View full text Add to dashboard Cite

Alzheimer’s Disease (AD) is among the most frequent neuro-degenerative diseases. Early diagnosis is essential for successful disease management and chance to attenuate symptoms by disease modifying drugs. In the past, a number of cerebrospinal fluid (CSF), plasma and neuro-imaging based biomarkers have been proposed. Still, in current clinical practice, AD diagnosis cannot be made until the patient shows clear signs of cognitive decline, which can partially be attributed to the multi-factorial nature of AD. In this work, we integrated genotype information, neuro-imaging as well as clinical data (including neuro-psychological measures) from ~900 normal and mild cognitively impaired (MCI) individuals and developed a highly accurate machine learning model to predict the time until AD is diagnosed. We performed an in-depth investigation of the relevant baseline characteristics that contributed to the AD risk prediction. More specifically, we used Bayesian Networks to uncover the interplay across biological scales between neuro-psychological assessment scores, single genetic variants, pathways and neuro-imaging related features. Together with information extracted from the literature, this allowed us to partially reconstruct biological mechanisms that could play a role in the conversion of normal/MCI into AD pathology. This in turn may open the door to novel therapeutic options in the future.

show abstract

Evaluating the Alzheimer's disease data landscape

Birkenbihl

Salimi

Domingo‐Fernándéz

et al. 2020

A&D Transl Res & Clin Interv

View full text Add to dashboard Cite

Introduction Numerous studies have collected Alzheimer's disease (AD) cohort data sets. To achieve reproducible, robust results in data‐driven approaches, an evaluation of the present data landscape is vital. Methods Previous efforts relied exclusively on metadata and literature. Here, we evaluate the data landscape by directly investigating nine patient‐level data sets generated in major clinical cohort studies. Results The investigated cohorts differ in key characteristics, such as demographics and distributions of AD biomarkers. Analyzing the ethnoracial diversity revealed a strong bias toward White/Caucasian individuals. We described and compared the measured data modalities. Finally, the available longitudinal data for important AD biomarkers was evaluated. All results are explorable through our web application ADataViewer (https://adata.scai.fraunhofer.de). Discussion Our evaluation exposed critical limitations in the AD data landscape that impede comparative approaches across multiple data sets. Comparison of our results to those gained by metadata‐based approaches highlights that thorough investigation of real patient‐level data is imperative to assess a data landscape.

show abstract

Linking COVID-19 and heme-driven pathophysiologies: A combined computational-experimental approach

Hopp

Domingo‐Fernándéz

Gadiya

et al. 2021

Preprint

View full text Add to dashboard Cite

The SARS-CoV-2 outbreak has been declared a worldwide pandemic in 2020. Infection triggers the respiratory tract disease COVID-19, which is accompanied by serious changes of clinical biomarkers such as hemoglobin and interleukins. The same parameters are altered during hemolysis, which is characterized by an increase in labile heme. We present two computational-experimental approaches that aim at analyzing a potential link between heme-related and COVID-19 pathophysiologies. Herein, we performed a detailed analysis of the common pathways induced by heme and SARS-CoV-2 by superimposition of knowledge graphs covering heme biology and COVID-19 pathophysiology. Focus was laid on inflammatory pathways and distinct biomarkers as the linking elements. In a second approach, four COVID-19-related proteins, the host cell proteins ACE2 and TMPRSS2 as well as the viral protein 7a and S protein, were computationally analyzed as potential heme-binding proteins with an experimental validation. The results contribute to the understanding of the progression of COVID-19 infections in patients with different clinical backgrounds and might allow for a more individual diagnosis and therapy in the future.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.