2016
DOI: 10.1016/j.jbi.2016.10.007
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised learning of the electronic health record for phenotype stratification

Abstract: Patient interactions with health care providers result in entries to electronic health records (EHRs). EHRs were built for clinical and billing purposes but contain many data points about an individual. Mining these records provides opportunities to extract electronic phenotypes, which can be paired with genetic data to identify genes underlying common human diseases. This task remains challenging: high quality phenotyping is costly and requires physician review; many fields in the records are sparsely filled;… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
121
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 149 publications
(122 citation statements)
references
References 25 publications
1
121
0
Order By: Relevance
“…To tap this potential, we will require algorithms like eADAGE that robustly integrate these diverse datasets in a manner that is not limited to well-understood aspects of biology. Furthermore, while public compendia tend to be dominated by expression data, autoencoders have also been successfully applied to datasets based on large collections of electronic health records where they are particularly effective at dealing with missing data (Beaulieu-Jones et al, 2016; Miotto et al, 2016, Beaulieu-Jones and Moore, 2017). These features, along with their unsupervised nature, make DAs a promising approach for the integration of heterogeneous data types.…”
Section: Discussionmentioning
confidence: 99%
“…To tap this potential, we will require algorithms like eADAGE that robustly integrate these diverse datasets in a manner that is not limited to well-understood aspects of biology. Furthermore, while public compendia tend to be dominated by expression data, autoencoders have also been successfully applied to datasets based on large collections of electronic health records where they are particularly effective at dealing with missing data (Beaulieu-Jones et al, 2016; Miotto et al, 2016, Beaulieu-Jones and Moore, 2017). These features, along with their unsupervised nature, make DAs a promising approach for the integration of heterogeneous data types.…”
Section: Discussionmentioning
confidence: 99%
“…As reflected by the name, it belongs to the class of deep neural-network models (Ching, Zhu, et al , 2018;Alakwaa et al , 2018;Chaudhary et al , 2018) . Recent years, deep learning and related deep neural network algorithms have gained much interest in the biomedical field (Ching, Himmelstein, et al , 2018) , ranging from applications from extracting stable gene expression signa tures in large sets of public data (Tan et al , 2017) to stratify phenotypes (Beaulieu-Jones et al , 2016) or impute missing values (Beaulieu-Jones and Moore, 2017) using electronic health record (EHR) data. Here, we construct DeepImpute models by splitting the genes into subsets and builds sub-networks to increase its efficacy and efficienc y.…”
Section: Introductionmentioning
confidence: 99%
“…Interactive development tools, such as Jupyter 19,20 , RMarkdown 21,22 and Sweave 23 can be incorporated to present the code and analysis in a logical graphical manner. For example, we recently used Jupyter with continuous analysis in our own publication 24 and corresponding repository 25 . Reviewers can follow what was done in an audit fashion without having to install and run software while having confidence that analyses are reproducible.…”
Section: Resultsmentioning
confidence: 99%