DNA microarray technology has led to an explosion of oncogenomic analyses, generating a wealth of data and uncovering the complex gene expression patterns of cancer. Unfortunately, due to the lack of a unifying bioinformatic resource, the majority of these data sit stagnant and disjointed following publication, massively underutilized by the cancer research community. Here, we present ONCOMINE, a cancer microarray database and web-based data-mining platform aimed at facilitating discovery from genome-wide expression analyses. To date, ONCOMINE contains 65 gene expression datasets comprising nearly 48 million gene expression measurements form over 4700 microarray experiments. Differential expression analyses comparing most major types of cancer with respective normal tissues as well as a variety of cancer subtypes and clinical-based and pathology-based analyses are available for exploration. Data can be queried and visualized for a selected gene across all analyses or for multiple genes in a selected analysis. Furthermore, gene sets can be limited to clinically important annotations including secreted, kinase, membrane, and known gene-drug target pairs to facilitate the discovery of novel biomarkers and therapeutic targets.
Characterization of the prostate cancer transcriptome and genome has identified chromosomal rearrangements and copy number gains/losses, including ETS gene fusions, PTEN loss and androgen receptor (AR) amplification, that drive prostate cancer development and progression to lethal, metastatic castrate resistant prostate cancer (CRPC)1. As less is known about the role of mutations2–4, here we sequenced the exomes of 50 lethal, heavily-pretreated metastatic CRPCs obtained at rapid autopsy (including three different foci from the same patient) and 11 treatment naïve, high-grade localized prostate cancers. We identified low overall mutation rates even in heavily treated CRPC (2.00/Mb) and confirmed the monoclonal origin of lethal CRPC. Integrating exome copy number analysis identified disruptions of CHD1, which define a subtype of ETS fusionnegative prostate cancer. Similarly, we demonstrate that ETS2, which is deleted in ~1/3 of CRPCs (commonly through TMPRSS2:ERG fusions), is also deregulated through mutation. Further, we identified recurrent mutations in multiple chromatin/histone modifying genes, including MLL2 (mutated in 8.6% of prostate cancers), and demonstrate interaction of the MLL complex with AR, which is required for AR-mediated signaling. We also identified novel recurrent mutations in the AR collaborating factor FOXA1, which is mutated in 5 of 147 (3.4%) prostate cancers (both untreated localized prostate cancer and CRPC), and showed that mutated FOXA1 represses androgen signaling and increases tumour growth. Proteins that physically interact with AR, such as the ERG gene fusion product, FOXA1, MLL2, UTX, and ASXL1 were found to be mutated in CRPC. In summary, we describe the mutational landscape of a heavily treated metastatic cancer, identify novel mechanisms of AR signaling deregulated in prostate cancer, and prioritize candidates for future study.
Recurrent chromosomal rearrangements have not been well characterized in common carcinomas. We used a bioinformatics approach to discover candidate oncogenic chromosomal aberrations on the basis of outlier gene expression. Two ETS transcription factors, ERG and ETV1, were identified as outliers in prostate cancer. We identified recurrent gene fusions of the 5' untranslated region of TMPRSS2 to ERG or ETV1 in prostate cancer tissues with outlier expression. By using fluorescence in situ hybridization, we demonstrated that 23 of 29 prostate cancer samples harbor rearrangements in ERG or ETV1. Cell line experiments suggest that the androgen-responsive promoter elements of TMPRSS2 mediate the overexpression of ETS family members in prostate cancer. These results have implications in the development of carcinomas and the molecular diagnosis and treatment of prostate cancer.
DNA microarrays have been widely applied to cancer transcriptome analysis; however, the majority of such data are not easily accessible or comparable. Furthermore, several important analytic approaches have been applied to microarray analysis; however, their application is often limited. To overcome these limitations, we have developed Oncomine, a bioinformatics initiative aimed at collecting, standardizing, analyzing, and delivering cancer transcriptome data to the biomedical research community. Our analysis has identified the genes, pathways, and networks deregulated across 18,000 cancer gene expression microarrays, spanning the majority of cancer types and subtypes. Here, we provide an update on the initiative, describe the database and analysis modules, and highlight several notable observations. Results from this comprehensive analysis are available at http://www.oncomine.org.
Supplemental Figure 1 Method: All MS runs were compared and clustered using standard artMS ( https://github.com/biodavidjm/artMS ) procedures on observed feature intensities computed by MaxQuant. Supplemental Figure 1 shows all Pearson's pairwise correlations between MS runs, and are clustered according to similar correlation patterns. Supplemental Figure 2 Method: See main text. Supplemental Figure 3 Method: PFAM domain enrichment analysis. The enrichment of individual PFAM domains (or PFAM clans) 1 was calculated with a hypergeometric test where success is defined as number of domains, and the number of trials is the number of individual preys pulled-down with each viral bait. The population values were the numbers of individual PFAM domains and clans in the human proteome.To make sure that the p-values that signify enrichment were meaningful, we only considered PFAM domains that have been pulled-down at least three times with any SARS-CoV-2 protein, and which occur in the human proteome at least five times. In SI Figure 3 we show PFAM domains/clans with the lowest p-value for a given viral bait protein.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.