Name: analyzing tumor gene expression factors with the corexplorer web portal
Uploaded: 2019-10-11T15:00:00.000+00:00
Duration: 8 min
Description: Watch this Scientific Journal Video about Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal at JoVE.com

GV was supported by DARPA award W911NF-16-0575.

1. Exploring factors containing a gene of interest
<ol>
	<li>Open a web browser and go to http://corex.isi.edu, the CorExplorer home page.</li>
	<li>On the right side under Quick Links, click on the + expand button next to Ovarian (TCGA-OV) to see a summary of the CorEx factor graph that was trained on the TCGA ovarian cancer data (shown in Figure 1). Optionally, click on others to compare.</li>
	<li>Once finished inspecting the factor graphs, click on Lung (TCGA-LUAD) to access the CorExplorer page for lung cancer RNA-seq.
	<ol>
		<li>Explore the CorEx factor graph for a gene of interest using the CorExplorer &#8216;Factor Graph&#8217; window.
		<ol>
			<li>Move the mouse cursor over the factor graph display window. Zoom into the factor graph using the mouse scroll wheel or trackpad to see details of the graph such as the most important genes in each factor and the connections between nodes at different layers. Alternatively, click and drag to move the view area or any node.</li>
			<li>To find a target gene (here we&#39;ll use BRCA1), click on the Gene dropdown menu at the top of the factor graph window. Type &#8216;BRCA1&#8217; to select it in the dropdown list and press Return to make the view zoom to factor 26, the factor with which BRCA1 is most strongly correlated.</li>
			<li>Reposition the mouse over the graph display and scroll to zoom out to see the Level 2 node, L2_8, and its associated factors that are neighbors to factor 26. Note that only genes with weight greater than the threshold indicated on the Min link weight slider are shown.</li>
			<li>To see all of the genes associated with the factor, click on the L1_26 node and select Load additional genes in the pop-up window. When the word &#8216;Done&#8217; appears, close the pop-up window.</li>
			<li>Now go back to the header section above the factor graph window and grab and drag the Min link weight modifier. Now, as the link weight slider is moved down to 0.05, other genes in factor L1_26, including BRCA2, will appear in weight order. Optionally, reposition nodes by grabbing and dragging to improve layout.</li>
		</ol>
		</li>
		<li>Determine how stratification of patients with respect to the factor affects survival by querying in the survival window.
		<ol>
			<li>In the survival window, uncheck Sort by p-val, then select factor 26 in the Single Factor dropdown menu in order to show survival curves for factor 26.</li>
			<li>Scroll down the survival graph to show the number of patients at risk along the x-axis.</li>
		</ol>
		</li>
		<li>Find associations with biological function by querying within the Annotation window.
		<ol>
			<li>In the annotation window, to sort the Factor dropdown menu by factor number rather than False Discovery Rate (FDR), uncheck FDR sort.</li>
			<li>Scroll and click to select factor 26 in the annotation window dropdown to show enrichment annotations for the factor.</li>
			<li>Scroll down the annotation list until DNA repair is visible and click on it to immediately see associated genes highlighted in yellow on the graph display. See the middle panel of Figure 2.</li>
			<li>Note that factors disappear or appear as different GO terms are selected, according to whether or not they are enriched for genes with the selected annotation, e.g. &#8216;intrinsic apoptotic signaling pathway in response to DNA damage.&#8217;</li>
		</ol>
		</li>
		<li>Explore the factors further by adding windows with different functionality.
		<ol>
			<li>From the top menu bar, add a protein-protein interaction network (PPI) window by selecting PPI from the Add Window dropdown, then click the Add button to add a PPI graph window to the display area. In the PPI graph window, choose factor &#8216;Layer1: 26&#8217; to show the protein-protein interactions. Note the density of connections.</li>
			<li>From top menu bar, instead of PPI, select Heatmap from the Add Window dropdown, then click the Add button to add a heatmap window to the display area. In the heatmap window, choose factor &#8216;Layer1: 26&#8217; to show the gene expression patterns.</li>
			<li>Grab and reposition the heatmap window so that the survival window is also visible. Along the top of the heatmap, observe how the orange/blue/grey colored bar corresponds to patient risk strata on survival graph. Results are shown in the bottom of Figure 2.</li>
		</ol>
		</li>
	</ol>
	</li>
</ol>
2. Filtering and interpreting CorEx factors using gene weight, survival, and annotation data
<ol>
	<li>Filter for factors of interest using survival and cluster quality.
	<ol>
		<li>From the Dataset dropdown menu at the top, select TCGA_OVCA to go to the CorExplorer page for the TCGA ovarian cancer RNA-seq.</li>
		<li>Once the page has loaded, note from the survival window that the factor with the largest survival differential for different strata is 114.</li>
		<li>At the top of the factor graph window select &#8216;Layer1: 114&#8217; from the Factor dropdown.</li>
		<li>Grab the link weight slider with the mouse and move it up to 0.5. Note that the large number of genes in factor 114 (1609), with none having weight &#62;0.35, indicates a relatively weak clustering.</li>
		<li>Next, expand the list of factors in the survival window and select the next best factor in the survival window dropdown, factor 39, to show its associated survival curves.</li>
		<li>Select factor 39 in the annotation window by clicking on it. The significant GO and KEGG annotations are shown.</li>
	</ol>
	</li>
	<li>To gain a better understanding of the biological role of genes in factor 39, interpret the factors using neighborhood annotation information as follows.
	<ol>
		<li>At the top of the factor graph window, select factor &#8216;Layer1: 39&#8217; in the factor dropdown. Then, move the mouse over the factor graph window and zoom out to reveal the entire L2_14 cluster with 6 factors: 14, 32, 39, 42, 52, and 82 (shown in Figure 3).</li>
		<li>To understand the relative significance of the factors linked to the L2_14 node, start by viewing survival differentials for each of the L2_14 factors. Uncheck Sort by p-val in the survival window and then click on each of the factor numbers in succession. Doing this, note that only factors 14, 32, and 39 display a survival association.</li>
		<li>Now from the top menu bar, select PPI from the Add Window dropdown once again. Press Add to add a PPI graph window to the display area. In the PPI graph window, select factor &#8216;Layer1: 52&#8217; to show the protein-protein interactions that are significant. An example layout of windows at this point is shown in Figure 3.</li>
		<li>Click the View at StringDB link at the bottom of the PPI window to link out to the StringDB online database. Click Continue from the first screen, then select the Analysis tab below the network graph as before to get an online GO analysis for the PPI network genes. The top cellular component is &#8216;MHC class II protein complex.&#8217;</li>
		<li>Return to the CorExplorer tab and PPI window and select factor 32, this time from the factor dropdown. Click the link View at StringDB out to the StringDB analysis. The top cellular component is &#8216;MHC class I protein complex,&#8217; in contrast to class II for factor 52 in the previous step!</li>
		<li>Finally, go back to the PPI window and select &#8216;Layer1: 39&#8217; from the factor dropdown menu at the top. Click the link View at StringDB to link out to the StringDB analysis.</li>
		<li>Click Continue from the first screen, then select the Analysis tab below the network graph to get an online GO analysis for the PPI network genes. Observe that the top molecular function is &#8216;CXCR3 chemokine receptor binding.&#8217;</li>
	</ol>
	</li>
</ol>
3. Using survival and database annotations to look for promising therapeutic combinations
<ol>
	<li>Switch to the TCGA melanoma CorExplorer by selecting TCGA_SKCM from the Dataset dropdown menu.</li>
	<li>Note that the factor with the largest survival differential is factor 171. Examine the factor 171 annotations by scrolling and note that &#8216;immune response&#8217; and &#8216;cytokine-mediated signaling pathway&#8217; are near the top (as they were for the top ovarian factor).</li>
	<li>To find a complementary factor, examine the top survival-associated factors along with their top annotation terms. To do this, click on the Dataset overview link in the top menu bar to open a separate tab containing a table with dataset processing details as well as a summary of top factors according to p-value of the survival differential. Note that the first non-immune factor is 88.</li>
	<li>Return to the TCGA_SKCM browser tab.</li>
	<li>Select factor 88 in the survival, annotation, and graph windows. The top several GO terms are related to &#8216;rRNA processing&#8217; and &#8216;mitochondrion organization,&#8217; confirming it as distinct from the immune-related factors.</li>
	<li>In the survival window, on the paired factors dropdown, select &#8216;88_171&#8217; to see how survival is improved for patients in the middle stratum for the combined 171 and 88 expression factors. Annotation and survival comparisons are illustrated in Figure 4.</li>
</ol>
4. Finding commonalities and differences of gene expression variation across tumor types using the Search page
<ol>
	<li>Click on the CorExplorer heading to return to the front page.</li>
	<li>Click on Search on the top menu bar to go to a page allowing searching over all the datasets on the CorExplorer site.</li>
	<li>In the Gene search box, enter &#8216;FLT1&#8217; (VEGFR1) and hit Return or press Search. FLT1 is found with a relatively high weight in the following factors: OVCA - 76, LUAD - 162, SKCM - 195 and SKCM - 184, as well as COAD - 112 and COAD - 74.</li>
	<li>Alternatively, search for a related GO term across all the datasets. Try this in the &#8216;GO Search&#8217; box by typing &#8216;angiogenesis&#8217; and hitting Return or pressing Search. All FLT1 factors, with the exception of SKCM-195, are listed as statistically enriched for &#8216;angiogenesis&#8217; genes&#8211;factor 195 does, in fact, have the annotation, but below the default 10-8 threshold. Search results for this and the prior step are shown in Figure 5.</li>
	<li>As further examples, in the GO search box, first type &#8216;epidermal growth factor receptor.&#8217; Only LUAD is enriched for this term, a well-known stratification factor for lung cancer. Next, type &#8216;mesenchymal&#8217; in the search box. This term is enriched in gene expression groups for OVCA, where it is a well-studied stratification factor.</li>
</ol>

<ol>
	<li>Petryszak, R., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+RNASeq-er+API-a+gateway+to+systematically+updated+analysis+of+public+RNA-seq+data.">The RNASeq-er API-a gateway to systematically updated analysis of public RNA-seq data.</a> Bioinformatics. 33, 2218-2220 (2017).</li><li>Steeg, G. V., Galstyan, A. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Maximally+Informative+Hierarchical+Representations+of+High-Dimensional+Data.">Maximally Informative Hierarchical Representations of High-Dimensional Data.</a> Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS). , (2015).</li><li>Ver Steeg, G., Galstyan, A. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Discovering+structure+in+high-dimensional+data+through+correlation+explanation.">Discovering structure in high-dimensional data through correlation explanation.</a> Advances in Neural Information Processing Systems. , (2014).</li><li>Pepke, S., Ver Steeg, G. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Comprehensive+discovery+of+subsample+gene+expression+components+by+information+explanation:+therapeutic+implications+in+cancer.">Comprehensive discovery of subsample gene expression components by information explanation: therapeutic implications in cancer.</a> BMC medical Genomics. 10, 12 (2017).</li><li>Byron, S. A., Van Keuren-Jensen, K. R., Engelthaler, D. M., Carpten, J. D., Craig, D. W. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Translating+RNA+sequencing+into+clinical+diagnostics:+opportunities+and+challenges.">Translating RNA sequencing into clinical diagnostics: opportunities and challenges.</a> Nature Reviews Genetics. 17, 257 (2016).</li><li>Cancer Genome Atlas Research Network. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Comprehensive+molecular+profiling+of+lung+adenocarcinoma.">Comprehensive molecular profiling of lung adenocarcinoma.</a> Nature. 511, 543 (2014).</li><li>Cancer Genome Atlas Network. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Comprehensive+molecular+characterization+of+human+colon+and+rectal+cancer.">Comprehensive molecular characterization of human colon and rectal cancer.</a> Nature. 487, 330 (2012).</li><li>Akbani, R., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Genomic+classification+of+cutaneous+melanoma.">Genomic classification of cutaneous melanoma.</a> Cell. 161, 1681-1696 (2015).</li><li>Cancer Genome Atlas Research Network. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Integrated+genomic+analyses+of+ovarian+carcinoma.">Integrated genomic analyses of ovarian carcinoma.</a> Nature. 474, 609 (2011).</li><li>Grossman, R. L., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Toward+a+shared+vision+for+cancer+genomic+data.">Toward a shared vision for cancer genomic data.</a> New England Journal of Medicine. 375, 1109-1112 (2016).</li><li>Moynahan, M. E., Chiu, J. W., Koller, B. H., Jasin, M. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Brca1+controls+homology-directed+DNA+repair.">Brca1 controls homology-directed DNA repair.</a> Molecular Cell. 4, 511-518 (1999).</li><li>Szklarczyk, D., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=STRING+v11:+protein&#8211;protein+association+networks+with+increased+coverage,+supporting+functional+discovery+in+genome-wide+experimental+datasets.">STRING v11: protein&#8211;protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.</a> Nucleic Acids Research. 47, 607-613 (2018).</li><li>Durgeau, A., Virk, Y., Corgnac, S., Mami-Chouaib, F. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Recent+advances+in+targeting+CD8+T-cell+immunity+for+more+effective+cancer+immunotherapy.">Recent advances in targeting CD8 T-cell immunity for more effective cancer immunotherapy.</a> Frontiers in Immunology. 9, 14 (2018).</li><li>Sato, E., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Intraepithelial+CD8++tumor-infiltrating+lymphocytes+and+a+high+CD8+/regulatory+T+cell+ratio+are+associated+with+favorable+prognosis+in+ovarian+cancer.">Intraepithelial CD8+ tumor-infiltrating lymphocytes and a high CD8+/regulatory T cell ratio are associated with favorable prognosis in ovarian cancer.</a> Proceedings of the National Academy of Sciences of the United States of America. 102, 18538-18543 (2005).</li><li>De Moura, M. B., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Mitochondrial+respiration-an+important+therapeutic+target+in+melanoma.">Mitochondrial respiration-an important therapeutic target in melanoma.</a> PLoS One. 7, 40690 (2012).</li><li>Folkman, J., Merler, E., Abernathy, C., Williams, G. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Isolation+of+a+tumor+factor+responsible+for+angiogenesis.">Isolation of a tumor factor responsible for angiogenesis.</a> Journal of Experimental Medicine. 133, 275-288 (1971).</li><li>Takahashi, S. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Vascular+endothelial+growth+factor+(VEGF),+VEGF+receptors+and+their+inhibitors+for+antiangiogenic+tumor+therapy.">Vascular endothelial growth factor (VEGF), VEGF receptors and their inhibitors for antiangiogenic tumor therapy.</a> Biological and Pharmaceutical Bulletin. 34, 1785-1788 (2011).</li><li>Subramanian, A., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Gene+set+enrichment+analysis:+a+knowledge-based+approach+for+interpreting+genome-wide+expression+profiles.">Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.</a> Proceedings of the National Academy of Sciences of the United States of America. 102, 15545-15550 (2005).</li><li>Cerami, E., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+cBio+Cancer+Genomics+Portal:+An+Open+Platform+for+Exploring+Multidimensional+Cancer+Genomics+Data.">The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data.</a> Cancer Discovery. 2, 401-404 (2012).</li><li>Gao, J., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Integrative+Analysis+of+Complex+Cancer+Genomics+and+Clinical+Profiles+Using+the+cBioPortal.">Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal.</a> Science Signalling. 6, 1 (2013).</li><li>Reich, M., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=GenePattern+2.0.">GenePattern 2.0.</a> Nature Genetics. 38, 500 (2006).</li><li>Wang, Y. E., Kutnetsov, L., Partensky, A., Farid, J., Quackenbush, J. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=WebMeV:+A+Cloud+Platform+for+Analyzing+and+Visualizing+Cancer+Genomic+Data.">WebMeV: A Cloud Platform for Analyzing and Visualizing Cancer Genomic Data.</a> Cancer Research. 77, 11-14 (2017).</li><li>. Morpheus Available from: <a target="_blank" href="https://software.broadinstitute.org/morpheus">https://software.broadinstitute.org/morpheus</a> (2019)</li><li>Weitschek, E., Lauro, S. D., Cappelli, E., Bertolazzi, P., Felici, G. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=CamurWeb:+a+classification+software+and+a+large+knowledge+base+for+gene+expression+data+of+cancer.">CamurWeb: a classification software and a large knowledge base for gene expression data of cancer.</a> BMC Bioinformatics. 19, 354 (2018).</li><li>Chou, P. -. H., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=tACCo,+a+Database+Connecting+transcriptome+Alterations,+pathway+Alterations+and+Clinical+outcomes+in+Cancers.">tACCo, a Database Connecting transcriptome Alterations, pathway Alterations and Clinical outcomes in Cancers.</a> Scientific Reports. 9, 3877 (2019).</li></ol>

The authors declare that they have no competing financial interests.

We have presented the CorExplorer site, a publicly accessible web server for interactive exploration of maximally correlated gene expression factors learned from tumor RNA-seq by the CorEx algorithm. We have shown how the website may be used to stratify patients according to tumor gene expression, and how such stratification corresponds to biological function and survival.
Other webservers for RNA-seq analysis have been built. Differential and co-expression analysis for tumors can be examined and integrated with other data types in cbioPortal19,20. The servers GenePattern21, Mev22, and Morpheus23, incorporate established clustering techniques such as principal component analysis (PCA), kmeans, or self-organizing maps (SOMs). More innovative efforts include CamurWeb24, based upon an automated rule-generating classifier, and TACCO25, which implements random forest classifiers and lassos. The CorEx algorithm used here optimizes multivariate information in order to find a hierarchy of factors that explain patterns in data. The nonlinear and hierarchical factor learning appears to yield improved interpretability relative to the linear global factors found via PCA4. Additionally, the technique&#39;s fine-grained parsing of sample signals allows precise tumor comparisons vis-&#224;-vis more commonly used broad subtypes. This combination of overlapping and hierarchical factor analysis distinguishes the CorExplorer from most other approaches and necessitates new tools for visualization and summarization.
A critical part of the CorExplorer factor analysis is the capability to explore not just several, but over 100 factors with informative gene patterns that are placed within an overlapping hierarchy. The CorExplorer facilitates the mining of these myriad factors for biological and clinical associations and allows for exceptionally detailed characterization of individual tumors. The unsupervised learning of such a large number of factors means that not all will be relevant to disease biology. In such a case, it is essential to either use annotations or known genes to pull out factors of interest or search for factors associated with clinical data such as survival. Thus, the CorExplorer allows users to implement this very important filtering step. The presence of factor gene patterns in a tumor may even suggest an approach to personalized oncology treatment. Further, the multiplicity of factor scores for each tumor that allows for discovery of potentially useful therapeutic combinations.
It is sometimes the case that no significant GO annotations appear for factors highly correlated with survival. While this may occur due to noisy or under sampled data, there are other possible causes such as a cluster size that is too small to register significant enrichment scores or the group being a &#8216;basket&#8217; of single genes from diverse pathways without coherent biological association. Additionally, a category of annotation different from the KEGG and GO biological process, e.g. cellular compartment, may be appropriate. These can be accessed by linking out to StringDB as demonstrated in the protocol. The Gene Ontology enrichment analysis on the CorExplorer site currently does not account for the gene weighting in a factor, though this will likely be remedied in the near future. Note a gene list option is available under &#8216;Add Window&#8217; that allows for download of the complete factor gene list for further analysis with external tools.
For the purposes of the website, CorEx was run on each of the datasets five times and the run that resulted in the greatest overall Total Correlation was retained. Having a statistical representation of the results of multiple runs may be more informative and is a goal for future work. Additionally, the set of tumor types available on the server is rather small, but we expect this to expand over time according to user interest.
As outlined above, the CorExplorer visualizes CorEx RNA-seq factor relationships along with clinical and database information, thus enabling a variety of different modes of interrogation. We are hopeful that this tool will lead to further work to utilize the power of RNA-seq analysis for discovery and clinical application in oncology.

Since its introduction just over a decade ago, RNA-seq has become a ubiquitous tool for measuring gene expression1. This is because it enables rapid and cheap de novo profiling of the entire transcriptome of a sample. However, RNA-seq tumor data reflects an underlying biology that is intrinsically complex and often under-sampled, while the data itself is high-dimensional and noisy. This presents a significant challenge for extracting reliable signals. The CorEx algorithm leverages multivariate mutual information to find subtle patterns in such situations2,3 . This technique was previously adapted to analyze ovarian tumor RNA-seq samples from the The Cancer Genome Atlas (TCGA) and in this context it appeared to have significant advantages over more commonly used analysis methods4.
Though the use of RNA-seq is enormously widespread in research applications, including in oncology, those efforts have not led to broad utilization for the purposes of clinical interventions5. Part of the reason for this is a lack of user-friendly algorithms and software targeted to these specific problems. To help bridge this gap, we have designed the CorExplorer web portal to enable researchers from a variety of backgrounds to study gene expression factors of tumor RNA-seq samples as found by the CorEx machine learning algorithm. The CorExplorer portal supports interactive visualization and querying of factors from several different tumor types including lung, colon, melanoma, and ovarian6,7,8,9,10, with the intent of helping researchers to sift through the data correlations and identify candidate pathways to stratify patients for therapeutic purposes.
We expect the CorExplorer portal may be useful to several types of users. The portal was designed with the user in mind who wishes to understand the broad factors driving tumoral gene expression differences in public databases and possibly also place individual gene expression profiles in the context of tumors with similar characteristics. In addition to the representative protocols outlined here, CorExplorer investigations may serve as a starting point to suggest hypotheses for further testing, to compare and contrast CorEx findings on datasets outside of the CorExplorer, and to connect pathological expression signatures of one or a few genes in an individual tumor to larger groups that may be coordinately affected. Finally, it may serve as a user-friendly introduction to the application of machine learning to RNA-seq for those getting started in the field.

<table><tbody><tr><td>Public server for CorExplorer website</td><td>USC</td><td>http://corex.isi.edu</td><td>Intel Xeon E5-2690 4-core 2.6 GHz, 8GB RAM. Backend architecture is LAMP: Linux, Apache, MySQL, PHP.</td></tr><tr><td>Web browser</td><td>Google/Apple</td><td>Chrome/Safari</td><td>Verified web browsers.</td></tr></tbody></table>

analyzing tumor gene expression factors with the corexplorer web portal

Differential gene expression analysis is an important technique for understanding disease states. The machine learning algorithm CorEx has shown utility in analyzing differential expression of groups of genes in tumor RNA-seq in a way that may be helpful for advancing precision oncology. However, CorEx produces many factors that can be challenging to analyze and connect to existing understanding. To facilitate such connections, we have built a website, CorExplorer, that allows users to interactively explore the data and answer common questions related to its analysis. We trained CorEx on RNA-seq gene expression data for four tumor types: ovarian, lung, melanoma, and colorectal. We then incorporated corresponding survival, protein-protein interactions, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments, and heatmaps into the website for association with the factor graph visualization. Here we employ example protocols to illustrate use of the database for comprehending the significance of the learned tumor factors in the context of this external data.

Searching for the gene &#8216;BRCA1&#8217; in the lung cancer dataset reveals it to be most strongly associated with CorEx factor 26 (Figure 2). GO term enrichment for this factor is seen to be extremely high, with DNA repair exhibiting an FDR of only 1 x 10-19. The selection also draws attention to the second level cluster L2_8 that has six closely related factors as children. Selecting &#8216;DNA repair&#8217; in either the GO term annotations or the factor graph&#39;s GO enriched dropdown highlights associated genes in each of the factors, with the factor 26 having by far the most, as expected11. The protein-protein interaction network is strongly connected, further supporting the tightly linked functionality of the genes in factor 26. The associated survival graph suggests a possible association with patient survival, but this would have to be confirmed in a larger dataset.
Starting with survival can allow dissection of reasons for improved survival associated with particular gene expression groups. As an example, the top factor influencing survival for ovarian cancer is seen to be number 39, which is strongly enriched for genes associated with the immune system (Figure 3). Five other factors associated with the same level 2 node are also indicated to be immune-related, however the survival impact appears to be strongly variable among them, with 39 being the highest and 52 being the lowest. Adding a protein-protein interaction window for a factor shows the immediate interaction network and allows for link out to the StringDB12 website to query various enrichments for the PPI network genes. By doing this for each of the L2_14 factors in turn, one finds that StringDB enrichments for the PPI network genes suggest the following possible explanation for the associations with survival. Factor 32 contains genes that make up the major histocompatibility complex (MHC) class I protein complex, which is recognized by cytotoxic T lymphocytes. Factor 39 corresponds to cytokine signaling and CXCR3 receptor binding, related to CD8+ T lymphocytes. Both of these factors appear to confer a significant survival advantage for patients exhibiting relatively high expression of the corresponding genes. Cytotoxic CD8+ T lymphocytes are primarily responsible for anti-tumor immunity. Factor 52, on the other hand, is comprised of genes coding for proteins in the MHC class II complex which are recognized primarily by CD4+ T helper cells rather than directly by cytotoxic T lymphocytes. The remaining L2_14 factors reflect generalized immune system activation that doesn&#39;t differentiate the two types of lymphocyte populations. A survival association specific to cytotoxic T lymphocyte recognition of MCH class I cellular antigens is consistent with our understanding of antitumor immunity in general and from other cancers such as melanoma13,14.
The web portal supports the discovery of pairs of factors with complementary functions that may suggest effective tumor-specific combination therapies. The dataset overview can be scanned for factors that show a correlation with survival yet have distinct GO enrichments. For melanoma (TCGA_SKCM; Figure 4), it is seen that the top survival factor 171 is immune related, while factor 88 down the list shows enrichment for genes related to mitochondrion organization. Indeed, this has been suggested as a target in melanoma15. Adding survival windows to the CorExplorer page allows comparison of stratification using the factor pair to that of each factor individually, showing that favorable gene expression patterns from both groups exhibits a trend of survival better than that for either factor alone. The top stratum does not appear to be improved however, suggesting immunotherapy only may be the best option for some patients.
Commonalities and differences among tumors can be seen by searching across datasets for genes or GO terms (Figure 5). As an example, FLT1 (aka VEGFR1) is a well-studied pro-angiogenic marker16,17. When it is put into the search bar, all of the tumors have factors in which FLT1 plays a major role. Conversely, when the GO term &#8216;angiogenesis&#8217; is input on the search page, 5 out of 6 of the FLT1 groups appear with that enrichment. All FLT1 factors, with the exception of SKCM-195, are listed as statistically enriched for &#8216;angiogenesis&#8217; genes. The sixth factor does, in fact, have the annotation, but below the default 10-8 threshold. When the weighting within the factor list is utilized in an alternative enrichment calculator, e.g., Gene Set Enrichment Analysis (GSEA)18, the sixth factor is found to be significantly enriched for &#8216;angiogenesis&#8217; genes as well.
It is important to check the heatmaps to ensure the gene expression pattern is of adequate quality to support biological interpretations. Heatmaps that show strong clear variation may exhibit either coordinated expression of the factor genes ranging from low to high or more complex patterns with some genes having low expression correlated with others having high (Figure 6). A key marker of a high-quality grouping is the presence of several genes with a smooth variation in expression as a function of factor score. The factor heatmaps show samples ordered according to factor score, thus there should be a smooth gradient moving from left to right. However, this can fail to happen in at least two different ways. Most commonly, the correlations may be extremely noisy (Figure 5C), calling into question the robustness and utility of any inferences regarding survival and/or biological function. Also, patterns that happen only in a small minority of samples may not conform to the model of three expression states assumed by the CorEx algorithm, resulting in a misleading classification of the samples (right side of Figure 5D).
<img alt="Figure 1" class="xfigimg" src="/files/ftp_upload/60431/60431fig01.jpg" /> 
Figure 1: CorExplorer front page.&#160;After clicking on + next to Ovarian Cancer under Quick Links, factor graph details are shown. The CorEx hierarchical model is made up of input variables (gene expression in this case) on the bottom layer and inferred latent factors in the higher layers. <a href="https://www.jove.com/files/ftp_upload/60431/60431fig01large.jpg" target="_blank">Please click here to view a larger version of this figure.</a>
<img alt="Figure 2" class="xfigimg" src="/files/ftp_upload/60431/60431fig02.jpg" /> 
Figure 2: Using a gene name to guide exploration.&#160;The figure shows a series of screenshots illustrating exploration of CorEx lung cancer factors strongly related to BRCA1. First, selecting &#8216;BRCA1&#8217; in the Gene dropdown box for the factor graph causes the graph view to zoom in on the factor for which BRCA1 has greatest weight. Zooming out a bit frames the layer two node L2_8 connecting that factor to other related ones. Survival and annotations can be compared: clicking on the GO term DNA repair highlights annotated genes. A PPI window is added to show the network interactions for genes in the factor. Using the Add Window button to add a heat map shows association of expression patterns with survival, suggesting increased expression of DNA repair genes may be associated with decreased survival. <a href="https://www.jove.com/files/ftp_upload/60431/60431fig02large.jpg" target="_blank">Please click here to view a larger version of this figure.</a>
<img alt="Figure 3" class="xfigimg" src="/files/ftp_upload/60431/60431fig03.jpg" /> 
Figure 3: Using clinical data (survival) to guide exploration.&#160;Exploring the top survival-associated factor (39) for ovarian cancer reveals interesting relationships among neighboring factors. After selecting factor 39 in the factor graph and zooming out a bit, the layer two factor linked to factor 39 is seen to have five other associated factors. An additional survival window allows direct comparison of the associated survival differentials. Factors 39 and 32 both show a positive survival correlation, in contrast to factor 52, which does not. The protein-protein interaction networks are all well-defined. Linking out to StringDB allows comparison of the GO annotations (not shown): Factor 39 is associated with a cytokine signaling network related to cytotoxic CD8+ T lymphocyte activation and factor 32 is dominated by MHC class I antigen presenting proteins that trigger recognition by such lymphocytes; the neighboring factors, however, are dominated by other immune system components such as CD4+ helper T cells and show no survival correlation. <a href="https://www.jove.com/files/ftp_upload/60431/60431fig03large.jpg" target="_blank">Please click here to view a larger version of this figure.</a>
<img alt="Figure 4" class="xfigimg" src="/files/ftp_upload/60431/60431fig04v3.jpg" /> 
Figure 4: Exploring top survival factors suggests potential therapeutic combinations.&#160;The &#8216;Datasets&#8217; link on the home page menu bar leads to a concise table of survival factors ordered by p-value, along with the top GO annotation (not shown). Using this information for melanoma, the combination of factor 171 for immune function with factor 88 for mitochondrion organization appears complementary. The figure shows annotation windows for each of the factors side-by-side to contrast them. Survival curves for patients stratified by the two factors individually or together indicate that the combination increases the survival differential compared to either factor alone. <a href="https://www.jove.com/files/ftp_upload/60431/60431fig04large.jpg" target="_blank">Please click here to view a larger version of this figure.</a>
<img alt="Figure 5" class="xfigimg" src="/files/ftp_upload/60431/60431fig05.jpg" /> 
Figure 5: The Search page facilitates pan-cancer analysis.&#160;Genes or GO biological process terms can be searched for across all datasets using the Search link from the home page. The figure shows search results for the gene FLT1 and the GO term &#8216;angiogenesis&#8217;. The results show the presence of FLT1 in factors annotated with the term &#8216;angiogenesis&#8217; across cancers. <a href="https://www.jove.com/files/ftp_upload/60431/60431fig05large.jpg" target="_blank">Please click here to view a larger version of this figure.</a>
<img alt="Figure 6" class="xfigimg" src="/files/ftp_upload/60431/60431fig06v2.jpg" /> 
Figure 6: Heatmaps can be used to qualitatively assess correlations among genes and samples according to factor score.&#160;High quality gene expression relationships are shown by smooth gradation when patients are ordered by factor score in the heatmaps. The leftmost heatmap for factor 18 is one example. The patterns may also encompass complex signatures of up and down expression as in the middle large heatmap for factor 11. Lower quality patterns sometimes show abrupt changes in expression for a subgroup of patients as in the factor 9 heatmap on the right or simple very noisy correlations as in the factor 161 heatmap at the lower right. <a href="https://www.jove.com/files/ftp_upload/60431/60431fig06large.jpg" target="_blank">Please click here to view a larger version of this figure.</a>

Watch this Scientific Journal Video about Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal at JoVE.com

Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal

We introduce the CorExplorer web portal, an resource for exploration of tumor RNA sequencing factors found by the machine learning algorithm CorEx (Correlation Explanation), and show how factors can be analyzed relative to survival, database annotations, protein-protein interactions, and one another to gain insight into tumor biology and therapeutic interventions.

Differential gene expression analysis is an important technique for understanding disease states. The machine learning algorithm CorEx has shown utility ...

analyzing-tumor-gene-expression-factors-with-corexplorer-web

Research

JoVE Journal

Cancer Research

7.4K Views.  Lyrid LLC. We introduce the CorExplorer web portal, an resource for exploration of tumor RNA sequencing factors found by the machine learning algorithm CorEx (Correlation Explanation), and show how factors can be analyzed relative to survival, database annotations, protein-protein interactions, and one another to gain insight into tumor biology and therapeutic interventions.

Video: Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

In recent years, emerging databases were designed to lower the barriers for approaching the intricate cancer genomic datasets, thereby, facilitating investigators to analyze and interpret genes, samples and clinical data across different types of cancer. Herein, we describe a practical operation procedure, taking ID1 (Inhibitor of DNA binding proteins 1) as an example, to characterize the expression patterns of biomarker and survival predictors of breast cancer based on pooled clinical datasets derived from online accessible databases, including ONCOMINE, bcGenExMiner v4.0 (Breast cancer gene-expression miner v4.0), GOBO (Gene expression-based Outcome for Breast cancer Online), HPA (The human protein atlas), and Kaplan-Meier plotter. The analysis began with querying the expression pattern of the gene of interest (e.g., ID1) in cancerous samples vs. normal samples. Then, the correlation analysis between ID1 and clinicopathological characteristics in breast cancer was performed. Next, the expression profiles of ID1 was stratified according to different subgroups. Finally, the association between ID1 expression and survival outcome was analyzed. The operation procedure simplifies the concept to integrate multidimensional data types at the gene level from different databases and test hypotheses regarding recurrence and genomic context of gene alteration events in breast cancer. This method can improve the credibility and representativeness of the conclusions, thereby, present informative perspective on a gene of interest.

Here, we present a protocol to explore the biomarker and survival predictor of breast cancer based on the comprehensive analysis of pooled clinical datasets derived from a variety of publicly accessible databases, using the strategy of expression, correlation and survival analysis step by step.

In recent years, emerging databases were designed to lower the barriers for approaching the intricate cancer genomic datasets, thereby, facilitating ...

Integration of Bioinformatics Approaches and Experimental Validations to Understand the Role of Notch Signaling in Ovarian Cancer

Notch signaling is a highly conserved regulatory pathway involved in many cellular processes. Dysregulation of this signaling pathway often leads to interference with proper development and may even result in initiation or progression of cancers in certain cases. Because this pathway serves complex and versatile functions, it can be studied extensively through many different approaches. Of these, bioinformatics provides an undeniably cost-efficient, approachable, and user-friendly method of study. Bioinformatics is a useful way to extract smaller pieces of information from large-scale datasets. Through the implementation of various bioinformatics approaches, researchers can quickly, reliably, and efficiently interpret these large datasets, yielding insightful applications and scientific discoveries. Here, a protocol is presented for integration of bioinformatics approaches to investigate the role of Notch signaling in ovarian cancer. Furthermore, bioinformatics findings are validated through experimentation.

Bioinformatics is a useful way to process large-scale datasets. Through the implementation of bioinformatics approaches, researchers can quickly, reliably, and efficiently obtain insightful applications and scientific discoveries. This article demonstrates the utilization of bioinformatics in ovarian cancer research. It also successfully validates bioinformatics findings through experimentation.

Notch signaling is a highly conserved regulatory pathway involved in many cellular processes. Dysregulation of this signaling pathway often leads to ...

Discovery of Driver Genes in Colorectal HT29-derived Cancer Stem-Like Tumorspheres

Cancer stem cells play a vital role against clinical therapies, contributing to tumor relapse. There are many oncogenes involved in tumorigenesis and the initiation of cancer stemness properties. Since gene expression in the formation of colorectal cancer-derived tumorspheres is unclear, it takes time to discover the mechanisms working on one gene at a time. This study demonstrates a method to quickly discover the driver genes involved in the survival of the colorectal cancer stem-like cells in vitro. Colorectal HT29 cancer cells that express the LGR5 when cultured as spheroids and accompany an increase CD133 stemness markers were selected and used in this study. The protocol presented is used to perform RNAseq with available bioinformatics to quickly uncover the overexpressed driver genes in the formation of colorectal HT29-derived stem-like tumorspheres. The methodology can quickly screen and discover potential driver genes in other disease models.

Presented here is a protocol to discover the overexpressed driver genes maintaining the established cancer stem-like cells derived from colorectal HT29 cells. RNAseq with available bioinformatics was performed to investigate and screen gene expression networks for elucidating a potential mechanism involved in the survival of targeted tumor cells.

Cancer stem cells play a vital role against clinical therapies, contributing to tumor relapse. There are many oncogenes involved in tumorigenesis and the ...

Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer

The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1&#8203;&#8203;. &#8216;How to use PSA&#39; remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2 or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g. PCA3 in prostate cancer5,6 and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1).

Human endogenous retroviruses (HERV), which occupy 8% of the human genome, retain scarce coding capacities but a hundred thousand long terminal repeats (LTRs). A custom Affymetrix microarray was designed to identify individual HERV locus expression and was used on prostate cancer tissues as a proof of concept for future clinical studies.

The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, ...

Medicine

Defining Gene Functions in Tumorigenesis by Ex vivo Ablation of Floxed Alleles in Malignant Peripheral Nerve Sheath Tumor Cells

The development of new drugs that precisely target key proteins in human cancers is fundamentally altering cancer therapeutics. However, before these drugs can be used, their target proteins must be validated as therapeutic targets in specific cancer types. This validation is often performed by knocking out the gene encoding the candidate therapeutic target in a genetically engineered mouse (GEM) model of cancer and determining what effect this has on tumor growth. Unfortunately, technical issues such as embryonic lethality in conventional knockouts and mosaicism in conditional knockouts often limit this approach. To overcome these limitations, an approach to ablating a floxed embryonic lethal gene of interest in short-term cultures of malignant peripheral nerve sheath tumors (MPNSTs) generated in a GEM model was developed.
This paper describes how to establish a mouse model with the appropriate genotype, derive short-term tumor cultures from these animals, and then ablate the floxed embryonic lethal gene using an adenoviral vector that expresses Cre recombinase and enhanced green fluorescent protein (eGFP). Purification of cells transduced with adenovirus using fluorescence-activated cell sorting (FACS) and the quantification of the effects that gene ablation exerts on cellular proliferation, viability, the transcriptome, and orthotopic allograft growth is then detailed. These methodologies provide an effective and generalizable approach to identifying and validating therapeutic targets in vitro and in vivo. These approaches also provide a renewable source of low-passage tumor-derived cells with reduced in vitro growth artifacts. This allows the biological role of the targeted gene to be studied in diverse biologic processes such as migration, invasion, metastasis, and intercellular communication mediated by the secretome.

Defining Gene Functions in Tumorigenesis by Ex vivo Ablation of Floxed Alleles in Malignant Peripheral Nerve Sheath Tumor Cells

Here, we present a protocol for performing gene knockouts that are embryonic lethal in vivo in genetically engineered mouse model-derived tumors and then assessing the effect that the knockout has on tumor growth, proliferation, survival, migration, invasion, and the transcriptome in vitro and in vivo.

The development of new drugs that precisely target key proteins in human cancers is fundamentally altering cancer therapeutics. However, before these ...

Digital Spatial Profiling for Characterization of the Microenvironment in Adult-Type Diffusely Infiltrating Glioma

Diffusely infiltrating gliomas are associated with high morbidity and mortality due to the infiltrative nature of tumor spread. They are morphologically complex tumors, with a high degree of proteomic variability across both the tumor itself and its heterogenous microenvironment. The malignant potential of these tumors is enhanced by the dysregulation of proteins involved in several key pathways, including processes that maintain cellular stability and preserve the structural integrity of the microenvironment. Although there have been numerous bulk and single-cell glioma analyses, there is a relative paucity of spatial stratification of these proteomic data. Understanding differences in spatial distribution of tumorigenic factors and immune cell populations between the intrinsic tumor, invasive edge, and microenvironment offers valuable insight into the mechanisms underlying tumor proliferation and propagation. Digital spatial profiling (DSP) represents a powerful technology that can form the foundation for these important multilayer analyses.
DSP is a method that efficiently quantifies protein expression within user-specified spatial regions in a tissue specimen. DSP is ideal for studying the differential expression of multiple proteins within and across regions of distinction, enabling multiple levels of quantitative and qualitative analysis. The DSP protocol is systematic and user-friendly, allowing for customized spatial analysis of proteomic data. In this experiment, tissue microarrays are constructed from archived glioblastoma core biopsies. Next, a panel of antibodies is selected, targeting proteins of interest within the sample. The antibodies, which are preconjugated to UV-photocleavable DNA oligonucleotides, are then incubated with the tissue sample overnight. Under fluorescence microscopy visualization of the antibodies, regions of interest (ROIs) within which to quantify protein expression are defined with the samples. UV light is then directed at each ROI, cleaving the DNA oligonucleotides. The oligonucleotides are microaspirated and counted within each ROI, quantifying the corresponding protein on a spatial basis.

Proteomic dysregulation plays an important role in the spread of diffusely infiltrating gliomas, but several relevant proteins remain unidentified. Digital spatial processing (DSP) offers an efficient, high-throughput approach for characterizing the differential expression of candidate proteins that may contribute to the invasion and migration of infiltrative gliomas.

Diffusely infiltrating gliomas are associated with high morbidity and mortality due to the infiltrative nature of tumor spread. They are morphologically ...

TMEM200A: A Novel Oncogene Regulating EMT and PI3K/AKT in Gastric Cancer

Multiomics Analysis of TMEM200A as a Pan-Cancer Biomarker

The transmembrane protein, TMEM200A, is known to be associated with human cancers and immune infiltration. Here, we assessed the function of TMEM200A in common cancers by multiomics analysis and used in vitro cell cultures of gastric cells to verify the results. The expression of TMEM200A in several human cancer types was assessed using the RNA-seq data from the UCSC Xena database. Bioinformatic analysis revealed a potential role of TMEM200A as a diagnostic and prognostic biomarker. 
Cultures of normal gastric and cancer cell lines were grown and TMEM200A was knocked down. The expression levels of TMEM200A were measured by using quantitative real-time polymerase chain reaction and western blotting. In vitro loss-of-function studies were then used to determine the roles of TMEM200A in the malignant behavior and tumor formation of gastric cancer (GC) cells. Western blots were used to assess the effect of the knockdown on epithelial-mesenchymal transition (EMT) and PI3K/AKT signaling pathway in GC. Bioinformatic analysis showed that TMEM200A was expressed at high levels in GC. 
The proliferation of GC cells was inhibited by TMEM200A knockdown, which also decreased vimentin, N-cadherin, and Snai proteins, and inhibited AKT phosphorylation. The PI3K/AKT signaling pathway also appeared to be involved in TMEM200A-mediated regulation of GC development. The results presented here suggest that TMEM200A regulates the tumor microenvironment by affecting the EMT. TMEM200A may also affect EMT through PI3K/AKT signaling, thus influencing the tumor microenvironment. Therefore, in pan-cancers, especially GC, TMEM200A may be a potential biomarker and oncogene.

Author Spotlight: Unveiling Transmembrane Protein Family-Related Markers in Gastric Cancer and Implications for Targeted Therapies 

Here, a protocol is presented in which multiple bioinformatic tools are combined to study the biological functions of TMEM200A in cancer. In addition, we also experimentally validate the bioinformatics predictions.

The transmembrane protein, TMEM200A, is known to be associated with human cancers and immune infiltration. Here, we assessed the function of TMEM200A in ...

Pipeline for Analyzing Microarray Data in Esophageal Cancer Research

Development of Compendium for Esophageal Squamous Cell Carcinoma

Esophageal cancer (EC) ranks as the 8th most aggressive malignancy, and its treatment remains challenging due to the lack of biomarkers facilitating early detection. EC manifests in two major histological forms - adenocarcinoma (EAD) and squamous cell carcinoma (ESCC) - both exhibiting variations in incidence across geographically distinct populations. High-throughput technologies are transforming the understanding of diseases, including cancer. A significant challenge for the scientific community is dealing with scattered data in the literature. To address this, a simple pipeline is proposed for the analysis of publicly available microarray datasets and the collection of differentially regulated molecules between cancer and normal conditions. The pipeline can serve as a standard approach for differential gene expression analysis, identifying genes differentially expressed between cancer and normal tissues or among different cancer subtypes. The pipeline involves several steps, including Data preprocessing (involving quality control and normalization of raw gene expression data to remove technical variations between samples), Differential expression analysis (identifying genes differentially expressed between two or more groups of samples using statistical tests such as t-tests, ANOVA, or linear models), Functional analysis (using bioinformatics tools to identify enriched biological pathways and functions in differentially expressed genes), and Validation (involving validation using independent datasets or experimental methods such as qPCR or immunohistochemistry). Using this pipeline, a collection of differentially expressed molecules (DEMs) can be generated for any type of cancer, including esophageal cancer. This compendium can be utilized to identify potential biomarkers and drug targets for cancer and enhance understanding of the molecular mechanisms underlying the disease. Additionally, population-specific screening of esophageal cancer using this pipeline will help identify specific drug targets for distinct populations, leading to personalized treatments for the disease.

This protocol describes a useful tool for identifying significant molecular changes in cancer and leads to the development of new diagnostic and therapeutic approaches for esophageal squamous cell carcinoma.

Esophageal cancer (EC) ranks as the 8th most aggressive malignancy, and its treatment remains challenging due to the lack of biomarkers facilitating early ...

TMOD3 Expression and Its Role in Ovarian Cancer Progression

Tropomodulin 3 Overexpression as a Marker for Platinum Resistance and Immune Infiltration in Ovarian Cancer

The cytoskeleton plays an important role in platinum resistance in ovarian cancer. Tropomodulin 3 (TMOD3) is critical in the development of many tumors, but its role in the drug resistance of ovarian cancer remains unexplored. By analyzing data from the Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA), and Clinical Proteomic Tumor Analysis Consortium (CPTAC) databases, this study compared TMOD3 expression in ovarian cancer and normal tissues, and examined the expression of TMOD3 after platinum treatment in platinum-sensitive and platinum-resistant ovarian cancers. The Kaplan-Meier method was used to assess the effect of TMOD3 on overall survival (OS) and progression-free survival (PFS) in ovarian cancer patients. microRNAs (miRNAs) targeting TMOD3 were predicted using TargetScan and analyzed using the TCGA database. Tumor Immune Estimation Resource (TIMER) and an integrated repository portal for tumor-immune system interactions (TISIDB) were used to determine the relationship between TMOD3 expression and immune infiltration. TMOD3 coexpression networks in ovarian cancer were explored using LinkedOmics, the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING), and The Database for Annotation, Visualization, and Integrated Discovery (DAVID) Bioinformatics. The results showed that TMOD3 was highly expressed in ovarian cancer and was associated with the grading, staging, and metastasis of ovarian cancer. TMOD3 expression was significantly reduced in platinum-treated ovarian cancer cells and patients. However, TMOD3 expression was higher in platinum-resistant ovarian cancer cells and tissues compared to platinum-sensitive ones. Higher TMOD3 expression was significantly associated with lower OS and PFS in ovarian cancer patients treated with platinum-based chemotherapy. miRNA-mediated post-transcriptional regulation is likely responsible for high TMOD3 expression in ovarian cancer and platinum-resistant ovarian tissues. The expression of TMOD3 mRNA was associated with immune infiltration in ovarian cancer. These findings indicate that TMOD3 is highly expressed in ovarian cancer and is closely associated with platinum resistance and immune infiltration.

Author Spotlight: Unveiling the Role of TMOD3 in Platinum Resistance and Immune Infiltration in Ovarian Cancer

Tropomodulin 3 (TMOD3) has been increasingly studied in tumors in recent years. This study is the first to report that TMOD3 is highly expressed in ovarian cancer and is closely associated with platinum resistance and immune infiltration. These results could help improve the therapeutic outcomes for ovarian cancer.

The cytoskeleton plays an important role in platinum resistance in ovarian cancer. Tropomodulin 3 (TMOD3) is critical in the development of many tumors, ...

Optimization of a Multiplex RNA-based Expression Assay Using Breast Cancer Archival Material

Nucleic acid degradation in archival tissue, tumor heterogeneity, and a lack of fresh frozen tissue specimens can negatively impact cancer diagnostic services in pathology laboratories worldwide. Gene amplification and expression diagnostic testing using archival material or material that requires transportation to servicing laboratories, needs a more robust and accurate test adapted to current clinical workflows. Our research team optimized the use of Invitrogen&#8482; QuantiGene&#8482; Plex Assay (Thermo Fisher Scientific) to quantify RNA in archival material using branched-DNA (bDNA) technology on Luminex xMAP&#174; magnetic beads. The gene expression assay described in this manuscript is a novel, quick, and multiplex method that can accurately classify breast cancer into the different molecular subtypes, omitting the subjectivity of interpretation inherent in imaging techniques. In addition, due to the low input of material required, heterogeneous tumors can be laser microdissected using Hematoxylin and Eosin (H&#38;E) stained sections. This method has a wide range of possible applications including tumor classification with diagnostic potential and measurement of biomarkers in liquid biopsies, which would allow better patient management and disease monitoring. In addition, the quantitative measurement of biomarkers in archival material is useful in oncology research with access to libraries of clinically-annotated material, in which retrospective studies can validate potential biomarkers and their clinical outcome correlation.

Nucleic acid degradation in archival tissue, tumor heterogeneity, and a lack of fresh frozen tissue specimens can negatively impact cancer diagnostic services in pathology laboratories worldwide. This manuscript describes the optimization of a panel of biomarkers using a multiplex magnetic bead assay to classify breast cancers.

Nucleic acid degradation in archival tissue, tumor heterogeneity, and a lack of fresh frozen tissue specimens can negatively impact cancer diagnostic ...

Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal

Summary

Explore More Videos

Chapters in this video

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Integration of Bioinformatics Approaches and Experimental Validations to Understand the Role of Notch Signaling in Ovarian Cancer

Discovery of Driver Genes in Colorectal HT29-derived Cancer Stem-Like Tumorspheres

Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer

Defining Gene Functions in Tumorigenesis by Ex vivo Ablation of Floxed Alleles in Malignant Peripheral Nerve Sheath Tumor Cells

Digital Spatial Profiling for Characterization of the Microenvironment in Adult-Type Diffusely Infiltrating Glioma

Multiomics Analysis of TMEM200A as a Pan-Cancer Biomarker

Development of Compendium for Esophageal Squamous Cell Carcinoma

Tropomodulin 3 Overexpression as a Marker for Platinum Resistance and Immune Infiltration in Ovarian Cancer

Optimization of a Multiplex RNA-based Expression Assay Using Breast Cancer Archival Material