GV Exercise.5

From BITS wiki
Jump to: navigation, search

Analyze human cancer data and find tumor markers

Vitruvian_man-col.jpg

[ Main_Page | Genevestigator_training | Analyze_public_microarray_data_using_Genevestigator | GV Exercise.4 |
| GV Exercise.6 ]
last edit: October 31, 2014



GV_contrasts.png

Handicon.png When setting-up enrichment analyses, remember we are using GV data originated from the Affy 47k chip aka GeneChip® Human Genome U133 Plus 2.0 Array

Find liver neoplasm specific markers as compared to normal liver

Find malignant liver cell markers absent in normal liver tissue

Handicon.png When you read NOS in GV, it means Not Otherwise Specified

We want to get markers differentially expressed between liver neoplasm cells and normal liver cells. At this stage, we do not care about expression of these genes outside of the liver context.

create a sample list with all human 47k samples

try it first

human_47k.png


gene-neopl.png
  • Using human_47k, start the Gene Search Neoplasm tool
  • search for the top 100 liver neoplasm markers
  • not considering metastatic entries - (this 'may' correspond to primary tumor markers)
  • take normal liver as background
  • do not include 'cell lines' - (note that this is a NEW feature since the last training)

Handicon.png check the top bar and choose meaningful options, first 'collapse all' to shorten the list then use the search box to find what you need

try it first

liver_neoplasm_search.png

try it first

first-liver-results.png

run the tool and inspect the results, see if these markers appear in other conditions, and refine the selection to exclude cholangiocarcinoma markers search for the top 100 markers now specific to hepathocellular carcinoma

try it first

refined-liver-results.png
  • Save the last probe list to a text file

Handicon.png GV cannot save probe lists to file directly, we need to do this in few easy steps

  • copy the probes to your clipboard
  • open a text editor, paste and save the list; name it hepathocellular_carcinoma-vs-liver.txt
  • also create the corresponding list in GV with the New button and name it hepathocellular_carcinoma-vs-liver

 

Find normal tissues that express HCC-specific markers

We just identified HCC markers and wish to know if a majority of these are found in some other 'normal' tissue(s). In order to do so, we create a new sample list with all human 47k samples that are not from tumor and not from cell lines (name it: human_47k-noTumor-noCellLines). We then build a heatmap with all markers from the list and all tissues from the new sample group.

try it first

human_non-tumor.png


hr-clust.png

Perform clustering using

  • the hepathocellular_carcinoma-vs-liver list
  • the human_non-tumor sample set
  • find in which 'normal' tissues (Anatomy) these markers (or part of) are differentially expressed
  • search 'hepatocytes' in the large heatmap to control your initial filtering

Handicon.png Use the 'SimilaritySearch' hierarchical Anatomy and cluster in both directions

try it first

hcc_vs_normal-tissues.png

 

Find markers specific for HCC and absent in other tumors

gene-neopl.png

From here you can proceed in two ways; create a sample list with only tumor experiments OR use the full human sample list and restrict your search to neoplasms. We take the first method but you are free to try the second.

  • create a new sample list with all human tumor samples

try it first

human-neoplasm_samples.png
  • select the human neoplasm samples and search for hepathocellular_carcinoma specific markers

try it first

hcc_vs_neoplasm.png

Starting from all samples this would show a longer list but with the same annotations

hcc_vs_neoplasm-all.png
  • run the tool and inspect the results with neoplasms

try it first

hepathocellular_carcinoma-vs-all-neoplasms.png
  • create a new gene list (hepathocellular_carcinoma-vs-all-neoplasms) with the top 100 markers

try it first

genes_hepathocellular_carcinoma-vs-all-neoplasms.png
  • save the probe list to a text file and name it hepathocellular_carcinoma-vs-all-neoplasms.txt

 

Perform functional analysis using free web-resources

The canonical DAVID, or the modern Enrich or WebGestalt, as well as other web enrichment tools allow complex yet easy enrichment computation starting from a list of IDs. The enrichment step is vital because human cannot efficiently comprehend gene lists and prefer biological functions to understand biology.

Technical.png We Illustrate here the first step of such analysis using one of our lists and invite the users to further explore these nice tools

 

Using DAVID to perform functional enrichment

The oldest of all such tools but still appreciated by many biologists for its ease of use.

Access DAVID at http://david.abcc.ncifcrf.gov/home.jsp

  • upload the hepathocellular_carcinoma-vs-all-neoplasms.txt list to DAVID and set it as a gene-list

try it first

01-upload.png
  • run the enrichment using standard parameters (or tune them!)

try it first

02-run.png
  • review clusters

try it first

03-func-annot-cluster.png
  • review charts

try it first

04-funct-annot-chart.png
  • review tables

try it first

05-funct-annot-table.png

Handicon.png Each output type has its own specificities and goodies, this is NOT a DAVID training, a great documentation is presented online

 

DAVID and BioMart conversion from probe IDs to gene symbols

The DAVID built-in ID convertor

DAVID does not only performs functional enrichment from ma,ny kind of ID lists but it can also be used to simply convert IDs from one type to another ( http://david.abcc.ncifcrf.gov/conversion.jsp)

Handicon.png Use the saved probe list as input and set the in and out format in order to get gene symbols

try it first

First upload the list from the saved text file and identifying the IDs as probe IDs (automatic process). Then call the Convertor from the menu and proceed as shown below.


david-select-conversion.png


david-set-convertion.png


david_results.png


david_results_export.png

BioMart conversion from probe IDs to gene symbols

Besides its huge database export capabilities, BioMart was recently added a fantastic web portal for ID conversion (http://central.biomart.org/converter/#!/ID_converter/gene_ensembl_config_2)

Using this portal does not require any knowledge about EnsEMBL ans is illustrated below to convert our list of probe IDs to a list a Gene Symbols (HUGO) that is required in the next exercise.


try it first

biomart_conv-01.png


biomart_conv-02.png


biomart_conv-03.png

Performing enrichment with the BioMart enrichment tool

This recent tool aggregates several sources for enrichment.

Access the Biomart Enrichment tools at http://central.biomart.org/enrichment/#/gui/Enrichment/. The BioMart tool is relatively simple in design and performs only a limited number of annotations.

We can try the tool with the first exported list hepathocellular_carcinoma-vs-liver.txt and not forgetting to specify the matching background (Affymetrix human u133_a)

Setup, Gene Ontology, and MIM results
upload.png
setup.png
GO-table.png
GO-graph.png
MIM-table.png
MIM-graph.png

 

Performing enrichment with Enrich

This recent tool aggregates several sources for enrichment and returns very dynamic content.

Access Enrich at http://amp.pharm.mssm.edu/Enrichr/

Technical.png Enrich, unlike DAVID does not support probe IDs, we need first to convert our probes to gene symbols using BIOMART or DAVID and to de-duplicate the obtained list

Setup
biomart-convert-list1.png
enrich-submit.png
Result categories try it to get the results
results1.png
results2.png
results3.png
results4.png
results5.png
results6.png

 

Performing enrichment with WebGestalt

Another recent tool that also aggregates several sources for enrichment and returns very dynamic content. Please first register (free: http://bioinfo.vanderbilt.edu/webgestalt/login.php) and start using this great and intuitive tool.

Setup, Gene Ontology, and MIM results
setup.png
ID-remapping.png
webgesalt-list1.png
KEGG.png
DiseaseAssociation.png

Handicon.png Many more such tools exist as well as great BioConductor packages that will produce excellent results after some time and learning

 

[VIB license required] Upload gene lists to IPA and run a core analysis

Technical.png Due to the VIB concurrent IPA license limit, we should not all try this at the same time, please review the pictures and tables generated for you and included here, especially if you do not have experience with working in IPA

click here to go to the IPA login page

Handicon.png Use the 100 probes saved as hepathocellular_carcinoma-vs-all-neoplasms.txt [1] or hepathocellular_carcinoma-vs-liver.txt [2], copy paste them or upload them to IPA


IPAKB.jpg

 

We fist report here results from the Venn intersection of both hepathocellular_carcinoma-vs-all-neoplasms.txt and hepathocellular_carcinoma-vs-liver.txt lists with the IPA knowledge base HCC biomarker list. Only few biomarkers are specifically expressed in HCC! this may seem strange at first sight but is in fact very common since tumor-specific antigens do not really exist

IPA venn diagram from three GV lists

IPA-venn.png

We now reproduce some of the other types of results one can obtain in IPA. We tried to demonstrate that IPA can find the biology hidden behind the different lists and to inform the user about what may be happening in the system (in tis case in HCC tumors).

Networks

  • IPA networks are pre-built entities showing known relations between proteins that are common to known functions or processes. Networks are often more informative than 'canonical pathways' as they group proteins that play together in a shared context rather than showing knowledge assembled from encyclopedic sources.
top IPA Networks
hcc-vs-neoplassms
hcc_vs_neoplasms-networks.png
hcc-vs-liver
hcc_vs_liver-networks.png
Best Network from each core analysis
hcc-vs-neoplasms
hcc_vs_neoplasm-NW1.png
hcc-vs-liver
hcc_vs_liver-NW1.png

Biological and Tox functions

Biological- and Tox-functions enriched in these two list are very relevant given the origin of the data.

Best tox-lists from each core analysis
hcc-vs-neoplasms
hcc_vs_neoplasms-toxlists.png
hcc-vs-liver
hcc_vs_liver-toxlists.png

In the comparison with liver, one top Tox annotation is 'cholangiocarcinoma' which was not specified in GV but is apparent here. This could be due to some GV samples being mislabeled and in fact belonging to this type or simply to a large overlap in features between the two tumor types.

Handicon.png IPA demonstrated its superiority of on free tools and was able to identify the very nature of the data based on a simple list of <100 markers selected by GV

 

Download the exercise files

Try it by yourself before expanding on the right!

  • download hepathocellular_carcinoma-vs-liver.txt and open it with your default worksheet application
  • download hepathocellular_carcinoma-vs-all-neoplasms.txt and open it with your default worksheet application

IPA-results for carcinoma-vs-liver & carcinoma-vs-all-neoplasms

  • download the IPA_core-hcc_vs_liver.pdf report file link
  • download the IPA_core-hcc_vs_neoplasms.pdf report link

genevestogator workspace

  • download ex5.gv4 and open it from within genevestigator File Load Workspace file link

References:



[ Main_Page | Genevestigator_training | Analyze_public_microarray_data_using_Genevestigator | GV Exercise.4 |
| GV Exercise.6 ]