40
EGAN Tutorial: A Basic Use- case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center [email protected]

EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center [email protected]

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

EGAN Tutorial:A Basic Use-case

October, 2009

Jesse Paquette

UCSF Helen Diller Family Comprehensive Cancer Center

[email protected]

Page 2: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

Preamble

• This document has many slides with multi-step animations– Best viewed in Slide Show mode

• The EGAN graphical user interface is evolving– Icons may change

– Menus may change

– Button/widget placement may change

– This document probably won’t change as quickly

– Please contact the developers if you notice major discrepancies between this and EGAN

Page 3: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Overview

• This document will guide you through a brief demonstration of EGAN functionality; you will

– Select gene nodes using experiment results– Show gene nodes on the Network View– Perform an automatic layout of the Network View– Save a custom gene set– Navigate the Network View– Link out to Entrez Gene– Link out to PubMed– Calculate enrichment scores for association nodes– Show association nodes on the Network View– Export a screenshot of the Network View– Export a Node Table with enrichment statistics

• Yes, that’s just the brief demonstration– There is a large amount of functionality that won’t be covered

• Launch the EGAN Demo to follow along– Try to make your screen match the screenshots on each slide

Page 4: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

Here’s where we begin. You should see this screen after the EGAN demo loads. In step one we will select gene nodes to be placed on the graph.

Drag the vertical divider to the left in order to give the Entrez Gene Node Table maximum visible width.

Entrez Gene Node Table

Page 5: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

Next, click the Expts. tab to show the Experiments Table in the bottom panel.

Page 6: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

The Experiments Table shows two experiment results from Neve (2006); one focused (blue), one unfocused (gray).

For each experiment result there are three columns in the Entrez Gene Node Table.

Column one shows the summary statistic value from the experiment for each gene. In this example, values represent each gene’s expression correlation with Herceptin sensitivity.

Column two indicates the sign of each gene’s summary statistic value and a color indicating the position of that value in the overall distribution. Positive values are green, negative values are blue, and brighter colors indicate statistics near the tails.

Column three indicates the p-value from the experiment for each gene.

Click on the header of column three to sort the Entrez Gene Node Table by p-value.

Page 7: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

These are the most significant genes in the focused experiment. Note that there are genes that correlate with both Herceptin sensitivity and Herceptin resistance. We are going to construct a network using only the genes associated with Herceptin resistance.

Click on the header of column two to sort the Entrez Gene Node Table by the sign of the correlation statistic.

Page 8: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

This sorts the gene nodes into two groups, - and +. Within each group, the nodes are still sorted by p-value. Now we can easily select all nodes in the - group.

Left-click on the top gene row (POLR2G) and drag downward until you reach gene rows that have p-values greater than 0.0.

Page 9: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

Left-click on the top gene row (POLR2G) and drag downward until you reach gene rows that have p-values greater than 0.0.

Page 10: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

Left-click on the top gene row (POLR2G) and drag downward until you reach gene rows that have p-values greater than 0.0.

Page 11: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Select gene nodes

Your selection block will end with the gene SEC61A1. To confirm the number of gene nodes now selected, click on the Nodes tab below to show the Node Types Table.

Page 12: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show selected gene nodes

The Selected Nodes column value for the Entrez Gene row shows that there are 41 genes selected. Remember, these are the top 41 genes having expression values correlated with Herceptin resistance.

We’re ready to show these gene nodes on the graph. Drag the Node Table divider back to the right to give room to the Network View.

Page 13: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show selected gene nodes

Click the Show selected button to show all selected nodes on the graph.

Page 14: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Perform layout

And there they are! All stacked on top of each other. To separate them, click the Force layout button above.

Page 15: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Group selected gene nodes

Ok, we’re almost ready to explore the graph. But first, let’s save this group of 41 genes so we can quickly retrieve it later. Click the Group selected button to the right.

Page 16: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Group selected gene nodes

Give this gene set a very descriptive name – you may not revisit this set until a future analysis, at which point you will need to know exactly what this set represents. Suggested: “Top 41 genes with expression correlation to Herceptin resistance (Neve 2006), p < 0.01”

To confirm that this set was created, left-click the Custom Node row in the Node Types Table. This will show the Custom Node Node Table to the right.

Page 17: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Group selected gene nodes

The group now appears as an association node in the Custom Node Node Table.

Now that our new gene set is saved, deselect all nodes by clicking the Deselect all button.

Page 18: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Navigating the Network View

Now for some Network View basics: right-click on the Network View in empty space (i.e. not on a node or an edge). While holding down the button, drag downward.

Page 19: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Navigating the Network View

You can also zoom in and out with the mouse wheel or the buttons above the Network View. Pan and zoom the Network View to focus on the cluster of 4 inter-connected genes to the left.

Page 20: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Link out to Entrez Gene

Right-click on the node TAP1.

Page 21: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Link out to Entrez Gene

This brings up the Node Menu. The first item, Summary, shows the Entrez Gene summary information for TAP1. Next, select Link out -> Link out ‘TAP1’ from this Node Menu.

Page 22: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Link out to Entrez Gene

If you have an active internet connection, using Link out from the Node Menu will connect you to the database entry corresponding with each node. In this case, Link out will load the Entrez Gene entry for TAP1.

Page 23: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Link out to PubMed

Now let’s consider the edges connecting these gene nodes. You will notice that there are three different edge colors, pink, orange and gray. To understand what each edge represents, click the Edges tab to bring up the Edge Types Table.

Page 24: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Link out to PubMed

The demo version of EGAN contains 3 pre-collated edge types:

Protein-protein interactions (defined by BIND, BioGRID, HPRD, IntAct and MINT)

Chromosomal sequence proximity (an edge exists if the genes are adjacent on the chromosome)

PubMed co-occurrence (an edge exists if the genes are discussed in the same article)

To view how many articles support each edge, click Display options -> Edges -> Reference count labels.

Page 25: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Link out to PubMed

We can now see that EGAN is aware of three articles in PubMed that mention both PSMB10 and PSMB8. Right-click on that edge and select Link out -> PubMed pages for all references for this edge. Those articles should load in your browser.

We’re almost done with the tutorial. Two things left to cover: enrichment statistics and exporting results. To calculate association node enrichment (i.e. over-representation) in the set of visible gene nodes, click Enrichment options -> Association visible enrichment below.

Next, click the Nodes tab to bring up the Node Types Table.

Page 26: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Calculate enrichment scores

Enrichment scores have been calculated for all association nodes (i.e. gene sets) in EGAN. Let’s explore this information. Click the Gene Ontology Process row in the Node Types Table.

Page 27: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Calculate enrichment scores

There are now two new columns in the Gene Ontology Process Node Table.

Column one shows the number of genes in the Network View that are also connected to each Gene Ontology Process association node.

Column two shows the p-value for the corresponding hypergeometric enrichment test.

Click the Visible Enrichment column header to sort the Gene Ontology Process Node Table by enrichment p-value.

Page 28: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show enriched association nodes

Click the checkboxes in the Visible column to selectively show response to biotic stimulus, cellular macromolecule catabolic process and negative regulation of ubiquitin-protein ligase activity during mitotic cell cycle on the Network View.

Page 29: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show enriched association nodes

Next, click the Force layout button.

Page 30: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show enriched association nodes

Congratulations, you’ve just created your first gene association network. Let’s add some enriched KEGG association nodes. Click the KEGG row in the Node Types Table.

Page 31: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show enriched association nodes

Then sort the KEGG Node Table by the Visible Enrichment column.

Page 32: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show enriched association nodes

Click the corresponding checkboxes to show Glycan structures – biosynthesis 1 and N-Glycan biosynthesis on the Network View. Then click the Force layout button above.

Page 33: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Show enriched association nodes

Now let’s show enriched Cytoband association nodes. Show the Cytoband Node Table, sort the table by Visible Enrichment and selectively show Cytoband association nodes enriched with p-values less than 0.01. After that, click the Force layout button.

Page 34: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Export to PDF

You should have shown nodes 1p36 and 6p21.3. Easy, eh? Note that your layout might look slightly different than this screenshot – this is because the Force layout algorithm is non-deterministic.

Let’s export the Network View to PDF. First, we want to manipulate the Network View so that we can take the best screenshot. Click the Maximize button above to give full screen space to the Network View.

Page 35: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Export to PDF

Let’s assume that we’ll want to print this network to paper at some point. It’s best to switch to the white color scheme to save on black ink. Click Display options -> Background -> White.

Page 36: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Export to PDF

To export the Network View to PDF, click Screenshot options -> Network-only PDF…

Page 37: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Export the Node Table

Ok, almost done…just one last useful tip. You can also export any Node Table to tab-delimited (Excel-ready) file. Click the Show all tables button above.

Page 38: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Export the Node Table

Then drag the divider to the left to give more space to the Node Table.

Page 39: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

A basic EGAN use-case: Export the Node Table

Finally, click the Export button at the top of the Node Table. You can export the Node Table for every node type shown in the Node Types Table below.

That’s all for this tutorial. Thanks for taking the time to learn EGAN!

Page 40: EGAN Tutorial: A Basic Use-case October, 2009 Jesse Paquette UCSF Helen Diller Family Comprehensive Cancer Center jesse.paquette@cc.ucsf.edu

Questions/comments?

• Visit http://groups.google.com/group/ucsf-egan for downloads, documentation and discussion– Requires an account with Google Groups