Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

Preview:

Citation preview

1

Presenting Results

Laura Bigginslaura.biggins@babraham.ac.uk

v1.0

2

I have my results in a table… what next?

Plot everything?

3

ArtefactsArtefacts in the data can be caused by a whole myriad of reasons during any stage from library preparation to the final step of the analysis where the gene lists are produced.

• RNA-seq – transcript length, expression levelRibosomal, cytoskeleton, extracellular, secretedmulti-mapping reads – multi vs singleribosomal, translation

• Bisulphite – CpG density• GC content – low and high GC fragments are underrepresented in

libraries• Location, average copy number • Starting population of cells – remember to include background• Completely random genes….

4

Differential power• RNA-seq – transcript length, expression level• Bisulphite – CpG density

• Non-random distribution– CpG density

5

• Mapping– multi-mapping– genome

• Splice variants– Analysis at transcript vs gene level

6

Copy number variation

7

Categories to be wary of

• ribosomal• cytoskeleton• extracellular• secreted • translation• glycoprotein

8

Beware…

GC < 0.35

9

GC > 0.6

10

All genes on chr 2, 8, 13

11

No of transcripts > 4

Random sets of 1000 genes put through DAVID

Artefacts – checking your gene list

12

• Make sure background is appropriate• Be suspicious of some ontology categories –

Ribosomal, cytoskeleton, extracellular, secreted, translation

http://www.bioinformatics.babraham.ac.uk/shiny/gene_screen/

gene_screen – Shiny app to check for obvious differences in target genes compared to background population

13

What next?

14

Figure examples

15

Figure examples

16

GO graph

Genes are often annotated with many functions

17

Displaying ResultsInterpreting and exploring results• How can the results be displayed so that I can

interpret and explore them most easily?– Understanding the functional terms (incl GO hierarchy)

– Finding relevant information amongst the masses (GOslim, redundant terms, clustering)

Presenting results• How should I present my results?• What information should I include?

18

Interpreting and Exploring Results• How can the results be displayed so that I can

interpret them most easily?• Understanding the functional categories– GOrilla – hierarchical map– Panther - interactive pie charts

• Reducing redundancy– DAVID – clusters of similar functions– REVIGO - semantic similarity– GOslims

19

GOrilla

cbl-gorilla.cs.technion.ac.il/

20

Panther

21

GOrilla

cbl-gorilla.cs.technion.ac.il/

22

Exploring Results• How can the results be displayed so that I can

interpret them most easily?• Understanding the functional categories– Gorilla – hierarchical map– Panther - interactive pie charts

• Reducing redundancy– DAVID – clusters of similar functions– REVIGO - semantic similarity– GOslims

23

GOrilla

cbl-gorilla.cs.technion.ac.il/

24

Exploring results

25

Reducing redundancy

http://revigo.irb.hr/

26

Reducing redundancy

27

Reducing redundancy

Giraph.jar

genelist3.txt

mouse_genes_seqmonk.txt

28

Reducing redundancy

• Use a clustering tool• Use a GOslim – various versions available, may lose

the interesting detail• Select non-redundant terms yourself – be

consistent– P-value filter, top x number of categories, largest

categories, most enriched

What information should be included?

29

30

Figure examples

31

Figure examples

32

Figure examples

33

Summary

• Beware of artefacts – if something looks too good to be true it probably is….

• Remember your background population• Do not try and plot absolutely everything• Choose a method to deal with redundant terms• Think about what you’re plotting and whether

it makes sense• Do not be afraid of including tables

34

Exercise 2

Category Term Count% PValue Genes List TotalPop HitsPop TotalFold EnrichmentBenjamini FDR

GOTERM_BP_FATGO:0006955~immune response 30 29 1.86E-22 CSF2, C3, LY86, H2-D1, OAS3, OAS2, CD74, B2M, LIF, OASL2, OASL1, GBP10, H2-K1, CIITA, ICAM1, H2-Q10, GBP6, GBP5, GBP9, H2-Q6, H2-Q7, PSMB9, SERPINA3G, H2-EB1, IRF8, H2-T22, TGTP1, TGTP2, OAS1A, GBP4, GBP3, GBP281 471 10.68 1.59E-19 2.88E-19

GOTERM_MF_FATGO:0005525~GTP binding 18 17 1.34E-11 GBP6, GM12185, EIF2S3Y, GBP5, GIMAP7, GBP9, IFI47, IGTP, GVIN1, GM4841, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GM4951, GBP2, GM407078 354 8.662 2.32E-09 1.64E-08

GOTERM_MF_FATGO:0032561~guanyl ribonucleotide binding 18 17 2.00E-11 GBP6, GM12185, EIF2S3Y, GBP5, GIMAP7, GBP9, IFI47, IGTP, GVIN1, GM4841, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GM4951, GBP2, GM407078 363 8.448 1.73E-09 2.44E-08

GOTERM_MF_FATGO:0019001~guanyl nucleotide binding 18 17 2.00E-11 GBP6, GM12185, EIF2S3Y, GBP5, GIMAP7, GBP9, IFI47, IGTP, GVIN1, GM4841, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GM4951, GBP2, GM407078 363 8.448 1.73E-09 2.44E-08

GOTERM_BP_FATGO:0019882~antigen processing and presentation 10 9.5 1.90E-09 H2-K1, ICAM1, H2-Q10, H2-EB1, H2-D1, H2-T22, H2-Q6, H2-Q7, CD74, B2M, PSMB981 87 19.28 8.10E-07 2.94E-06

GOTERM_BP_FATGO:0048002~antigen processing and presentation of peptide antigen 7 6.7 4.88E-08 H2-K1, H2-Q10, H2-EB1, H2-D1, H2-Q6, H2-Q7, CD74, B2M81 35 33.55 1.39E-05 7.55E-05

GOTERM_BP_FATGO:0001916~positive regulation of T cell mediated cytotoxicity 4 3.8 1.61E-05 H2-K1, P2RX7, H2-Q6, H2-Q7, B2M81 9 74.56 0.002288 0.024911

GOTERM_BP_FATGO:0006952~defense response 13 12 1.12E-05 CIITA, H2-K1, LYZ2, C3, LY86, H2-D1, IFI47, H2-Q6, H2-Q7, CD74, B2M, P2RX7, CD44, IRF881 448 4.868 0.001916 0.017378

GOTERM_BP_FATGO:0001914~regulation of T cell mediated cytotoxicity 4 3.8 3.13E-05 H2-K1, P2RX7, H2-Q6, H2-Q7, B2M81 11 61 0.003817 0.048513

GOTERM_MF_FATGO:0032555~purine ribonucleotide binding 32 30 5.16E-09 OAS3, HSPA1A, HSPA1B, OAS2, CKB, IGTP, OASL2, GM4841, OASL1, DDX3Y, GBP10, IIGP1, TOP2A, GM4070, CIITA, GM12185, GBP6, EIF2S3Y, MYO6, GBP5, GIMAP7, GBP9, IFI47, PSMB9, MYO10, P2RX7, GVIN1, TGTP1, TGTP2, OAS1A, GBP4, GM4951, GBP3, GBP278 1796 3.035 2.23E-07 6.31E-06

GOTERM_MF_FATGO:0003924~GTPase activity 11 10 3.07E-09 GBP6, IGTP, GBP5, EIF2S3Y, GBP9, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GBP278 128 14.64 1.77E-07 3.75E-06

GOTERM_CC_FATGO:0009897~external side of plasma membrane 12 11 3.14E-09 H2-K1, LY6A, LY6C1, ICAM1, P2RX7, IL12RB1, S1PR1, CD44, CD274, H2-D1, H2-Q6, H2-Q7, CD7461 206 11.94 3.46E-07 3.55E-06

35

05

101520253035

Count

0.00E+004.00E+008.00E+001.20E+011.60E+012.00E+01

-log(FDR)

0

20

40

60

80

Fold Enrichment

36

37

Panther plots

Recommended