46
GENEVESTIGATOR TUTORIAL VIB - Gent 12.04.2011 1

BITS - Genevestigator to easily access transcriptomics data

  • Upload
    bits

  • View
    574

  • Download
    1

Embed Size (px)

DESCRIPTION

These are the presentation slides of the BITS training session about 'Genevestigator'.Many thanks to Nebion for contributing these slides.

Citation preview

Page 1: BITS - Genevestigator to easily access transcriptomics data

GENEVESTIGATOR TUTORIAL

VIB - Gent

12.04.2011

1

Page 2: BITS - Genevestigator to easily access transcriptomics data

Goals

Understand what Genevestigator is and why it has been developed

Understand the function of the tools provided by the software

Learn how to use Genevestigator to find genes of interest

2

Page 3: BITS - Genevestigator to easily access transcriptomics data

Content

Microarray technology

Concept of Genevestigator

Data curation

Tools:– Meta-profile analysis– Biomarker search– RefGenes– Clustering analysis

Page 4: BITS - Genevestigator to easily access transcriptomics data

Microarray technology

Advantages:– Genome wide– Relatively cheap– Standardized streamlined handling– Use of an optimized system based on oligonucleotide sequences– Possibility to store data in publicly available repositories

Disadvantages:– Sequence must be known in advance– Hybridization reaction

Page 5: BITS - Genevestigator to easily access transcriptomics data

Workflow of a microarray experiment

Each pixel intensity is determined by the expression level of a gene in the specific sample hybridized on the array

Raw Data (Probe level)

Quality Control

Normalization

Normalized Data

Analysis

Validation (Q-PCR)

5

Submission to repository

HybridizationConditions selection and experiments

RNA extraction, amplification and labelling

Hybridization on chips

DAT fileScanned raw image

CEL file

TXT file

Page 6: BITS - Genevestigator to easily access transcriptomics data

Concept of Genevestigator

6

Thousands of microarrayexperiments exist world-wide

Tissue type 1Tissue type 2Tissue type 3Tissue type 4……………Tissue type 200

=> Summarize information from thousands of public experiments into easily interpretable results

Model of asummarized output

Page 7: BITS - Genevestigator to easily access transcriptomics data

Concept of Genevestigator

7

meta-analysis?

Data repositories

Dataqualitycontrol

Expert annotationwith systematic

ontologies

anatomy

development

condition

genotype

Curation

meta-analysis!

Genevestigator

Build a systematic database of gene expression information

Page 8: BITS - Genevestigator to easily access transcriptomics data

8

1. Data Curation - Overview

Quality control all sample data

Collect raw data files and normalize data

Read and understand the experiment

Manually annotate experiments using structured vocabularies (ontologies)

Final goal of curation: translate experimental information in computer-readable and „statistically usable“ form

Quality control+

Normalization

1. Data Curation

Expert annotationwith systematic

ontologies

anatomydevelopment

conditiongenotype

Page 9: BITS - Genevestigator to easily access transcriptomics data

9

Curation: Quality control

Unprocessed probe intensity

RNA degradation plots

Probe-level analysis (RLE, NUSE)

Border element analysis

Array-array correlation plots

Page 10: BITS - Genevestigator to easily access transcriptomics data

10

Curation: normalization models

Multi-array models– e.g. dChip, RMA, gcRMA– all arrays from an experiment are normalized simultaneously– cannot easily be used to create large databases– RMA and gcRMA use perfect-match information only (background estimation by

statistical approaches)

Single array models– e.g. MAS5– normalize each array independantly– does not correct for biases between experiments– MAS5 uses both perfect-match and mismatch probe information

(mismatch is used to model background (biochemical approach))

Page 11: BITS - Genevestigator to easily access transcriptomics data

11

Curation: Ontologies

Ontologies built for– Anatomical parts

– Stages of development

– Perturbations (diseases, chemicals, etc.)

Ontologies– Were compiled from various public ontology

sources and own developments

– Are built using tree structures

Anatomy Ontology:- Arabidopsis- Rice - Barley

(version 2008)

DevelopmentOntology:- Mouse

Page 12: BITS - Genevestigator to easily access transcriptomics data

12

12

Curation: Meta-profiles

sample meta-data

expression data

[space] [time] [response] [response]

summarizedresults

Page 13: BITS - Genevestigator to easily access transcriptomics data

13

Curation: Data content

As of December 2010: > 54’000 Affymetrix arrays Total 1’742 54’786

World’s largest standardized, quality controlled, and manually annotated gene expression compendium for plants, animals, and microorganisms!

Page 14: BITS - Genevestigator to easily access transcriptomics data

14

Genevestigator application

Database and analysis engine

Website with user support

Analysis tool for the user

Browser– Genevestigator works in Internet Explorer,

Firefox, Safari, Opera, and Chrome

Java– Sun Microsystems; Minimal: Java 1.4.2. or

higher

Computer:– 500 MB RAM or more

Requirements

Page 15: BITS - Genevestigator to easily access transcriptomics data

15

Toolsets

Page 16: BITS - Genevestigator to easily access transcriptomics data

16

Analytical approach 1

genes

Anatomy[space]

Development[time]

Condition /Genotype[response]

which conditions?

Page 17: BITS - Genevestigator to easily access transcriptomics data

17

Meta-Profile Analysis

1. Choose an organism

2. Enter the genes you wish to work with

Page 18: BITS - Genevestigator to easily access transcriptomics data

18

Meta-Profile Analysis tools

View and interpret the results across:– Anatomical categories (Anatomy tab)– Developmental stages (Development tab)– Chemicals, diseases, tumors, etc. (Conditions tab)– Genetic modifications (Genotype tab)– Tumors (Neoplasm tab, only for Human)

Page 19: BITS - Genevestigator to easily access transcriptomics data

19

Note: Select by experiment or annotation

Page 20: BITS - Genevestigator to easily access transcriptomics data

20

Meta-Profile Analysis: Anatomy tool

Looks at how genes are expressed in different tissues

Mean and standard deviation

Anatomy categories as a tree (ontology); expand / collapse

Number of arrays per category is indicated

Page 21: BITS - Genevestigator to easily access transcriptomics data

21

Meta-Profile Analysis: Neoplasm tool

Looks at how genes are expressed in different tumors

Clinical parameters of the tumors are available

Mean and standard deviation

Anatomy categories as a tree (ontology); expand / collapse

Number of arrays per category is indicated

Expression profile of NPY across different tumor types

Page 22: BITS - Genevestigator to easily access transcriptomics data

22

Meta-Profile Analysis: Development tool

Looks at how genes are expressed during the life cycle of an organism

Example for barley

Example for mouse / rat

Page 23: BITS - Genevestigator to easily access transcriptomics data

23

Meta-Profile Analysis: Conditions and Genotype tools

List (or tree)of variousconditions

Spots indicate theresponses of selectedgene(s) to the list of conditions

Most upregulating conditions

Most downregulating conditions

Page 24: BITS - Genevestigator to easily access transcriptomics data

24

Meta-Profile Analysis: Scanner tool

All arrays are represented on a single screen

Easily find and select experiments in which expression is particularly high (screen for peaks)

Magnifying glass and tooltip allow to look into details of signals, arrays, and experiments.

Page 25: BITS - Genevestigator to easily access transcriptomics data

25

Meta-Profile Analysis: Samples tool

All arrays are represented in a single plot, scroll down

Look at expression level and “absent / present” calls

Tooltips allow to look into details of arrays and experiments.

Page 26: BITS - Genevestigator to easily access transcriptomics data

26

Analytical approach 2

Anatomy[space]

Development[time]

Conditions /Genotypes[response]

conditions which genes?

Page 27: BITS - Genevestigator to easily access transcriptomics data

27

Biomarker search

1. Choose an organism

3. Save target genes for further analysis

2. Choose conditions and run analysis

Page 28: BITS - Genevestigator to easily access transcriptomics data

28

Biomarker Search

Identify genes that exhibit specific expression characteristics

Anatomy

Development

Conditions / Genotype

Page 29: BITS - Genevestigator to easily access transcriptomics data

29

Classical biomarker search

Most biomarker search approaches look for the genes, which respond the most to a given condition

This condition may include multiple similar studies

How these genes respond to other conditions is unknown, because they were not included into the analysis

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

con

dit

ion

1

con

dit

ion

2

con

dit

ion

3

con

dit

ion

4

con

dit

ion

5

con

dit

ion

6

con

dit

ion

7

con

dit

ion

8

con

dit

ion

9

co

nd

itio

n 1

0

con

dit

ion

11

con

dit

ion

12

co

nd

itio

n 1

3

co

nd

itio

n 1

4

con

dit

ion

15

con

dit

ion

16

co

nd

itio

n 1

7

??

Page 30: BITS - Genevestigator to easily access transcriptomics data

30

Biomarker validation in Genevestigator

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

con

dit

ion

1

con

dit

ion

2

con

dit

ion

3

con

dit

ion

4

con

dit

ion

5

con

dit

ion

6

con

dit

ion

7

con

dit

ion

8

con

dit

ion

9

co

nd

itio

n 1

0

con

dit

ion

11

con

dit

ion

12

co

nd

itio

n 1

3

co

nd

itio

n 1

4

con

dit

ion

15

con

dit

ion

16

co

nd

itio

n 1

7

Genevestigator allows to find out how specific these genes are (Meta-Profile Analysis -> Stimulus/Mutation tools)

Only few are responsive only to condition 9 (black arrows). All others are sensitive to one (grey arrows) or more other conditions.

Page 31: BITS - Genevestigator to easily access transcriptomics data

31

Biomarker Search in Genevestigatorco

nd

itio

n 1

con

dit

ion

2

con

dit

ion

3

con

dit

ion

4

con

dit

ion

5

con

dit

ion

6

con

dit

ion

7

con

dit

ion

8

con

dit

ion

9

co

nd

itio

n 1

0

con

dit

ion

11

con

dit

ion

12

co

nd

itio

n 1

3

co

nd

itio

n 1

4

con

dit

ion

15

con

dit

ion

16

co

nd

itio

n 1

7

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

The Genevestigator Biomarker Search tools identify genes that are specifically responsive to the chosen condition (they respond minimally to other conditions).

These genes are not necessarily the ones with the strongest response to the chosen condition

The Genevestigator Biomarker Search tools usually find other target candidates than classical tools, which analyze only a subset of experiments

Page 32: BITS - Genevestigator to easily access transcriptomics data

32

Biomarker Search in Genevestigator

Imagine extending this to a much wider set of conditions…– you may find other conditions to which the set of genes respond

co

nd

itio

n 1

con

dit

ion

2

con

dit

ion

3

con

dit

ion

4

co

nd

itio

n 5

con

dit

ion

6

co

nd

itio

n 7

co

nd

itio

n 8

con

dit

ion

9

co

nd

itio

n 1

0

con

dit

ion

11

con

dit

ion

12

co

nd

itio

n 1

3

co

nd

itio

n 1

4

con

dit

ion

15

con

dit

ion

16

co

nd

itio

n 1

7

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

co

nd

itio

n 1

8

co

nd

itio

n 1

9

co

nd

itio

n 2

0

co

nd

itio

n 2

1

con

dit

ion

22

con

dit

ion

23

co

nd

itio

n 2

4

co

nd

itio

n 2

5

con

dit

ion

26

con

dit

ion

27

con

dit

ion

28

co

nd

itio

n 2

9

co

nd

itio

n 3

0

co

nd

itio

n 3

1

co

nd

itio

n 3

2

co

nd

itio

n 3

3

con

dit

ion

34

co

nd

itio

n 3

5

co

nd

itio

n 3

6

con

dit

ion

37

co

nd

itio

n 3

8

co

nd

itio

n 3

9

co

nd

itio

n 4

0

co

nd

itio

n 4

1

con

dit

ion

42

con

dit

ion

43

co

nd

itio

n 4

4

con

dit

ion

45

con

dit

ion

46

co

nd

itio

n 4

7

con

dit

ion

48

co

nd

itio

n 4

9

co

nd

itio

n 5

0

co

nd

itio

n 5

1

con

dit

ion

52

co

nd

itio

n 5

3

co

nd

itio

n 5

4

co

nd

itio

n 5

5

con

dit

ion

56

co

nd

itio

n 5

7

con

dit

ion

58

con

dit

ion

59

co

nd

itio

n 6

0

con

dit

ion

61

con

dit

ion

62

co

nd

itio

n 6

3

co

nd

itio

n 6

4

co

nd

itio

n 6

5

co

nd

itio

n 6

6

con

dit

ion

67

co

nd

itio

n 6

8

co

nd

itio

n 6

9

con

dit

ion

70

co

nd

itio

n 7

1

co

nd

itio

n 7

2

con

dit

ion

73

con

dit

ion

74

co

nd

itio

n 7

5

target condition

other conditions to which the genes are responding

Page 33: BITS - Genevestigator to easily access transcriptomics data

33

Biomarker Search: example

Search for genes that are associated with a set of conditions, e.g. how do abiotic stresses relate to hormonal responses?

hormonalresponses

abiotic stresses

ABA (+)

salt (+)osmotic (+)

---

salt (-)osmotic (-)

ABA (+)

salt (+)osmotic (+)

cold (+)

MeJA (+)

salt (+)drought (+)

BL / H3BO3(+)

anoxia (-)hypoxia (-)

ethylene (+)

hypoxia (-)

Page 34: BITS - Genevestigator to easily access transcriptomics data

34

Biomarker Search in Genevestigator

Example: human genes responsive to Actinomycin-D

Actinomycin-D

Cell cycle inhibition

Echinomycin

Chemical: ARC

SapphyrinPropiconazoleOncolytic herpessimplex virus

vMyb

target condition(s)

co-inducing conditions

Page 35: BITS - Genevestigator to easily access transcriptomics data

35

RefGenes

Goal: identify reference genes for use in qPCR.

Solution: search the Genevestigator database for genes that show constant expression in a certain category of arrays.

Page 36: BITS - Genevestigator to easily access transcriptomics data

36

RefGenes: validation experiment with mouse liver

Validation experimenton mouse liver

geNorm selection of the moststable reference genes within

this experiment

Dataset: 197 arrays from mouse liver

Page 37: BITS - Genevestigator to easily access transcriptomics data

37

Clustering Analysis

Goal: to identify groups of genes that have similar expression characteristics

Tools:– Hierarchical clustering (with leaf

ordering)– Biclustering (BiMax algorithm)

Page 38: BITS - Genevestigator to easily access transcriptomics data

38

Biclustering

Search for biclusters in a list of 64 genes responsive to myocardial infarction

One of many possible biclusters Development profile of these 7 genes

Page 39: BITS - Genevestigator to easily access transcriptomics data

39

Advantages of using Genevestigator

Benefit from the normalized data from 54’000 arrays on 12 organisms

Extended and precise gene search according to:

- Anatomy- Development- Stimulus / Mutation

Find genes, which might be interesting for a further study

Gain further information about specific gene sets

Find appropriate reference genes for the conditions you study

Rapidly compare, validate and extend data

Page 40: BITS - Genevestigator to easily access transcriptomics data

QUESTIONS?

Page 41: BITS - Genevestigator to easily access transcriptomics data

Supplementary Slides

Page 42: BITS - Genevestigator to easily access transcriptomics data

42

Select Genes

Page 43: BITS - Genevestigator to easily access transcriptomics data

43

Problems with classical reference genes

Most groups use common housekeeping genes such as β-Actin or GAPDH to normalize qPCR data

Depending on the condition studied, these genes show some regulations and are therefore unsuitable

Hypothesis: for each biological context, there is a subset of genes that are most suitable to normalize expression data from this context.

Page 44: BITS - Genevestigator to easily access transcriptomics data

44

Summary

Page 45: BITS - Genevestigator to easily access transcriptomics data

Affymetrix GeneChip®

Scan

Page 46: BITS - Genevestigator to easily access transcriptomics data

Affymetrix GeneChip® scanned image

46

Each pixel intensity is determined by the expression level of a gene in the specific sample hybridized on the array

DAT fileScanned raw image

CEL file

Raw Data (Probe level)

Quality Control

Normalization

Normalized Data

Into repository

TXT file