20
Dianne Ford DNA microarrays and high-throughput sequencing approaches for analysing patterns of gene expression

Cmb2003 lecture 12 2013

Embed Size (px)

Citation preview

Page 1: Cmb2003 lecture 12 2013

Dianne Ford

DNA microarrays and high-throughput sequencing approaches

for analysing patterns of gene expression

Page 2: Cmb2003 lecture 12 2013

Functional genomics• Experimental methods of identifying the function and

expression pattern of genomic sequences– Bioinformatics (CMB2005)– Genome projects – Professor Morgan– Mouse knockout models (MMed lecture 11)– DNA microarrays and high throughput (“next generation”)

sequencing• Detect patterns of expression of gene expression

– E.g. compare different tissues (see Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-8)

– E.g. compare normal and abnormal states (e.g. cancer)– E.g. compare effects in specific tissue of pharmaceutical/dietary intervention

(see Lagouge et al (2006) Resveratrol improves mitochondrial function and protects against metabolic disease by activating Sirt1 and PGC-1alpha. Cell 127: 1109-1122)

– E.g. compare same tissue/cell line before and after a specific treatment (e.g nutrient starvation; oxidative stress)

Page 3: Cmb2003 lecture 12 2013

So what is a DNA microarray?

• A glass slide (like a microscope slide) spotted at high density with individual DNA sequences (up to approximately 40,000), each of which corresponds to a known gene product– Oligonucleotides (50-70 mers)– PCR products

Page 4: Cmb2003 lecture 12 2013

Why is that useful?

• Incubate microarray with fluorescently-labelled cDNA from the tissue/cell line of interest.

• Labelled cDNA will hybridase (stick) only to DNA spots on the array to which it is complementary.

• Detect which (known) spots test DNA has hybridised to to determine which genes are expressed.

Page 5: Cmb2003 lecture 12 2013

Sounds too easy!

• True – that is an over-simplification– It is more usual to hybridise cDNA samples prepared

separately from two different tissues/cell types or from the same tissue in two different states (e.g. normal and diseased) or from the same tissue/cell type before and after a specific treatment.

– By incorporating a different coloured fluorescent dye into each sample, the relative level of expression of each gene on the array in each of the two samples can be compared.

• Other platforms (e.g. Affymetrix) use just a single dye and samples are hybridised to separate arrays, which are then compared.

Page 6: Cmb2003 lecture 12 2013

Principle of analysis of relative levels of gene expression by DNA microarray hybridisation

RNA isolatedfrom sample A

RNA isolatedfrom sample B

Reverse transcribe(to produce cDNA),incorporating greenfluorescent dye

Reverse transcribe(to produce cDNA),incorporating redfluorescent dye

MixHybridise to microarrayand scan

Expressed in neither sample

Expressed only in sample A

Expressed only in sample B

Expressed equally in both samples

Page 7: Cmb2003 lecture 12 2013

The image generated by a microarray experiment actually looks like this:

Page 8: Cmb2003 lecture 12 2013

An example of a microarry experiment

• Middle-aged (1 y) male mice provided with standard diet or high fat (60% energy) diet

• Resveratrol added to the diet of half mice on each diet

• Resveratrol shifted physiology of mice on high-fat diet towards mice on standard diet and increased survival significantly

• Microarray anaysis of gene expression in liver showed resveratrol opposed effects of high-calorie diet on 144 out of 153 significantly-altered pathways

Lagouge et al (2006) Cell 127: 1109-1122

Page 9: Cmb2003 lecture 12 2013

An example of a microarry experiment

• Microarray anaysis of gene expression in liver showed resveratrol opposed effects of high-calorie diet on 144 out of 153 significantly-altered pathways

Lagouge et al (2006) Cell 127: 1109-1122

Parametric analysis of gene-set enrichment (PAGE) comparing every pathway significantly upregulated (red) or downregulated (blue) by either the HC diet or resveratrol (153 in total, with 144 showing opposing effects).

Page 10: Cmb2003 lecture 12 2013

Next generation DNA sequencing: extension of the approach to “counting” transcript numbers in an mRNA sample• Also known as “massively parallel” DNA sequencing• Different commercial platforms are available

– E.g. Ilumina Solexa Genome Analyser• Achieves parallel (simultaneous) short (35-75 bp) sequencing of

hundreds of millions of random fragments of DNA (or for determining transcript (mRNA) numbers of cDNA).

– Fragments for sequencing arrayed randomly in clusters of around 103-106 produced by “bridge amplification” of single fragments that bind to solid support (flow cell) covered with oligonucleotides that pair with adapter oligonucleotides ligated to each and of fragmented DNA (or cDNA copies of short (e.g. approx. 200 base) mRNA fragments).

– Then uses “DNA sequencing by synthesis” technology» All 4 nucleotides added together, with DNA polymerase; each carries base-unique fluorescent label and

3’OH group blocked chemically so incorporate only one base at a time.» Flow cell imaged by sophisticated optics after laser excitation.» Large number of copies sequenced in each cluster is required to generate a sufficiently-strong signal for

detection.» Then 3’ blocking group removed chemically and next round proceeds

• Sequence reads aligned against a reference genome.• Depending on initial sample preparation gives information on genomic

sequence variations, splice variants or transcript (mRNA) numbers.

Page 11: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 1

OR cDNA sample (copy of mRNA, sorepresentative of number of copiesof each mRNA in the sample)

Page 12: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 2

Page 13: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 3

Page 14: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 4

Page 15: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 5

Page 16: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 6

Page 17: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing: step 1 (reminder)

OR cDNA sample (copy of mRNA, sorepresentative of number of copiesof each mRNA in the sample)

Page 18: Cmb2003 lecture 12 2013

New generation (Solexa) sequencing:Use for RNA-seq to determine transcript (mRNA) copy number

Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-8

- by Mg-catalysed hydrolysis, to give fragments of approximately 200 bases

- by random-primed reverse transcription; then Solexa sequencing

RPKM = Reads Per Kilobase of transcript per Million mapped reads

Page 19: Cmb2003 lecture 12 2013

Example of the use of sequence data from multiple genomes: deducing gene (protein) function

– Functionally-linked proteins should have homologues in all organisms with that function

• E.g. Flagella proteins should be only in bacteria with flagella

Flagella? No Yes Yes No Yes No

Page 20: Cmb2003 lecture 12 2013

ReferencesMardis ER (2008) Next-generation DNA sequencing methods.Annu Rev

Genomics Hum Genet. 9:387-402

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq.Nat Methods. 5:621-8

Noordewier M & Warren P (2001) Gene expression microarrays and the integration of biological knowledge. Trends in Biotechnology (2001)19:412 – 415

Lagouge et al (2006) Resveratrol improves mitochondrial function and protects against metabolic disease by activating Sirt1 and PGC-1alpha. Cell 127: 1109-1122