7
1 Prof. Yechiam Yemini (YY) Computer Science Department Columbia University Chapter 5: Microarray Techniques 5.1 Introduction 2 Overview Introduction Cancer example Technologies basics

Chapter 5: Microarray Techniques - Department of … · Gene Expression DNA T ra n scri p ti o n mRNA T ra n sl a ti o n ... Created microarray with 18k probes ... oligonucleotides

Embed Size (px)

Citation preview

1

Prof. Yechiam Yemini (YY)

Computer Science DepartmentColumbia University

Chapter 5: Microarray Techniques

5.1 Introduction

2

Overview

Introduction Cancer example Technologies basics

2

3

Gene Expression

DNA

Transcription

mRNA

Translation

Protein

4Probe strands

E. Southern: Using HybridizationTo Measure Gene Expression

Hybridize

Target strands

+

Measurement

3

5

Spotting Microarray

RNA Extraction

Reverse Transcription

PCR Amplification

Spotting

Microarray

6

Microarray Experiments

Data analysis

Normalization

Image prcssng

clustering

classificationStat anlyssMachine lrng

Biological analysis

4

7

Application Example: Cancer Analysis

Alizadeh et al.:“Distinct types of diffuse large B-cell lymphoma identified

by gene expression profiling” Nature 403, Feb 2000.

Diffuse large B-cell lymphoma (DLBCL), the most common subtype of non-Hodgkin'slymphoma, is clinically heterogeneous: 40% of patients respond well to currenttherapy and have prolonged survival, whereas the remainder succumb to thedisease…….We identified two molecularly distinct forms of DLBCL which had geneexpression patterns indicative of different stages of B-celldifferentiation. One type expressed genes characteristic of germinal centre B cells('germinal centre B-like DLBCL'); the second type expressed genes normally inducedduring in vitro activation of peripheral blood B cells ('activated B-like DLBCL').Patients with germinal centre B-like DLBCL had a significantly better overall survivalthan those with activated B-like DLBCL. The molecular classification of tumourson the basis of gene expression can thus identify previously undetectedand clinically significant subtypes of cancer.

8

Cancer AnalysisConsidered 3 types of non-Hodgkin lymphoma

Diffused B-cells (DLBCL); follicular (FL);chronic lymp leukemia (CLL)Created microarray with 18k probes

1/4 of the genes were replicated to assure reproducibility 128 array experiments using 96 test samples and 1 control sample log(T/C) measures relative level of fluorescence of test/control

Clustered co-expressed genes based on expression profileGene profile = vector of expression level per test sample

Clustered test samples based on test profile

5

9

Clustering

10

Discovering Two Types of DLBCL76% of GC B-like DLBCL patients were still aliveafter five years, as compared with only 16% ofactivated B-like DLBCL patients

6

11

Short Oligonucleotide MicroarraysKey idea: short sequences (25bp)

fingerprint genesProbes are carefully selected

Provide fingerprintingAvoid cross hybridizing

Construct array with photolithographyProbe = perfect match +mismatch

12

Applying Oligonucleotide Microarrays

7

13

Using Inkjet Printers to Build Microarrays

Inkjet printer provides small aperture spot sizeBoundaries are very sharpDensity is improving exponentially fast

Use longer oligonucleotide

14

Comparison

Advantages• No need to isolate and purify cDNAs

oligonucleotides are synthesized.• Oligonucleotides are less likely to have

cross-reactivity with target sequences• Density of chips is higher than with cDNAs.

Limitations• The sequence has to be known.• Synthesis can be expensive and time-

consuming.• Short sequences are not as specific for

target DNA

In-situ Synthesis / Oligos PCR Products / cDNA Probes

Advantages• Flexibility to study cDNAs from any source.• cDNAs do not require any a priori

information about the corresponding genes.• Longer sequences increase hybridization

specificity, which reduces false positives.

Limitations• Isolation of individual cDNAs to immobilize

at each spot can be cumbersome.• Density is lower than synthesizing

oligonucleotides on the surface of the chip.• cDNAs are long sequences and are more

likely to randomly contain target sequences