Upload
elfreda-little
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Computational Laboratory: aCGH Data Analysis
Feb. 4, 2011
Per Chia-Chin Wu
Today’s Topics
• Review aCGH and its data analysis
• Homework of aCGH data analysis using tools in Genboree and ruby
Chromosomal Aberrations
REF: Albertson et al
Array CGHLabel
Patient DNA with
Cy3
Label Control
DNA with Cy5
Hybridize DNA to genomic clone
microarray
Analyze Cy3/Cy5 fluorescence ratio of
patient to control (log of Cy3/Y5)
Workflow of aCGH Analysis
Finished chips (scanner) Raw image data (experiment info ) (image processing software)
Probe level raw intensity data
Background adjustment, Normalization, transformation
Raw copy number (CN) data [log ratio of tumor/normal intensities]
Segmentation and boundary determination Estimation of CN
Characterizing individual genomic profiles
• Background Adjustment/CorrectionReduces unevenness of a single chip
Before adjustment After adjustment
Corrected Intensity (S’) = Observed Intensity (S) – Background Intensity (B)
Eliminates non-specific hybridization signal
Normalization
• NormalizationReduces technical variation between chips Before After
S – Mean of S
S’ =
STD of S
S’ ~ N(0,1 )
Normalization
• Log Transformation
before Log transformation
S
after Log transformation
Log(S)
S : Probe raw intensity; S’ : Log transformation, S’ = log2(S)CN = S’tumor - S’normal = log2(Stumor/Snormal)
Segmentation/Smoothing
CN
Clone/Chromosome
CN
Clone/Chromosome
Segmentation/Smoothing
Segmentation/Smoothing
• Goal:To partition the clones into sets with the same copy number and to characterize the genomic segments.
Noise reduction Detection of Loss, Normal, Gain, Amplification Breakpoint analysis
• Biological model: genomic rearrangements lead to gains or losses of sizable contiguous parts of the genome. Recurrent (over tumors) aberrations may indicate an oncogene or a tumor suppressor gene
• AWS - Adaptive Weights Smoothing• CBS - Circular Binary Segmentation• HMM - Hidden Markov Model partitioning• Many more
All existing methods amount to unsupervised, location-specific partitioning and operating on individual
chromosomes.
Segmentation Methods
Workflow of aCGH Data Analysis
Finished chips (scanner) Raw image data (experiment info ) (image processing software)
Probe level raw intensity data
Background adjustment, Normalization, transformation
Raw copy number (CN) data [log ratio of tumor/normal intensities]
Segmentation and boundary determination Estimation of CN
Characterizing individual genomic profiles
Homework: Analyze TCGA Data
The Cancer Genome Atlas Project (TCGA)
• Goal: find genomic alterations that cause cancer (mutations, CNA, methylation, …)
• Pilot project1. brain (glioblastoma multiforme): 186 pairs of tumor and normal samples2. lung (squamous)3. ovarian (serous cystadenocarcinoma )
Flowchart of Data Analysis
Raw copy number (CN) data [log ratio of tumor/normal intensities]
Segmenttion and boundary determination Estimation of CN
Characterizing individual genomic profiles
Annotation
Identify Recurrent Genes
Ruby: Mapping Probes
Ruby: Mapping Probes
Ruby: Mapping Probes
LFF format
Upload Data
Data Analysis: Segmentation
Data Analysis: Combine Tracks
Data Analysis: Annotation Selector
Data Analysis: Mapping Genes
Data Analysis: Recurrent Genes
Overview of Data Analysis
Raw copy number (CN) data [log ratio of tumor/normal intensities]
Data Preprocessing (Ruby) and uploading data to Genboree
Segmentation (Segmentation Tool)
Characterizing individual genomic profiles
Combing data
Annotation (Annotation Selector; Attribute Lifter)
Identify Recurrent Genes (Ruby)
You Need To Submit
1. ruby script from step 1 that creates your lff file
2. ruby script from step 5 that parses your table
3. two-column final output from step 5