Upload
allan
View
42
Download
0
Embed Size (px)
DESCRIPTION
CBI Tech. Workshop - NGS Special Session. Lesson 5 Genetic Variant Annotation. Linlin Yan ( 颜林林 ) Center for Bioinformatics, Peking University Jun 13, 2011. Outline. Review & Overview Thoughts & Methods Variant Browsing Variant Annotation Association Study More Beyond Demos & Exercises. - PowerPoint PPT Presentation
Citation preview
Lesson 5
Genetic Variant Annotation
Linlin Yan ( 颜林林 )Center for Bioinformatics, Peking University
Jun 13, 2011
CBI Tech. Workshop - NGS Special Session
2
Outline
Review & Overview
Thoughts & MethodsVariant BrowsingVariant AnnotationAssociation StudyMore Beyond
Demos & Exercises
Part I: Review & Overview
4
Workshop ScheduleTopic Title Speake
rDate
0 Warm-up Warm-up and Introduction GaoG 4-25
1 Basic File Format & Reads Mapping YanLL 5-9
2 Solexa Pipeline CaiT 5-16
3 Genetics Alignment File Manipulate YeYX 5-23
4 Genetic Variant Caller LiuH 5-30
5 Genetic Variant Annotation YanLL 6-13
6 Genome Assembling LiZ 6-20
7 Transcriptome(RNA-Seq)
... CaiT 6-27
8 Transcript Mapping ZhaoHQ 7-4
9 Transcript Assembling LiuXQ 7-11
10 Differential Expression Caller ChenWB 7-18
11 ChIP-Seq Peak Caller TangX 7-25
5
NGS Analysis Workflow
Short Reads
Sequencer
Assembling Mapping
Contigs / Scaffolds AlignmentsCall Variants
Call PeaksCalculate
ExpressionSNV / CNV / SV
Expression Profile
Peaks / RegionsAnnotation
6
Genetic Variant Analysis WorkflowSolexa Pipeline (Lesson 2)
File Format (Lesson 1) FASTQ / Quality / SAM / ...
Reads Mapping (Lesson 1) Maq / Bowtie / BWA
Alignment File Manipulate (Lesson 3) Samtools / BedTools / FastX-tool
Genetic Variant Caller (Lesson 4) GATK
Genetic Variant Annotation (Lesson 5) PolyPhen / SIFT / ANNOVAR / PLINK / ...
Sequencer
Short Reads
Mapping
Alignments
Call Variants
SNV / CNV / SV
Annotation
Part II: Thoughts & Methods
8
What Could Be Inferred from Variants
What at the positions?
How affect functions?
What related to phenotype?
More beyond ...
=> Genome Browser
=> Variant Annotation
=> Association Study
=> Disease: CDCV vs. CDRV
SNV / CNV / SVGenetic Variants
Genome Annotation
Mutation Effects
PhenotypeDisease
9
Genome Browser
Online Browsers:
UCSC Genome Browserhttp://genome.ucsc.edu/
Ensembl Genome Browserhttp://www.ensembl.org/
DNAnexushttps://dnanexus.com/genomes/hg18/public_brows
e
Local Browsers:
IGV (Integrative Genomics Viewer)http://www.broadinstitute.org/igv/
10
UCSC Genome Browser
(http://genome.ucsc.edu/cgi-bin/hgTracks?clade=mammal&org=Human&db=hg19)
11
UCSC Genome Browser (cont.)
Support Formats:BED / bigBedbedGraphGFFGTFWIG / bigWig
MAFBAMBED detailPersonal Genome SNPPSL
(http://genome.ucsc.edu/)
12
IGV (Integrative Genomics Viewer)
(http://www.broadinstitute.org/igv/)
13
UCSC: Table Browser & Public DB
Retrieve track data in batch
Retrieve sequences in specific regions
Combine regions and/or annotations
Query track data in public MySQL database
(http://genome.ucsc.edu/cgi-bin/hgTables)
These are KNOWN variants.
How about UNKNOWN variants?
15
Mutation Effects Prediction
SIFT (Sorting Intolerant From Tolerant)http://sift.jcvi.org/
PolyPhen (Polymorphism Phenotyping)http://genetics.bwh.harvard.edu/pph/
MAPP (Multivariate Analysis of Protein Polymorphism)http://mendel.stanford.edu/SidowLab/downloads/MAPP/in
dex.html
SNPs3Dhttp://www.snps3d.org/
16
Automatically Variant Annotation
ANNOVAR (ANNOtate VARiation)http://www.openbioinformatics.org/annovar/
Gene-based annotationSNPs/CNVs affect protein coding
Region-based annotationsVariants in specific region
Filter-based annotationVariants reported in dbSNP, 1000 genomesFilter by SIFT score
OthersRetrieve sequences or cadidate gene list in batch
17
Between Patients and Normals
Too many variants detected
Most variants are not related to target disease
Comparing MAF (Minor allele Frequency) between patients and normals can indicate related variants
MAF Patients Normals Related
SNP1 5% 5% No
SNP2 40% 10% Yes
18
Association Study Tools
PLINKhttp://pngu.mgh.harvard.edu/~purcell/plink/
gPLINKhttp://pngu.mgh.harvard.edu/~purcell/plin
k/gplink.shtml
Haploviewhttp://www.broadinstitute.org/scientific-com
munity/science/programs/medical-and-population-genetics/haploview/haploview
19
More Beyond: Find Out Causal Gene
Two Disease Hypothesis Models:CDCV: Common Disease, Common VariantCDRV: Common Disease, Rare Variant
To Find Out Rare VariantFrom GWAS (Microarray) to SequencingMore SamplesPool-up analysis methods
20
Rare Variant Analysis
Gene-Based Method
(PMID:17660818)
21
Pool Up The Rare Variants
Fixed-Threshold Method (Li, et al, 2008)
Weighted Approach (Madsen, et al, 2009)
Variable-Threshold Method (VT-Test) (Price, et al, 2010)http://genetics.bwh.harvard.edu/rare_variant
s/
Part III: Demos & Exercises
23
Demos
Data PreparingReads MappingVariant CallingBED/Wig generation
24
Demos (cont.)
UCSC Genome BrowserUploading BAM/BED/Wig
IGV Genome BrowserLoading BAM/BED/Wig
UCSC Table BrowserRetrieve track dataRetrieve coding sequences
UCSC Public Database
25
Demos (cont.)
SIFT & PolyPhen
ANNOVAR
PLINK
VT-Test
Thanks for your attention!