VIZBI 2014 - Visualizing Genomic Variation

Preview:

DESCRIPTION

This talk was given at the VizBi 2014 conference. See vizbi.org/2014

Citation preview

Visualizing Genomic Variation

Prof Jan AertsFaculty of Engineering - ESAT/STADIUSiMinds Medical ICT DepartmentKU Leuven!jan.aerts@esat.kuleuven.behttp://visualanalyticsleuven.be

What is genomic variation?

Aerts & Tyler-Smith, In: Encyclopedia of Life Sciences, 2009

“copy number variation”

transitionstransversions

Effects of variation on phenotype

• change in protein abundance

• level of transcription or translation (loss/gain)

• stability

• change in protein structure (partly deleted, fusion genes, …)

What are we interested in?

• multiple samples

• show all affected genes (or functional units)

• cluster individuals

• functional effect of structural variation

• gene-centric instead of positionally ordered: coordinate-free view

• high-level annotations (pathways, GO-terms)

• uncertainty (statistical & positional) and underlying evidence

DNA sequencingread mapping

variant callingwhat is effect of variant?

check signal

QC QC

variant filtering

Single Nucleotide Polymorphisms

General approach: reference-based

UCSC

Ensembl

Ferstay et al, IEEE InfoVis, 2013

Variant Viewsequence variants in gene context

Integrative Genome Viewer (IGV)

Sequence logo

Sequence Diversity Diagram

Structural Variation

dotplot

Pevzner & Tessler, Genome Research, 2003

read depth information: arrayCGH and next-generation sequencing

Xie & Tammi, BMC Bioinformatics, 2009

next-generation sequencing: read-pair information

Medvedev, Nature Methods, 2009

Stephens et al, Cell, 2011

Integrate read-depth and read-pair information

Pavlopoulos et al, Nucleic Acids Research, 2013

Stephens et al, Cell, 2010

Meander

From data generation to data interpretation: understanding the effect of structural variation

linearity of reference chromosome broken by structural variation, but still using the reference for comparison

!

!

=> domain expert needs to try and “wrap his head around” the data

=> need to lessen the cognitive load in interpretation: change a cognitive task into a perceptual one

UCSC Genome Browser

Nielsen & Wong, Nat Methods, 2012

represent the chromosome as it is in vivo (=~ FISH)

Feuk, Nature Reviews Genetics, 2006

reconstruct rearranged chromosome based on graph structure of segments

breakpoint graph

Pevzner & Tessler, Genome Research, 2003

focus on functional impact - Pipit

Sakai et al, submitted

Challenges

• visual and interaction scalability

• genome size: HSA1 = 240Mb = 240,000 screens at 1pixel/bp = 72km

• deep sequencing => very high depth per track

• high-dimensional data: many tracks (n=98!)

• compare multiple samples

• computational scalability

• how to compute fast enough to make interactivity possible? (e.g. switching between data resolutions)

Challenges

Thank you

• Authors of papers mentioned

• Bioinformatics/Visual Analytics Leuven

• Ryo Sakai

• Raf Winand

• Thomas Boogaerts

• Toni Verbeiren

• Georgios Pavlopoulos

• Data Visualization Lab (datavislab.org)

• Erik Duval

• Andrew Vande Moere

�33

Questions?

Recommended