Designing Interactive Visualisations to Solve Analytical Problems in Biology

  • View
    86

  • Download
    0

  • Category

    Science

Preview:

Citation preview

(Designing)

Interactive

Visualisations to

Solve Analytical

Problems (in biology) CAGATAY TURKAY,

giCentre, City University London

Who?

• Lecturer in Applied Data Science @ the giCentre, CUL

• PhD @ VisGroup at Univ. of Bergen, Norway

• Research interests:– Integrating Computational Tools in Interactive Visual Analysis

Methods

– Perceptually Optimized Visualization

• Methods for several domains:– Biology, transport, intelligence, neuroscience

giCentre (www.giCentre.net)

• 6 academics

• 2 researchers

• 5 PhDs

Data supported science

• Data analysis in almost all scientific fields

–Biology, medicine, astronomy, psychology,…

• Data driven science

• Research in several fields

–Visualization

–Data Mining

–Machine Learning

–Statistics

Visualization ?

“Computer-based visualization systems

provide visual representations of datasets

designed to help people carry out tasks more

effectively.” [Tamara Munzner, 2014]

“The use of computer-generated, interactive, visual

representations of data to amplify cognition”[Card,

Mackinlay, & Shneiderman 1999]

VIS -- a mature field already

Biological data + VIS:

A good synergy

.. but why?

Why biology is interesting for VIS?

Datasets are large & heterogeneous

Yeast Protein interaction network, Barabási & Oltvai, 2004

Clustering miR expressions

http://gdac.broadinstitute.org/

Why biology is interesting for VIS?

Things happen at multiple scales

[ by O’Donoghue et al., 2010]

[Nye, 2008]

Why biology is interesting for VIS?

Processes are dynamic (spatio-temporal complexity)

Neutrophil chasing a bacteria by David Rogers

Why biology is interesting for VIS?

• Computational methods are central in analysis

–Uncertainties hinder reliability

– Interpretation is a problem (black-box alg., little

context)

Comprehensive molecular portraits of human breast tumours, TCGA Network, Nature, 2012

How can visualisation help?

• Ease of cognition & communication

• Relating multiple aspects

• Compare multiple computational outputs

• Investigate uncertainties

• Seamless integration of computation

and …

• Enable & foster hypothesis generation

Forms of visualisation support

VIS as a presentation medium

+

VIS with interaction

+

VIS with integrated computations

Visualisation as a

presentation medium

Cross-section of Escherichia coli cell, Illustration by David S. Goodsell, the Scripps Research Institute

106 diffusing and reacting molecules in real-time, Muzic et al., 2014

NATURE METHODS: POINTS OF VIEW, by Wong et al.

http://blogs.nature.nom/methagora/2013/07/data-visualization-points-of-view.html

Why is VIS good here?

• Analysts’ perceptual & cognitive capabilities

• Better interpretation

• Communication

Visualisation

with interaction

Example: MizBee - Synteny Browser

Meyer et al., MizBee: A Multiscale Synteny Browser, 2009

Why is VIS good here?

• Linking multiple aspects

• Interactively varying the focus

• Display multiple-scales concurrently

Visualisation with

integrated computations

Combine the best of two worlds: human capabilities and

power

Facilitate the informed use of

computation through interactive visual methods

(a.k.a. Visual Analytics)

Example: StratomeX, Caleydo

http://caleydo.org

Pat

ien

ts (

sam

ple

s)

Genes

Candidate Subtype /Heat Map

Header /Summary of whole Stratification

Cancers have subtypes• different histology• different molecular alterations

Subtypes are identified by stratifying datasets, e.g.,

• based on an expression pattern• a mutation status• a copy number alteration• a combination of these

Case: Cancer Subtype Analysis

Multiple Stratifications

Many shared Patients

Clustering 1 Clustering 2

Sample Overlaps

Dependent PathwaysSlide by Alex Lex

Slide by Alex Lex

Multiple Stratifications (again)

Many shared Patients

Clustering 1 Clustering 2

Sample OverlapsG

en

e O

verl

ap

s ??

Finding distinctive genes

Characterizing cancer subtypes using dual analysis in Caleydo StratomeX, Turkay et al., IEEE CG&A, 2014

Finding distinctive genes (ex. BRCA types)

[*] Cancer Genome Atlas Network. (2012). Comprehensive molecular portraits of human breast tumours. Nature, 490(7418), 61-70.

Luminal-A

underexpressed genes

Luminal-A

overexpressed genes

Basal-like

overexpressed

Basal-like

underexpressed

Ex: Cavity analysis in molecular simulations

Cavities on molecular surfaces

• Important in ligand binding

• Drug design, etc.

Long molecular simulations

Cavities are dynamic, hard to track

Amino-acids to characterize the

cavity

• hydrophobicity (grey)

• polarity (green)

• positively charged (blue)

• negatively charged (red) Visual Cavity Analysis in Molecular Simulations

J. Parulek, C. Turkay, N. Reuter, I. Viola. BMC Bioinformatics, 2013.

1. Run the simulation

2. Fit graphs cavities

3. Compute measures

4. Find touching amino-acids

5. Perform visual analysis

Analysis of Proteinase 3

A hydrophobic cavity

Why is VIS good here?

• Multiple linked data sets – improve interpretation

• Multiple computational results – deal with

uncertainty

• Integrate computation outputs, i.e., clusters, derived

data

• Allows a fast-paced iterative process

• Quick idea prototyping

Wrap up !

VIS as a presentation medium

+

VIS with interaction

+

VIS with integrated computations

Visualisation is very good to answer

HOW & WHY?questions ..

- How do these genomes overlap?

- Why is this a cluster?

....

Outlook

• Interaction and explorative analysis is key!

• Seamless support from integrated computation, i.e., t-tests

• Visual analysis as an everyday tool for analysts

Thanks ! (& more biovis ?)

http://www.biovis.net

#biovis

Paper deadline: February 15, 2015

Data & Design Contests: May 1, 2015

• VisGroup (Helwig Hauser, Julius Parulek & Ivan Viola) and

Nathalie Reuter from University of Bergen

• Caleydo team (Alex Lex, Hanspeter Pfister, Nils Gehlenborg, Marc Streit)