2013.10.24 big datavisualization

Visualizing “Big” DataSean Kandel & Je!rey Heer Trifacta Inc. @trifacta

How can we visualize and interact with billion+ record

databases in real-time?

Two Challenges:1. E!ective visual encoding2. Real-time interaction

Perceptual and interactive scalability should be limited by the chosen resolution of the visualized data, not the

number of records.

Perception

Data Sampling

ModelingBinning

Google Fusion Tables (Sampling)

imMens (Binned Aggregation)

Bin > Aggregate (> Smooth) > Plot

1. Bin Divide data domain into discrete “buckets”Categories: Already discrete (but check cardinality)Numbers: Choose bin intervals (uniform, quantile, ...)Time: Choose time unit: Hour, Day, Month, etc.Geo: Bin x, y coordinates after cartographic projection

Number of Bins?

100,000 Data Points Rectangular BinsHexagonal Bins

Hexagonal or Rectangular Bins?

Hex bins better estimate density for 2D plots,but the improvement is marginal [Scott 92], whilerectangles support reuse and query processing.

2. Aggregate Count, Sum, Average, Min, Max, ...

(3. Smooth Optional: smooth aggregates [Wickham ’13])

[1] Wickham 2013

(3. Smooth Optional: smooth aggregates [Wickham ’13])

4. Plot Visualize the aggregate summary values

Plot: Visual Encoding

Choose Most E!ective Encoding [Cleveland & McGill ’84]

1D Plot -> Position or Length EncodingHistograms, line charts, etc.

2D Plot -> Area or Color EncodingSpatial dimensions (x, y) already allocated.While less e!ective than area for magnitude estimation, color can be used at the per-pixel level and provides an overall “gestalt”

Standard Color RampCounts near zero are white.

-> Outliers are missed

Add Discontinuity after ZeroCounts near zero remain visible.

-> Outliers can be seen

Linear Alpha Interpolationis not perceptually linear.

Cube-Root Alpha Interpolationapproximates perceptual linearity.

Color Encoding

Luminance (in range 0-1)

Min. Non-Zero Intensity (α=0.15) [1] Perceptual Scaling (γ=1/3) [2]

User-Adjustable Min/Max Values [3]

[1] Keep small non-zero values visible (outliers!)[2] Match color ramp to perceptual distances[3] Enable exploration across value ranges

Design Space of Binned Plots

Interaction

Interaction Techniques?1. Select Detail-on-Demand2. Navigate Pan & Zoom3. Query Brush & Link

5-D Data CubeMonth, Day, Hour, X, Y

512 1023…

0 1 … 30

0 1 … 30 0 1 … 30 0

12 x 31 x 24 x 512 x 512 = ~2.3 billion cells

512 1023…

0 1 … 30

0 1 … 30 0 1 … 30 0

Brushing JanuaryMonth, Day, Hour, X, Y

31 x 24 x 512 x 512 = ~195 million cells

Multivariate Data Tiles1. Send data, not pixels2. Embed multi-dim data

Full 5-D Cube

For any pair of 1D or 2D binned plots, the maximum number of dimensions needed to support brushing & linking is four.

Σ Σ Σ Σ

X : 512 bins

~2.3B bins

~17.6M bins (in 352KB!)

Full 5-D Cube

13 3-D Data Tiles

Σ Σ Σ Σ

Query & Render on GPU via WebGL

Pack data tiles as PNG image files,bind to WebGL as image textures.

Invoke program for each output bin.Executes in parallel on GPU.

Performance BenchmarksSimulate interaction:brushing & linkingacross binned plots.

- imMens vs. Profiler- 4x4 and 5x5 plots- 10 to 50 bins

Measure time from selection to render.

Test setup:2.3 GHz MacBook Pro (4-core)

NVIDIA GeForce GT 650MGoogle Chrome v.23.0

~50fps querying of visualsummaries of 1B data points.

In-Memory Data Cube

imMens

Number of Data Points

5 dimensions x 50 bins/dim x 25 plots

[1] Lins et. al. Infovis 2013

[2] Sismanis et. al. SIGMOD 2002

NanoCubes

[1] Lins et. al. Infovis 2013

NanoCubes

ResourcesimMens vis.stanford.edu/projects/immensTableau Public tableausoftware.com/publicBigVis (R) github.com/hadley/bigvisNanocubes nanocubes.netBlinkDB blinkdb.orgMapD geops.csail.mit.edu/docs/

AcknowledgmentsZhicheng “Leo” LiuBiye Jiang

Visualizing “Big” DataSean Kandel & Je!rey Heer Trifacta Inc. @trifacta

2013.10.24 big datavisualization

Technology

INTERMEDIATEDATA SCIENCELEARNINGPLAN FOR2017...Tools–H2O,SparkR,PySpark Datavisualizationandd3.jscoursebyudacity Datavisualization&communicationinTableau Dashingd3.jscodingtutorial

Movable Type 事業戦略説明会（ProNet Meeting 2013.10.24）

Datavisualization inciting conversation

Big Data + Big Ideas = Big Impact

DATAVISUALIZATION OF BOOK COLLECTION FOR THE …

ボーアポスター2013.10.24最終...Title ボーアポスター2013.10.24最終 Created Date 10/24/2013 8:36:24 AM

Think Big Big Big Big Big Big Big Big Think Big. IQ challenge Which letter completes the puzzle?

WO2$ BioNetVisA$workshop$ From$biological ...WO2$ $ BioNetVisA$workshop$ $ From$biological$networkreconstructionto datavisualization$andanalysis$in$molecular$ biology$andmedicine$

Big Data, Big Commerce, Big Challenge

Data Science: Data Visualization Boot Camp 3D R's rgl packageccartled/Teaching/2020-Spring/DataVisualization/Presentations/...delFromSubscene3d(obj, subscene = root) 24/28 Type Sample

Paper 377-2013 Google-Like Maps in SAS®support.sas.com/rnd/datavisualization/papers/sgf2013/377...1 Paper 377-2013 “Google-Like” Maps in SAS® Darrell Massengill, SAS Institute

Next-gen sequence analysis - Schatzlabschatzlab.cshl.edu/teaching/2013/2013.10.24.SBU Comp Bio... · 2013-10-31 · Next-gen sequence analysis Michael Schatz Introduction to Computational

BIG EIGHT – BIG FIVE – BIG FOUR.ppt

NPP DataVisualization using McIDAS-V

Big Eight Big Six Big Four

visualizing quantitative informationvisualizing quantitative informationmkweb.bcgsc.ca/talks/datavisualization/datavisualization.pdf · outoutlineline best practices of graphical

Data Visualization Summit - The Innovation Enterpriseie.theinnovationenterprise.com/eb/DataVisualization... · 2014-10-24 · Data visualization greatly enhances not only data comprehension,

UP Plus 2 3D Printer Manual - 3dprintingsystems.com3dprintingsystems.com/UP Plus 2 3D Printer Manual.pdf · UP Plus 2 3D Printer User Manual v 2013.10.24 2. Overview Thank you for

Tokamaks 读书报告 4.17-4.19 叶磊 2013.10.24. 4.17 Fluctuations anomalous transport ： turbulent diffusion caused by fluctuations. 1. electrostatic ： E X B drift

BIG CONTAINERS, BIG ORCHESTRATION, BIG DATA · BIG CONTAINERS, BIG ORCHESTRATION, BIG DATA William Benton Red Hat, Inc. @willb