32
A Web Language for Big Data Visualization Leo Meyerovich, Matthew Torok, Eric Atkinson, Rastislav Bodik Parallelism Lab, UC Berkeley SUPERCONDUCT OR 1

Superconductor: A Language for Interactive Big Data Visualization

Embed Size (px)

DESCRIPTION

Download ("save") PPTX file to see animations. Presented at O'Reilly Fluent 2013 by Leo Meyerovich (UC Berkeley). See http://www.sc-lang.com for more.

Citation preview

Page 1: Superconductor: A Language for Interactive Big Data Visualization

1

A Web Language for Big Data Visualization

Leo Meyerovich, Matthew Torok, Eric Atkinson, Rastislav BodikParallelism Lab, UC Berkeley

SUPERCONDUCTOR

Page 2: Superconductor: A Language for Interactive Big Data Visualization

2

“Well-designed graphics

are usually the simplest”

Big Data is Different:going from Data Reporting

to Knowledge Discovery

… small & static charts enough?

Page 3: Superconductor: A Language for Interactive Big Data Visualization

3

Ex: How to Report Voter Turnout

Swedes Like Voting

0% 100%

50%

Voter Turnout

# Votes

Mexico

Sweden

Democracy? Bell Curve

Mystery Country

Abnormal curve; can be voter

fraud!

Page 4: Superconductor: A Language for Interactive Big Data Visualization

4

Precrafted message,

not knowledge discovery!

demo: fraud analysis

Page 5: Superconductor: A Language for Interactive Big Data Visualization

5

Page 6: Superconductor: A Language for Interactive Big Data Visualization

6

Interactive

OpenGL

Platform Demand

sScale Customizable

JavaScript

Page 7: Superconductor: A Language for Interactive Big Data Visualization

7

Big Data Viz

Parallel JS

Parallel Framework

*Effective*Parallel JS

Page 8: Superconductor: A Language for Interactive Big Data Visualization

8

Platform: JavaScript is the New Assembly

parallel

multicore:

SIMD:

HTML5 Hardware Access

GPU:

Low-level,how to exploit?

Page 9: Superconductor: A Language for Interactive Big Data Visualization

9

Data Viz Parallel JS

Parallel Framework

*Effective*Parallel JS

Superconductorspecializes for

data visualization

Page 10: Superconductor: A Language for Interactive Big Data Visualization

10

Superconductor’s Domain Specific Languages

data

paintlayout

stylize

Parallel & High-Level Language for Each?

Page 11: Superconductor: A Language for Interactive Big Data Visualization

11

DSL 1: Data via JSONJavaScript, Ruby, Python, Java, …

Easy… until 1-10s data loading

Page 12: Superconductor: A Language for Interactive Big Data Visualization

12

Parsing Demo

Page 13: Superconductor: A Language for Interactive Big Data Visualization

13

Optimizing JSON Parsingraw.json: 23MB

compress + zipcsr1.zip (0.2MB), …, csr12.zip server

browser

Parallel parsing easy!

… when you fix the format

big JavaScript object

Each worker:1. native JSON parse

csr.json2. decompress obj.json3. 0-copy return: typed arrays!

parallel parseparallel parseparallel parse

partitionraw1.json(1.9MB), …, raw12.json

Page 14: Superconductor: A Language for Interactive Big Data Visualization

14

DSL 2: Custom Layout/Rendering

Compilertreemap.ftlParallel codeWebCL+WebGL

tree: SC_DOM.js

LayoutEngine.js

offline browser

Page 15: Superconductor: A Language for Interactive Big Data Visualization

15

class HBox : Node children:

left : Noderight: Node

constraints: w := left.w + right.w

xy xy

y

y

y

w h

w h

x x

x

hw

Writing a Custom Layout: Super CSS!

10px5px

Root

HBox

LeafLeaf

LeafLeaf

HBox

left

right

w

xy

hw

hwh

input: x, y var: w, h

[Kastens 1980, Saraiva 2003] [WWW 2010, PPOPP 2013]

2. Single-assignment

1. Local

Page 16: Superconductor: A Language for Interactive Big Data Visualization

16

Leaf

Compute: Layout as Tree Traversals

w,h w,h

w,hw,h

w,h

w,h x,y …

1. Works for all data sets2. Compiler automatically parallelizes!

[WWW 2010]

h0=max(h1,h2)w0=f (w1,w2)

document tree

constraintson node attributeslogical joins

logical spawns

Parallel

Parallelism in each traversal!

Page 17: Superconductor: A Language for Interactive Big Data Visualization

17

Layout DSL is Flexible!multicor

emultico

re

multicore

GPU

GPU

GPU

GPU

Page 18: Superconductor: A Language for Interactive Big Data Visualization

18

Big Data Viz

Parallel JS

Parallel Framework

*Effective*Parallel JS

Page 19: Superconductor: A Language for Interactive Big Data Visualization

19

Animation & Interaction

LayoutModificatio

n

Layout fast enough for real-time loop!

Page 20: Superconductor: A Language for Interactive Big Data Visualization

20

First Rule of GPU Club: Don’t Talk to the GPU

Budget: 30ms = 33fps

Maxed out by 300 small messages!

Page 21: Superconductor: A Language for Interactive Big Data Visualization

21

Small Interactions: JavaScript Proxy

1. Small read/writes: JavaScriptvar w = root.getWidth();//sc.js proxies read from GPU

2. Animation: rerun layout!root.setHeight(0.5); //sc.js proxies write to GPUlayout();

Page 22: Superconductor: A Language for Interactive Big Data Visualization

22

Bigger Interactions: CSS Selectors*

state precinct { height: 5 }

* buggy

selectors.js

myStylesheet.webCL

… tree traversal, same as layout!

[WWW 2010]

Page 23: Superconductor: A Language for Interactive Big Data Visualization

23

Layout on GPU

level 1

JSON Tree

level n

whxy

Nodes in arrays

Array per attribute

Superconductor does this for you.

Benefits1. Parallelism!2. Data never leaves

GPU

: Must “Flatten” Tree

Page 24: Superconductor: A Language for Interactive Big Data Visualization

24

How to Compute Layout on GPU:

Level-synchronous Breadth-First

level 1

JSON Tree

level n

[Blelloch 93]

parallel for loop(level synchronous)

Page 25: Superconductor: A Language for Interactive Big Data Visualization

25

circ(…); …

Problem: Layout->Rendering Buffer Allocation?

function circ(x,y,w,h) {

buffer = malloc(w*10); loop:

buffer[i] = cos(i); …} //alloc + tessellate + …

Dynamic allocation

square(…)rect(…);

line(…); …

rect(…); …

oval(…)

Page 26: Superconductor: A Language for Interactive Big Data Visualization

26

Optimizing Buffer Allocation & Passing

allocCirc(…); …

allocRect(…); …

allocLine(…); …

allocRect(…); …

fillCirc(…); …

fillRect(…); …

fillLine(…); …

fillRect(…); …

1. Prefix sum for needed space

2. Allocate buffers

3. Fill vertex buffers in parallel

4. Give OpenGL buffers pointer

Page 27: Superconductor: A Language for Interactive Big Data Visualization

27

layout (4 passes)

rendering pass

TOTAL1

10

100

1,000

10,000

Naïve JS (Chrome 26) Arrays (Chrome 26)

GPU (Safari + WebCL 11/3) 24fps

Tim

e (

ms)

CPU vs. GPU for Election Treemap: 5 traversals over 100K nodes

Array-based: 14X speedup

WebCL: 31X

WebCL: 5X

COMBINED: 54X !

Page 28: Superconductor: A Language for Interactive Big Data Visualization

28

Multicore Parsing w/ Web Workers

runtime flattening

(BASE-LINE)

+ pre-process-

ing

+ paral-lelization

0

2

4

6

8

10

12 ownership transfer (multicore msg copy)library init, GPU transfer

Tim

e (

ms)

2012 MacBook Pro (2.6GHz quadcore i7 w/ 8GB)

290ms

600ms

2.7s

Page 29: Superconductor: A Language for Interactive Big Data Visualization

29

Recap: Parallel Arch

HTML dataCSS styling

JS script

Pixels

Parser

Selectors

Layout

RendererJa

vaSc

ript V

MRenderer.GL

Parser.js

webpage

Layout.CL

Selectors.CLGPU

superconductor.js

datastyling

widgets

data viz

Compiler

Date stayson GPU!

Page 30: Superconductor: A Language for Interactive Big Data Visualization

30

GE Demo

Page 31: Superconductor: A Language for Interactive Big Data Visualization

31

Data Viz Parallel JS

Parallel Framework

*Effective*Parallel JS