Techniques for Visualizing Massive Data Sets

Preview:

DESCRIPTION

Techniques for Visualizing Massive Data Sets. Leilani Battle , Mike Stonebraker. Context. Visualization System. query. result. Database. Problem. Performance Vis systems don’t scale well for big d ata Or are turning into databases Over-plotting M akes visualizations unreadable - PowerPoint PPT Presentation

Citation preview

TECHNIQUES FOR VISUALIZING MASSIVE DATA SETSLeilani Battle, Mike Stonebraker

Context

Visualization System

Database

query

result

Problem• Performance

• Vis systems don’t scale well for big data• Or are turning into databases

• Over-plotting• Makes visualizations unreadable• Waste of time/resources

Solution: Resolution Reduction

Visualization System

Database

Resolution Reduction Layer

query

queryplan query

queryplan result

modified query

reduced result

ScalaR• Scalable vis system for data exploration

• Web front-end• Uses SciDB (www.scidb.org)

• Visualizes query results• Performs Resolution Reduction

Demo of ScalaR

Array Browser• Collaboration with:

• Brown: Justin DeBrabant, Stan Zdonik, Ugur Cetintemel• Stanford: Zhicheng Liu, Jeff Heer

• Google Maps-style exploration experience• Fetches subsets of the data (aka data tiles)

Array Browser Example

Array Browser Architecture

Demo of Array Browser

Future Work: Prefetching• Goal: Reduce user-wait time by prefetching tiles• Cache tiles in the tile buffer• Need algorithms to decide what to pre-fetch

User Behavior Predictor (Seer)

P

P

• Learn common query sequences from user traces

Statistical Analysis Predictor

P

P

P

• Look for statistical similarities in tiles• Try to guess what’s important based on patterns

Using Multiple Predictors• Run multiple predictors (or experts) in parallel• Compare predictions to user’s actual behavior• Use predictions from best performing expert

• May change over time based on user’s goals

Other Challenges• Lots if interesting problems left to address

• Best eviction policy for the tile buffer?• How to share data between multiple users?• More predictors?

Questions?

Gemini Sagittarius

Dogs Cats

Prefetching Experts• User behavior predictor (Seer)

• Learn common query sequences from user traces• Stats analysis predictor

• Look for statistical similarities in tiles• Try to guess what’s important based on patterns

Recommended