20
TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Techniques for Visualizing Massive Data Sets

  • Upload
    stash

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

Techniques for Visualizing Massive Data Sets. Leilani Battle , Mike Stonebraker. Context. Visualization System. query. result. Database. Problem. Performance Vis systems don’t scale well for big d ata Or are turning into databases Over-plotting M akes visualizations unreadable - PowerPoint PPT Presentation

Citation preview

Page 1: Techniques for Visualizing Massive Data Sets

TECHNIQUES FOR VISUALIZING MASSIVE DATA SETSLeilani Battle, Mike Stonebraker

Page 2: Techniques for Visualizing Massive Data Sets

Context

Visualization System

Database

query

result

Page 3: Techniques for Visualizing Massive Data Sets

Problem• Performance

• Vis systems don’t scale well for big data• Or are turning into databases

• Over-plotting• Makes visualizations unreadable• Waste of time/resources

Page 4: Techniques for Visualizing Massive Data Sets

Solution: Resolution Reduction

Visualization System

Database

Resolution Reduction Layer

query

queryplan query

queryplan result

modified query

reduced result

Page 5: Techniques for Visualizing Massive Data Sets

ScalaR• Scalable vis system for data exploration

• Web front-end• Uses SciDB (www.scidb.org)

• Visualizes query results• Performs Resolution Reduction

Page 6: Techniques for Visualizing Massive Data Sets

Demo of ScalaR

Page 7: Techniques for Visualizing Massive Data Sets

Array Browser• Collaboration with:

• Brown: Justin DeBrabant, Stan Zdonik, Ugur Cetintemel• Stanford: Zhicheng Liu, Jeff Heer

• Google Maps-style exploration experience• Fetches subsets of the data (aka data tiles)

Page 8: Techniques for Visualizing Massive Data Sets

Array Browser Example

Page 9: Techniques for Visualizing Massive Data Sets

Array Browser Architecture

Page 10: Techniques for Visualizing Massive Data Sets

Demo of Array Browser

Page 11: Techniques for Visualizing Massive Data Sets

Future Work: Prefetching• Goal: Reduce user-wait time by prefetching tiles• Cache tiles in the tile buffer• Need algorithms to decide what to pre-fetch

Page 12: Techniques for Visualizing Massive Data Sets

User Behavior Predictor (Seer)

P

P

• Learn common query sequences from user traces

Page 13: Techniques for Visualizing Massive Data Sets

Statistical Analysis Predictor

P

P

P

• Look for statistical similarities in tiles• Try to guess what’s important based on patterns

Page 14: Techniques for Visualizing Massive Data Sets

Using Multiple Predictors• Run multiple predictors (or experts) in parallel• Compare predictions to user’s actual behavior• Use predictions from best performing expert

• May change over time based on user’s goals

Page 15: Techniques for Visualizing Massive Data Sets

Other Challenges• Lots if interesting problems left to address

• Best eviction policy for the tile buffer?• How to share data between multiple users?• More predictors?

Page 16: Techniques for Visualizing Massive Data Sets

Questions?

Page 17: Techniques for Visualizing Massive Data Sets
Page 18: Techniques for Visualizing Massive Data Sets

Gemini Sagittarius

Dogs Cats

Page 19: Techniques for Visualizing Massive Data Sets
Page 20: Techniques for Visualizing Massive Data Sets

Prefetching Experts• User behavior predictor (Seer)

• Learn common query sequences from user traces• Stats analysis predictor

• Look for statistical similarities in tiles• Try to guess what’s important based on patterns