View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Interactive Query Processing in Scientific Applications
David LiuUC Berkeley
Computer Science Division
Problem: Data management in Scientific Experiments
Big science is about managing big dataScientific Applications lots of storage distributed architectures lots of computation Embarrassingly parallel Embarrassingly slow
Interactive Query Processing + $$$
Can locality and random sampling go together?
dataQuery
Processorinteractive random
samples
dataQuery
Processor
cache
batch
interactive randomsamplesdata
QueryProcessor
cache
Fine-grain computation scheduling
Three queries active in the systemProcess A first, then B
Q1
G B
A
Q2
D B
A
Q3
E C
A
Fine-grain computation scheduling
All queries partially satisfiedQ1 aborted at 66% complete
Q1
G B
A
Q2
D B
A
Q3
E C
A
Fine-grain computation scheduling
D, E, C all satisfy 1 query a piece, which one to go for?E or C, because it improves the more neglected query
Q2
D B
A
Q3
E C
A
Fine-grain schedulingUse fine-grain scheduling to improve overall mirth Schedule more popular tuples first Schedule neglected queries first
Vision: A Flexible Querying System
Flexible Querying: sacrifice fidelity for performanceWays to Sacrifice fidelity: partial answer approximate answer alternative answer
Another way to improve perceived performance: interactive results
A Flexible Querying System
QueryProcessor
User[Agent]
Costs, Utilities
Partial,Approximate,
AlternativeAnswers
Preferences
Cache
FlexibilityRules
Data