Upload
polly-craig
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Exploiting Type and Space in a Main Memory Query Engine
Thomas SchwarzMatthias Grossmann, Daniela Nicklas, Bernhard Mitschang
Universität Stuttgart, Institute of Parallel and Distributed Systems
University of StuttgartCenter of Excellence 627
2
Outline
Motivation and Scenarios
Index Structures
Related Work
Experiments
Conclusion
University of StuttgartCenter of Excellence 627
3
City-Guide Szenario
Query: How do I get to the closest hotel?
Hotel Youth-Hostel Museum
University of StuttgartCenter of Excellence 627
4
Typical Data and Type Hierarchy
12
3
4
6
7
Typical data
Root0
Building1
Museum3
Res-taurant
4
Road5
LocalRoad
6MainRoad
7Hotel2
Typical type hierarchy
NameID
Type
YouthHostel
8
University of StuttgartCenter of Excellence 627
5
Type Hierarchy of TIGER/Line Data Sets
D51 130Airportor airfield
D52 131Trainstation
D53 132Busterminal
D 97Landmark
A11 5Primary ...,unseparated
A15 9Primary...,separated
A19 13Prim..,bridge
A1 4Primary HighwayWith LimitedAccess
A4 34Local,Neighborhood,and Rural Road
D5 128Transpor-tationTerminal
D4 122Educationalor religiousInstitution
B1 66RailroadMainLine
B12 68..., intunnel
A 1Road
B 63Railroad
H 213Hydrography
0the root type
CFCC Type IDdescription
258 types
University of StuttgartCenter of Excellence 627
6
Typical Queries
Typical queries ask for Gas stations next to the planned route Nearest base stations for wireless internet Sights / landmarks / buildings in a given area All roads / only major roads in a given area
Disjunctive queries Restrict type of queried objects Restrict location of queried objects
Exploit these characteristics for speedup Leverage a dedicated index structure Combine both primary access paths
University of StuttgartCenter of Excellence 627
7
System Architecture
DataProvider
Mobile Device
Application
DiscoveryService Integration
MiddlewareIntegrationMiddleware
DataProvider
DataProvider
Mobile Device
ApplicationApplication
main memoryquery engine
University of StuttgartCenter of Excellence 627
10
Summary of the Requirements
Simple query capabilitites suffice
Combine Type and Space
Cope with different workloads
Fast response times
University of StuttgartCenter of Excellence 627
11
Outline
Motivation and Scenarios
Index Structures
Related Work
Experiments
Conclusion
University of StuttgartCenter of Excellence 627
12
Separate Indexes
Array
(Main road) 7
(Hotel) 2
(Local road) 6
(Restaurant)4
(Root) 0
(Museum) 3
(Road) 5
(Building) 1
Spatial index(Quadtree)
chooseschooses
Separate Lists
Cost-basedoptimizer
Candidates Candidates
Type predicate Spatial predicateFinalResult
University of StuttgartCenter of Excellence 627
13
Real 3D Index
2
3
41..4
B
uild
ing
+ a
ll su
btyp
es
typ
e d
ime
nsi
on
Query
University of StuttgartCenter of Excellence 627
14
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
15
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
16
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
17
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
18
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
19
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
20
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
21
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
22
Traversing the Index
Spatial dimension
Typ
e d
imen
sion
Query
University of StuttgartCenter of Excellence 627
23
Type Hierarchy Linearization
Treat type information like a spatial dimension
Root0
Building1
Museum3
Res-taurant
4
Road5
LocalRoad
6MainRoad
7Hotel2
University of StuttgartCenter of Excellence 627
24
Type Hierarchy Linearization
Treat type information like a spatial dimension
Root0
Building1
Museum3
Res-taurant
4
Road5
LocalRoad
6MainRoad
7Hotel2
Type dimensionBuilding + all subtypes
University of StuttgartCenter of Excellence 627
25
Effects of the Spacing in the Type Dimension
typ
e d
ime
nsi
on
spatial dimension0 2 4 6 8
0
4
2
6
8
Objects are primarily grouped
by their type
Objects are primarily grouped by their position
inner nodeof index tree
object
spatial dimension
typ
e d
ime
nsi
on
0 2 4 6 80
6
9
3
wide spacing betweenmapped values
narrow spacing betweenmapped values
Affects clustering of objectsDetermine best type mapping range
University of StuttgartCenter of Excellence 627
26
Type Mapping Variant: Equal Spread (ES)
0 1000100 200 300 400 500 600 700 800 900
0 1 2 3 4 5 6 7 8 9 Type ID
mapped value
type mapping range
range containingall subtypes
Same gap between all mapped values
976432
1 5 8
0 The simplestvariant
University of StuttgartCenter of Excellence 627
27
Type Mapping Variant: Type Hierarchy (TH)
0 1000
0
250
1
313
2
375
3
438
4
500
5
583
6
666
7
750
8
875
9 Type ID
mapped value
range containingall subtypes
type mapping range
Same gap between a type and its direct
subtypes
976432
1 5 8
0Cluster objects
with samesupertype
University of StuttgartCenter of Excellence 627
28
Type Mapping Variant: Object Distribution (OD)
0 10003671 250304 429 571 643 750 839
0 1 2 3 4 5 6 7 8 9 Type ID
mapped value
range containingall subtypes
Size of gap corresponds to the number of
instances of a type2 2 10 3 7
210 47
33 64 76
99
12
5 8 85
0 2 Cluster infrequentobjects by location,
cluster frequentobjects by type
type mapping range
Requires additional histogram information
University of StuttgartCenter of Excellence 627
29
Related Work
Spatial Indexes We use them, but don‘t build one
Object-oriented Databases Use only point access methods
Object-relational Databases Separate table for each type
Query many tables for all subtypes
Single global table Use point access methods
University of StuttgartCenter of Excellence 627
30
Outline
Motivation and Scenarios
Index Structures
Related Work
Experiments
Conclusion
University of StuttgartCenter of Excellence 627
31
Experimental Setup
Data sets from 9 counties in California(TIGER/Line 2003)
Universe Width: 15 to 100 km
Height: 26 to 115 km
12k to 203k objects
258 types
University of StuttgartCenter of Excellence 627
33
Comparing the Type Mapping Ranges
100%
110%
120%
130%
140%
150%
160%
ES TH OD ES TH OD ES TH OD ES TH ODData Provider Discovery
ServiceIntegrationMiddleware
Mobile Device
Re
lati
ve
Res
po
ns
e T
ime A, = 15km
B, = 150km
C, = 1500km
D, = 15000km
E, = 60000km
type mapping range
Almost best type mapping range is sufficient
^
^
^
^
^
University of StuttgartCenter of Excellence 627
34
Comparing the Approaches
100%
110%
120%
130%
140%
Data Provider DiscoveryService
IntegrationMiddleware
Mobile Device
150%200%250%300%350%400%450%500%550%600%650%700%
481% 51
8%
690%
446%
Rel
ativ
e R
esp
on
se T
ime
SEP
indexingapproach
R3D.1:1R3D.ESR3D.THR3D.OD
Type mapping does matter!Object Density is the best variantMore impact with low type selectivity
University of StuttgartCenter of Excellence 627
35
Resource Consumption
0
5
10
15
20
2512
k17
k23
k46
k54
k72
k15
1k17
5k20
3k
Byt
es
Nan
o s
eco
nd
s
SEP R3D.OD.B R3D.OD.C R3D.OD.D R3D.OD.E
Indexing approach
Data set size Data set size
12k
17k
23k
46k
54k
72k
151k
175k
203k
0
100
200
300
400
500
600Insertion time per object Memory per object
Scales well with larger data setsSpeed costs resources
University of StuttgartCenter of Excellence 627
36
Conclusion
Location-conscious main memory query engine Exploits characteristics of typical queries
Deployable to many components
Real 3D Index Best performance
Type mapping range: Larger than expected
Type mapping variant: Object Density
Separate Indexes Best resource consumption