Trajectory Data Modeling and Processing in HBaseit4bi.univ-tours.fr/it4bi/medias/pdfs/2014_Master_Thesis/IT4BI... · Trajectory Data Modeling and Processing in HBase Faisal Moeen

Trajectory Data Modeling and

Processing in HBase

Faisal Moeen Orakzai

Fachgebiet Datenbanksysteme und Informationsmanagement

Technische Universitat Berlin

A thesis submitted for the degree of

Master of Science (M.Sc.) in Computer Science

August 10th 2014

Advisor

Dipl.-Inf. Alexander S. Alexandrovv

Reviewers

Prof. Dr. rer. nat. Volker Markl

Prof. Dr. Esteban Zimanyi

mailto:[email protected]

http://www.dima.tu-berlin.de/

http://www.tu-berlin.de/

Abstract

The number of devices equipped with location sensors has increased expo-

nentially in the last couple of years. These devices generate huge amount of

movement data which is difficult to process or query because of lack of scal-

ability of existing approaches and systems. There has been efforts in this

direction but these are limited to research prototypes and such systems have

neither attracted users nor have made it to the industry. In this thesis, we

design and implement Guting’s moving objects algebra on an open-source

key-value store HBase. We discuss various spatial indexing strategies to

improve query performance and present our strategy based on space filling

curves. To enable efficient querying using space filling curves, we present

the design and implementation of a query processing layer on top of Apache

Phoenix and compare the performance of our implementation with existing

work.

Zusammenfassung

Die Anzahl der Gerate mit Sensoren ausgerustet Lage hat exponentiell in

den letzten Jahren zugenommen. Diese Gerate erzeugen enorme Bewe-

gungsdaten, die schwierig zu verarbeiten oder Abfrage ist wegen des Man-

gels an Skalierbarkeit bestehender Ansatze und Systeme. Es hat Bemhun-

gen in dieser Richtung, aber diese sind zu Forschungsprototypen beschrankt

und ein solches System weder zogen Benutzer noch hat sie in der Industrie

hergestellt. In dieser Arbeit, entwerfen und implementieren wir Guting be-

wegender Objekte Algebra auf einer Open-Source-Schlussel-Wert-Speicher

HBase. Wir diskutieren verschiedene raumliche Indizierung Strategien, um

die Abfrageleistung zu verbessern und prasentieren unsere Strategie, die auf

raumfullende Kurven. Um effiziente Abfrage ermoglichen mit raumfullende

Kurven, prasentieren wir das Design und die Implementierung eines Abfragev-

erarbeitung Schicht auf Apache Phoenix und vergleichen Sie die Leistung

unserer Umsetzung mit bestehenden Arbeit.

iv

Acknowledgements

I would like to thank Prof. Dr. Ralf Hartmut Guting for his help during the

course of this thesis. I have special appreciation for Jiamin Lu for his help

with Parallel Secondo and answering all my questions immediately without

any consideration of time even while he was on holidays. I would also like

to thank Johannes Kirschnick and all colleagues from the database systems

research group (DIMA) at TU Berlin for their continuous support and the

numerous comments in the past months.

ii

Contents

List of Figures ix

List of Tables xi

1 Introduction 1

2 Background 5

2.1 Spatio-Temporal Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Allens Temporal Concepts . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 SQL 2011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.3 Guting’s Spatio-temporal Algebra . . . . . . . . . . . . . . . . . 6

2.1.4 Hermes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Guting’s Spatio-temporal Algebra . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1.1 mpoint & mregion . . . . . . . . . . . . . . . . . . . . . 7

2.2.1.2 Other Data-Types . . . . . . . . . . . . . . . . . . . . . 8

2.2.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.3 Example Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.4 SECONDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 Spatio-temporal Indexes [1] . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Multidimensional Indexes . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1.1 R-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1.2 3D R-Tree . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1.3 STR-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1.4 TB-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.2 Multi-Version R-Trees . . . . . . . . . . . . . . . . . . . . . . . . 15

iii

CONTENTS

2.3.2.1 HR-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.2.2 HR+-Tree . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.2.3 MVR-tree . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.3 Grid Based Index . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.3.1 SETI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.3.2 MTSB-Tree . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.3.3 CSE-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4 Space-Filling Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4.1 Z-Order Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.2 Hilbert Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.3 GeoHash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.3.2 How to Calculate Geo-Hash . . . . . . . . . . . . . . . . 22

2.4.3.3 Objectives to Achieve in Geo-Hash Indexing . . . . . . 23

3 Distributed Platforms for Querying 25

3.1 Distributed Spatial Data Processing Platforms . . . . . . . . . . . . . . 25

3.1.1 Spatial-Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.2 Hadoop-GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 Distributed Online Querying Platforms . . . . . . . . . . . . . . . . . . . 26

3.2.1 Cassandra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2.2 Stinger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.3 HBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2.4 Phoenix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3 Parallel Secondo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3.2 Parallel Query Execution . . . . . . . . . . . . . . . . . . . . . . 29

3.3.2.1 PS-Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.2.2 Distributed Data Types . . . . . . . . . . . . . . . . . . 31

3.3.2.3 Distributed Operators . . . . . . . . . . . . . . . . . . . 33

3.4 HBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.1 Log Structured Message Trees . . . . . . . . . . . . . . . . . . . . 35

3.4.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

iv

CONTENTS

3.4.3 Write Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.4 Read Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.5 Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Choice of Platform for Guting’s Algebra . . . . . . . . . . . . . . . . . . 40

3.5.1 Schema Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.5.2 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.3 Partitioning Control . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.4 Co-location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.5 Scan Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.5.6 Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.5.7 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Algebra Implementation 45

4.1 Motivation behind the use of Apache Phoenix . . . . . . . . . . . . . . . 45

4.2 Implementation Approaches . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.1 Use of Struct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.2 Binary Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.3 Data Type Flattening . . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3.1 Spatial Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3.1.1 Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3.1.2 Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3.1.3 Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3.1.4 DLine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.3.1.5 Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.3.2 Basic Unit Data Types . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3.3 Spatial Unit Data Types . . . . . . . . . . . . . . . . . . . . . . . 49

4.3.4 Basic Range Data Types . . . . . . . . . . . . . . . . . . . . . . . 50

4.3.5 Temporal Range Data Types . . . . . . . . . . . . . . . . . . . . 50

4.3.6 Basic Temporal Data Types . . . . . . . . . . . . . . . . . . . . . 50

4.3.7 Spatio-Temporal Data Types . . . . . . . . . . . . . . . . . . . . 51

4.3.7.1 MPoint . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

v

CONTENTS

4.4 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Indexing Strategy & Querying Framework 53

5.1 Indexing in HBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.2 Indexing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.1 Maintaining a Global Index . . . . . . . . . . . . . . . . . . . . . 54

5.2.2 Maintaining Local Indexes . . . . . . . . . . . . . . . . . . . . . . 55

5.2.3 Maintaining Distributed Indexes . . . . . . . . . . . . . . . . . . 56

5.2.4 SFC based Indexing for HBase . . . . . . . . . . . . . . . . . . . 56

5.3 Spatial Index Design for LSMT . . . . . . . . . . . . . . . . . . . . . . . 56

5.3.1 Co-location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3.2 Lesser Size of Unwanted Data Scan . . . . . . . . . . . . . . . . . 57

5.3.3 Lesser Scans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.4 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.4.1 Priliminary Choices . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.4.2 Choice of SFC Index . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.4.2.1 Choice of Geo-Hash . . . . . . . . . . . . . . . . . . . . 59

5.4.3 Indexing a Region . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.4.3.1 Single-Level Single-Hash (SLSH) . . . . . . . . . . . . . 60

5.4.3.2 Multiple Hashes per Region . . . . . . . . . . . . . . . . 61

5.4.4 Physical Approaches for Building the Index . . . . . . . . . . . . 64

5.4.4.1 Single-Index Approach . . . . . . . . . . . . . . . . . . 64

5.4.4.2 Multi-Index Approach . . . . . . . . . . . . . . . . . . . 64

5.4.5 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.4.6 Index Implementation . . . . . . . . . . . . . . . . . . . . . . . . 65

5.4.6.1 Schema Design for GET Requests . . . . . . . . . . . . 68

5.4.6.2 Schema Design for SCAN Requests . . . . . . . . . . . 68

5.5 The Querying Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.5.1 Guting’s Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.5.2 SFC Plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.5.3 Query Translator . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.5.4 Query Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.5.5 Stats-Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

vi

CONTENTS

5.5.6 Hash Coverage Algorithm . . . . . . . . . . . . . . . . . . . . . . 80

5.5.7 Client-side filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.5.8 Meta-Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.5.8.1 Schema Meta-Data . . . . . . . . . . . . . . . . . . . . 82

5.5.8.2 Algebra Meta-Data . . . . . . . . . . . . . . . . . . . . 85

5.6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6 Benchmark & Results 87

6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.2 The Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.3 Query Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.4.1 Query-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.4.2 Query-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.4.3 Query-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.4.4 Query-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.4.5 Query-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7 Conclusion 95

7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

7.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

References 97

vii

CONTENTS

viii

List of Figures

2.1 A moving point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Type Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Operators for moving types . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Operators with moving results . . . . . . . . . . . . . . . . . . . . . . . 10

2.5 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.6 Two views of R-Tree [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

(a) Objects & Minimum Bounding Boxes . . . . . . . . . . . . . . . . 13

(b) R-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.7 TB-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.8 HR-TRee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.9 MV3RTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.10 Spatial Query using Hilbert Curve . . . . . . . . . . . . . . . . . . . . . 18

2.11 Z-Order Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.12 Z-Order Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.13 Hilbert Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.14 Geo-Hash Precision Coverage . . . . . . . . . . . . . . . . . . . . . . . . 21

2.15 Relative Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.16 Geo-Hash Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1 Hadoop-GIS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 PS-Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Parallel Secondo Infrastructure . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Some examples of flist data type . . . . . . . . . . . . . . . . . . . . . . 32

3.5 Multipage blocks iteratively merged across LSM-trees . . . . . . . . . . 36

3.6 HBase Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

ix

LIST OF FIGURES

3.7 An HBase table with two column families . . . . . . . . . . . . . . . . . 39

5.1 Implementation Block Diagram . . . . . . . . . . . . . . . . . . . . . . . 71

5.2 GeoHash Edge Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.3 Meta-Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.1 Query-1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.2 Query-2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.3 Query-3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92



x

List of Tables

3.1 Platform Selection Criteria . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.1 Unit Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.2 Spatial Unit Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3 Basic Range Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4 Basic Temporal Data Types . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.1 Number of points in each grid level . . . . . . . . . . . . . . . . . . . . . 64

5.2 Index for Movement Table along with hash-length . . . . . . . . . . . . 67

5.3 Constant Spatial Entities . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.4 Index Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.1 BerlinMOD Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.2 Benchmark queries by index and input types . . . . . . . . . . . . . . . 89

xi

LIST OF TABLES

xii

1

Introduction

Research on Moving Object Databases has been going on since 1995. This type of

databases is more complex than relational databases because of a continuously chang-

ing dimension of time. The main goal has been to allow one to represent moving

entities in databases and to enable a user to ask all kinds of questions about such

movements. This requires extensions of the DBMS data model and query language.

Further, DBMS implementation needs to be extended at all levels, for example, by

providing data structures for representation of moving objects, efficient algorithms for

query operations, indexing and join techniques, extensions of the query optimizer, and

extensions of the user interface to visualize and animate moving objects.

A model of data together with some operations on it is captured by the concept

of an abstract data type (ADT). Guting et al. in 2000 [2, 3] proposed a type system

and operations carefully designed for handling the temporal aspect of the data. Further

work by them defines the discrete model and develops algorithms for the operations [4].

The model has also been extended to a network-based representation of moving objects

(or trajectories) [5]. SECONDO [6, 7, 8, 9] and Hermes [10] are the two prototypical

implementations of this data model. SECONDO has been developed at University of

Hagen and is very extensible. It provides querying at two levels, SQL and its executable

language. SECONDO uses Berkeley DB as the underlying database engine. Moreover

Hermes has been developed at University of Piraeus Greece and uses SQL as query

language with Oracle as its underlying DB.

Trajectories are hard to process because of continuously changing time dimension

especially with huge data sizes. Less work is available that deals with processing tra-

1

1. INTRODUCTION

jectories in distributed fashion. Parallel SECONDO is the parallel/distributed version

of Secondo which is an effort towards this direction but it lacks user base and open

source community support. It is a research project and has neither been benchmarked

for performance and scalability against other systems nor test proven in the industry.

It has its own executable query language (no SQL support) which gives the user power

to run the queries at slave databases and collect the results at the master but the user

has to parallelize queries himself to get the required results. It would be interesting

to see how well the known open source distributed data stores tackle the problem of

trajectories storage and querying.

HBase is an open source implementation of Google’s Big Table [11] key-value store.

It is a distributed, scalable and fault-tolerant system that gives fast query response

times over massive data. Data is automatically sharded to various nodes based on

’regions’ which are horizontal partitions of data. Data is sorted which means that

similar data is stored together. This results in faster range queries. On one hand it

shines because of its scalability, fault-tolerance and efficiency in online querying but on

the other hand it has limited support for defining a schema which makes it difficult to

model spatio-temporal data storage. Being a Key-Value store, HBase supports querying

only by the key which makes it difficult to design an effective storage strategy with

acceptable performance for spatial and temporal domains. There is limited support for

indexing and it does not allow to plug-in custom indexes.

In this thesis we present the design and implementation of Guting’s Algebra in

HBase and compare it with Parallel Secondo with respect to performance. We discuss

possible spatial index integration scenarios, devise a spatial indexing strategy based

on Space Filling Curves and present the design and implementation of a query pre-

processing layer on top of Apache Phoenix that enables support for spatial indexing

in HBase. We benchmark the performance of both systems using Parallel BerlinMOD

Benchmark for parallel moving object databases. We also inquire the effects of our

work i.e. implementation of spatial index & Guting’s algebra, on query performance

when compared to raw HBase and present results.

The remainder of this thesis is structured as follows: Chapter 2 introduces spatio-

temporal algebra and spatio-temporal indexing structures with a focus on Guting’s

algebra and GeoHash index. Chapter 3 describes some of the relevant distributed

platforms for online querying and our criteria which lead us to the choice of HBase.

2

Chapter 4 presents our implementation of Guting’s algebra. Chapter 5 proposes var-

ious indexing approaches in HBase using Space Filling Curves and presents our index

design and querying framework. Chapter 6 presents our experimental design and re-

sults. Finally, Chapter 7 concludes the discussion and provides some ideas for future

development.

3

1. INTRODUCTION

4

2

Background

This chapter reviews the background required to understand the ideas presented in the

exposition part of the thesis. We first introduce some existing spatio-temporal algebras

in section 2.1 and explain in more detail Guting’s Algebra which we have implemented

as part of this thesis in section 2.2. Section 2.3 gives an overview of various types

of spatio-temporal indexes and briefly explains a few indexing structures. We discuss

Space Filling Curves in section 2.4 as an alternate strategy for indexing spatial data

using conventional 1D indexes. GeoHash index being our choice for implementation in

this thesis is explained in more detail in section 2.4.3.

2.1 Spatio-Temporal Algebras

2.1.1 Allens Temporal Concepts

Allen et al. [12] were one of the first proposers of temporal concepts like intervals and

their relationships. The concepts of recent temporal SQL standards resemble their work

to a great extent. They proposed concepts like during, before, after, overlaps,

equals and meets. Most of the temporal concepts of SQL-2011 can be represented

using the concepts proposed by them.

2.1.2 SQL 2011

SQL-2011 was published in December of 2011, replacing SQL-2008 as a SQL standard.

This version added the ability to create and manipulate temporal tables. A lot of

temporal concepts have been added which make querying over temporal data very

5

2. BACKGROUND

easy. For completeness sake, a few of the data types as well as operators are mentioned

below [13]:

PERIOD Defines a period with a start and end data: PERIOD(DATE ’2010-01-01’,

DATE ’2011-01-01’)

OVERLAPS The predicate X OVERLAPS Y returns true if X or Y have atleast one time

point in common.

CONTAINS The predicate X CONTAINS Y returns true if Y is a subset of X.

PRECEDES The predicate X PRECEDES Y returns true if X occurs before Y and they

do not overlap.

SUCCEEDS The predicate X SUCCEEDS Y returns true if X occurs after Y and they

do not overlap.

As SQL-2011 handles data which changes in time, movement data can also be modeled

in a temporal database and queried. One of the drawbacks using this approach is that

it focuses more on the state or version of the data rather than movement aspects of it.

The operators are also not intuitive for querying moving objects which make it really

hard to design analytical queries.

2.1.3 Guting’s Spatio-temporal Algebra

This algebra focuses on moving objects and was proposed by Prof. Guting from Uni-

versity of Hagen in year 2000. From here onwards we refer to this as Guting’s algebra

to differentiate it from other algebras. During the course of this thesis, this algebra

was implemented in HBase. Section 2.2 discusses this algebra in detail with examples.

2.1.4 Hermes

HERMES [14] is a prototype DB engine that defines a powerful query language for

trajectory databases, which enables the support of mobility-centric applications, such

as Location-Based Services (LBS). The querying model of HERMES is an extended

version of Guting’s algebra. HERMES extends the data definition and manipulation

language of Object-Relational DBMS (ORDBMS) with spatio-temporal semantics and

functionality based on advanced spatio-temporal indexing and query processing tech-

niques. HERMES has been implemented using Oracle RDBMS.

6

2.2 Guting’s Spatio-temporal Algebra


This section describes and discusses Guting’s approach to modeling moving and evolv-

ing spatial objects based on the use of abstract data types. We introduce data types for

moving points together with a set of operations on such entities. Some of the related

auxiliary data types, such as pure spatial or temporal types, time-dependent real num-

bers are also discussed. Chapter 4 discusses the implementation of a collection of these

types and operations in HBase using Apache Phoenix framework to obtain a complete

data model and query language.

2.2.1 Data Types

2.2.1.1 mpoint & mregion

Both mpoint & mregion are extensions of purely spatial data types point and region

respectively. An mpoint and mregion can be described as mappings from time into

space, that is

mpoint = time→ point

mregion = time→ region

More generally, we can introduce a type constructor τ which transforms any given

atomic data type a into α type τ(α) with semantics

τ(α) = time→ α

and we can denote the types mpoint and mregion also as τ(point) and τ(region),

respectively.

A value of type mpoint describes the position of a point as a function of time.

This can be represented as a curve in the 3-D space (x, y, t) as shown in figure 2.1.

The assumption is that space as well as time dimensions are continuous. This means

that if the position of a point is asked at a time instant that lies between the recorded

timestamps, the data type will still return a position. A value of type mregion is a

set of volumes in the 3-D space (x, y, t). Any intersection of that set of volumes with

a plane t = t0 yields a region value, describing the moving region at time t0. It is

7

2. BACKGROUND

Figure 2.1: A movoing point.[15]

Figure 2.2: Type Constructors[2]

possible that this intersection is empty, and an empty region is also a proper region

value.

2.2.1.2 Other Data-Types

Figure 2.2 shows the type constructors used to construct data-types of various forms

and their signatures. The data-types are divided into five types. BASE types include

the basic types which every database supports e.g. int, real, string and bool. SPATIAL

data-types include conventional spatial types like point, line and region. The differ-

ence between these types and the ones used in other spatial databases is that here the

line or a region can be disconnected. This means that a line is in fact a collection of

more than one disconnected line and a region is a collection of more than one discon-

nected regions. As you can see in figure 2.2, points is an additional data-type which

represents a collection of points. Third category is of type TIME. It contains only one

data-type instant. It corresponds to the timestamp type supported by conventional

8


Figure 2.3: Operators for moving types [2]

databases. It represent time as a long value. Fourth category is RANGE. RANGE

data-types can be either of type BASIC or TIME. Data-types part of RANGE cate-

gory represent intervals. Periods is a data-type belonging to temporal range category.

BASIC range types include rint, rbool, rreal etc. Fifth category is TEMPORAL.

It is further divided into either spatio-temporal types or basic temporal types. mpoint

and mregion discussed previously belong to spatio-temporal category. Basic temporal

types include mint, mbool, mreal. They are the moving versions of basic types.

2.2.2 Operators

Figure 2.3 shows some of the operators that can be applied to a moving data-type.

At gives the value of a moving object at a particular point in time. Minvalue and

maxvalue give the minimum and maximum values of a moving object. For both these

functions, a total order must exist for α. Start and stop return the minimum and

maximum of a moving value’s (time) domain, and duration gives the total length of

time intervals a moving object is defined. We can also use the functions startvalue(x)

and stopvalue(x) as an abbreviation for at(x, start(x)) and at(x, stop(x)), re-

spectively. Whereas all these operations assume the existence of moving objects, const

offers a canonical way to build spatio-temporal objects: const(x) is the ”moving”

object that yields x at any time.

In particular, for moving spatial objects we may have operations such as mdinstance

and visits. mdistance computes the distance between the two moving points at

all times and hence returns a time changing real number, a type that we call mreal

(”moving real”;mreal = τ(real)), and visits returns the positions of the moving point

given as a first argument at the times when it was inside the moving region provided

as a second argument. Here it becomes clear that a value of type mpoint may also

be a partial function, in the extreme case a function where the point is undefined at

9

2. BACKGROUND

Figure 2.4: Operators with moving results

all times. Operations may also involve pure spatial or pure temporal types and other

auxiliary types. line is a data-type describing a curve in 2-D space which may consist

of several disjoint pieces; it may also be self-intersecting. Region is a type for regions in

the plane which may consist of several disjoint faces with holes. Figure 2.5 summarizes

the operators that are part of the BerlinMOD benchmark and have been implemented

by us.

2.2.3 Example Queries

When the above mentioned data types and operators are implemented in a DBMS, we

can have a relation as follows:

flights(id: string; from: string; to: string; route: mpoint)

Now a query can be asked ”Give me all flights from Dusseldorf that are longer than

5000 kms”:

SELECT id

FROM flights

WHERE from = ”DUS” AND length(trajectory(route)) > 5000

This query projects the route i.e. an mpoint into space. We can also project it on

to time dimension as follows:

SELECT to

FROM flights

WHERE from = ”SFO” AND duration(route) ≤ 2.0

We can use the projections into space and time, to solve some spatio-temporal

question like:”Find all pairs of planes that during their flight came closer to each other

than 500 meters!”:

10


Figure 2.5: Operators [16]

11

2. BACKGROUND

SELECT A.id, B.id

FROM flights A, flights B

WHERE A.id 6= B.id AND minvalue(mdistance(A.route,B.route)) ≤ 0.5

2.2.4 SECONDO

SECONDO [6] is an extensible DBMS developed at University of Hagen. SECONDO

does not have a fixed datamodel, but is open for implementation of new models. It has

following three major components which can be used together or independently:

1. The kernel, which offers query processing over a set of implemented algebras, each

offering some type constructors and operators.

2. The optimizer,which implements the essential part of an SQL-like language.

3. The graphical user interface which is extensible by viewers for new data types and

which provides a sophisticated viewer for spatial and spatio-temporal (moving)

objects.

Many algebras have been implemented in SECONDO such as relations, spatial data

types, R-trees, or midi objects (music files), each with suitable operations. Each com-

ponent of SECONDO is extensible e.g the kernel can be extended by algebras, the

optimizer by optimization rules and cost functions, and the GUI by viewers and dis-

play functions. SECONDO is of importance to us because it implements Guting’s

Algebra. This algebra can be used with in its SQL like interface.

2.3 Spatio-temporal Indexes [1]

There are three different kinds of spatio-temporal indexes based on their approach.

1. The first kind of indexes uses any multidimensional indexes like R-tree and extend

them for temporal dimensions such as 3D R-tree [17], or STR-tree [17].

2. The second kind of indexes builds a separate R-tree for each time stamp and

share intersecting parts between two consecutive R-trees. A few examples of this

type are MR-tree [18], HR-tree [19], HR+-tree [20], and MV3R-tree [20].

3. The third kind of indexes divides the spatial space into grids and for each grid

builds a temporal index. This category includes SETI [21], and MTSB-tree [22].

12


(a) Objects & Minimum Bounding Boxes (b) R-Tree

Figure 2.6: Two views of R-Tree [1]

2.3.1 Multidimensional Indexes

2.3.1.1 R-Tree

R-Tree is one of the most widely used spatial index put to use by many spatial databases.

It forms the bases of many varieties of spatial as well as spatio-temporal indexes. Un-

derstanding R-Tree is necessary to grasp 3D R-Tree which can also index the temporal

dimension in addition to the spatial dimension. R-Tree is a data structure with bal-

anced height. Each node in R-tree represents a region which is the minimum bounding

box (MBB) of all its children nodes. Each node can have many children. The node

contains entry for every child and this entry represents the MBB of the referenced child

node. Whenever a point or a region needs to be searched, it is based on the MBB of

the nodes which acts as a key for finding the right nodes. R-trees can also be used for

nearest neighbor queries using either depth first search or best first search [23].

2.3.1.2 3D R-Tree

3D R-Tree is an extension to R-Tree which also keeps in consideration the time domain

while calculating MBBs. Instead of storing 2D MBBs, it stores a 3D MBB for each

segment which increases the size of the bounding box although the size of the segment

is still small. This reduces the discrimination capability of the index. The temporal

aspect of a query could be based on either a time instant or a time period. The insertion

and deletion of data is the same as R-Tree.

13

2. BACKGROUND

Figure 2.7: TB-Tree [17]

2.3.1.3 STR-Tree

The STR-tree (Spatio-Temporal R-Tree) is an extension of the 3D Rtree which supports

efficient querying of trajectories. It differs from 3D R-tree in insertion and split strategy.

An STR-Tree improves R-Tree by keeping segments belonging to the same trajectory

together.

2.3.1.4 TB-Tree

The TB-tree (Trajectory-Bundle tree) [24] is an extension to R-Tree. It is a trajectory

bundle tree which bundles the segments of the same trajectory into the same leaf node.

TB-Tree consists of a set of leaf nodes each of which contains a partial trajectory

organized in a tree hierarchy. In simple words, a trajectory spans a set of disconnected

leaf nodes. Figure 2.7 shows a trajectory symbolized by the gray band is fragmented

across six nodes c1, c3, etc. The shown part of a TB-tree structure illustrates how this

trajectory is stored. The leaf nodes representing the trajectory are connected through

a linked list.

‘

14


Figure 2.8: HR-Tree

2.3.2 Multi-Version R-Trees

Another solution for spatio-temporal indexing different than adding a temporal dimen-

sion to an R-Tree is to construct an R-Tree for each timestamp and then index the

R-Trees for time. Any temporal index can be used but B-Trees serve the purpose well.

B-Tree can be used to locate the R-Trees based on a time instant or a time period and

these R-Trees are drilled down to locate objects of interest. Constructing an R-Tree

for each timestamp is space consuming. To optimize further on this approach, only

the part of R-Tree which is different from the previous timestamp is created for the

new timestamp. Thus consecutive R-Trees share branches. The indexing structures

that use this strategy are MRtree [25] (Multiversion R-tree), HR-tree [19] (Historical

R-tree) and HR+-tree [20]].

2.3.2.1 HR-Tree

Figure 2.8 shows an example of HR-tree which stores spatial objects for timestamp 0

and 1. Since all spatial objects in A0 do not move, the entire branch is shared by both

trees R0 and R1. In this case, it is not necessary to recreate the entire branch in R1,

instead, a pointer is created to point branch A0 in R0.

2.3.2.2 HR+-Tree

H+-Tree is an improvement of HR-Tree. It allows entries belonging to different times-

tamps to be stored in the same node. In simple words, if there is a small change in

movement of an object, it can be shared with different R-trees. Because of this reason,

HR+-Tree consumes approximately 20% less space when compared to an HR-Tree yet

is several times faster [20]. For a single timestamp query, querying time is the same.

15

2. BACKGROUND

Figure 2.9: MV3RTree

2.3.2.3 MVR-tree

Although HR-Tree and HR+-Tree save a lot of space by not storing a complete R-Tree

for each timestamp, they are still plagued by a lot of duplication which costs space.

They are good in timestamp queries but perform poorly in interval queries. MV3R-

Tree [26] uses a combination of multiversion B-Trees and 3D R-Trees to overcome these

disadvantages. A MV3R-tree consists of two structures: a multiversion R-tree (MVR-

tree) and a small auxiliary 3D R-tree built on the leaves of the MVR-tree in order to

process interval queries. Figure 2.9 shows an overview of MV3R-tree.

2.3.3 Grid Based Index

Spatial and temporal dimensions are different in a way that spatial dimension has fixed

domain while temporal domain is continuously changing. The rate of change also vary

i.e. faster in temporal domain. 3D R-Tree handles both dimensions equally which

results in a lot of overlapping bounding boxes. This leads to poor performance when

data size grows. Grid based indexes handle this by partitioning the data spatially and

with-in that partition, data is indexed for temporal dimension. The SETI indexing

mechanism (Scalable and Efficient Trajectory Index) [21] is the first grid based index.

2.3.3.1 SETI

SETI partitions the data into spatial cells. This partitioning can be fixed as well

as dynamic. Each cell contains the trajectories present within its boundaries. If a

trajectory spans multiple cells, the trajectory is split into multiple pieces in such a way

that each cell only stores the piece lying inside its boundaries. Each cell is represented

by a data page and each trajectory is stored in the file as a tuple. The number of files

16

2.4 Space-Filling Curves

may increse with time. The lifetime of the page file is indexed by using an R*-tree.

Hence it has a sparse temporal index and is lightweight.

2.3.3.2 MTSB-Tree

MTSB-Tree is a variant of SETI which differs in temporal indexing strategy. Unlike

SETI, it uses the TSB-tree (Time Split B-tree) [27] to index the time dimension within

each cell. Compared to R*-tree that is used by SETI, the advantage of using TSB-tree

is that it provides results sorted by time. So it is better for those queries returning

trajectories close in spatial as well as temporal dimensions.

2.3.3.3 CSE-Tree

Another variant of SETI is CSE-tree (Compressed Start-End tree) [28] which uses

different temporal indexes for each cell. If the cell is frequently updated, it uses B+-

Tree whereas, for rarely updated data, it uses sorted dynamic array .


The indexes like R-Tree mentioned before work efficiently but when the data size grows,

they perform poorly. Multi-dimensional queries may lead to a multitude of disk seeks

which may kill the performance. Another reason for them performing poorly as the

data size grows is during insertion. Updating R-Trees or its variants can be a bottle

neck in such a case. Space filling curves convert multi-dimensional data into a single

dimension. B-Trees can then be used to index this dimension. As B-Trees perform

extremely well on 1-D data, the query performance increases significantly. Also the

resulting 1-D data can be sorted which makes the range queries perform far better.

The performance of range queries also depend on the type of space filling curve being

used. The curve which keeps the data in close proximity closer after the transformation

process, performs better. Figure 2.10 shows a polygon present inside an area mapped

by Hilbert Curve. As each grid cell has been assigned a number using Hilber Curve, it

can easily be retrieved using following SQL query.

SELECT *

FROM regiontable

WHERE hilbert_value=3 AND hilbert_value>=7

17

2. BACKGROUND

Figure 2.10: Spatial Query using Hilbert Curve [29]

AND hilbert_value<=12 AND hilbert_value<>11

Some of the famous space filling curves are briefly explained in the following.

2.4.1 Z-Order Curve

Z-order also known as Morton order maps multidimensional data to one dimension

while preserving locality of the data points. The z-value of a multidimensional point

is calculated by interleaving the binary representations of its coordinate values. Fig-

ure 2.11 explains the calculation of z-value for each grid where as Figure 2.12 shows

first three orders of Z-curve. Once the data are sorted into this ordering, any one-

dimensional data structure can be used such as binary search trees, B-trees, skip lists

or (with low significant bits truncated) hash tables. The resulting ordering can equiva-

lently be described as the order one would get from a depth-first traversal of a quadtree;

because of its close connection with quadtrees, the Z-ordering can be used to efficiently

construct quadtrees and related higher-dimensional data structures. [30]

2.4.2 Hilbert Curve

Hilbert curve is similar to z-order curve but follows its own ordering. This ordering

also preserves the proximity of points in most of the cases. However, like z-order curve,

18


Figure 2.11: Z-Order Calculation

Figure 2.12: 1st, 2nd and 3rd order Z-Curves[31]

19

2. BACKGROUND

Figure 2.13: Approximations of Hilbert Curve [31]

there are edge cases where multiple range queries might be required to retrieve a region.

Figure 2.13 shows different approximations of Hilbert Curve.

2.4.3 GeoHash

2.4.3.1 Introduction

Geo-Hash is a hierarchical spatial data structure which divides the space into grid

shaped buckets. One of the benefits of Geo-Hash is that it supports multiple precisions

just by removing or adding characters to the final hash value. The lower the number

of characters used, the lower will be the precision. This helps in side optimization and

allows filtering of data at much lower precision. This allows to filter out data that

might not be interesting by using cheaper operations. Another property of Geo-Hash

index is its co-location of neighboring coordinates. This results in the representation of

nearby places with similar prefixes but not always. The longer a shared prefix is, the

closer the two places are.

A Geo-Hash is a kind of space filling curve that turns multidimensional values into

a single dimensional. This requires each of the dimensions to have a fixed domain.

For spatial values, we want to convert latitudes and longitudes to a single value. The

longitude dimension has a range [-180.0, 180.0], and the latitude dimension has a range

[-90.0, 90.0]. Although Geo-Hash preserves spatial locality, it has some edge-cases as

well where this does not hold true. A precision is designated when calculating a Geo-

Hash. The highest precision that 8-byte Long can hold is 12 characters. We can increase

20


Figure 2.14: Geo-Hash Precisoin Coverage [32]

the precision but then the string will not remain understandable. When end characters

are removed from a Geo-Hash, its precision decreases and hence it represents a larger

area of map. 12 characters which is full precision, represents a point. Any Geo-Hash

with less than 10 characters represents an area on the map i.e. bounding box around an

area. Figure 2.14 illustrates the variation in size of area represented when a Geo-Hash

is truncated.

In HBase, we can use a Geo-Hash as a prefix for querying. All points within

the space represented by the Geo-Hash match the common prefix. This means that

we can use HBase’s prefix scan on the rowkeys to retrieve points that are relevant

to the query. As rowkeys are sorted, the closer points are stored together on the

disk. But as figure 2.14 shows, if we choose a lower precision, we might retrieve a

lot of unwanted data. let’s look at some real points. Consider these three locations:

LaGuardia Airport (40.77 N, 73.87 W), JFK International Airport (40.64 N, 73.78

W), and Central Park (40.78 N, 73.97 W). Their coordinates Geo-Hash to the values

dr5rzjcw2nze, dr5x1n711mhd, and dr5ruzb8wnfr respectively. We can look at those

points on the map in figure 2.15 and see that Central Park is closer to LaGuardia

21

2. BACKGROUND

Figure 2.15: Reletive Distances of Points [32]

than JFK. In absolute terms, Central Park to LaGuardia is about 5 miles, whereas

Central Park to JFK is about 14 miles. Because they’re closer to each other spatially,

we expect Central Park and LaGuardia to share more common prefix characters than

Central Park and JFK. [32]

sort <(echo "dr5rzjcw2nze"; echo "dr5x1n711mhd"; echo "dr5ruzb8wnfr")

dr5ruzb8wnfr

dr5rzjcw2nze

dr5x1n711mhd

2.4.3.2 How to Calculate Geo-Hash

Although we are representing Geo-Hashes as Base32 encoding character strings, in

reality, the Geo-Hash is a sequence of bits representing an increasingly granular sub-

partition of longitude and latitude. For example, 40.78 N is a latitude. It falls in the

upper half of the [-90.0, 90.0] range, so its first Geo-Hash bit is 1. 40.78 is in the lower

half of the range [0.0, 90.0] so its second bit is 0. The third bit is 1 because 40.78 falls

in the upper half of third range [0.0, 45.0] and so. We represent this binary value as a

22


Figure 2.16: Geo-Hash Calculation [32]

sequence of ASCII characters using Base-32 encoding. So if the point ≥ the midpoint,

it’s a 1-bit otherwise, it’s a 0-bit. This process is repeated, again by cutting the range

in half and selecting a 1 or 0 based on the locality of target point. This is done for

both the longitude and latitude values. Then the bits are weaved together to create

the hash.

2.4.3.3 Objectives to Achieve in Geo-Hash Indexing

Although Geo-Hash is an efficient of way of converting multiple dimensions to a single

one, it comes with some complications. If the data is indexed using Geo-Hash, it is

supposed to be queried using a Geo-Hash string. As the size of the grids represented

by Geo-Hashes is fixed, exact representation of an area is a challenge. Keeping this in

context, we define following objectives for our index and query design.

1. Lesser Results: To represent an area with a single Geo-Hash might result in a

grid cell far greater than the actual area. If this grid-cell is used to index the area,

we might get a lot of unwanted results which will have to be filter at the client.

We need to figure out a way in which we can reduce the number of unwanted

results.

23

2. BACKGROUND

2. Lesser Scans: We can reduce the number of results by representing an area

with high precision Geo-Hashes which means a lot of hashes are required to index

a single area. For instance, if we have indexed a column with hashes of length

11 and we want to query it with a comparatively large area which requires 1000

hashes of length 11 to be represented, we will have to send 1000 different scan

requests which is totally in-optimal. We need to figure out a way to solve this

problem.

24

3

Distributed Platforms for

Querying

This chapter provides an overview of existing distributed querying platforms. We cat-

egorize the platforms in 3 types. Section 3.1 describes well known distributed spatial

platforms. Parallel Secondo being the only distributed Moving Object Database is de-

scribed in detail in section 3.3. In section 3.2, we briefly discuss existing distributed

platforms for online querying. As the thesis deals with the implementation of Guting’s

algebra over HBase, section 3.4 discusses it in detail. We conclude this chapter by

describing some of the criteria we considered before selecting HBase for our implemen-

tation.

3.1 Distributed Spatial Data Processing Platforms

3.1.1 Spatial-Hadoop

SpatialHadoop [33] is a MapReduce extension to Apache Hadoop developed at Univer-

sity of Minnesota. It has been designed specially to work with spatial data and can

be used to analyze huge spatial datasets on a cluster of machines. It provides efficient

processing of spatial data by providing data types to be used in MapReduce jobs in-

cluding point, rectangle and polygon [34]. Spatial indexes can be built in HDFS such

as Grid file, R-tree and R+-tree. To efficiently read these indexes in MapReduce jobs,

InputFormats and RecordReaders have been provided. Spatial operations have been

implemented as MapReduce jobs which access spatial indexes. It allows developers to

25

3. DISTRIBUTED PLATFORMS FOR QUERYING

implement custom spatial operations which can benefit with spatial indexes. Spatial-

Hadoop comes bundled with Pigeon [35] which is a spatial extension to Pig which makes

querying easier and intuitive. All operations in Pigeon are introduced as user-defined

functions (UDFS) which decouples it from users’ existing deployments of Pig. The

spatial functionality of Pigeon is based on ESRI Geometry API, a native Java open

source library for spatial functionality licensed under Apache Public License. Pigeon

uses the same function names as PostGIS to make its use easier for existing PostGIS

users. To give a feel of how it looks like, following is an example that computes the

union of all ZIP codes in each city:

1 zip_codes = LOAD ’zips ’ AS (zip , city , geom);

2 zip_by_city = GROUP zip_codes BY city;

3 zip_union = FOREACH zip_by_city

4 GENERATE group AS city , ST_Union(geom);

3.1.2 Hadoop-GIS

Hadoop-GIS [36] is a scalable and high performance spatial data warehousing system

for running large scale spatial queries on Hadoop. Hadoop-GIS is based on MapReduce

and improves spatial query performance by using spatial partitioning. It has a cus-

tomizable spatial query engine called RESQUE. RESQUE implements operators and

measurement functions to provide geometric computations and implicit parallel spatial

query execution on MapReduce. It also implements effective methods for amending

query results through handling boundary objects. Hadoop-GIS constructs a global

index based on its partitioning strategy and a customizable on demand local spatial

index to achieve efficient query processing performance for local operations. To sup-

port declarative spatial queries, an integration with Hive has also been developed. The

architecture of Hadoop-GIS is shown in figure 3.1.

3.2 Distributed Online Querying Platforms

3.2.1 Cassandra

Apache Cassandra is a distributed key-value store developed at Facebook [37]. It

can handle very large amounts of data spread out across many commodity servers.

26

3.2 Distributed Online Querying Platforms

Figure 3.1: Hadoop-GIS Architecture [36]

Cassandra provides high availability without single point of failure by replication of

data to servers across multiple data centers. It also provides the option for choosing

between synchronous or asynchronous replication for each update. Also, its elasticity

allows read and write throughput, both increasing linearly as new machines are added,

with no downtime or interruption to applications. Its architecture is a mixture of

Google’s BigTable [11] and Amazon’s Dynamo [38]. As in Amazon’s Dynamo, every

node in the cluster has the same role, so there is no single point of failure unlike HBase.

It resembles HBase in a way that the data model provides a structured key-value store

where columns are added only to specified keys which means that different keys can have

different number of columns in any given family. Cassandra is a write-oriented system

whereas HBase was designed to get high performance for intensive read workloads.

Cassandra can be queried using a SQL style language called CQL or Cassandra Query

Language.

3.2.2 Stinger

Stringer is an improvement of Hive [39](originally developed at Facebook). Hive is a

SQL like language that performs reasonably well for running data warehouse style an-

alytical queries on huge amounts of data; however, it is not suitable for online queries.

Stinger is a community wide initiative to build interactive querying support into Hive

and claims performance improvement of up to 100x. Significant performance improve-

27


ments over original Hive include, introduction of an ORCFile format, new query opti-

mizer for complex query operations and a vectorized query engine.

3.2.3 HBase

HBase [40] is an open source, distributed, column-oriented database system based on

Googles BigTable [11]. It runs on top of Apache Hadoop [42] and Apache ZooKeeper [41]

and uses the Hadoop Distributed Filesystem (HDFS) [42]. HDFS is an open source

implementation of Googles file system GFS [43] which provides fault-tolerance and

replication for the data stored on it. HBase is written in Java. It provides linear

and modular scalability, strictly row based consistent data access automatic and con-

figurable sharding of data. HBase has limited support for full-fledge schema creation

but supports tables, columns and column families. As it is a key-value store, access is

restricted based on the key only. HBase can be accessed either through its API for real-

time access or by using the MapReduce jobs in Hadoop for analytical batch processing

use-cases. Column families can contain many columns. Each row may have a different

set of columns. The table cells are versioned and are stored as an uninterpreted array

of bytes.

3.2.4 Phoenix

Apache Phoenix [44] is a SQL layer over HBase. It is available as a client-embedded

JDBC driver which enables low latency queries over HBase data in SQL like language.

It takes a SQL query, compiles it into a series of HBase scans, and orchestrates the

results to deliver regular JDBC result sets to the client. It stores the table metadata

in an HBase table in a versioned format. Versioning enables snapshot queries over

prior versions to run using the correct schema automatically. Under the hood, it uses

the HBase API and for efficiency reasons implements most of its functionality using

coprocessors and custom filters. This results in performance on the order of seconds

for big datasets.

3.3 Parallel Secondo

Parallel SECONDO [45] scales up the capability of processing extensible data models

in the SECONDO database system to a cluster of computers. It makes available almost

28


all the operators of SECONDO to be run in parallel on individual SECONDO nodes

by using MapReduce framework. The drawback in this case is that the queries have to

be written in SECONDO executable language which is non-intuitive and more complex

when compared to SQL. Using the executable language, the user can write parallel

queries without learning too many details about the underlying Hadoop platform.

3.3.1 Architecture

Parallel SECONDO has been designed by coupling the Hadoop framework and discrete

SECONDO databases deployed on various nodes of a cluster of computers, as shown in

Figure 3.3. Its deployment is flexible and can be deployed either on a single computer

or a cluster. Both hadoop and Parallel SECONDO are deployed on the same cluster

but can be used independently. Hadoop uses HDFS (Hadoop Distributed File System)

for data exchange, whereas Parallel SECONDO uses a distributed file system called

PSFS (Parallel SECONDO File System), specially prepared for Parallel SECONDO.

Each individual deployable component in Hadoop is called a node however in Parallel

SECONDO it is called a Data Server. A Data Server contains a compact version of

SECONDO called Mini-SECONDO and its database, together with a PSFS node. A

single cluster machine can contain many Data Servers. This increases performance on

machines having multiple hard disks. A Data Server can be deployed on each disk for

higher throughput. When parallel queries are processed, MapReduce framework is used

which means HDFS is used to assign tasks to Data Servers however most intermediate

data is exchanged through the PSFS. Parallel SECONDO contains a master Data

Server and many slave Data Servers. The entry point to the system is only through

the master. Parallel SECONDO comes with a lot of PQC (Parallel Query Converter)

operators, also called Hadoop operators, which convert a parallel query to a Hadoop

job with various tasks to be executed at the slave Data Servers. Data Servers process

these tasks in parallel inside the Hadoop framework. The master database stores meta

data of the whole system.

3.3.2 Parallel Query Execution

To understand the system better, lets see how parallel queries are written in SECONDO

executable language. The SECONDO executable language is more complex than SQL

but allows easier querying than using MapReduce jobs.

29


Figure 3.2: PS-Matrix[45]

30


3.3.2.1 PS-Matrix

Parallel SECONDO uses the concept of PS-Matrix to distribute data over the cluster,

as shown in Figure 3.2. A Secondo object is partitioned using two functions, d(x) and

d(y). The d(x) divides the data into R rows. These rows are distributed over the

cluster. A single Data Server can contain more than one rows. After distribution, d(y)

is used to divide each row into C columns. Hence a PS-Matrix is composed by RXC

pieces.

3.3.2.2 Distributed Data Types

To represent the PS-Matrix, Parallel SECONDO provides a data type flist . It wraps the

existing SECONDO objects and makes them distributable by Parallel Secondo. After

division of an object into a PS-Matrix, piece data belonging to flist is distributed and

kept in slave Data Servers but the partition scheme is kept in the Master SECONDO

as an flist object.

An flist object can be distributed to slave Data Servers in two ways. Either the

piece data is stored in mini SECONDO databases belonging to Data Servers as objects

or it is stored as files in the PSFS node of the Data Servers. Hence, there are two kinds

of flist objects in Parallel SECONDO.

1. Distributed Local Objects (DLO): In a DLO, large-sized SECONDO objects

are divide into a Nx1 PS-Matrix. Each row of this matrix is saved in a slave Mini-

SECONDO database as SECONDO objects, called sub-objects. The sub-objects

that belong to the same flist have the same name in different slave databases.

Theoretically, DLO flist can wrap all available SECONDO data types.

2. Distributed Local Files (DLF): In a DLF, the data is also divided into a

RXC PS-Matrix, but the difference with DLO is that each piece is saved as a

PSFS file, called sub-file. During parallel operations, sub-files can be exchanged

between data servers. At present, only relations can be saved as sub-files.

An flist object can wrap any kind of SECONDO object but it is not always optimal.

For example, the objects which are of very small size should not be stored as an flist,

rather they should be duplicated to various nodes. The process of duplication is good

for small objects but is very heavy for larger objects because the objects contain data

31


Figure 3.3: Parallel Secondo Infrastructure [46]

Figure 3.4: Some examples of flist data type

32


in nested-list format. If a relation has millions of tuples, there will be a lot of overhead

involved in transforming it. Therefore, Parallel SECONDO introduces a new data kind

called DELIVERABLE. The data types belonging to this kind can be duplicated to

slaves during the runtime and hence can be used in parallel queries,.

3.3.2.3 Distributed Operators

There are three types of operators specifically designed for Parallel SECONDO: Flow

operators, Assist operators and Hadoop operators. Flow operators are responsible for

distribution and collection of objects to slave nodes. In other words, they connect se-

quential queries with parallel queries. Two operator of this type spread and collect are

explained in the following sections. Assist operators are designed to work with Hadoop

operators. They enable flist objects to be used with normal operators. Hadoop opera-

tors are based on either Map or the Reduce phase of a hadoop job. They allow running

sequential SECONDO queries at slave nodes of Parallel SECONDO. The explanation

of each of the operators is out of the scope of the thesis but a few important distributed

operators have been chosen which deserve explanation. The informal explanation will

help understand Parallel SECONDO better.

1. Spread Operator: The spread operator partitions a SECONDO relation into

a PS-Matrix, distributes pieces into the cluster, and returns a DLF flist. It di-

vides a relation into rows of PS-Matrix according to a partition attribute AI.

Each row can be further partitioned into columns if another partition attribute

AJ is provided. As it returns a DLF object, each piece of relation is exported as

a sub-file. Following is an example of the use of spread operator. The ffeed op-

erator called ”File Feed” reads a file ’QueryPoints’ from /home/fmorakzai/data

directory and passes it on to spread operator. The spread operator partitions

the data into 4 files each stored in one of the 4 slave SECONDO nodes. When

the data is stored in the sub-files in each node, a reference to those is returned to

the master node for it to store its metadata as a DLF object.

let QueryPoints_p = "QueryPoints" ffeed[’/home/fmorakzai/data’;;]

spread[; Id, 4,TRUE;];

33


2. Collect Operator: The collect operator performs the opposite function

to that of the spread operator. It takes as an input, a DLF kind flist object,

collects the constituent sub-files distributed over the cluster, and returns a stream

of tuples from sub-files. Following example explains its operation. The collect

operator reads the ’QueryPoints p’ flist object that was created in the previous

example from slave nodes and combines them at the master as a single tuple

stream. This stream is then passed to count operator which counts the number

of tuples and returns the result.

query QueryPoints_p collect[] count;

3. Para Operator: The flist can wrap all available SECONDO data types, and

work with various SECONDO operators. However, all operators can not recognize

this new data type and do not know how to process it. To solve this, Parallel

SECONDO implements the para operator which unwraps flist objects and returns

their embedded data types. After this, the object can pass the type checking of

existing operators.

4. HadoopMap Operator: hadoopMap creates an flist object of either DLO or

DLF kind, after processing provided sequential operators by slaves in parallel,

during the map step of the template Hadoop job. The operators provided to it as

an argument are not evaluated in the master node, but delivered and processed in

slaves. Lets take the creation of a distributed B-Tree as an example. The original

sequential query for creation of a B-Tree index looks like:

let singleTable_btree = singleTable createbtree[Licence];

The singleTable exists at the master and the index is also created at the master

just like normal SECONDO B-Tree index. To create a distributed index, we first

need to distribute/partition the singleTable to slave nodes and store it in slave

SECONDO databases as a table. The following query does that. The spread

operator distributes the data to slave nodes as sub-files based on ID attribute

of the single table. HadoopMap operator excepts the DLF object created by the

spread operator and runs the consume operator at each node. The consume

operator stores the sub-files as a table in slave SECONDOs thus creating a DLO.

34

3.4 HBase

let distributedTable = singleTable feed

spread[;ID, 10, TRUE;]

hadoopMap[; . consume];

After running this query, singleTable is distributed over slave SECONDO nodes

and is represented by a DLO distributedTable. We can create a local B-Tree

for this table at each slave node by passing the table name and the function to

create B-Tree to HadoopMap operator as follows.

let distributedTable_btree = distributedTable

hadoopMap[; . createbtree[ID]];

This will create local B-Tree indexes and return a reference distributedTable btree

as a DLO.

3.4 HBase

HBase has been described briefly in section 3.2.3. This section discusses the architec-

ture HBase in more detail. HBase is based on the concept of Log-Structured Merge-

Trees(LSM-trees) [47] just like Google’s BigTable [11]. To understand the functioning

of HBase, its necessary to first understand how LSM-trees work.

3.4.1 Log Structured Message Trees

Log-structured merge-trees [47], also known as LSM-trees, store the incomming data

in a logfile first, completely sequentially. Once the log has the modification saved, it

then updates an in-memory store that holds the most recent updates for fast lookup.

When the system has accumulated enough updates and starts to fill up the in-memory

store, it flushes the sorted list of key→record pairs to disk, creating a new store file.

At this point, the updates to the log can be thrown away, as all modifications have

been persisted. The store files are arranged similar to B-trees, but are optimized for

sequential disk access where all nodes are completely filled and stored as either single-

page or multipage blocks. Updating the store files is done in a rolling merge fashion,

that is, the system packs existing on-disk multipage blocks together with the flushed

in-memory data until the block reaches its full capacity, at which point a new one is

started.

35


Figure 3.5: Multipage blocks iteratively merged across LSM-trees

Figure 3.5 shows how a multipage block is merged from the in-memory tree into the

next on-disk tree. Merging writes out a new block with the combined result. Eventually,

the trees are merged into the larger blocks. As more flushes are taking place over time,

creating many store files, a background process aggregates the files into larger ones so

that disk seeks are limited to only a few store files. The on-disk tree can also be split

into separate trees to spread updates across multiple store files. All of the stores are

always sorted by key, so no reordering is required to fit new keys in between existing

ones. Lookups are done in a merging fashion in which the in-memory store is searched

first, and then the on-disk store files are searched next. That way, all the stored data,

no matter where it currently resides, forms a consistent view from a clients perspective.

Deletes are a special case of update wherein a delete marker is stored and is used

during the lookup to skip deleted keys. When the pages are rewritten asynchronously,

the delete markers and the key they mask are eventually dropped. An additional

feature of the background processing for housekeeping is the ability to support predicate

deletions. These are triggered by setting a time-to-live (TTL) value that retires entries,

for example, after 20 days. The merge processes will check the predicate and, if true,

drop the record from the rewritten blocks.

LSM-trees work at disk transfer rates and scale much better to handle large amounts

of data. They also guarantee a very consistent insert rate, as they transform random

writes into sequential writes using the logfile plus in-memory store. The reads are

independent from the writes, so we get no contention between these two operations.

The stored data is always in an optimized layout. So, we have a predictable and

consistent boundary on the number of disk seeks to access a key, and reading any

number of records following that key doesnt incur any extra seeks. In general, what

could be emphasized about an LSM-tree-based system is cost transparency: we know

36

3.4 HBase

Figure 3.6: HBase Architecture

that if we have five storage files, access will take a maximum of five disk seeks, whereas

we have no way to determine the number of disk seeks an RDBMS query will take,

even if it is indexed.

The next sections will explain the storage architecture.

3.4.2 Architecture

Figure 3.6 shows an overview of how HBase and HDFS are combined to store data.

The figure shows that HBase handles basically two kinds of file types: one is used for

the write-ahead log and the other for the actual data storage. The files are primarily

handled by the HRegionServers. In certain cases, the HMaster will also have to perform

low-level file operations. The actual files are divided into blocks when stored within

HDFS.

HBase consists of one master and many slave nodes. The slave nodes are called

Region Severs. They are called Region Servers because they contain regions of tables.

A table is partitioned based on its key. Each partition (called a Region) is assigned

to a Region Server. Each Region Server ensures efficient read and write access to the

region it is responsible for. A region of a table is called HRegion. A Region Server can

host more than one HRegions belonging to one or more tables. To store an HRegion,

the Region Server uses LSM-Tree as explained before. The actual storage file on disk is

37


called an HFile where as the storage in memory is known as MemStore. When MemStore

gets full, it is flushed to disk as an HFile. Thus, a single HRegion can be stored on

multiple HFiles on disk. For improving disk access, these files are later merged into a

single file through a process called compaction. Every Region Server maintains a Write

Ahead Log (WAL) known as HLog. Update to any of the regions it maintains are first

written to WAL and then stored in the corresponding MemStore.

The WAL (HLog) as well as all HFiles are stored on HDFS for the purpose of

replication and fault-tolerance. Each Region Server has a DFS Client which communi-

cates with HDFS. HDFS DataNode and the Region Server can both exist on the same

machine thus preventing writes or reads over the network.

3.4.3 Write Process

When a client issues a write request, it is routed to the HRegionServer, which hands

the details to the matching HRegion instance. The first step is to write the data to

the write ahead log (the WAL), represented by the HLog class. The WAL is a standard

Hadoop SequenceFile and it stores HLogKey instances. These keys contain a sequential

number as well as the actual data and are used to replay not-yet-persisted data after

a server crash. Once the data is written to the WAL, it is placed in the MemStore. At

the same time, it is checked to see if the MemStore is full and, if so, a flush to disk is

requested. The request is served by a separate thread in the HRegionServer, which

writes the data to a new HFile located in HDFS. It also saves the last written sequence

number so that the system knows what was persisted so far.

3.4.4 Read Process

When a client issues a read request, it is routed to the HRegionServer, which hands

the details to the matching HRegion instance. As the request is always based on the

key, the key is searched in the MemStore. If found, the KeyValue instance is returned.

If the key can not be found in MemStore, it is searched for in the existing HFiles

as more than one HFiles may exist. To optimize the search, each HFile contains a

Bloom Filter which can tell with 100% confidence if the file does not contain the key.

If the Bloom Filter suggests that the file contains the key, there is high probability of

its existence. Based on the suggestion of Bloom Filter, the key is searched in the file.

38

3.4 HBase

Figure 3.7: An HBase table with two column families [48]

HFile is stored as B-Tree which greatly enhances the search efficiency. If the key is

found, a KeyValue instance is returned.

3.4.5 Data Model

HBases data model is very different from what you have likely worked with or know of

in relational databases . As described in the original Bigtable paper [11], its a sparse,

distributed, persistent multidimensional sorted map, which is indexed by a row key,

column key, and a timestamp . The easiest and most naive way to describe HBases

data model is in the form of tables, consisting of rows and columns . The concepts of

rows and columns is slightly different the RDBMSs. We define some of the concepts

first:

1. Table: HBase organizes data into tables . Table names are Strings and composed

of characters that are safe for use in a file system path.

2. Row: Within a table, data is stored according to its row. Rows are identified

uniquely by their row key. Row keys do not have a data type and are always

treated as a byte[] (byte array).

3. Column Family: Data within a row is grouped by column family. Column

families also impact the physical arrangement of data stored in HBase. For this

reason, they must be defined up front and are not easily modified. Every row in a

table has the same column families, although a row need not store data in all its

39


families. Column families are Strings and composed of characters that are safe

for use in a file system path.

4. Column Qualifier: Data within a column family is addressed via its column

qualifier, or simply, column. Column qualifiers need not be specified in advance.

Column qualifiers need not be consistent between rows. Like row keys, column

qualifiers do not have a data type and are always treated as a byte[].

5. Cell: A combination of row key, column family, and column qualifier uniquely

identifies a cell. The data stored in a cell is referred to as that cells value. Values

also do not have a data type and are always treated as a byte[].

6. Timestamp: Values within a cell are versioned. Versions are identified by their

version number, which by default is the timestamp of when the cell was written.

If a timestamp is not specified during a write, the current timestamp is used. If

the timestamp is not specified for a read, the latest one is returned. The number

of cell value versions retained by HBase is configured for each column family. The

default number of cell versions is three.

3.5 Choice of Platform for Guting’s Algebra

A number of NoSQL databases are available in the market each with a particular set

of features. Deciding about the platform of choice is not an easy task. Before going for

HBase, we outlined some criteria which we considered important for the platform for

implementation of the algebra. These criteria are described in the following.

3.5.1 Schema Design

One of the features we were looking for, was the ability to design flexible schemas.

Although almost all of the online NoSQL platforms are Key-Value stores, some of them

provide an abstraction over physical storage which allows users to design schemas. This

greatly enhances the usability of the system and makes it intuitive. It also makes it

easier for people from RDBMS community to migrate to the system. All of the systems

under discussion hold schema design capabilities.

40

3.5 Choice of Platform for Guting’s Algebra

3.5.2 Indexing

As the objective of the thesis is to have online querying support for trajectories or

moving objects, indexing is an important criteria. Moving Objects Databases queries

are normally multi-dimensional. A distributed system with indexing support performs

better for such queries. The system should not only provide indexing support but should

also allow to develop custom indexes. This is important because different types of

applications require different kinds of indexes that perform efficiently for that particular

use-case. All of the systems under discussion possess indexing capabilities. Indexing in

Cassandra is limited as it only helps if the column to be indexed has low cardinality.

We also require the platform to have the support of multi-dimensional indexing. By

this we mean that it should allow indexes on different columns of the same table.

3.5.3 Partitioning Control

Trajectory data is difficult to process when the data grows bigger especially when it

is distributed over a cluster of computers. Imagine a trajectory X stored at node

A. If a new point that is part of trajectory X, arrives and is stored at node B, the

trajectory becomes distributed which makes it difficult to process. A single operation

on trajectory X will require other parts of this trajectory to be retrieved over the

network which is very costly. Some queries require data belonging to a certain time

period to be processed together. In that case the data should be partitioned by time.

Keeping this in view, the system should give us the control to partition the data that

suits us best for the implementation of Guting’s algebra.

3.5.4 Co-location

Partitioning of data allows us to have similar data on a single node but this is not

enough. It prevents us from network latencies but not from disk seeks. Keeping in

view the nature of queries based on Guting’s algebra, we believe that co-location of

data is of crucial nature. If we have an area query asking for all the points inside a

region fulfilling a certain criteria, we want all the points to be located together on disk.

If the data is co-located, we can access all those points with a single seek and then

operate at disk transfer rate. We can get co-location of data in HBase by carefully

41


designing its key however in Cassandra, it is difficult to achieve as it hashes the key

and does not guarantee co-location on disk.

3.5.5 Scan Performance

Most of the queries that we can think of by using Guting’s algebra, are range queries or

require a scan e.g. ”Give me the license details of all taxis with in 3 km of Berlin Hbf

right now”. The platfrom should give reasonable scan performance. Apache Cassandra

is good at point queries; however, Apache HBase gives reasonable scan performance

because of the way it sorts and stores the data co-located based on its key.

3.5.6 Transactions

As our aim is to have the algebra implemented for on-line querying, this suggests that

the system is expected to be deployed in production and accepting a lot of live inputs.

This means that there will be a lot of inserts and updates. We mentioned updates

because the way Guting’s mpoint data type is stored using object based approach i.e.

in a single row, the row will have to be updated each time a new point arrives that

belongs to the trajectory. This motivates the support of transactions. HBase provides

row level consistency and transactional support however Apache Cassandra does not.

Apache Stinger or in other words Apache Hive does not support insertions at all.

3.5.7 Latency

Our objective is to have Guting’s algebra in an on-line environment which means that

whatever platform we choose for our implementation should have low latency in query

processing. Apache HBase and Cassandra are good at it. Although Project Stinger

improved the performance of Apache Hive many times, it is still a platform for offline

analytical queries.

3.6 Summary

In this chapter we discussed various open source distributed platforms for query pro-

cessing that we can use for implementation of Guting’s algebra and that could give

comparable performance with Parallel Secondo which in our knowledge is the only

parallel moving object database. We explained the criteria for our choice of the right

42

3.6 Summary

Ser. Feature Stinger Cassandra HBase

1. Index partial partial yes

2. Partitioning Control no no yes

3. Co-location no no yes

4. Scan Performance yes no yes

5. Transactions no no yes

6. Low Latency no yes yes

Table 3.1: Platform Selection Criteria

platform. Table 3.1 summarizes the discussed platforms’ capabilities according to our

shortlisted criteria. We chose Apache HBase as a platform of our choice after compar-

ison with other platforms.

43


44

4

Algebra Implementation

This chapter presents the implementation of Guting’s algebra over HBase. In sec-

tion 4.1 we discuss the motivation behind the use of Apache Phoenix instead of raw

HBase. Section 4.2 discusses the choice of data structure for implementation of data-

types. Section 4.3 presents the data structures of various types implemented in Apache

Phoenix. As the operators have already been explained in chapter 2, we conclude this

chapter by summarizing some points regarding the implementation of operators.

4.1 Motivation behind the use of Apache Phoenix

Initial work of the thesis was done on raw HBase. Although this approach could lead

to better performance but following drawbacks let to the choice of Apache Phoenix.

1. The schema design could be highly optimized for a particular dataset (BerlinMOD

for this thesis) but there was no generalized way of describing it. Which means

that for a new dataset the same effort is supposed to be done

2. Querying support for the moving object data was provided in the form of a java

api. Operators like ”feed” and ”filter” were provided but the user was supposed to

know the storage structure and the schema design intricacies to write an optimal

query.

3. No SQL interface was available which means no automatic optimization was pos-

sible

45

4. ALGEBRA IMPLEMENTATION

4. For optimization purposes, co-processors were used which run in HBase Region

Server process space. If the code crashed, it would take down the region server

with it. Use of such Algebra implementation requires a lot of trust by the user.

4.2 Implementation Approaches

Keeping in view the above drawbacks, Apache Phoenix was chosen to implement the

algebra. At the time of writing of this thesis, Phoenix does not support definition of

custom data types. To overcome this drawback, following approaches were considered.

4.2.1 Use of Struct

Most commercial databases support ”STRUCT” data type. This allows users to define

structures of their own and store custom objects. Supporting STRUCT data type is

work in progress in Phoenix. This led to the consideration of following approaches.

4.2.2 Binary Objects

Use of binary objects is another way of defining custom data types. Phoenix supports

binary data type using which, any custom object/data can be stored. This approach was

avoided because the data in binary form is hard to handle specially during debugging

of operators, loading scripts and the data types themselves. Also decoding the binary

data each time we have to perform some operation is costly especially if we have huge

number of rows.

4.2.3 Data Type Flattening

Another approach for implementation of data types is by flattening custom data types

to the natively supported data types of the system. This means that all custom data

types are flattened to arrays of native types. This approach has two benefits. First,

it does not require a decoding process and operations can be performed in the native

data-type. Second, it makes the development process simple as it is easily readable

while debugging.

46

4.3 Data Structures

4.3 Data Structures

4.3.1 Spatial Data Types

For the implementation of Guting’s data-types, we used the flattening approach. In

the following, the representation of various implemented data types is shown.

4.3.1.1 Point

float[3]-->{1,x,y}

A point is represented with an array of 3 float numbers. The first digit ”1” denotes the

type code and the next two numbers represent the x and y coordinates of the point.

The type code is used by the operators to identify the type of object passed to them.

4.3.1.2 Points

float[2n+6]-->{2,bb{xmin,ymin,xmax,ymax},numPoints,

x1,y1,x2,y2,...xn,yn}

Points data type is represented by an array of type float. The first element of the

array represents the type code. the next 4 elements of the array represent the bounding

box covering all points. The bounding box helps in indexing the data-type. This also

helps by preventing the need for parsing all the objects for application of an operator.

Therefore, only those objects are parsed whose bounding box intersects with the area

of interest. The next number represents the number of points n contained in the array.

The points are represented in the array starting from index 5 till 5 + 2n− 1.

4.3.1.3 Line

float[n]-->{3,bb{xmin,ymin,xmax,ymax},numPoints,length,

x1,y1,x2,y2,...xn,yn}

Line data type is represented by a float array. The first element of the array represents

the type code. Next four elements represent the bounding box for the line. The

bounding box is stored to make spatial queries faster as the lines can be filtered just by

comparing the bounding box with query parameter instead of comparing all the points

inside the line with the query parameter. The next element represents the number of

47


points contained in the line. The next element is the length of the line. This parameter

is put here to improve the performance of queries based on line length. After that the

points of the line are stored in the array. If there are n points in the line, the size of

the array will be 2n+ 7.

4.3.1.4 DLine

float[n]-->{4,bb{xmin,ymin,xmax,ymax},numLines,length,

numPts,x1,y1,x2,y2,...xn,yn,


numPts,...}

DLine data type represents a line with disconnections or breaks. Line in Guting’s

data types, represents a disconnected line but for the sake of compatibility with other

spatial libraries, Line has been used to represent a connected line whereas DLine is

used to represent a line with breaks. DLine is represented as a float array. The first

element represents the type code. The next four elements represent the bounding box

of DLine. The next element represents the number of connected lines in the data type

after which the total length of DLine is stored. Hereon, the data related to individual

lines is stored. Each connected line starts with the element representing the the number

of points in the line. This helps in knowing how many elements of the array to scan

next. The element after the number of elements twice this number, is part of next line.

4.3.1.5 Region

float[n]-->{5,bb{xmin,ymin,xmax,ymax},numFaces,


numPts,x1,y1,x2,y2,...xn,yn,...}

Region data type represents an area/region. The difference between this and the con-

ventional Region type used in geo-spatial domain is that this data type can represent

a disconnected region. A disconnected region consists of more than one region which

may or may not share common area. In simple words, it is a collection of regions. This

data type is represented by a float array whose first element is the type id. The next

four floats represent the bounding box of the area covered by all regions combined.

This helps in faster query processing and the operator does not need to parse the whole

48

4.3 Data Structures

Ser. Data Type Array Type Array Structure

1 Uint long[3] {11,int,time}2 UBool long[2] {12,+-time}3 UReal double[3] {13,double,time}4 UString varchar[n+3] {14,size,varchar,time}

Table 4.1: Unit Data Types


1 UPoint float[4] {21,x,y,time}2 UPoints float[n] {22,Points,time}3 ULine float[n] {23,Line,time}4 URegion float[n] {24,Region,time}

Table 4.2: Spatial Unit Data Types

data type to check if it lies in the area of interest. The next float represents the total

number of disconnected regions the object contains. Hereon the individual regions are

represented. Each region starts with numPts which is the number of points this region’s

boundary has. numPts and numFaces are required for parsing purposes. Using this in-

formation, the decoder of the operator knows where a new region starts and ends in

the float array.

4.3.2 Basic Unit Data Types

As explained before, a data type can be converted to a unit type by adding a time

attribute to it. Table 4.1 shows the data structures for implemented basic unit data

types. All of the following types are implemented as arrays.

4.3.3 Spatial Unit Data Types

Data structure for spatial data types have been explained before. Spatial unit data

types add an attribute of time to them. Following table shows the data structure of

implemented spatial unit data types. The spatial data types embedded in spatial unit

data types do not contain their type code.

49



1 RINT int[n] {31,numComponents,min,max,s,e,s,e...}2 RBOOL int[n] {32,numComponents,min,max,s,e,s,e...}3 RREAL double[n] {33,numComponents,min,max,s,e,s,e...}

Table 4.3: Basic Range Data Types

4.3.4 Basic Range Data Types

Basic Range data types represent ranges of basic types e.g. int, bool and real. Table 4.3

shows the structure of these types. Each data type starts with the type id followed

by the number of ranges it contains. min and max denote the bound of all the ranges

stored in the object of the data type. Each range is represented by a start value s and

an end value e.

4.3.5 Temporal Range Data Types

Periods is the temporal range data type. It is represented by an array of long with

following data structure:

long[n]-->{36,numComponents,min,max,s,e,lc,rc,s,e,lc,rc...}

Just like other data types, Periods starts with the type id. The next number numComponents

denotes the number of periods the object contains. min and max denote the minimum

and maximum timestamps that bound all of the periods.

4.3.6 Basic Temporal Data Types

By basic temporal data types we mean the moving versions of basic type e.g MInt and

MBool etc. Following is the flattened representation of basic int moving type. For

MBool and MReal, int is replaced by the corresponding basic data type.

float[n]-->{5*,min,max,no-components,periods(deftime),

{numPoints,n1,int,t1,int,t2.....int,tn1},

{numPoints,n2,int,t1,int,t2....,int,tn2}...}

The first element of the array denotes the data type. This is used by the operators

for type checking and using the corresponding decoder for using the type. min and

50

4.3 Data Structures

Ser. Data Type Array Type Type Code

1 Mint float[n] 51

2 MBool float[n] 52

3 MReal float[n] 53

4 MString varchar[n] 54

Table 4.4: Basic Temporal Data Types

max represent the bound of all the objects contained in the type. The no-components

variable shows the number of connected components the data type object holds. The

periods element contains the temporal boundaries of each object it holds. After this

element, all the individual connected basic moving objects are represented. Table 4.4

shows the array type and type codes of each basic moving type.

4.3.7 Spatio-Temporal Data Types

Spatio-temporal data types are also called the moving spatial types and include Mpoint,

Mpoints, MLine and MRegion. During the course of the thesis, only Mpoint has been

implemented as this is the type having most use-cases in real world. This also means

that the operators implemented only support Mpoint for their operations. It is not dif-

ficult to extend the operators to also support other moving spatial types. The following

section explains the flattened type representation of Mpoint.

4.3.7.1 MPoint

float[n]-->{61,bbox,no-components,periods(deftime),

{n1,x1,y1,t1,x2,y2,t2.....xn1,yn1,tn1},

{n2,x1,y1,t1....,xn2,yn2,tn2}...}

The MPoint data type is represented by a float array. The first element shows the type

code. The next four elements represent the bounding box covering the whole of MPoint.

The following element represents the number of components in the MPoint. As MPoint

can have breaks, this number represents the number of connected MPoints. The next

few elements represent the Periods during which the Mpoint is defined. After Periods,

connected Mpoints are stored. The first element of a connected Mpoint denotes the

number of points. The next elements represent the points belonging to this MPoint.

51


Each point is represented by a set of 3 numbers {t, x, y} where t is the time instant and

x & y are the coordinates of the object at that instant. This is followed by the next

MPoint and the next.

4.4 Operators

All the operators in figure 2.5 were implemented as custom functions in Apache Phoenix

except for concat and circle. Guting’s algebra contains many other operators but

these are the operators being used in BerlinMOD Benchmark. As Apache Phoenix

does not support custom data types, it can not distinguish between the data types we

implemented. This means that it can not do type checking as well. Each of the operator

does type checking inside the function by using the type ids. For pure spatial functions

we use JTS Topology Suite [49] which conforms to the Simple Features Specification for

SQL published by the Open GIS Consortium [50]. The operators have to parse the data

type objects before JTS can be used. For efficiency purposes, instead of parsing the

complete object and applying the operator, only the metadata of the data-type object

is read to see if the object is relevant. This filters irrelevant objects before an operation

is applied. For example if an intersection is required between a Region column an

MPoint column, the operator first checks if the bounding box of Mpoint stored in its

meta data overlaps with the region. If it doesn’t, instead of parsing the object and

applying intersection on each contained connected MPoints, a null is returned. All

the operators operate on array types instead of casting them to objects for performance

reasons although it makes the code less readable and less intuitive.

52

5

Indexing Strategy & Querying

Framework

This chapter presents our index design for Guting’s Algebra in HBase. We start by

presenting different index deployment strategies. We motivate our use of SFC(Space

Filling Curves) based index and present three logical and two physical index design

strategies and discuss their querying methodology. We present our querying framework

that is capable of benefiting from our index implementation. We present its components

in detail by giving examples. We also discuss the techniques we used to optimize index

access. We conclude the chapter by describing the future work.

5.1 Indexing in HBase

Any datatype that can be sorted by HBase and is put as a key in a table should be

considered indexed. The reason is that HBase optimizes its row access based on keys

and uses techniques like Bloom filter for faster query processing. In Guting’s spatio-

temporal algebra, time and space are first class citizens and most of the operators work

on these dimensions. As time is represented as a long, it can be sorted and handled

by HBase without any extra effort however a spatial data type like Point or Region

require a different approach towards indexing. An Mpoint involves both spatial and

temporal dimensions which is more complex to index. In the following sections, our

approach to spatial indexing for composite/complex data types is explained. We also

present arguments to support our choice of our indexing strategy based on Space Filling

53

5. INDEXING STRATEGY & QUERYING FRAMEWORK

Curves.

5.2 Indexing Strategies

A lot of work has been published about spatial, temporal and spatio-temporal indexes.

Some surpass others in certain criteria of performance. Before even considering their

performance, it is important to consider how can they be deployed in a distributed

environment. The indexes like R-Tree, KD-Tree or R+ Tree can either be deployed

as a global index serving the whole cluster or as a local index for each slave node.

In addition to that there exist indexes like SETI which are especially designed for

distributed systems. The following sections discuss the pros and cons of each of the

deployment approaches.

5.2.1 Maintaining a Global Index

Maintaining a global index is the easiest approach as it stores all the indexing informa-

tion at a single place which in most cases is the master node. This allows the master

node to find out which rows to access before sending the query to slave nodes. It also

allows to send the query to only those slave nodes where the data is actually present.

Following are the advantages of maintaining a global index:

1. Easy maintenance as index is stored at a single location.

2. Indexed based joins are processed as a local join which prevents distributed join

operations. This greatly improves query performance.

3. Almost all non-distributed indexes can be customized to be used as a global index

for distributed systems.

These advantages seem tempting but when the global indexing approach is consid-

ered in light of scalability, fault-tolerance and throughput, following problems can be

identified.

1. When data-size grows, the size of its global index also grows and at some point

exceeds the capacity of a single node.

54

5.2 Indexing Strategies

2. Due to limited memory capacity of a single node and large size of a global index,

it is not possible to cache the whole index.

3. The performance of most of the non-distributed indexes degrades with the in-

crease in their size. As a global index is responsible for indexing data belonging

to all slave nodes, its size grows dramatically which leads to quicker performance

degradation.

4. When the number of queries grow, master node becomes the bottleneck because

every query is routed through master node for index lookups. Some strategies can

be adopted to partially resolve this problem but they lead to further complexity

and kills ’Simplicity’.

5. HBase is a scalable and fault-tolerant system but does not support integration

with custom indexes. A custom index can only be implemented outside of its

indexing framework. Therefore the custom index (e.g R Tree etc) does not inherit

fault-tolerance and scalability from HBase. A lot of complex work needs to be

done to do intelligent replication and make the index fault-tolerant.

5.2.2 Maintaining Local Indexes

In this approach each slave node maintains its local index. Parallel SECONDO (the

system with which we compare our implementation) uses this approach and builds local

spatial and B-Tree indexes at the slave nodes. This approach has following benefits.

1. This approach is more scalable than global indexing approach. Each index only

handles the data stored in the node where it is deployed.

2. The index can be cached in local nodes because of less size of the data.

There are a few disadvantages of using local indexes as well:

1. Although local index joins can be performed, a global join is still required.

2. Local indexes can not be used unless a complete framework is developed for

support of custom local indexes. All queries will have to be intercepted and

re-written in the HBase co-processors to use local indexes.

55


3. When the size of a region grows, HBase splits the region into two and moves

the second part to some other node in the cluster. As local indexes would be

implemented without support for custom indexes in HBase, all the features like

index splitting, replication and fault-tolerance will have to be implemented from

scratch.

5.2.3 Maintaining Distributed Indexes

There are a couple of distributed spatio-temporal indexes like SETI. They are scalable

but its really hard to integrate them into HBase. The index will have to built outside

HBase and the queries will have to be rewritten before sending them to HBase. This

can be considered as processing two queries at two different systems instead of writing

one query for HBase. Another drawback is that these indexes do not handle scalability

and fault-tolerance automatically. A lot of effort is needed to induce such qualities in

them.

5.2.4 SFC based Indexing for HBase

Space Filling Curves (SFC) have been discussed in chapter 2.4. As discussed before, the

indexing strategy of HBase is based on sorting the key and storing the data in HFile

like a B-Tree to optimize disk access. In other words, HBase can only index a single

dimension. As we intend to index trajectory data which is multi-dimensional, SFCs can

be used to convert multiple dimensions to a single dimension. The spatial dimensions

of a data-type can be reduced to a single dimension which can either be a number or

a base-32 encoded string. It can then be sorted and indexed by HBase. The benefit of

using this approach is that it is scalable. We do not need to take care of fault-tolerance

and replication etc. This approach might not be the most optimal one but the benefits

of scalability, fault-tolerance and simplicity are a strong motivation for us to use it.

Things get tricky when we think of indexing a Region or mpoint data-type. In the

coming sections, we discuss how to tackle these problems.

5.3 Spatial Index Design for LSMT

Log Structured Merge Trees (LSMTs) have been discussed before. HBase is an LSMT

based key-Value store. As mentioned before, HBase indexes data based on the key. The

56

5.3 Spatial Index Design for LSMT

design of the key heavily affects the querying performance. Data is sorted according to

the key. HBase optimizes its disk access to retrieve the data based on the key. HBase

allows us to design a composite key to optimize the performance of our queries based

on our use-case. Before presenting our approach to indexing spatial data, we outline

two crucial objectives for our design with an example. Lets say we have the following

query to process: ”Find all cars within 200 meters of Theodor-Heuss Platz”. Keeping

the query in mind, while designing a spatial index following three objectives should be

taken care of:

5.3.1 Co-location

All points located close to each other should be stored close to each other on disk as

well. Considering above query, if all cars within 200 meters of Theodor-Heuss Platz are

co-located on disk, it will require a single seek and the result will be returned at disk

transfer speed. If the cars are spread over disk, multiple seeks will be required and the

results will be returned at disk seek speed which is many times slower than its transfer

speed. Our choice of implementing an SFC based index already makes that happen as

SFC hashes are similar for closer points (except for a few edge cases) and HBase stores

the hashes in a sorted manner.

5.3.2 Lesser Size of Unwanted Data Scan

The size of unwanted data returned should be small. Unwanted data is the data which

does not fulfill the query criteria. In the above example query, the cars present in the

area beyond 200 meters of Theodor-Heuss Platz are unwanted data. GeoHash has been

explained in section 2.4.3 in detail. The grid sizes of GeoHash are fixed and depend

on the level of detail chosen. If GeoHash or any SFC is used for indexing, a region is

represented by the grid it lies in. In the example query, we want to find all cars in

the circle of radius 200 meters with its center at Theodor-Heuss Platz. To process this

query, the grid cell containing the circle completely, is found. The grid cell will cover

an area greater than that of the circle. This grid cell is used to process the query and

all cars present in the cell are retrieved. This also brings some extra cars which are

not present in the circle. These extra cars are then filtered out. If the size of the cell is

too big, more unwanted data will be retrieved which is costly. This happens when the

grid cell of one level is a bit smaller than the circle and the grid cell of the next level

57


is used for this purpose. The difference in area of grid-cells of two consecutive levels

is significantly large. To prevent this, instead of finding a single grid cell which covers

the query circle, we can also find multiple smaller grid cells that cover the query circle.

5.3.3 Lesser Scans

When we represent a circle with smaller grid cells, there are chances that the number

of grid cells required to cover the circle is very large. For example the circle with in

which we want to find out the cars is very big and requires 1000 grid-cells of level 11.

To process this kind of query, 1000 scans would be required in HBase.

5.4 Our Approach

We achieve the objectives of section 5.3 at two levels i.e. indexing and querying. For

indexing, we present three different strategies for indexing a region and argue which

of those can be used for primary or secondary indexing. At querying level, we present

a querying framework which by using data statistics and schema meta-data, optimizes

the queries to achieve the above mentioned objectives.

5.4.1 Priliminary Choices

5.4.2 Choice of SFC Index

Hbase sorts all the tables on its key attribute. It optimizes disk accesses by using bloom

filters and storing the H-Files like a B-Tree which allows it to quickly reach the required

key. Range queries are also fast because the data is stored in a sorted way. This allows

HBase to remove the complexity of maintaining a separate index for each Region in

each Region Server and creation of new index after region split.

The decision to use the built-in index support for Keys in a table was based on

the fact that it allows to inherit all properties of HBase like scalability, replication

and fault-tolerance. The problem with this is that it only allows one dimensional

querying or indexing support while we want to index multidimensional data like points,

lines, regions and mpoins. For solving this problem, we use Space Filling Curves like

Z-order curve, Hilbert curve or GeoHash. The Space Filling Curves (SFC) convert

multi-dimensional data into a single dimension which allows us to query HBase for this

58

5.4 Our Approach

attribute by using it as a table key. A point (two dimensional) is converted into a one

dimensional hash value. This hash can be sorted and stored by HBase and queried

efficiently.

5.4.2.1 Choice of Geo-Hash

SFCs like Z-order curve, Hilbert Curve or GeoHash tend to achieve the same objective.

They convert multiple dimensions into a single dimension and try to maintain data

locality. Each of these have edge cases where they fail to maintain data locality. We

could not find any study comparing their performance so the choice was hard to make.

We chose GeoHash for our indexing approach because of its acceptance in the open-

source and GIS community. Amazon uses it to index spatial data in its DynamoDB.

From now on, we will be using grid cell or hash interchangeably as a grid cell is repre-

sented by a hash. Also grid level and hash-length represent the same concept. GeoHash

divides multiple dimensions into hierarchical grids where the deepest level is 12 which

represents a point. Unlike R-Trees, the size of the grid cells is fixed. This means that

GeoHash cannot exactly represent data types representing an area e.g. Region, Line

or an mpoint with a single hash value. If a single hash is demanded, it would repre-

sent an area mostly far greater than the actual area covered. Guting’s data-types can

have standard, spatial or temporal dimensions. Standard dimension can be handled

by HBase without hassle. During the thesis, we focused on the spatial dimension and

left temporal dimension as a future work. Spatial dimension can be of type point or

region. Indexing a point is simple and requires a 12 character Geo-Hash to be used as

a key in an HBase table. In the following sections, we present our approach to indexing

data-types that represent an area i.e. an mpoint.

5.4.3 Indexing a Region

Indexing a point is easy using SFC but when it comes to data types having higher

dimensions than a point, it becomes complex. The reason is as follows. A point can

be exactly represented by a length-12 Geo-Hash which makes it possible to have exact

matches while querying point data. A region mostly can not be exactly represented

by a Geo-Hash grid. The reason is that a region can have any shape or size whereas

a grid-cell is always fixed in Geo-Hash. The probability that a region will be of the

exactly same shape and size as the grid is rare. Thus we can say that indexing a region

59


is always an approximation because it does not allow to get exact matches from the

index. In other words we consider a region index a ’filter’. We use the region filter to

retrieve relevant results and then perform exact spatial operations to obtain the true

results. Same approach is used in an R-Tree. An R-Tree can only index objects in

terms of their bounding boxes. An object whose bounding box matches the query is

retrieved but then the actual object is checked to see if it satisfies the query criteria.

If it does, it is retained otherwise it is discarded. In R-Tree, the objects are indexed

based on their bounding boxes. The drawback in SFC based approach is that objects

are indexed based on the grid-cells in which they lie which are pre-decided and often

cover far more area than the bounding box of actual object. This means that after

querying the index, more rows are available for the actual operator to be applied on

as compared to R-Tree. In the following sections we present three different logical

approaches i.e. SLSH, SLMH & MLMH for indexing a region and two approaches i.e.

single-index & multi-index, related to physically maintaining the index.

5.4.3.1 Single-Level Single-Hash (SLSH)

In this approach a single region is represented by a single hash. A single hash has a

unique level that is why we call this approach SLSH (Single Level Single Hash). Using

a single hash to index a region is the simplest solution but it can be highly inefficient

as well. Consider a small region r which can be covered easily by a grid-cell a11 of

hash-length 11. If r lies at the border between two grid cells of length 11 i.e. a11 &

b11, it can no more be indexed using either a11 or b11. We have to choose a bigger grid

cell c10 i.e. a grid cell of length 10. As discussed in section 2.4.3, the difference of area

between the grid cells of two consecutive levels is very big, c10 is many times bigger

than either of a11 or b11. If we choose c10 for indexing r which is even smaller than

a11, c10 is very inaccurately indexing the r. It could lead to low query performance.

Lets assume that the area of r is 5% of the area of c10. Lets say we are querying

the index to find regions intersecting with a point stream. and the points are equally

distributed over the space. As r has been indexed by c10 whose 95% of the area falsely

represents r, the region r would be returned in the results wrongly, 95% of the times.

This example if generalized over the complete indexed dataset tells us that the number

of false results will be more. These results are then transfered to the client for filtering

60

5.4 Our Approach

out extra results. This is heavy on the network as more results are transfered over the

network. This goes against our ’lesser results’ objective.

There is a benefit to this approach as well. As the regions are indexed with a bigger

grid-cell, more and more regions will be represented by a single grid. This means all of

these regions will be stored closed to each other. This satisfies our co-location objective

with less chances of edge cases. Also there is only one hash per region being stored

in the index. This makes it faster on disk. This is less heavy on disk as the data can

ideally be accessed using a single disk seek however it is heavy on network because of

large number of results to be transfered to the client.

Although this approach is not very optimal but we implemented it because this is

the only approach we can use to build a primary index. In HBase, there can only be

one primary index in the table which is the key. As there can only be one key per row,

we can only use a single hash value as a key per row. SLSH allows us to have only

one hash per region which we can use in a rowkey. For a single region column, we can

have one primary and many secondary indexes. For secondary indexes we propose two

approaches based on multiple hashes in the following.

Table 5.1 shows number of points with the hash length of their bounding box, for

a particular table.

To index these hashes, we have two approaches as discussed in the following.

5.4.3.2 Multiple Hashes per Region

To prevent the retrieval of a large set of irrelevant results, we can use multiple hashes to

index a single region. We present two approaches for using multiple hashes for indexing

a region. One approach uses multiple hashes of the same level to index and the other

approach uses multiple hashes of different levels. We call these approaches SLMH &

MLMH respectively. These approaches are presented below.

1. Single-Level Multi-Hash (SLMH): We can index a region with more ac-

curacy if we use multiple grid cells of a granular level. The more granular the

level is, the more accurate the index would be and would satisfy the objective

’lesser unwanted results’. One approach is to use the most granular level for all

i.e. level 11 but we might need to insert hundreds of hashes in the index for a big

region. This increases the size of the index and takes more time in disk access

61


but greatly enhances the performance by reducing the number unwanted tuples

transmitted to the client. If a too large grid level is chosen, it generalizes the

index but decreases the size of the index which means faster disk access. The

choice of wrong level for index can lead to poor performance. It is also hard for a

common database user to understand what grid level is the right one because it all

depends on the data. To prevent this, we propose a smarter way to handle this.

We present a simple algorithm which chooses the level of a region automatically.

The algorithm takes an input variable MAX HASHES PER REGION. This pa-

rameter tells the algorithm, the maximum number of hashes it can use to index

a single object. This indirectly determines the maximum size of the index. If

we have one million rows to be indexed and this parameter is set to 100, we can

be sure that the index size won’t grow more than 100 million rows. Algorithm 1

shows the code of the algorithm. It takes as parameters the region for which

hashes are to be found and the max-hashes parameter. We start with the grid-

cell which completely covers the region and then drill deeper into more granular

levels. The deeper we go, the more hashes will be required to cover the region.

If by going deeper, the total number of hashes returned becomes greater than

the limit MAX HASHES PER REGION, we return the previous list of hashes

with lesser hashes than the maximum allowed. Using this algorithm, regions of

different sizes are represented by hashes of different level proportionate to their

size. This ensures that the level is neither too small for the region nor too big.

Data: region,MAX HASHES PER REGIONResult: List of HasheshashLength= findSingleCoveringHash(region);noOfHashes=0;while hashLength <MAX HASH LENGTH do

newHashes=findHashes(region,++hashLength);if newHashes.size() >MAX HASHES PER REGION then

return listOfHashes;else

listOfHashes=newHashes;end

endAlgorithm 1: Algorithm to determine hashes per region

This is an improved approach when compared to SLSH as it satisfies objectives

’lesser unwanted data’ as well as ’lesser scans’ because the levels are of appropri-

62

5.4 Our Approach

ate granularity for each region. The limitation of this approach is that it should

only be used for building secondary indexes in HBase. Secondary indexes are

maintained in separate table than the main table. They only contain the indexed

hash bundled with the primary key of main table. If a single row is indexed

using 50 hashes for example, each hash will also contain the rowid bundled with

it which helps in joining the results with the main table. As the rowid is gener-

ally small compared to the whole row, this duplication does not produce much

overhead. If SLMH is used as a primary index, each row will have to be stored

as many times as the number of hashes for the row. This wastes a lot of space

and greatly reduces query performance because a range scan now takes longer

because of duplicated data. As part of this thesis, we implemented this approach

for maintaining secondary indexes.

2. Multi-Level Multiple-Hash (MLMH): This approach involves the use of

multiple hashes belonging to different grid levels for indexing purposes. The use

of multiple hashes suggests that it should be used as a secondary index. This

approach is more accurate in representing a region and index size is smaller as

well. When a region is required to be indexed, a coverage-algorithm is used

to find the hashes/grid-cells best covering the region. If a region is big, unlike

SLMH approach which might require 100s of hashes to cover it, this approach

uses a few hashes of bigger size to cover the region and for accuracy purposes,

uses granular hashes to cover small areas of the region not covered by the bigger

hashes. Imagine a big region which is covered by a single-level index of length 10.

Suppose that the region could have been represented by a single grid-cell of length

10 but it has a very small part extending to a neighboring grid cell. SLMH index

would dig deeper and represent it more accurately by using grid-cells of level-11.

This will increase the accuracy but also increase the number of hashes required,

significantly. MLMH index will use one grid cell of level-10 and one grid cell of

level-11 (which is more granular) to index the region. Thus this approach is more

accurate and faster.

63


Hash-Length Count

11 150000

10 0

9 600

8 100

7 2000

6 0

5 0

4 0

3 0

2 0

1 0

Table 5.1: Number of points in each grid level

5.4.4 Physical Approaches for Building the Index

The above types of indexes can be implemented either using a single index or a collection

of indexes. These approaches are discussed below.

5.4.4.1 Single-Index Approach

In this approach, all hashes are stored in a single index. This makes it easy to handle

the indexing process. Consider an index with statistics shown in table 5.1. As we

have hash values of length 7,8,9 and 11, we will have to send four get/scan requests

for each hash-length to retrieve the results and combine them at the client. For all our

experiments, we used this approach because it is easier, clean and more intuitive for

the query optimizer to optimize queries on single-index approach. This is explained in

detail in section 5.4.6 explaining with examples why we need multiple get/scan requests.

5.4.4.2 Multi-Index Approach

Another approach is to build four different indexes with hash-length 7,8,9 and 11. To

process a spatial query, we need to formulate four different queries, one for each index

and combine the results in the end. The difference with single index approach is that in

single index, multiple scan requests are sent to a single index however in multiple index

64

5.4 Our Approach

approach, the same number of requests are sent to different indexes. In the scenario

under discussion, both approaches will require execution of four scan requests. Which

approach is faster is a question that needs to be further looked into.

5.4.5 Optimization

As it can be seen that the lines being covered by hash-lengths 9 and 10 are very less,

they can be merged with hashes of level 8. Thus we can build only two indexes with

hash-length 8 and 11. All the hashes of length 9 and 10 would be trimmed to length 8.

Let us see what happens when a line with a hash-length 8 and hash value ’dr65h8p6’

is an input to a query for finding intersection. Here we assume that the line has been

translated by the Query Optimizer into a single hash but the operator has not been

rewritten yet for simplification purposes.

As we have two indexes, one of length 8 and the other of length 11, the input

bounding box will be queried against both as follows:

WHERE Index8.Line = ’dr65h8p6’ ;

and:

WHERE Index11.Line LIKE ’dr65h8p6%’ ;

The results are then merged at the client.

5.4.6 Index Implementation

We implemented an SLSH index for primary indexing purposes and an SLMH index

for secondary indexes using the single-index approach. We use our coverage-algorithm

to find out the hashes that effectively cover the region. These hashes are stored in a

separate HBase table as a key and the key of original data holding table is stored as a

value. For every spatial query, the index is queried first to get potential matches and

using those matches, the table containing the actual data is queried. Building an index

for a region is an easy task but when it comes to querying, a few tweaks are required.

To understand how we can query such an index, lets take an example query:

SELECT M.name

FROM Movement M

WHERE intersects(M.Route, ’dr65h8p6d542’ );

65


Here we want to find all the people who crossed a specific point. Here Route is a column

containing lines that are indexed using a single hash. Although the index used in this

example is SLSH, the querying process we are about to explain is true for SLMH as

well. The length of hashes vary from 7 to 11 as shown in table 5.1. Lets say that we

have 8 hash entries in the table as shown in table 5.2. As we can see, the query should

return rows belonging to Faisal, Alex, David and Bob. As HBase only supports get and

scan operations over the key, this SQL will be translated by Phoenix into HBase get

and scan operations. Lets discuss how can we retrieve the correct results using a partial

scan or in simple words using a LIKE or = operator. The reason we are discussing this

example with a LIKE or = operator will be clear in later sections where we explain

how we translate Guting’s operators into LIKE or = operator for better performance.

Lets see what happens when we use a hash of length 10 to query e.g:

WHERE M.Route = ’dr65h8p6d5’ );

This will return only Faisal. But we also want Alex, David and Bob to be returned.

Lets try it by trimming the query point to 7 digits so that we can retrieve other relevant

results as well.

WHERE M.Route LIKE ’dr65h8p%’ );

This query will return the whole table because it is too generic. From this we

understand that we will have to issue four queries as we have 4 different lengths of

hashes stored in the table. Our modified query becomes like this:

WHERE M.Route = ’dr65h8p’

OR M.Route = ’dr65h8p6’

OR M.Route = ’dr65h8p6d’

OR M.Route = ’dr65h8p6d5’);

This query will retrieve only the correct results. For this process we need to know

what levels of the grid an index contains. If we don’t know this, we will have to issue a

query with 11 conditions in the where clause, one for each hash-length. This is highly

in-optimal as the chances of having an index containing all the levels is rare. For

making the querying process efficient, we use data statistics which we maintain in our

stats-store which is explained in the section 5.5.5.

66

5.4 Our Approach

Hash Name Hash-Length

dr65h8p6d5 Faisal 10

dr65h8p6d Alex 9

dr65h8p6 David 8

dr65h8p Bob 7

dr65h8p6d1 Terry 11

dr65h8p6d2 Olena 11

dr65h8p6d3 Charlotte 11

dr65h8p6d4 Ivan 11

Table 5.2: Index for Movement Table along with hash-length

The above example explains how to query the index when a point is an input.

This method also works if a region is an input to the query but the region should be

represented by a hash that is greater or equal in length to the most granular hash in

the index. Lets say we have the same query but this time we want to find all the people

who crossed my field ever. Lets assume I have a big field which is represented by a

hash of length 8. The query to find all those people is:

SELECT M.name

FROM Movement M

WHERE intersects(M.Route, ’dr65h8p6’ );

In the previous example we has a point as an input with a hash of length 12 which

we trimmed to make is suitable for matching hashes of smaller length. But in this

example, we have an input hash that is smaller than some hashes in the index. In

this case, we will use LIKE operator for all hash lengths that are greater than 8. The

modified WHERE clause would look like this:

WHERE M.Route = ’dr65h8p’

OR M.Route LIKE ’dr65h8p6%’

For matching the input with smaller hashes in the index, we trim it and use = operator

whereas for hashes that are greater or equal in length to the input hash, we use the

LIKE operator. The more the number of conditions in the WHERE clause, the more

the number of get/scan requests. In our experiments we had three different lengths of

67


hashes in out SLMH index and we saw that sending 3 get/scan requests did not incur

much overhead.

5.4.6.1 Schema Design for GET Requests

As it can be seen we use get and scan interchangeably when we talk of requests. It

totally depends on the schema design which kind of request will be used. For example,

if the key of the index only contains the hash, the ids of all regions having the same

hash in their coverage will be stored against the same key. This does not mean that

the ids will be appended to the same key. How HBase stores this is as follows. Lets

assume our index has two columns hash and id where id is the identifier of region. We

index a region which has a hash=dr65h8p and id=1. This will be stored in HBase as

a row. Now we get another region to index with hash=dr65h8p and id=2. If we insert

this in the index, HBase will not overwrite the previous row instead it will version it.

It will attach a timestamp to this row and store it just next to the previous row. All

other rows with a similar hash will be stored in the same way. Now if we query the

index for hash=dr65h8p, we will get a row with id=2. The reason is that by default,

HBase returns the most recent row it stored which in our case was the one with id=2.

To use this kind of schema design, we increase the number of versions HBase stores for

a row to an appropriate number and tell it to return all the versions for a row when

ever a get request is sent. We use this design approach because this makes our query

translation and optimization more intuitive. This approach allows us to use = operator

to retrieve all the regions belonging to a particular hash. It would be interesting to

see which schema design approach performs better. When we use this schema design

approach, the WHERE clause of our SQL query would look like this:

WHERE hash = ’dr65h8p’

5.4.6.2 Schema Design for SCAN Requests

This schema design approach involves the use of a composite key in HBase. Lets see

this with the example of an index which has two columns hash and id where id is the

identifier of region. Instead of having two columns, we merge them together to have a

composite key column of the format hash-id. Lets say we have two regions and both

have hash=dr65h8p. For one region id=1 and for the other id=2. When we form a

68

5.5 The Querying Framework

composite key, it would look like dr65h8p-1 and dr65h8p-2 respectively. If we want

to retrieve all the regions belonging to dr65h8p hash, we will have to issue a SCAN

request in HBase because now we have two rows for the same hash. Using Phoenix

(which is a SQL layer on top of HBase), we would use the LIKE operator instead of =

operator and our WHERE clause will look like:

WHERE hash LIKE ’dr65h8p%’


In the previous section we presented our design of the spatial index. To enable the user

to transparently query the trajectory data, a querying framework was required that

could hide the complexities of our approach and optimize the query. Figure 5.1 shows

the block diagram of our implementation. The following sections describe each of these

components one by one which as a whole form our querying framework.

5.5.1 Guting’s Algebra

This module contains our implementation of Guting’s algebra (data-types and opera-

tors) within the framework of Apache Phoenix. The operators are designed in a way

that they can take geometric information in the lat-long or SFC hash format. All inter-

nal operations are performed in lat-long format as the JTS library works on WGS-84

geometry. Therefore the types in SFC hash format are converted to lat-long format

before applying the operation. The operators have no knowledge of this conversion.

For this purpose they use the SFC plugins supplied as a package with the algebra.

5.5.2 SFC Plugins

The algebra has no knowledge about matters related to SFCs. During the course of

the thesis, we used GeoHash for all purposes but any other kind of SFC can also

be integrated by implementation of a simple interface. The algebra or the query

translator, only require encode and decode methods but for rest of the modules re-

quire some advanced methods like findNorthNeighbour(), findAllNeighbours() and

findChildCells() etc. to be implemented.

69


Constant Spatial Entities Basic Data Type Format n

Point float[1+2n] {TypeID,lat,long} 1

Bounding Box float[1+2n] {TypeID,lat1,long1,...,lat4,long4} 4

Line float[5+2(n-2)] {TypeID,lat1,long1,...,latn,longn} ≥ 2

Region float[7+2(n-3)] {TypeID,lat1,long1,...,latn,longn} ≥ 3

Table 5.3: Constant Spatial Entities

We distribute SFC plug-ins and Guting’s algebra in the same package so that the

operators can use the encoding and decoding functions of the respective SFC imple-

mentation.

5.5.3 Query Translator

For enabling spatio-temporal querying transparent, a naive query translator has been

implemented. The translator adds support for space filling curves based index. A

variety of space filling curves can be plugged in by implementing a simple interface.

As part of this thesis, GeoHash indexing support was built. Lets take GeoHash as an

example to explain the functioning of this translator. It receives a SQL query as an

input and parses it for existence of a spatial entity. For the purpose of translation, a

spatial entity can either be a constant entity or a meta entity. Types of spatial constant

entities are described in table 5.3.

where n = no.ofpoints in the spatial entity.

As a second step, meta spatial entities are detected. A meta spatial entity is a

table or a column in the table storing a spatial attribute. For the translator, only

those spatial columns are important which store the spatial attribute using space filling

curves. This information is stored in the meta-data store. All spatial constant entities

are translated to their respective space filling curve representation by the translator.

Lets take an example of a spatial query-4 from BerlinMOD Benchmark [16] presented

here as Query-1.

Query-1

SELECT PP.Pos AS Pos, C.Licence AS Licence

FROM dataScar C, QueryPoints PP

WHERE C.Trip passes PP.Pos;

70


Figure 5.1: Implementation Block Diagram

71


This query finds out which licence plate numbers belong to vehicles that have passed

the points from QueryPoints. As Phoenix only supports equi-joins for now, this query

needs to be divided into two queries. We divide the queries manually at the client.

This process is not part of the Query Translator. We use this example just to show

how can we handle a query where we have joins based on Guting’s operators but is not

part of the scope of this thesis. The first query looks like this:

Query-2

SELECT PP.Pos AS Pos

FROM QueryPoints PP;

This gets all the points in the QueryPoints table. As the size of this table is very

small i.e. 100 points, these points are added to the WHERE clause of second query as

constants.

Query-3

SELECT C.Licence AS Licence


WHERE passes(C.Trip , [1,52.40092d,13.52795d])

OR passes(C.Trip , [1,52.46290d,13.55138d])

OR ...;

The query translator queries the meta-store to determine if C.Trip attribute is a

space filling curve enabled and if yes, of what type. It then translates all the points to

the respective representation using the corresponding space filling curve plugin. In our

case, it will use the GeoHash plugin to translate the points and the query would look

something like this:

Query-4

SELECT C.Licence AS Licence


WHERE passes(C.Trip , ’u33d5g6c2r9f’)

OR passes(C.Trip ,’u33dkqefd1fw’)

OR ...;

The points in our implementation of Geo-Hash are base-32 encoded and are rep-

resented by 12 characters . Although support for Geohash has been added to the

72


implemented Guting’s Algerba operators, this approach is far better than allowing the

operators to perform the transformation. Imagine a relation with 2 billion rows and

the following query:

Query-5

SELECT M.Loc

FROM MOVEMENT M

WHERE intersects(M.Loc, [1,52.40092d,13.52795d]);

If M.Loc is a Space Filling Curve (SFC) enabled column, the operator ’intersects’

will be called by Phoenix/HBase for each row which means that the lat-long point will

be translated to the corresponding SFC 2 Bn times. If the query-translator is used, it

will be translated only once. This gives a huge performance boost. So the translated

query which HBase will have to execute is:

Query-6

SELECT M.Loc

FROM MOVEMENT M

WHERE intersects(M.Loc, ’u33d5g6c2r9f’);

The same procedure is applied to other data types mentioned in the above table.

Lets consider an example where the all locations which intersect with a bounding box

are to be retrieved. In that case the ’intersects’ function will be passed a double[9]

array representing a bounding box. The query translator converts this double array

into an array of type varchar[5] where the first element of the array is the type identifier

of the SFC based bounding box. The above query will look something like following:

Query-7

SELECT M.Loc

FROM MOVEMENT M

WHERE intersects(M.Loc, [’51’, ’u33d5g6c2r9f’, ’u33d5g6c2r9g’,

’u33d5g6c2r9h’, ’u33d5g6c2r9i’);

The second performance benefit of using this translator is that it helps in the con-

version of a scan query into a point query which is done by the Query Optimizer. After

translation of the query into an SFC based query, the Query Translator passes it on

to the Query Optimizer which as the name suggests, optimizes the query. The Query

Optimizer is explained in the following.

73


5.5.4 Query Optimizer

The query optimizer takes an SFC based translated query as an input and transforms

a scan query into a point query by using some information from the meta-store. But

before that, the query has to fulfill some criteria. If both the parameters of the operator

are of type point, the optimizer converts the Guting operator into a basic equality

operator. The Query Optimizer gets this info from the type-id of the input constant

spatial entity and from the meta-store for meta spatial entities. Lets take the example

of the above query-6. The signature of intersects involves two SFC based point

parameters. This can be transformed into following query:

Query-8

SELECT M.Loc

FROM MOVEMENT M

WHERE M.Loc = ’u33d5g6c2r9f’;

This increases query performance manifolds as the scan query has turned into a

point query now. HBase can optimally locate the rows and filter them based on the

equality.

This approach is also applied to the operators involving parameters of type other

than the point. To understand how query optimizer optimizes a query involving pa-

rameters that represent an area or a line, lets take Query-7 as an example. When this

query is passed to the optimizer, it identifies the spatial operators and the datatypes of

their parameters. In the current example, it will find out that intersects is a spatial

operator and the parameters provided to it are a point and a bounding box. This query

if sent to HBase will result in a full table scan because Phoenix does not understand

the semantics of the operator. The optimizer converts this full table scan query into a

range query by replacing the intersects operator into a LIKE operator. This is done

by taking the least common string of all 4 SFC points and inserting a literal ’%’ at the

end. Here is how the query looks like after performing this process:

Query-9

SELECT M.Loc

FROM MOVEMENT M

WHERE M.Loc LIKE ’u33d5g6c2r9%’;

74


This greatly enhances the performance of the query and prevents full table scan.

The only problem with that query is that it returns more results than required because

we lost some information while converting a bounding box to a single SFC hash. In

this example, 12 digit hashes were converted to a single 11 digit hash. The results

are retrieved very fast but they need to be filtered at the client. This is done by the

implemented Client Filter.

This doesn’t always lead to optimization for edge cases (discussed previously in

GeoHash explanation. This happens when the area to be queried lies at the edge of

SFC grids. Lets take the example of following figure. The small box in the Geohash grid

represents the bounding box we want to use in our example query. After translation

the query would look something like this:

Query-10

SELECT M.Loc

FROM MOVEMENT M

WHERE intersects(M.Loc, [’51’, ’dr72h8p6c2r9f’,

’dr72hb0u33d5g’, ’dr5ruzb5cr7m6’, ’dr5ruxz2gty6n’);

If a least common string is taken for all 4 points, we get only dr. dr is a 2 character

geohash which represents a very large area. If the query optimizer sends a query with

the WHERE clause such as WHERE M.Loc LIKE ’dr%’, It will probably bring in the whole

database to the client and it will fail. For such cases we need a mechanism which makes

HBase to do lesser disk accesses and also most of the filtering part is shifted to the server

while the client filter is used for fine trimming of the result.

The optimizer solves this problem by taking a user parameter which tells the opti-

mizer the maximum number of hashes or the maximum number of where conditions it

can generate. If this parameter is set to 2, the above query-10 would look like this:

Query-11

SELECT M.Loc

FROM MOVEMENT M

WHERE M.Loc LIKE ’dr72h%’ OR M.Loc LIKE ’dr5ru%’;

In this case, a single intersects operator is transformed into two LIKE operators

each handling a 5 digit hash. This greatly reduces the number of results returned to

75


the client as well disk hits. This was a simple case where each two of the four hashes

had 5 characters in common. This does not happen often. Although this significantly

reduces the size of the result returned to the client, the size of the grids being searched

for, is very large. If the maximum hash split parameter allows for more splits, more

granular grids can be chosen. If the parameter is increased too much, the number of

results will be significantly reduced but can take more time to execute because each

LIKE operator will be turned into an HBase scan. If the data is sparse, many scans

can return an empty set and waste resources thus increasing the overall execution

time. This parameter should be chosen very carefully. The coverage calculation is

optimized based on the statistics of stored data. The coverage of the whole stored data

is divided into grids. For each grid, the total number of points it contains is stored.

The optimizer uses these statistics to find the grid hashes better covering the desired

area. The objective of this coverage algorithm is to find out hashes which reduce the

number of total returned results. This algorithm is described in a separate section.

The job of the optimizer becomes tricky when the column to be queried upon is an

area based data-type e.g. bounding box or a region. Let us consider the conditional

clause of an input query to the optimizer.

Query-12

WHERE intersects(M.Loc, [’51’, ’dr72h8p6c2r9f’, ’dr72hb0u33d5g’,

’dr5ruzb5cr7m6’, ’dr5ruxz2gty6n’);

Lets assume that M.Loc is of type SFC based bounding box. The optimizer gets the

information about its length from the stats-store and uses this to decide the length of

the input hashes. If the length is less than the smallest input hash, all input hashes are

trimmed to make them equal the column hash length. The length parameter is passed

to the coverage algorithm which determines hashes of only that length. Suppose the

length of M.Loc is 5, the coverage algorithm will return hashes of length 5 only. The

above query-12 will be rewritten as:

Query-13

WHERE M.Loc = ’dr72h’ OR M.Loc = ’dr5ru’;

The above mentioned approach is used if an index has a fixed length. This is

true if multi-index approach mentioned in section 5.4.4.2. If a single-index approach

76


Figure 5.2: GeoHash Edge Case [32]

77


is used, we might have hashes of different lengths in the same index. For this, we

ask the coverage-algorithm to provide us with hashes of those levels only. During the

course of our thesis, we used the BerlinMOD Benchmark[16] data and constructed

SLSH approach for primary indexes and SLMH approach for secondary indexes. The

optimizer checks this information from the meta-store to find out what kind of index

does the column hold and the stats-store to find out how many hash-lengths does it

contain and generates queries automatically.

The query can either be on the index table or the main table. For any of these

cases, if the table or the index contain all the columns requested in the query, the

Query Optimizer can generate queries automatically. If the query is on main table,

the optimizer can read the meta information to find out if the column has a secondary

spatial index and query the secondary index if the index contains the columns requested

in the query. But if the index does not contain all the requested columns, the query

needs to be split into two queries, one for the index and the other one for main table.

This scenario is not currently handled by the Query Optimizer automatically. It is just

an implementation challenge which requires more work-hours. Currently we do this

process manually and consider this automation as a future work. We merge the results

of the two queries programmatically.

As a last step, the Query Optimizer appends the original WHERE clause at the

end of the generated query with an AND logical operator. The optimizer does this

to push the operator to Region Servers. Apache Phoenix evaluates the conditions in

order. In our case, the conditions that are in the beginning of WHERE clause are the

ones comparing hashes and are evaluated first. As hashes are just approximations of

the actual covered area, these conditions help to filter out most of the unwanted data

quickly. After evaluation of all hash conditions, the actual operator is applied on the

filtered results to obtain true results. Pushing the operator to Region Servers gives us

two benefits. First, the operators are applied by many machines at the same time, thus

the intermediate results are filtered very fast as compared to a single client doing all

the filtering. Second, the unwanted results are not sent to the client which improves

network performance.

78


5.5.5 Stats-Store

The statistics-store contains various statistics about the data which help Query Op-

timizer and the Coverage Algorithm to optimize their operations. The whole data is

divided into grids of different sizes. The grids are divided into different granularity

levels. The levels to calculate statistics for, are provided in a properties file. Currently,

the statistics can only be stored for GeoHash SFC. The granularity levels vary from

1-12. Granularity level 1 represents the biggest area a base-32 encoded GeoHash can

represent. The dataset provided may not be of this big area and can belong to a coun-

try or a city which requires much granular GeoHashes. To cater for this, the user has

to provide the bounds of the dataset which are stored as part of the statistics. Based

on this information, the parameter MIN HASH LEVEL is determined. For example,

if this parameter is calculated to be 6, the user can tell the statistics module to cal-

culate and store the statistics for levels 6 and above. As this thesis only deals with

SELECTs rather than INSERTs, the statistics module is not integrated with the sys-

tem. Instead the statistics are calculated offline in a batch mode. Keeping in view the

size of the datasets involved, a MapReduce job has been written which does this job.

These statistics are used by the Query Optimizer as well as the Coverage Algorithm

for their operations. The statistics-store is a simple xml file which can be accessed by

the split/max-hash algorithm. The statistics-store stores following information:

1. Maximum Spatial Bounds: The spatial bounds of the data are calculated

offline using a MapReduce job. This information is used to calculate the highest

grid level which is used by the query optimizer in transformation of queries. The

spatial information in BerlinMOD dataset is in grid format rather than lat-long.

We use this information to transform the spatial attribute to lat-long format.

2. Maximum Temporal Bounds: This information can be used for designing

an index for temporal periods. We have left this for future work.

3. Total count of points for each grid cell: The stats-store maintains a list

of grid-cells for top 11 levels and the number of points each cell holds. This

information is used by the coverage-algorithm to decide which cell to go deep in.

4. Total count of objects for each level: The stats-store contains the total

count of objects for each level. This information is used by the index builder

79


to minimize the number of different length hashes in an index. This helps in

reducing the number of HBase scan/get operations for processing a query.

5. Hash-length Frequency for Indexes: The stats-store contains the frequency

of hashes of each length for all SFC indexes. This information is used by the

Query Optimizer to formulate WHERE clause conditions for only those lengths

that exist in the index.

5.5.6 Hash Coverage Algorithm

The hash-coverage algorithm takes as input a list of hashes to determine the coverage

area and return the hashes which cover the area. It accepts a parameter containing the

list of levels for which coverage hashes to generate. The algorithm uses the statistics

of data from statistics store to find best hashes in a greedy manner. The pseudocode

of the algorithm is presented in algorithm 2.

The algorithm starts with the highest level and finds the hashes representing it.

If the number of hashes do not exceed the maximum number of hashes specified, the

hashes for next level are found. These hashes are stored in a list where each element of

the list belongs to hashes of a single level. The hashes which are completely covered by

the hashes of granular level represent the area accurately. We keep those hashes and

remove all the hashes covered by this hash from the granular level. We continue this

process until we get the best coverage in the list of levels provided. If the total number

of hashes exceed the maximum number of hashes specified, we trim the hashes. This

is done by trimming the hashes starting from the granular level. Those hashes are re-

moved first which can be best covered by adding a hash at the immediate higher level. If

the lowest level becomes empty, we move to the next higher level and repeat the process.

80


Data: listOfLevels, region, maxHashesResult: listOfHashessortAscending(listOfLevels);foreach level in listOfLevels do

listOfHashes[level]=findHashes(region,level);if listOfHashes[level-1] exists then

foreach hash in listOfHashes[level-1] doif hash completely covered by listOfHashes[level] then

remove hashes from listOfHashes[level] covered by hash ;endremove hash from listOfHashes[level-1] ;

end

endif size(listOfHashes) >maxHashes) then

break;end

endwhile size(listOfHashes) >maxHashes do

level=LOWEST LEVEL ;commonPrefix = the level-1 common prefix shared by most of thelistOfHashes[level] ;Add commonPrefix to listOfHashes[level-1] ;remove hashes from listOfHashes[level] with commonPrefix ;if size(listOfHashes[level]=0) then

level=level+1;end

endAlgorithm 2: Coverage Algorithm

5.5.7 Client-side filter

When the queries are rewritten by the optimizer, there can be two scenarios. Either

the re-written part only involves points or they also involve area datatypes. If the

re-written part only involves points, the results are exact and do not need to be filtered

at the client end. If the user wrote a query which covers more results than required,

he needs to filter the results by himself. Normally, the Query Optimizer pushes the

actual operator to Region Servers so that no filtering is required at the client but there

are scenarios where some data needs to be filtered. For example, a secondary index

may contain many object IDs against a single hash and a single object ID may be

present against many hashes. When such kind of result is returned to the client, the

81


result needs to be filtered. The client filter receives the results in the form of JDBC

ResultSet and does the filtering. The user can also use the Client-filter to filter his

results. Currently we handle such situation manually. Design of a generic Client-side

filter is recommended as future task.

5.5.8 Meta-Store

The meta-store represents information about schema design useful for Query Translator

and Query Optimizer. In our implementation the meta-store is kept at the client-end for

easier experimentation and tests. It is strongly recommended to keep this information

in HDFS so that other clients connecting to HBase can use this information to optimize

Spatio-temporal queries. The meta-store is created in XML format for easy parsing and

is loaded into memory before running queries.

Meta-Store contains meta information about the schema, indexes as well as the

algebra. The meta-store has two parts which are discussed in the following.

5.5.8.1 Schema Meta-Data

Schema meta-data contains information about the schema that is useful for the query

translator and query optimizer. The general schema information is maintained by

Phoenix which can also be queried for some information but to keep it simple and

robust, we keep all the meta-data as an xml file at the client. The root node of the

meta-store is <MetaStore> which contains a list of <Table> nodes. It is important to

put the information about all tables which contain at least one spatial column. There

can be many spatial columns in a table. As the columns can store spatial information

either in lat-long format or SFC format, this information is stored as an attribute

to each spatial column node. If a column is spatial but is not SFC based, Query

Translator wont translate input arguments to SFC hashes. Query Optimizer will skip

the conditional clauses based on such columns.

Each spatial column can be indexed using SFC hashes. Each of these hashes is

represented as a column of a table. For now, we only support SFC based indexing

which means that we have to store SFC indexing attributes in the <column> node

which is the child of <IndexColumns> node. There can be more than one indexes of

a column depending on the number of levels chosen for indexing. The higher the level,

more results will be returned which means that the work to be done by client-filter

82


Index Type Abbreviation

Single-Level Single-Hash SLSH

Single-Level Multi-Hash SLMH

Multi-Level Multi-Hash MLMH

Table 5.4: Index Types

will be significantly greater. Each index has an attribute of type IndexType which

tells which kind of index it contains. There are three possible options which have been

discussed before but summarized in table 5.4. The Query Optimizer can rewrite the

query in a way that instead of querying the original table, the index is queried. But

the requirement in this case is that the secondary index should contain all the columns

requested in the query and the names of the columns should be the same. If this is not

the case, the query needs to be split into two queries one for the index and based on

the results, another query for main table. Currently query splitting is not supported

by the Query Optimizer but is planned as a future work.

Figure 5.3 shows the meta-data of two tables. First table represents movement of

cars and has columns ID, LICENCE, LOC and MPOINT. LOC denotes the current

location of the car. LOC has been stored as a key in HBase table which means that it is

also indexed. LOC is a point and is represented by 12 characters. All queries involving

LOC attribute will be translated to SFC queries. Another spatial column in this table

is MPOINT which represents motion of the car with time. As a table can have only

one index, it is not possible to index this column in the same table. <IndexColumn>

node contains information about secondary indexes. In the current example, MPOINT

has two secondary indexes MOVE INDEX 1 and MOVE INDEX 2, one of length 10

and the other of length 8 respectively. The IndexType attribute tells that their type is

SLSH. The SFCLength attribute denotes the length of hashes stored. If this attribute is

set to 0, this means that the hashes are of variable length as in case of ROUTE column

of ROAD table. Each secondary index also stores information about its sister columns

present in the index. The secondary indexes of MPOINT column are bundled with ID

and LICENCE attributes. This means that if the query is diverted to the index instead

of main table, these two columns can also be retried from the index. This optimizes

83


Figure 5.3: Meta-Store

the query in a way that a second query to get these columns from main table is not

needed to be sent.

The second table ROADS stores the roads of an area. It has two columns ID and

ROUTE. ROUTE is of type line so can not be stored as an HBase key. Instead,

its bounding box is indexed as key. The <IndexColumns> shows two indexes for

this column. One in the same ROADS table which serves as the key and the other in

MOV INDEX 12 table. Both indexes are of type MLMH. This information is mentioned

to understand the indexing type and is not needed by the querying framework. The

querying process is same for all types of indexes. The information that is helpful

is the length of the index. If it is fixed and greater than 0, it represents a kind of

multi-index approach where a single index can only hold hashes of same length. This

information is used by the Query Optimizer to generate queries best fitting this length.

If the mentioned length is 0, this means that the index contains the hashes of different

length. In this case, the Query Optimizer uses the information from stats-store to know

84

5.6 Future Work

what variety of hash-lengths are available and formulates the query accordingly.

5.5.8.2 Algebra Meta-Data

This store contains the meta-information about the implemented Guting’s algebra op-

erators. Query Translator uses this information to find out Guting’s operators in a

query. This information can also be used for type checking and query validation. No

work has been done by us in this direction. We consider this a part of our future work.

5.6 Future Work

When the user types the query in Phoenix, it is parsed and type checked by Phoenix.

This means that if the user has written a wrong query e.g. comparing a Varchar with

a Long, the parser will throw an error informing user about the actual problem. In

our implementation, the query is directly handed over to the Query Translator which

hands it over to Query Optimizer after translation. Query Optimizer transforms the

query and sends it to Phoenix. By now the query is different than what user wrote and

Phoenix can only do checks on the transformed query. If an exception is thrown, its

hard to debug what went wrong. We suggest that a mechanism should be developed

by which Phoenix does the type checking first before it is handed over to our Query

Translator.

The Query Optimizer is an important part of our implementation and improves

the performance of queries many times. It can convert whole table scans to point

queries and choose the best coverage for input geometries etc. In its current state, it

can not handle secondary indexes which do not contain the other required columns,

automatically and some manual intervention is required. We propose to improve the

optimizer so that it can automatically split the queries and join the results for the user.

The query performance can be enhanced for queries if we also index time periods

in the similar manner as we do for spatial domain. We believe that we can devise a

custom SFC based index for time-periods as well but this is left for future work.

The client-side filter can filter out results based on equality. In some scenarios,

Guting’s operators are needed to filter the result set. Currently these scenarios are

handled manually. The design of a generic client-side filter is recommended.

85


We currently index the regions using SLMH approach which gave us good per-

formance during our experiments. We consider MLMH approach to be more optimal

which can perform better than SLMH for huge datasets. Our coverage algorithm can be

modified a bit to find the best multi-level hashes for this index. It would be interesting

to compare its performance with SLMH.

We implemented Guting’s data-types using float arrays. The encoding and decoding

the types are cpu heavy although we try to skip it where possible (by only reading the

meta-data of the data-type) and the implementation of the operators is complex and

non-intuitive as for performance reasons most of the operations are performed on float

arrays. Parsing the data-types and formation of objects is also an extra overhead which

we managed to reduce significantly because of our indexing approach which helps us

in applying the actual operators to filtered data which is significantly smaller than the

table. If the query can not be transformed by the Query Optimizer for some reason

e.g. index not available etc, the operators will make the query perform very poorly.

Struct data-type in Phoenix is under implementation. If this is released in the next

version, we propose to port the algebra using Struct data-type. This will essentially

involve re-writing of most part of the algebra.

86

6

Benchmark & Results

This chapter presents the results from our experiments conducted using BerlinMOD

Benchmark data. The experiments focus on a comparison between Parallel Secondo,

raw HBase/Phoenix and our implementation of Guting’s Algebra over HBase/Phoenix.

We first present our experimental setup and describe the BerlinMOD Benchmark

datasets. We explain our choice of queries and provide a query-wise analysis of ex-

perimental results.

6.1 Experimental Setup

We conducted our experiments on a cluster of four machines. Each machine was

equipped with 2 Intel Xeon E5530 CPUs (4 cores, 8 hardware contexts) and 48GB

RAM. The machines disk arrays read 500 MB/sec, according to hdparm. The clus-

ter has consequently 32 cores, 64 threads. We used HBase version 0.94.18, Hadoop

version 1.2.1, Apache Phoenix version 3.0 and Parallel SECONDO version 3.3.0 all in

their default configuration. We used OpenJDK version 7 for Parallel Secondo and

Oracle Java version 7 for rest of the platforms. For all index building purposes,

MAX HASHES PER REGION parameter was set to 100. The points and regions used

as an input for our test queries were sampled from QueryPoints and QueryRegions

tables part of BerlinMOD Benchmark.

87

6. BENCHMARK & RESULTS

ScaleFactor Days Vehicles Trips Size

0.005 2 141 1,797 64.5MB

0.05 6 447 15,045 561MB

0.2 13 894 62,510 2.2GB

1.0 28 2,000 292,940 11GB

Table 6.1: BerlinMOD Datasets

6.2 The Dataset

For our experiments we used the data of BerlinMOD[16] benchmark. BerlinMOD has

been designed at University of Hagen under Dr. Ralf Guting who originally proposed

the algebra we implemented as part of this thesis. This benchmark data has been

primarily designed to measure the performance of queries on moving objects data or

specifically an mpoint. The data has been sampled from the moving point data of a

set of cars whose driving is simulated on the street network of Berlin. The simulation

models have been built to represent the behavior of typical workers who commute

between their homes and work places on all working days, and make some trips to

other places in their free time. The available sampled data is mapped to the street

network. The dataset also comes with a generator which allows us to generate data

ourselves based on various parameters. Although the data is mapped to street network,

disturbed data can also be generated. The benchmark data is generated based on real

world spatial data that has been imported from a tool called bbbike (http://bbbike.de).

Bbbike contains real world spatial data based on the streets of Belrin. Currently, the

data of four scales is available. We used all four scales in our experiments to check

scalability. The different datasets that we used and their characteristics are given in

Table 6.1.

6.3 Query Selection

We selected five different queries to test our implementation against other platforms.

These queries have been selected based on the indexed data-type and the provided

input. As our index implementation as well as query optimization revolves around

handling varios index and input combinations of area and point data-types, we tried to

88

6.4 Results

Query Index Input

1. Point Point

2. Point Region

3. Region Point

4. Region Region

5. Region Region

Table 6.2: Benchmark queries by index and input types

cover all possible scenarios and see how our implementation performs when compared

to other platforms. Table 6.2 shows our selected queries by their index and input types.

6.4 Results

In the following, we present the query-wise results of our experiments.

6.4.1 Query-1

SELECT C. Licence AS Licence

FROM dataScar C

WHERE equals(C.loc,Point);

This query retrieves the licenses of all cars who are currently present at a particular

point. This is a simple point comparison. As seen in figure 6.1, raw Phoenix performs

very poor and does not scale. The reason is that Phoenix/HBase does not support

point data-type. We tried to use the latitude component of point as a key but still

HBase has to perform a scan for matching longitude component. The performance is

very bad for scale-1 because the increase in the data to scan. Parallel SECONDO gives

constant processing time. The overhead is because of running a Hadoop job for getting

the results. For our implementation, it is a simple HBase get/scan request because the

point as a whole is indexed as part of the key which allows HBase to retrieve the results

optimally.

89


Figure 6.1: Query-1 Results

6.4.2 Query-2


FROM dataScar C

WHERE inside(C.loc,Region);

This query is similar to Query-1 but takes a region as an input. Raw Phoenix could

not return results for scale-0.2 and scale-1.0. This is because it has to retrieve all the

points at the client to perform inside operation. In our implementation, Phoenix issues

a simple get/scan request on the key which contains the GeoHash of point.

6.4.3 Query-3


FROM dataScar C

WHERE passes(C.Trip,Point);

This query retrieves the licenses of all vehicles which have been at a particular point

at any point in time in their history. C.Trip is of type mpoint. Raw Phoenix fails to

return results for scale-0.2 and scale-1.0 because it has to retrieve all data to the client

for processing. It retrieves all the points from HBase, accumulates them together to

form a line for each car and then uses JTS library to find the intersection between

90

6.4 Results

Figure 6.2: Query-2 Results

the input point and this line. Our implementation performs better because of two

reasons. Firstly because we use a secondary index to retrieve licenses that are potential

candidates and query the main table for only those licenses. Secondly, We push the

operation to HBase Region Servers which means that the operator is applied to the

potential candidates in each slave node and only true results are returned. Our mpoint

data-type contains its exact bounding box in its header which helps us further filter

out irrelevant results and improve performance.

6.4.4 Query-4


FROM dataScar C

WHERE intersects(C.Trip,Region);

This query finds all cars which have been in an area during any time in their

history. Both the parameters of the operator are of type area. The query optimizer

formulates the optimal hash for the input region and sends the query to secondary index.

Operation-wise this query is similar to Query-3 because the secondary index contains

hashes of length 5,6,7 and the input region can be covered by a hash of length 8. This

means that the region has to be trimmed to achieve the WHERE clause conditions.

Same happens in Query-3 where a point represented by hash-length 12 is trimmed to

91


Figure 6.3: Query-3 Performance

match the index. Figure 6.4 shows the performance comparison of this query. Raw

Phoenix could only process the query on scale-0.005 and crashed for all others as it

could not handle the amount of data flowing in.

6.4.5 Query-5


FROM dataScar C

WHERE length(trajectory(at(C.Trip,Region)))>10;

This query is operationally similar to Query-4 as index performance is the same. It

only contains more spatial operations on the filtered data which means that this is CPU

heavy when compared to the previous query. Figure 6.5 shows that raw Phoenix could

only process this query on scale-0.005 data. Our implementation performs better than

Parallel SECONDO and scales well.

92

6.4 Results



93


94

7

Conclusion

This master’s thesis presented the implementation of Guting’s Algebra in HBase, differ-

ent index design and a querying framework which as a whole enable Guting’s Algebra

queries’ execution in HBase. We conclude this work with a summary of the topics

covered in each chapter and an outlook on some of our ideas for future development.

7.1 Summary

Chapter 2 provided the background for this thesis. It covered different algebras that

could be used to process trajectory data and explained Guting’s Algebra in detail. We

discussed different kinds of indexing techniques with a focus on GeoHash.

Chapter 3 presented different distributed platforms for spatio-temporal data pro-

cessing. We discussed some of the open-source platforms for online querying and ex-

plained Parallel SECONDO as well as HBase in detail as they form the basis of our

thesis. We wrap the chapter by presenting various criteria which we considered before

choosing HBase as a platform for our implementation.

Chapter 4 presented our implementation of Guting’s data-types. We discussed

the internal representation of each data-type and explained important data-types like

mpoint in more detail.

Chapter 5 started with an explanation of indexing in HBase. We discussed different

index deployment strategies and motivated the suitability of SFC based indexes for

HBase. We presented three different index designs for indexing regions and discussed

their querying process. We presented a querying framework which enables the use

95

7. CONCLUSION

of our index implementation for Guting’s algebra queries and explained its various

components such as Query Translator and Query Optimizer in detail. We presented a

hash-coverage algorithm for determining hashes to be sent as input to a query on our

index.

Finally, Chapter 6 reported the results of our experiments on five selected queries.

The experiments compared the scalability and the execution performance of Parallel

SECONDO, raw HBase and our implementation. We analyzed the experiment results

and offered possible explanations for the observed performance difference between the

these systems.

7.2 Outlook

Our work in this thesis has delivered promising results and can be extended in many

ways for performance enhancement and usability. One of the major areas to work on, is

temporal periods indexing. Unlike spatial dimension which has fixed bounds, temporal

dimension is always changing. In future we would like to find out how can we use SFCs

to index temporal periods effectively. We also intend to discover ways and means by

which we can combine SFC based spatial and temporal indexes in such a way that a

single key can be used to index both the dimensions.

In the future, we want to improve the querying framework in a two fold way: First,

we want to add support of splitting the queries automatically. Second, we plan to

integrate it with the Query Optimizer of Phoenix so that all other features of Phoenix

e.g. type checking and meta-data storage, can be reused instead of maintaining separate

meta-stores. We also plan to integrate our index building process with Phoenix in such

a way that a user can use SQL to create an index by specifying the corresponding type

i.e. SLSH or SLMH etc.

We intend to compare different indexing strategies i.e. SLSH, SLMH and MLMH

for performance. We also want to see the effect of max-hash parameter in SLMH and

MLMH indexes and find out how an optimal value can be found. We plan to improve

our implementation of data-types by using struct data-type of Phoenix whenever it is

released. We intend to contribute this work to Apache Phoenix.

96

References

[1] Yu Zheng and Xiaofang Zhou. Computing with spatial

trajectories. 2011. iii, ix, 12, 13, 15

[2] Ralf Hartmut Guting, Michael H Bohlen, Martin Erwig,

Christian S Jensen, Nikos A Lorentzos, Markus Schneider,

and Michalis Vazirgiannis. A foundation for repre-

senting and querying moving objects. ACM Trans-

actions on Database Systems (TODS), 25(1):1–42, 2000.

1, 8, 9

[3] Luca Forlizzi, Ralf Hartmut Guting, Enrico Nardelli, and

Markus Schneider. A data model and data structures for

moving objects databases, 29. ACM, 2000. 1

[4] Shashi Shekhar and Hui Xiong. Moving Object

Databases. In Encyclopedia of GIS, pages 732–732.

Springer, 2008. 1

[5] Hartmut Guting, Teixeira de Almeida, and Zhiming Ding.

Modeling and querying moving objects in net-

works. The VLDB JournalThe International Journal on

Very Large Data Bases, 15(2):165–190, 2006. 1

[6] Ralf Hartmut Guting, Thomas Behr, Victor Almeida,

Zhiming Ding, Frank Hoffmann, Markus Spiekermann, and

LG Datenbanksysteme fur neue Anwendungen. SEC-

ONDO: An extensible DBMS architecture and prototype.

FernUniversitat, Fachbereich Informatik, 2004. 1, 12

[7] Ralf Hartmut Guting, Victor Almeida, Dirk Ansorge,

Thomas Behr, Zhiming Ding, Thomas Hose, Frank Hoff-

mann, Markus Spiekermann, and Ulrich Telle. Secondo:

An extensible DBMS platform for research pro-

totyping and teaching. In Data Engineering, 2005.

ICDE 2005. Proceedings. 21st International Conference

on, pages 1115–1116. IEEE, 2005. 1

[8] Victor Teixeira de Almeida, Ralf Hartmut Guting, and

Thomas Behr. Querying Moving Objects in SEC-

ONDO. In MDM, 6, page 47, 2006. 1

[9] Ralf Hartmut Guting, Thomas Behr, and Christian Dunt-

gen. Secondo: A platform for moving objects database

research and for publishing and integrating research im-

plementations. Fernuniv., Fak. fur Mathematik u. Infor-

matik, 2010. 1

[10] Nikos Pelekis, Elias Frentzos, Nikos Giatrakos, and Yan-

nis Theodoridis. HERMES: aggregative LBS via a

trajectory DB engine. In Proceedings of the 2008

ACM SIGMOD international conference on Management

of data, pages 1255–1258. ACM, 2008. 1

[11] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C

Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chan-

dra, Andrew Fikes, and Robert E Gruber. Bigtable:

A distributed storage system for structured

data. ACM Transactions on Computer Systems (TOCS),

26(2):4, 2008. 2, 27, 28, 35, 39

[12] James F Allen. An Interval-Based Representation of

Temporal Knowledge. In IJCAI, 81, pages 221–226,

1981. 5

[13] Krishna Kulkarni and Jan-Eike Michels. Temporal fea-

tures in SQL: 2011. ACM SIGMOD Record, 41(3):34–

43, 2012. 6

[14] Nikos Pelekis, E Frentzos, N Giatrakos, and Y Theodor-

idis. HERMES: A trajectory DB engine for

mobility-centric applications. International Journal

of Knowledge-based Organizations, 2011. 6

[15] Martin Erwig, Ralf Hartmut Gu, Markus Schneider,

Michalis Vazirgiannis, et al. Spatio-temporal data

types: An approach to modeling and query-

ing moving objects in databases. GeoInformatica,

3(3):269–296, 1999. 8

[16] Christian Duntgen, Thomas Behr, and Ralf Hartmut

Guting. BerlinMOD: a benchmark for moving ob-

ject databases. The VLDB Journal, 18(6):1335–1368,

2009. 11, 70, 78, 88

[17] Dieter Pfoser, Christian S Jensen, Yannis Theodoridis,

et al. Novel approaches to the indexing of mov-

ing object trajectories. In Proceedings of VLDB, pages

395–406. Citeseer, 2000. 12, 14

[18] Kyung-Chang Kim and Suk-Woo Yun. MR-Tree: a

cache-conscious main memory spatial index struc-

ture for mobile GIS. In Web and Wireless Geographical

Information Systems, pages 167–180. Springer, 2005. 12

[19] Mario A Nascimento and Jefferson RO Silva. Towards

historical R-trees. In Proceedings of the 1998 ACM

symposium on Applied Computing, pages 235–240. ACM,

1998. 12, 15

[20] Yufei Tao and Dimitris Papadias. Efficient historical R-

trees. In Scientific and Statistical Database Management,

2001. SSDBM 2001. Proceedings. Thirteenth International

Conference on, pages 223–232. IEEE, 2001. 12, 15

[21] V Prasad Chakka, Adam C Everspaugh, and Jignesh M Pa-

tel. Indexing large trajectory data sets with SETI.

Ann Arbor, 1001:48109–2122, 2003. 12, 16

[22] Panfeng Zhou, Donghui Zhang, Betty Salzberg, Gene

Cooperman, and George Kollios. Close pair queries

in moving object databases. In Proceedings of the

13th annual ACM international workshop on Geographic

information systems, pages 2–11. ACM, 2005. 12

[23] Gısli R Hjaltason and Hanan Samet. Distance browsing

in spatial databases. ACM Transactions on Database

Systems (TODS), 24(2):265–318, 1999. 13

[24] Xiaolei Li, Jiawei Han, Jae-Gil Lee, and Hector Gonza-

lez. Traffic density-based discovery of hot routes

in road networks. In Advances in Spatial and Temporal

Databases, pages 441–459. Springer, 2007. 14

97

REFERENCES

[25] Xiaomei Xu, Jiawei Han, and Wei Lu. RT-tree: an im-

proved R-tree index structure for spatiotemporal

databases. In Proceedings of the 4th international sympo-

sium on spatial data handling, 2, pages 1040–1049. IGU

Commission on GIS, 1990. 15

[26] Yufei Tao and Dimitris Papadias. The mv3r-tree: A

spatio-temporal access method for timestamp and

interval queries. 2001. 16

[27] David Lomet and Betty Salzberg. The performance

of a multiversion access method. In ACM SIGMOD

Record, 19, pages 353–363. ACM, 1990. 17

[28] Longhao Wang, Yu Zheng, Xing Xie, and Wei-Ying Ma.

A flexible spatio-temporal indexing scheme for

large-scale GPS track retrieval. In Mobile Data Man-

agement, 2008. MDM’08. 9th International Conference on,

pages 1–8. IEEE, 2008. 17

[29] Michael Rys Nicholas Dritsas Ed Katibah, Milan Stojic.

Tuning Spatial Point Data Queries in SQL Server

2012. In http://social.technet.microsoft.com/, pages 441–

459. Microsoft, 2013. 18

[30] Marshall Bern, David Eppstein, and Shang-Hua Teng.

Parallel construction of quadtrees and quality tri-

angulations. International Journal of Computational

Geometry & Applications, 9(06):517–532, 1999. 18

[31] Jonathan K Lawder and Peter JH King. Using space-

filling curves for multi-dimensional indexing. In

Advances in Databases, pages 20–35. Springer, 2000. 19,

20

[32] Nick Dimiduk, Amandeep Khurana, Mark Henry Ryan, and

Michael Stack. HBase in action. Manning Shelter Island,

2013. 21, 22, 23, 77

[33] Ahmed Eldawy and Mohamed F Mokbel. A demon-

stration of spatialhadoop: an efficient mapreduce

framework for spatial data. Proceedings of the VLDB

Endowment, 6(12):1230–1233, 2013. 25

[34] Ahmed Eldawy, Yuan Li, Mohamed F Mokbel, and Ravi

Janardan. CG Hadoop: computational geome-

try in MapReduce. In Proceedings of the 21st ACM

SIGSPATIAL International Conference on Advances in

Geographic Information Systems, pages 284–293. ACM,

2013. 25

[35] Ahmed Eldawy and Mohamed F Mokbel. Pigeon: A

spatial MapReduce language. In Data Engineering

(ICDE), 2014 IEEE 30th International Conference on,

pages 1242–1245. IEEE, 2014. 26

[36] Ablimit Aji, Fusheng Wang, Hoang Vo, Rubao Lee, Qiaol-

ing Liu, Xiaodong Zhang, and Joel Saltz. Hadoop GIS:

a high performance spatial data warehousing sys-

tem over mapreduce. Proceedings of the VLDB En-

dowment, 6(11):1009–1020, 2013. 26, 27

[37] Avinash Lakshman and Prashant Malik. Cassandra: a

decentralized structured storage system. ACM

SIGOPS Operating Systems Review, 44(2):35–40, 2010.

26

[38] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gu-

navardhan Kakulapati, Avinash Lakshman, Alex Pilchin,

Swaminathan Sivasubramanian, Peter Vosshall, and

Werner Vogels. Dynamo: amazon’s highly available

key-value store. In ACM SIGOPS Operating Systems

Review, 41, pages 205–220. ACM, 2007. 27

[39] Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng

Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete

Wyckoff, and Raghotham Murthy. Hive: a warehous-

ing solution over a map-reduce framework. Pro-

ceedings of the VLDB Endowment, 2(2):1626–1629, 2009.

27

[40] Apache HBase. 28

[41] Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira,

and Benjamin Reed. ZooKeeper: Wait-free Coordina-

tion for Internet-scale Systems. In USENIX Annual

Technical Conference, 8, page 9, 2010. 28

[42] Apache Hadoop. 28

[43] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung.

The Google file system. In ACM SIGOPS Operating

Systems Review, 37, pages 29–43. ACM, 2003. 28

[44] Apache Phoenix. 28

[45] Jiamin Lu and Ralf Hartmut Guting. Parallel SEC-

ONDO: A practical system for large-scale process-

ing of moving objects. In Data Engineering (ICDE),

2014 IEEE 30th International Conference on, pages 1190–

1193. IEEE, 2014. 28, 30

[46] Jiamin Lu and Ralf Hartmut Guting. Simple and ef-

ficient coupling of Hadoop with a database en-

gine. In Proceedings of the 4th annual Symposium on

Cloud Computing, page 32. ACM, 2013. 32

[47] Patrick ONeil, Edward Cheng, Dieter Gawlick, and Eliz-

abeth ONeil. The log-structured merge-tree (LSM-

tree). Acta Informatica, 33(4):351–385, 1996. 35

[48] Amandeep Khurana. Introduction to HBase Schema

Design. Usenix;login, 37(5):1626–1629, 2012. 39

[49] JTS Topology Suite. 52

[50] Open GIS Consortium. 52

98

http://hbase.apache.org/

http://hadoop.apache.org/

http://phoenix.apache.org/

http://www.vividsolutions.com/jts/JTSHome.htm

http://www.opengeospatial.org/

Declaration

I herewith declare that I have produced this paper without the prohibited

assistance of third parties and without making use of aids other than those

specified; notions taken over directly or indirectly from other sources have

been identified as such. This paper has not previously been presented in

identical or similar form to any other German or foreign examination board.

The thesis work was conducted from 1st April to 10th August 2014 under

the supervision of Alexander S. Alexandrovv at the DIMA research group

at TU Berlin.

Berlin, 10th August 2014

Documents

Trajectory Data Modeling and Processing in HBaseit4bi.univ-tours.fr/it4bi/medias/pdfs/2014_Master_Thesis/IT4BI... · Trajectory Data Modeling and Processing in HBase Faisal Moeen