57
Budapest University of Technology and Economics Department of Measurement and Information Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges Dániel Varró Budapest University of Technology and Economics Fault Tolerant Systems Research Group

IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Embed Size (px)

DESCRIPTION

In model-driven software engineering (MDE), model queries are core technologies of many tool and transformation-specific challenges such as design rule validation, model synchronization, view maintenance, simulation and many more. As software models are rapidly increasing in size and complexity, traditional MDE tools frequently face scalability issues that decrease productivity of engineers and increase development costs. Incremental graph queries offer a graph pattern based language for capturing queries. Furthermore, the result set of a query is cached and incrementally maintained upon model changes to provide instantaneous query response time. In this talk, first a brief overview is given on the EMF-IncQuery framework (which is an official Eclipse subproject). Then we discuss how to incorporate incremental queries over a distributed cloud infrastructure (to scale up from a single-node tool to a cluster of nodes) deployed over popular database back-ends (such as Cassandra. 4store, Neo4J, etc). We present our first benchmarking experiments with IncQuery-D to highlight that distributed incremental model queries can perform significantly better than the native query technologies of the underlying database back-end, especially, for complex queries.

Citation preview

Page 1: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems

Distributed Incremental Model Queries over the Cloud:

Engineering and Deployment Challenges

Dániel VarróBudapest University of Technology and Economics

Fault Tolerant Systems Research Group

Page 2: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Outline of the TalkMotivation & Background:• Validation of design rules• Graph pattern matching

Incremental Model Queries: The EMF-IncQuery framework• Language - Execution

Distributed Incremental Model Queries (IncQuery-D)• Architecture -

Performance Benchmarks• Distributed model load• Incremental query evaluation

Main Contributors o István Ráth (lead)o Ákos Horvátho Gábor Bergmanno Ábel Hegedüso Zoltán Ujhelyio Benedek Izsóo Gábor Szárnyaso Csaba Debrecenio Dénes Harmatho József Makaio Dániel Stein

Page 3: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

SCALABLE MODEL DRIVEN ENGINEERING

Page 4: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Scalable MDE: The MONDO Project

Models and Languages

• Large and heterogeneous

• Construction• Visualization

Queries and Transformations

• Executed over large models

• Incremental• Lazy• Parallel

Collaboration

• Offline (SVN)• Online (Gdocs)• Many

collaborators• Secure access

Persistent Storage

• Efficient• Secure• Interoperability

Case studies: • validate solutions through real case studies• guided by industrial advisory board

Prototype tools: • open source software• open benchmarks

Academic Partners: • Univ. York (UK) Univ. Autónoma Madrid (ES), ARMINES (FR), BME (HU)

Industrial Partners: • The Open Group (UK), Uninova (PT), Softeam (FR), Soft-Maint (FR), IKERLAN (ES)

Page 5: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

MOTIVATION FOR INCREMENTAL MODEL QUERIES

Page 6: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Motivation: Early validation of design rulesSystemSignalGroup design rule (from AUTOSAR)

o A SystemSignal and its group must be in the same IPduo Challenge: find violations quickly in large modelso New difficulties

• reversenavigation

• complexmanualsolution

AUTOSAR: • standardized SW architecture of the automotive industry• now supported by modern modeling toolsDesign Rule/Well-formedness constraint: • each valid car architecture needs to respect• designers are immediately notified if violatedChallenge: • >500 design rules in AUTOSAR tools• >1 million elements in AUTOSAR models• models constantly evolve by designers

Page 7: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Domain-Specific Modeling Languages

Abstract

Meta-model

Model

«type»

Page 8: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Validation of Well-formedness Constraints

Meta-model

Model

pattern switchWOSignal(sw) { Switch(sw); neg find switchHasSignal(sw);}

pattern switchHasSignal(sw) { Switch(sw); Signal(sig); Signal.mountedTo(sig, sw);}

Query

Modify

User

Result

Domain-specific modeling languages

Page 9: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Model sizes in practice Models with 10M+ elements are common:

o Car industryo Avionicso Source code analysis

Models evolve and change continuously

Source: Markus Scheidgen, How Big are Models – An Estimation, 2012.

Application Model sizeSystem models 108

Sensor data 109

Geospatial models 1012

Validation can take hours

Page 10: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

MODEL QUERIES AND GRAPH PATTERN MATCHING

Page 11: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

What is a model query? For a programmer:

o A piece of code that searches for parts of the model For the scientist:

oQuery = set of constraints that have to be satisfied by (parts of) the (graph) model

o Result = set of model element tuples that satisfy the constraints of the query

oMatch = bind constraint variables to model elements A query engine: Supports

o the definition&executionof model queries

Query(A,B) ∧condi(Ai,Bi) • all tuples of model elements a,b• satisfying the query condition• along the match A=a and B=b• parameters A,B can be input/ output

Page 12: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Graph Pattern Matching for Queries

Match: om: L G

(graph morphism)o CSP:

• Variables: Nodes of L• Constraints: Edges of L• Domain values: G

o Complexity: |G|^|L|

L

Gstraight

left

route: Route sp: SwitchPosition

switch: Switchsensor: Sensor

switchPosition

switchsensor

routeDefinition

All sensors with a switch that belongs to a route must directly be linked to the same route.

Page 13: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

route: Route sp: SwitchPosition

switch: Switchsensor: Sensor

switchPosition

switchsensor

routeDefinition

Graph Pattern Matching (Local Search)

Search Plan: o Select the first node

to be matchedo Define an ordering on

graph pattern edges Search is restarted from

scratch each time

12

0

3

4

straight

left

Page 14: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

route: Route sp: SwitchPosition

switch: Switchsensor: Sensor

switchPosition

switchsensor

routeDefinition

Incremental Graph Pattern Matching

Main idea: More space to less timeo Cache matches of patternso Instantly retrieve match (if valid)o Update caches upon model changeso Notify about relevant changes

Approaches: o TREAT, LEAPS, RETE, …o Tools: VIATRA, GROOVE, MoTE, TCore

straight

left

route sp switch sensor

r1 sp1 sw1

Page 15: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Batch vs. Live Query Scenarios Batch query

(pull / request-driven):1. Designer selects a query2. One/All matches are

calculated3. Rule is applied on one/all

matches4. All Steps 1-3 are redone if

model changes Query results obtained

upon designer demand

Live query(push / event-driven):1. Model is loaded2. Rule system is loaded3. Calculate full match set4. Model is changed (rules

fired or designer updates)5. Iterate Steps 3 and 4 until

rule system is stopped Query results are always

available for designer

Page 16: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

INCREMENTAL MODEL QUERIES: THE EMF-INCQUERY PROJECT

Page 17: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

• Declarative graph query language• Transitive closure,

Negative cond., etc.• Compositional, reusable

Definition

• Incremental evaluation• Cache result set• Maintain incrementally

upon model change

Execution

• Derived features,• On-the-fly validation• View generation,

Notifications, Soft links, Databinding,

Features

EMF-IncQuery: An Open Source Eclipse Project

http://eclipse.org/incquery

Page 18: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

The IncQuery (IQ) Graph Query Language

IQ: declarative query languageo Attribute constraints o Local + global querieso Compositionality+Reusabilility o Recursion, Negation, o Transitive Closureo Syntax: DATALOG style

pattern routeSensor(sensor: Sensor) = { TrackElement.sensor(switch,sensor); Switch(switch); SwitchPosition. switch(sp, switch); SwitchPosition(sp); Route.switchPosition(route, sp); Route(route); neg find head(route, sensor); }pattern head(R, Sen) = { Route.routeDefinition(R, Sen);}

ModelQuery(A,B): • tuples of model elements A, B• satisfying the query condition• enumerate 1 / all instances• A,B can be input or output

route: Route sp: SwitchPosition

Switch: Switchsensor: Sensor

switchPosition

switchsensor

routeDefinition

Page 19: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

TOOL DEMO: INCQUERY Development Tools

Query Explorer

Pattern Editor

Queries are applied & updates on-the-fly

• Works with most EMF editors out-of-the-box

• Reveals matches as selection

Page 20: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Incremental Query Evaluation by RETE AUTOSAR well-formedness validation rule

Communication channel

Logical signal Mapping Physical signal

Invalid model fragment

Instance model

Valid model fragment

Page 21: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Fill the input nodesFill the worker nodesRead the result setModify the modelPropagate the changesRead the changes in the result set (deltas)

Incremental Query Evaluation by RETE

join

join

antijoin

Result set

input nodes

Communication channel

Logical signal Mapping Physical signal

worker nodes

Page 22: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Performance of EMF-INCQUERY Incremental graph queries based on Rete Built for the Eclipse Modeling Framework

model size

runtimebatch queries

incremental queries

Runtime is proportional to the size of the modification.

Page 23: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Performance of EMF-INCQUERY

model size

incremental queries

batch queries

memory limit

Storing partial resultsmemory

consumption

Page 24: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Selected Applications (EMF-IncQuery)• Complex traceability• Query driven views• Abstract models by

derived objects

Toolchain for IMA configs

• Connect to Matlab Simulink model

• Export: Matlab2EMF• Change model in EMF• Re-import:

EMF2Matlab

MATLAB-EMF Bridge

• Live models (refreshed 25 frame/s)

• Complex event processing

Gesture recognition

• Experiments on open source Java projects

• Local search vs. Incremental vs. Native Java code

Detection of bad code smells

• Rules for operations• Complex structural

constraints (as GP)• Hints and guidance• Potentially infinite

state space

Design Space Exploration

• Itemis (developer)• Embraer• Thales• ThyssenKrupp• CERN

Known Users

Page 25: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

INCQUERY-D: DISTRIBUTED INCREMENTAL MODEL QUERIES

Page 26: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Goals of INCQUERY-D Objectives

o Distributed incremental pattern matchingo Adaptation of EMF-INCQUERY’s tooling to graph DBso Executed over cloud infrastructure (COTS hardware)

Achieve scalability by avoiding memory bottlenecko Sharding separately

• Data• Indexers• Query network

o In memory: • Index + Query

Assumptions• All Rete nodes fit on a server node• Indexers can be filled efficiently• Modification size model size≪• The application requires the complete result

set of the query (opposed to just one match)

Page 27: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Dimensions of Scalability Infrastructure

o Number of machineso Available memory / CPUo Network performanceo Number of concurrent users

Modelo Model sizeo Model characteristics

Querieso Number of querieso Query complexity

Metrics

Page 28: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

From EMF-INCQUERY to INCQUERY-D

Transaction

In-memory EMF model

Rete net

Indexer layer

EMF-INCQUERY

Indexing

In-memory storage

Production network• Stores intermediate query results• Propagates changes

Page 29: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Database shard 0

INCQUERY-D Architecture

Server 1

Database shard 1

Server 2

Database shard 2

Server 3

Database shard 3

Transaction

Server 0

Rete net

Indexer layer

INCQUERY-D

Distributed query evaluation network

Distributed indexer Model access adapter

Distributed indexing, notification

Distributed persistent storage

Distributed production network• Each intermediate node can be allocated

to a different host• Remote internode communication

Page 30: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

INCQUERY-D Architecture

Server 1

Database shard 1

Server 2

Database shard 2

Server 3

Database shard 3

Transaction

In-memory EMF modelDatabase shard 0

Server 0

Indexer layer

INCQUERY-D

Indexer Indexer Indexer Indexer

JoinJoin

Antijoin

Akka

Triple store (4store),Document DB (Mongo),RDF over Column family

(Cumulus)

Page 31: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Database shard 0

Termination Protocol in INCQUERY-D

Server 1

Database shard 1

Server 2

Database shard 2

Server 3

Database shard 3

Transaction

Server 0

INCQUERY-D

Indexer Indexer Indexer Indexer

JoinJoin

Antijoin

When a production node reached an ACK message is sent back Stack added to each update msg

• Registers the Rete nodes the message passes through

User retrieves query result

Page 32: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

IncQuery-D Architectural Layers

•Gremlin, Cypher

•SPARQL

•IQPL (IncQuery)

High-Level Query Lang

•Distributed Indexers (MONDIX)

•SPARQL

Low-Level Query Lang

•Cayley

•Titan

•4store

Distributed Graph DB

•MongoDB

•Cassandra

•4store

Native Storage

•RDF

•XMI / Ecore

•Property Graphs

Storage Format

• Efficient element access by indices• Local queries

• Global queries• Complex navigations

• Can be transparent (via indexers) • Integrates popular graph storages

• Efficient NoSQL storages• Triple stores

• Standardized data formats• Popular interchange formats

Page 33: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Summary: Key Components of IncQuery-D

Distributed Model Storage

• Adaptable to different back-end storages

• Agnostic to graph repres.

• TripleStores (RDF), EMF,Property graph

Model Access Adapter

• Surrogate key to identify distibuted elements

• Graph manip. API

• Change notifications

Distributed Indexer

• Type-instance indices, etc.

• Stored on multiple servers

• Protects exceeding memory limits

Distributed Query Evaluator

• Distributed RETE network

• Distributed termination protocol

• Constructed and deployed by coordinator node

Decouple and separately distribute Storage, Indexer and Query layers

Page 34: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

USAGE PHASES

Page 35: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Load Model

Update Model

Request Result

(1) Loading a Query

Deploy RETE

RETE Network

Allocate RETE

Cloud Infra-

structure

Construct RETE

Load Query

Construct RETE• From EMF-IncQuery specs• Should incorporate

infrastructure constraints

Deploy RETE• Managed by a

coordinator node• Intelligent sharding of

RETE nodes

Page 36: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Load Model

Update Model

Request Result

(2) Loading a Model

Model shards

Deploy RETE

RETE Network

Allocate RETE

Maintain Result Set

Cloud Infra-

structure

Construct RETE

Model Access Adapter

Load Query

Load model• Model traversal• Init indexers• Network

communication

Page 37: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Load Model

Update Model

Request Result

(3) Updating a Model

Model shards

Deploy RETE

RETE Network

Allocate RETE

Maintain Result Set

Cloud Infra-

structure

Construct RETE

Model Access Adapter

Load Query

Model manipulation• Update messages• Create / Delete

Page 38: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Load Model

Update Model

Request Result

(4) Requesting Query Result

Model shards

Deploy RETE

RETE Network

Allocate RETE

Evaluate Query

Maintain Result Set

Cloud Infra-

structure

Construct RETE

Model Access Adapter

Load Query

Evaluate query• Process incoming

messages• Propagate along

RETE network

Retrieve results• instantly

Page 39: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Load Model

Update Model

Request Result

(5) Monitoring and Reconfiguration

Model shards

Deploy RETE

RETE Network

Allocate RETE

Evaluate Query

Maintain Result Set

Monitor & Manage

Cloud Infra-

structure

Construct RETE

Model Access Adapter

Load Query

Visualized on a web-based dashboard

OS metrics JVM metrics Rete metricsAkka metrics

Page 40: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

DEPLOYMENT PROCESS FOR DISTRIBUTED RETE

Page 41: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

RETE Deployment ProcessQuery

Language

Query Predicates

RETE Structure

Platform Description

Allocation / Mapping

Deployment Descriptor

pattern routeSensor(sensor: Sensor) = { TrackElement.sensor(switch,sensor); Switch(switch); SwitchPosition. switch(sp, switch); SwitchPosition(sp); Route.switchPosition(route, sp); Route(route); neg find head(route, sensor); }pattern head(R, Sen) = { Route.routeDefinition(R, Sen);}

route: Route sp: SwitchPosition

Switch: Switchsensor: Sensor

switchPosition

switchsensor

routeDefinition

Page 42: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Tooling: RDF Pattern Language

Vocabulary <railway.rdf> base <http://www.semanticweb.org/ontologies/2011/1/TrainRequirementOntology.owl#>

pattern posLength(Segment, SegmentLength) { Segment(Segment); Segment_length(Segment, SegmentLength); check("SegmentLength <= 0");}

segment: Segment

segment.length � 0

import "http://www.semanticweb.org/ontologies/2011/1/TrainRequirementOntology.owl"

pattern posLength(Segment, SegmentLength) { Segment(Segment); Segment.Segment_length(Segment, SegmentLength); check(SegmentLength <= 0);}

EMF-IncQuery syntax

RDF-IncQuery syntax

Xbase (compiles to Java)

Javascript

Page 43: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

RETE Deployment Process Construct language-

independent constraints Resolution of

o syntactic sugar o type information

Query Language

Query Predicates

RETE Structure

Platform Description

Allocation / Mapping

Deployment Descriptor

Variables route sp switchParameter sensor

Constraints

Edge: SwitchPosition.switch Edge: TrackElement.sensor Edge: Route.switchPosition Negation: head

Page 44: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

RETE Deployment Process Construct RETE structure

(platform independently) Optimizations:

o Model statisticso Expected usage profile

Query Language

Query Predicates

RETE Structure

Platform Description

Allocation / Mapping

Deployment Descriptor

join

join

join

Page 45: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

RETE Deployment Process Architecture model

(Cloud infrastructure)o Virtual Machines

• Memory limits• CPU speed• Storage capacity

o Communication Channels• Bandwidth

Specified by a textual DSL (Xtext)

Query Language

Query Predicates

RETE Structure

Platform Description

Allocation / Mapping

Deployment Descriptor

1 2

3 4

Page 46: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

RETE Deployment ProcessMachine Allocated Nodes

1 In1, In2, Join2

2 In3

3 In4

4 Join1, Join3

Query Language

Query Predicates

RETE Structure

Platform Description

Allocation / Mapping

Deployment Descriptor

1 2

3 4

Join1

Join3

Join2

In1 In2 In3 In4

Page 47: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

RETE Deployment Process Configuration scripts for

o Deploymento Communication

middleware Derived by automated

code generationo Using Eclipse technology:

EMF-IncQuery + Xtend

Query Language

Query Predicates

RETE Structure

Platform Description

Allocation / Mapping

Deployment Descriptor

Page 48: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

DISTRIBUTED PERFORMANCE BENCHMARKS

Page 49: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

The Train Benchmark Model validation workload:

o User edits the model o Instant validation of

well-formedness constraints o Model is repaired accordingly

Scenario:o Loado Checko Edito Re-Check

Models:o Randomly generatedo Close to real world instanceso Following different metrics o Customized distributionso Low number of violations

Queries:o Two simple queries

(<2 objects, attributes)o Two complex queries

(4-7 joins, negation, etc.)o Validated match sets

Incremental validationBatch validation

Instance model

Read Check Edit ReCheck!

100x

Page 50: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Evaluation of distributed scalability Extensions to previous work (single workstation)

o Generation of large instance modelso Distributed, parallel loading of modelso Distributed transformation and validation

Benchmark Distributed benchmark

Model size 1K – 13M 1K – 88M

Load method Batch Distributed, parallel

Transformation and validation Single workstation Multiple servers

Page 51: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Load and first validation: load the graph to the databases and execute the query

Transformation: query the graph and delete some elements

Revalidation: execute the query

Batch graph scenarioIncremental scenario – IncQuery-D Load and first validation: load the graph to the databases

and initialize the Rete net and retrieve the results

Transformation RevalidationGraphML

DB shards Result set

Rete net

Load and first validation

DB shards Result set

Rete net

Revalidation: retrieve the results from the Rete net

Transformation: incrementally query the graph and delete some elements, propagate the changes

Page 52: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Benchmark environment Private cloud Different DBMSs Query

o The DBMS’s own query languageo IncQuery-D

SPARQL Gremlin

Page 53: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

0.030.06

0.110.23

0.430.86

1.743.47

6.9313.90

27.7655.75

1

10

100

1000

10000

4store IncQuery-D TitanIncQuery-D 4store

Model size [million elements]

Runti

me

[s]

Load and first validation55M model: approx. 15 minutes

Rete network’s initialization

overhead pays off

Page 54: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

0.030.06

0.110.23

0.430.86

1.743.47

6.9313.90

27.7655.75

1

10

100

1000

4store IncQuery-D TitanIncQuery-D 4store

Model size [million elements]

Runti

me

[s]

Model modification

1. Elementary model query2. Model modification

– Query from the Rete network’s indexer– Propagation of modifications is fast

2 orders of magnitude

Page 55: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

0.030.06

0.110.23

0.430.86

1.743.47

6.9313.90

27.7655.75

0.00

0.01

0.10

1.00

10.00

100.00

1000.00

4store IncQuery-D TitanIncQuery-D 4store

Model size [million elements]

Runti

me

[s]

Revalidation

memory limit

Sub-second response time for models with

88M elements

Different characteristics

Page 56: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Benchmarking Conclusions Memory consumption

o Single workstation: 13M model, 4 GBo Cloud of four servers: 55M model, <4×8 GB

Runtimeo Same order of magnitude and similar characteristics to

the single workstation tool

INCQUERY-D is scalable and significantly more efficient for query evaluation than the native query engines in 4store, Titan and Neo4j

Page 57: IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineering and Deployment Challenges

Conclusions