Constraint Databases and Temporal Reasoning

Preview:

DESCRIPTION

Constraint Databases and Temporal Reasoning. Peter Revesz. University of Nebraska-Lincoln. Constraint Database. A generalization of relational database A set of constraint tables/relations Each constraint table consists of a set of constraint tuples - PowerPoint PPT Presentation

Citation preview

Constraint Databases and

Temporal Reasoning Peter Revesz

University of Nebraska-Lincoln

Constraint Database

A generalization of relational database A set of constraint tables/relations Each constraint table consists of a set of

constraint tuples Each constraint tuple is a conjunction of

allowed types of constraints over a set of variables and parameters

Example Constraint Databases

Linear Constraint Databases

(any conjunction of linear constraints

over the attribute variables) Parametric Rectangles Periodic Parametric Rectangles Parametric 2-Spaghetti Geometric Transformations etc

Parametric Rectangles (P□)

Scheme: (X, Y, T)Variables: x, y, tConstraints:

x >= a1 t + b1

x <= a2 t + b2

y >= c1t + d1

y <= c2 t + d2 from ≤ t ≤ to

Example: Parametric Rectangle

X Y T

[2t, 3t+5] [2t, 2t+5] [0,3]

Periodic Parametric Rectangle

A PP□ is a parametric rectangle that repeats its motion every period (p) until t = end (e)

A PP□ is equivalent to a P□ when p = -1 [from,to]p,end

Exa: [80,82]10,111 means union of

[80,82], [90,92], [100,102], [110,111]

Exa: Periodic Parametric Rectangle

X Y T

[2t’, 3t’+5] [2t’, 2t’+5] [0,3]5,498

t’ = t mod 5

N-dimensional PP □

x

y

z

Parametric 2-Spaghetti Constraint Level:

Scheme 1: (Ax, Ay, Bx, By, Cx, Cy, From, To) Parameter: t

Scheme 2: (X, Y, T) Variables: x, y, t Conjunction of constraints over x, y, t that define a

changing triangle ABC

Logical Level: Variables: x, y, t Tuples (x,y,t) such that (x,y) is in ABC at time t

Ax Ay Bx By Cx Cy From To

3 3-t 4+.5t 4-.5t 5+t 3 0 10

LCDB PP□P∆

Expressive Power

Geometric Transformations [Chomicki & Revesz Time’99]

A triple (V, I, f) V is d dimensional representative spatial

object I is time domain f is V x t V’, parametric geometric transformation function

that maps the V at any time t in I to some V’ d dimensional spatial object.

Exa: Geometric Transformation

Whale(V, I, f) where V = [20, 30] x [20, 30] I = [0, 20] x -t

f(x, y, t) = +

y 0

Any (x,y) at time t is mapped to (x-t, y).

The whale moves west with 1 unit speed.

Interoperability of Data Models

Scheme=Vars Extreme Points Geo. Transform

inequality Rectangles identity

rectangle reference

x, y linear

t inequality

Worboys identity

polygon reference

xi linear in t inequality

parametric rectangles

scaling + translation rectangle reference

x,y linear for each t

parametric

2-spaghetti

affine motion

polygon reference

Raster Parametric Rectangle given the initial and final snapshots at time ti and tf

first, rectangulate the smaller snapshot:

next, rectangulate the bigger snapshot, ensuring that both have the same number of rows of rectangles

Queries use the logical level

Constraint level is for storage.

Since logical level is the same*** as for relational databases, relational queries are possible.

***except the logical level can be infinite

set of points for constraint database

Constraint Query Languages

Relational algebra SQL Max op in LCDB linear programming Datalog Spatial operators: Area, Buffer, etc. Spatio-temporal operators: Animation (visualization by video) Block (find movement when …) Collide (find movement bouncing …) Deflect (find movement in gravitational field…) etc.

A Query for Parametric Rectangles

Region(X, Y, T, temperature) Clouds(X, Y, T, humidity)

Where and when will it snow?

Relational algebra:

PROJECT x,y,t (SELECT(temperature < 32) Region)

PROJECT x,y,t (SELECT(humidity > 80) Clouds)

SQL:SELECT x, y, tFROM RegionWHERE temperature < 32

INTERSECTSELECT x, y, tFROM CloudsWHERE humidity > 80

Datalog:Snow (x, y, t) : - Region(x, y, t, temperature),

Clouds(x, y, t, humidity), temperature < 32, humidity > 80

A Query for Parametric 2-Spaghetti

Region(Ax, Ay, Bx, By, Cx, Cy, F, T,temperature) Clouds(Ax, Ay, Bx, By, Cx, Cy, F, T, humidity)

Where and when will it snow?

Same as before

Query Language Issues

High-level, user-friendly Expressiveness Termination Closed-Form Complexity Data complexity: fixed queries (as in a library/menu) and variable size input relations

Theorem: Parametric Rectangles are Closed under Relational Algebra Queries

R1 = ( [t+6, t+20], [5, 15], [0, 10] )R2 = ( [2t, 3t+10], [0, 10], [0, 10] )

r1

r2

r1 r1

r2 r2

0 t < 5 5 t < 6 6 t 10

Proof: Intersection

Complement of a Parametric Rectangle

Theorem: Relational algebra can be evaluated in PTIME data complexity forParametric Rectangle Databases.

Similar theorems for other combinations of constraint databases and query languages.

Spatial Operators

Area Operator computes the area of a relation R at a time ti

denoted by area (R, ti)

Animation

Look at the examples.

Other Spatiotemporal Operators

Block Operator computes the change in a relation R2, as a

result of being “blocked” by R1, until the time tk

denoted by block (R1, R2, tk) if we view a parametric rectangle as a set of

moving points, it can be defined as the set of moving points in R2 that did not intersect with R1 before tk

Other Spatiotemporal Operators

Block Operator (contd.)

the second figure shows the result of difference operation and the third shows the result of block operation at t = 15

Other Spatio-temporal Operators

Block Operator (contd.) evaluation

instantiate R1, R2 at tk

successive partitioning maximum depth

Other Spatio-temporal Operators

Collide Operator computes the result of a collision between two

(moving) parametric rectangles (which do not grow or shrink)

denoted by collide (r1, r2) assume that the collision is elastic view the parametric rectangles as spherical bodies,

with given fixed masses

Other Spatio-temporal Operators

Collide Operator (contd.) determine line joining centers

(LC) at time of collision resolve velocities along LC

Other Spatio-temporal Operators

Collide Operator (contd.) initial velocity (Vi) LC velocity before collision (ViLC) LC velocity after solving “head-on”

collision (VfLC) final velocity (Vf) =

vector sum of [VfLC - ViLC, Vi]

PReSTO System Simulation

PReSTO System

PReSTO System Parametric Rectangles as SpatioTemporal Objects Visual C++ implementation

GUI Queries operator icons provided on the interface

Other systems: CCUBE, DEDALE, MLPQ, etc.

Uses of Constraint Databases

Decision Support Systems example: fire control

Approximate Reasoning example: approximation of Time Series (t1,y1), (t2,y2), …., (tn,yn)

Timed Automata Verification

Animation

Forest Fire Control Problem

Problem Description a fire starts in some portion of

a forest fire-fighters have several options

to control the fire spread option which minimizes damage

must be chosen

Forest Fire Control Problem

Problem Description (contd.) assume that we can predict the fire spread

from environmental conditions two possible strategies of dropping foam could

be:

Forest Fire Control Problem

Problem Solution (contd.) after applying the block and area operators in

PReSTO, we get the following results:

Strategy

Burnt Area

None

1

2

16056

12426

7315

t Temp

t temp temp=2t+75, t>=0, t<=1.

t temp temp=9t+68, t>1, t<=2.

t temp temp=2t+82, t>2, t<=4.

t Temp

0 75

1 77

2 86

3 87

4 90

PLA

=3

TemperatureTemperature

RelationalDatabase

ConstraintDatabase

Piecewise Linear Approximation

US Precipitation (6,726 Stations, 96 Months)

(inch) 0.1 0.2 0.4 0.8 0.16 0.32 0.64

Average # of pieces 89.03 84.29 76.09 63.22 45.72 25.30 7.84

Correlation coefficient 0.9999 0.9999 0.9993 0.9756 0.9748 0.8775 0.6424

Querying Piecewise Linear Approximations

For the relation Temperature(Temp, t), find the temperatures at time 1.5 :

select Temp from Temperature where t = 1.5;

For the relation Temperature(Temp, t), find the lowest temperature:

select min(Temp) from Temperature;

Update of Piecewise Linear Approximation

Timed Automata

TOURIST

d >= 100 ?

e := e + 198

d := d - 100

y >= 30,000 ?

e := e + 198

y := y - 30000

Timed Automata Datalog

T(d, e’, y’) :- T(d, e, y), y >= 30000, y’ = y – 30000, e’ = e+198.

Value-by-Area CartogramA value-by-area cartogram is a map in which each display area is proportional to some geographically distributed “value”.

Contiguous cartograms Noncontiguous cartograms

Example: 1990 USA Population Distribution

Some Disadvantages of Current Animation Algorithms

Current cartogram animation algorithms run very slowly.

Current cartogram animation algorithms require pre-computing and saving the snapshots. This greatly limits the number of snapshots that can be displayed in animation.

Cartogram Animation

Two snapshots of value-by-area cartogram animation for U.S.population in 1970 and 1990.

Volume Animation

Two snapshots of daily mean temperature inU.S. during a winter day and a summer day.

Constraint-Based Animation Methods

Avoid pre-computation

Support animation with large number of snapshots

Run fast

Methods for Animation

Parallel Method Serial Method Hybrid Method

Parallel Method

Serial Method

Hybrid Method

Comparison of the three methods

Definitions

Cell value (Vi)

Actual cell area (ACi)

Average cell density (DAVG)

Desired cell area (ADi=Vi/ DAVG)

Cell distortion (Δi=| ADi - ACi |/ ADi )

A Naïve Cartogram Algorithm

For each cell, calculate ACi and ADi

Repeat

For each cell Ci, begin

compute its ACi

update cell Ci’s corner vertices

update all other corner vertices

End for.

Until for all cell i, |ACi-ADi|/ADi <= ε

Effective Range

A Cell Ci’s effective range is the area of the map that contains all corner vertices that could change while inflating/deflating cell Ci.

Move of the Corner Vertices

When enlarging/shrinking a cell, a corner vertex is moved along the line that connecting the cell center and this vertex

Move of the Corner Vertices

the new coordinate for cell (x,y) is:

Finding the Center of a Cell

For a polygonal cell with corner vertices of (x1,y1), (x2,y2),…(xn,yn), the center (x,y)is calculated as following:

n

iix

nx

1

1

n

iiy

ny

1

1

Shape Distortion for the Originally Square Cell Division

A New Cartogram Algorithm

For each cell, calculate ACi and ADiRepeat

Sort the Cells by its distortion Δi

For each cell Ci (in sorted order), begincompute its ACi

update cell Ci’s corner verticesupdate other corner vertices only if

they are inside cell Ci’s effective range End for.Until for all cell i, |ACi-ADi|/ADi <= ε

Query Optimization Issues

Indexing new data structures and algorithms Algebraic Optimization algebraic rewriting rules Recursive query processing pushing down recursively constraints termination control: subsumption testing

Indexing: Parametric R-trees

Assume that airplanes, clouds etc.

are (approximated by)

Parametric Rectangles

“Find all the airplanes that are currently traveling within some cloud”

Parametric R-trees

Like an R-Tree a PR-Tree is a height-balanced tree and each node has between M/2 and M children, where M is a constant depended on the page size.

Def: Let S be set of d-dimensional parametric rectangles. Then we call a parametric rectangle r the minimum bounding parametric rectangle (MBPR) of S if and only if 1.r contains all parametric rectangles in S

2.The area of the project of r onto the (xi,t) space is minimized for each i = 1,……,d.

Parametric Rectangles: r4-r11 MBPRs: r1-r3

ID x[ x] y[ y] t[ t]

r1 30+7t 90+7t 50+5t 100+6t 0 10

r2 0 50 10+|5t 55+5t 0 10

r3 60-6t 100-5t 0 40+t 0 10

r4 30+7t 50+7t 80+6t 100+6t 0 10

r5 30+12t 40+12t 50+5t 65+5t 0 10

r6 75+6t 90+7t 70+8t 80+8t 0 10

r7 0 15 40+5t 55+5t 0 9

r8 0 12 20+4t 40+4t 1 10

r9 30 50 10+7t 20+7t 0 10

r10 80-5t 100-5t 2t 20+3t 0 10

r11 60-6t 70-6t 30-3t 40-2t 0 10

MBPRs

PaR. Rec.

Snapshot at time t=0.

Snapshot at time t=10.

Compute MBPR

Let R be the MBPR of S.

Let tmin and tmax denote the start and the end time of R then

tmin = minrεS(r.t[), tmax = maxrεS(r.t])

The projection of each parametric rectangle in S onto the(xi ,t) space corresponds to a trapezium with four extreme points as shown in Figure , the projection of S onto the (xi,t) space corresponds a set Si of 4|S| extreme points in the (xi, t) space.

Let Hi be the convex hull of Si, the lower and the upper bounds of R for the xi dimension are extensions of some edges of the convex hull Hi which can be computed efficiently.

CONVEX HULL in the (xi,t) space

MBPR in the (xi,t) space

Theorem: The minimum bounding parametric rectangle of M number of d-dimensional parametric rectangles can be computed in O(dMlogM) time.

Searching

The Search algorithm of a PR-tree is based on the algorithm to check whether two parametric rectangles intersect at any time instance.

Given two parametric rectangles r1 and r2 ,let Πi,t(r1) and Πi,t(r2) denote the projection of r1 and r2 on the (xi , t) space, then r1 and r2 intersect if and only if there is at least one time instance t1 such that for each I=1,…….d, Πi,t(r1) and Πi,t(r2) intersect at t1.

Whether two d-dimensional parametric rectangles intersect can be checked in O(d) time.

INSERTION

The insertion algorithm of PR-trees is an extension of the insertion algorithm of R-trees.

Go down the tree to find an appropriate leaf node to insert the new index record,split the nodes that overflow, then propagate box upward.

Appropriate Subtree:At each level we choose the child whose bounding parametric rectangle needs least volume enlargement to include the new tuple.

Let r=(x[1, x1

],…………., xd[,x]

d , t[,t]) be a d-dimensional parametric rectangle and P be the corresponding polyhedron in the (x1,…,xd,t) space , the volume of r is the integral of the area function as follows:

Cont…

Let the children of a non-leaf node be E1,……,Ep and rj be the bounding parametric rectangle of Ej.Suppose we would like to insert anew tuple T into one child of the node.

Let MBPR(rj,T) denote the minimum bounding parametric rectangle of rj and T.

We will choose the child that needs the minimum the enlargement to include T, that is

minj enlarge(Ej,T)

Where enlarge ( rj,T) = vol(MBPR(rj,T))-vol(rj).

Node Splitting

When a new entry is added to a full node with M entries,it is necessary to divide the collection of M+1 entries between two nodes.

The PR-tree node splitting algorithm is an extension of the quadratic split algorithm of R-tree.

The idea is to choose two of the M+1 entries whose minimum bounding parametric rectangle has the largest volume increase as the first elements of the two new groups,where the volume increase is the volume of their MBPR minus their volume.Each of the remaining parametric rectangle is inserted into the group that needs less volume enlargement to include it.

Insertion of r12 when 3 is the maximum number of children per node.

This causes the splitting of the root and the height of the tree increases by 1 as shown in figure.

Split r1

Split the root

The process of the splitting of the root is similar to the splitting of r1.The insertion of a PR-tree can be done in O(logMN) time,where M is the page size and N is the number of moving objects.

Performance for Fixed Size Moving Objects

Performance For Growing or Shrinking Moving Objects

Conclusions and Future Work

More Systems experiments of on large-scale problems GIS, bioinformatics user feedback Timed Automata Verification evaluation of Datalog queries (complexity, termination, tuple recognition) Approximate Evaluation applied for Datalog queries?

Recommended