Constraint Databases and
Temporal Reasoning Peter Revesz
University of Nebraska-Lincoln
Constraint Database
A generalization of relational database A set of constraint tables/relations Each constraint table consists of a set of
constraint tuples Each constraint tuple is a conjunction of
allowed types of constraints over a set of variables and parameters
Example Constraint Databases
Linear Constraint Databases
(any conjunction of linear constraints
over the attribute variables) Parametric Rectangles Periodic Parametric Rectangles Parametric 2-Spaghetti Geometric Transformations etc
Parametric Rectangles (P□)
Scheme: (X, Y, T)Variables: x, y, tConstraints:
x >= a1 t + b1
x <= a2 t + b2
y >= c1t + d1
y <= c2 t + d2 from ≤ t ≤ to
Example: Parametric Rectangle
X Y T
[2t, 3t+5] [2t, 2t+5] [0,3]
Periodic Parametric Rectangle
A PP□ is a parametric rectangle that repeats its motion every period (p) until t = end (e)
A PP□ is equivalent to a P□ when p = -1 [from,to]p,end
Exa: [80,82]10,111 means union of
[80,82], [90,92], [100,102], [110,111]
Exa: Periodic Parametric Rectangle
X Y T
[2t’, 3t’+5] [2t’, 2t’+5] [0,3]5,498
t’ = t mod 5
N-dimensional PP □
x
y
z
Parametric 2-Spaghetti Constraint Level:
Scheme 1: (Ax, Ay, Bx, By, Cx, Cy, From, To) Parameter: t
Scheme 2: (X, Y, T) Variables: x, y, t Conjunction of constraints over x, y, t that define a
changing triangle ABC
Logical Level: Variables: x, y, t Tuples (x,y,t) such that (x,y) is in ABC at time t
Ax Ay Bx By Cx Cy From To
3 3-t 4+.5t 4-.5t 5+t 3 0 10
LCDB PP□P∆
Expressive Power
Geometric Transformations [Chomicki & Revesz Time’99]
A triple (V, I, f) V is d dimensional representative spatial
object I is time domain f is V x t V’, parametric geometric transformation function
that maps the V at any time t in I to some V’ d dimensional spatial object.
Exa: Geometric Transformation
Whale(V, I, f) where V = [20, 30] x [20, 30] I = [0, 20] x -t
f(x, y, t) = +
y 0
Any (x,y) at time t is mapped to (x-t, y).
The whale moves west with 1 unit speed.
Interoperability of Data Models
Scheme=Vars Extreme Points Geo. Transform
inequality Rectangles identity
rectangle reference
x, y linear
t inequality
Worboys identity
polygon reference
xi linear in t inequality
parametric rectangles
scaling + translation rectangle reference
x,y linear for each t
parametric
2-spaghetti
affine motion
polygon reference
Raster Parametric Rectangle given the initial and final snapshots at time ti and tf
first, rectangulate the smaller snapshot:
next, rectangulate the bigger snapshot, ensuring that both have the same number of rows of rectangles
Queries use the logical level
Constraint level is for storage.
Since logical level is the same*** as for relational databases, relational queries are possible.
***except the logical level can be infinite
set of points for constraint database
Constraint Query Languages
Relational algebra SQL Max op in LCDB linear programming Datalog Spatial operators: Area, Buffer, etc. Spatio-temporal operators: Animation (visualization by video) Block (find movement when …) Collide (find movement bouncing …) Deflect (find movement in gravitational field…) etc.
A Query for Parametric Rectangles
Region(X, Y, T, temperature) Clouds(X, Y, T, humidity)
Where and when will it snow?
Relational algebra:
PROJECT x,y,t (SELECT(temperature < 32) Region)
PROJECT x,y,t (SELECT(humidity > 80) Clouds)
SQL:SELECT x, y, tFROM RegionWHERE temperature < 32
INTERSECTSELECT x, y, tFROM CloudsWHERE humidity > 80
Datalog:Snow (x, y, t) : - Region(x, y, t, temperature),
Clouds(x, y, t, humidity), temperature < 32, humidity > 80
A Query for Parametric 2-Spaghetti
Region(Ax, Ay, Bx, By, Cx, Cy, F, T,temperature) Clouds(Ax, Ay, Bx, By, Cx, Cy, F, T, humidity)
Where and when will it snow?
Same as before
Query Language Issues
High-level, user-friendly Expressiveness Termination Closed-Form Complexity Data complexity: fixed queries (as in a library/menu) and variable size input relations
Theorem: Parametric Rectangles are Closed under Relational Algebra Queries
R1 = ( [t+6, t+20], [5, 15], [0, 10] )R2 = ( [2t, 3t+10], [0, 10], [0, 10] )
r1
r2
r1 r1
r2 r2
0 t < 5 5 t < 6 6 t 10
Proof: Intersection
Complement of a Parametric Rectangle
Theorem: Relational algebra can be evaluated in PTIME data complexity forParametric Rectangle Databases.
Similar theorems for other combinations of constraint databases and query languages.
Spatial Operators
Area Operator computes the area of a relation R at a time ti
denoted by area (R, ti)
Animation
Look at the examples.
Other Spatiotemporal Operators
Block Operator computes the change in a relation R2, as a
result of being “blocked” by R1, until the time tk
denoted by block (R1, R2, tk) if we view a parametric rectangle as a set of
moving points, it can be defined as the set of moving points in R2 that did not intersect with R1 before tk
Other Spatiotemporal Operators
Block Operator (contd.)
the second figure shows the result of difference operation and the third shows the result of block operation at t = 15
Other Spatio-temporal Operators
Block Operator (contd.) evaluation
instantiate R1, R2 at tk
successive partitioning maximum depth
Other Spatio-temporal Operators
Collide Operator computes the result of a collision between two
(moving) parametric rectangles (which do not grow or shrink)
denoted by collide (r1, r2) assume that the collision is elastic view the parametric rectangles as spherical bodies,
with given fixed masses
Other Spatio-temporal Operators
Collide Operator (contd.) determine line joining centers
(LC) at time of collision resolve velocities along LC
Other Spatio-temporal Operators
Collide Operator (contd.) initial velocity (Vi) LC velocity before collision (ViLC) LC velocity after solving “head-on”
collision (VfLC) final velocity (Vf) =
vector sum of [VfLC - ViLC, Vi]
PReSTO System Simulation
PReSTO System
PReSTO System Parametric Rectangles as SpatioTemporal Objects Visual C++ implementation
GUI Queries operator icons provided on the interface
Other systems: CCUBE, DEDALE, MLPQ, etc.
Uses of Constraint Databases
Decision Support Systems example: fire control
Approximate Reasoning example: approximation of Time Series (t1,y1), (t2,y2), …., (tn,yn)
Timed Automata Verification
Animation
Forest Fire Control Problem
Problem Description a fire starts in some portion of
a forest fire-fighters have several options
to control the fire spread option which minimizes damage
must be chosen
Forest Fire Control Problem
Problem Description (contd.) assume that we can predict the fire spread
from environmental conditions two possible strategies of dropping foam could
be:
Forest Fire Control Problem
Problem Solution (contd.) after applying the block and area operators in
PReSTO, we get the following results:
Strategy
Burnt Area
None
1
2
16056
12426
7315
t Temp
t temp temp=2t+75, t>=0, t<=1.
t temp temp=9t+68, t>1, t<=2.
t temp temp=2t+82, t>2, t<=4.
t Temp
0 75
1 77
2 86
3 87
4 90
PLA
=3
TemperatureTemperature
RelationalDatabase
ConstraintDatabase
Piecewise Linear Approximation
US Precipitation (6,726 Stations, 96 Months)
(inch) 0.1 0.2 0.4 0.8 0.16 0.32 0.64
Average # of pieces 89.03 84.29 76.09 63.22 45.72 25.30 7.84
Correlation coefficient 0.9999 0.9999 0.9993 0.9756 0.9748 0.8775 0.6424
Querying Piecewise Linear Approximations
For the relation Temperature(Temp, t), find the temperatures at time 1.5 :
select Temp from Temperature where t = 1.5;
For the relation Temperature(Temp, t), find the lowest temperature:
select min(Temp) from Temperature;
Update of Piecewise Linear Approximation
Timed Automata
TOURIST
d >= 100 ?
e := e + 198
d := d - 100
y >= 30,000 ?
e := e + 198
y := y - 30000
Timed Automata Datalog
T(d, e’, y’) :- T(d, e, y), y >= 30000, y’ = y – 30000, e’ = e+198.
Value-by-Area CartogramA value-by-area cartogram is a map in which each display area is proportional to some geographically distributed “value”.
Contiguous cartograms Noncontiguous cartograms
Example: 1990 USA Population Distribution
Some Disadvantages of Current Animation Algorithms
Current cartogram animation algorithms run very slowly.
Current cartogram animation algorithms require pre-computing and saving the snapshots. This greatly limits the number of snapshots that can be displayed in animation.
Cartogram Animation
Two snapshots of value-by-area cartogram animation for U.S.population in 1970 and 1990.
Volume Animation
Two snapshots of daily mean temperature inU.S. during a winter day and a summer day.
Constraint-Based Animation Methods
Avoid pre-computation
Support animation with large number of snapshots
Run fast
Methods for Animation
Parallel Method Serial Method Hybrid Method
Parallel Method
Serial Method
Hybrid Method
Comparison of the three methods
Definitions
Cell value (Vi)
Actual cell area (ACi)
Average cell density (DAVG)
Desired cell area (ADi=Vi/ DAVG)
Cell distortion (Δi=| ADi - ACi |/ ADi )
A Naïve Cartogram Algorithm
For each cell, calculate ACi and ADi
Repeat
For each cell Ci, begin
compute its ACi
update cell Ci’s corner vertices
update all other corner vertices
End for.
Until for all cell i, |ACi-ADi|/ADi <= ε
Effective Range
A Cell Ci’s effective range is the area of the map that contains all corner vertices that could change while inflating/deflating cell Ci.
Move of the Corner Vertices
When enlarging/shrinking a cell, a corner vertex is moved along the line that connecting the cell center and this vertex
Move of the Corner Vertices
the new coordinate for cell (x,y) is:
Finding the Center of a Cell
For a polygonal cell with corner vertices of (x1,y1), (x2,y2),…(xn,yn), the center (x,y)is calculated as following:
n
iix
nx
1
1
n
iiy
ny
1
1
Shape Distortion for the Originally Square Cell Division
A New Cartogram Algorithm
For each cell, calculate ACi and ADiRepeat
Sort the Cells by its distortion Δi
For each cell Ci (in sorted order), begincompute its ACi
update cell Ci’s corner verticesupdate other corner vertices only if
they are inside cell Ci’s effective range End for.Until for all cell i, |ACi-ADi|/ADi <= ε
Query Optimization Issues
Indexing new data structures and algorithms Algebraic Optimization algebraic rewriting rules Recursive query processing pushing down recursively constraints termination control: subsumption testing
Indexing: Parametric R-trees
Assume that airplanes, clouds etc.
are (approximated by)
Parametric Rectangles
“Find all the airplanes that are currently traveling within some cloud”
Parametric R-trees
Like an R-Tree a PR-Tree is a height-balanced tree and each node has between M/2 and M children, where M is a constant depended on the page size.
Def: Let S be set of d-dimensional parametric rectangles. Then we call a parametric rectangle r the minimum bounding parametric rectangle (MBPR) of S if and only if 1.r contains all parametric rectangles in S
2.The area of the project of r onto the (xi,t) space is minimized for each i = 1,……,d.
Parametric Rectangles: r4-r11 MBPRs: r1-r3
ID x[ x] y[ y] t[ t]
r1 30+7t 90+7t 50+5t 100+6t 0 10
r2 0 50 10+|5t 55+5t 0 10
r3 60-6t 100-5t 0 40+t 0 10
r4 30+7t 50+7t 80+6t 100+6t 0 10
r5 30+12t 40+12t 50+5t 65+5t 0 10
r6 75+6t 90+7t 70+8t 80+8t 0 10
r7 0 15 40+5t 55+5t 0 9
r8 0 12 20+4t 40+4t 1 10
r9 30 50 10+7t 20+7t 0 10
r10 80-5t 100-5t 2t 20+3t 0 10
r11 60-6t 70-6t 30-3t 40-2t 0 10
MBPRs
PaR. Rec.
Snapshot at time t=0.
Snapshot at time t=10.
Compute MBPR
Let R be the MBPR of S.
Let tmin and tmax denote the start and the end time of R then
tmin = minrεS(r.t[), tmax = maxrεS(r.t])
The projection of each parametric rectangle in S onto the(xi ,t) space corresponds to a trapezium with four extreme points as shown in Figure , the projection of S onto the (xi,t) space corresponds a set Si of 4|S| extreme points in the (xi, t) space.
Let Hi be the convex hull of Si, the lower and the upper bounds of R for the xi dimension are extensions of some edges of the convex hull Hi which can be computed efficiently.
CONVEX HULL in the (xi,t) space
MBPR in the (xi,t) space
Theorem: The minimum bounding parametric rectangle of M number of d-dimensional parametric rectangles can be computed in O(dMlogM) time.
Searching
The Search algorithm of a PR-tree is based on the algorithm to check whether two parametric rectangles intersect at any time instance.
Given two parametric rectangles r1 and r2 ,let Πi,t(r1) and Πi,t(r2) denote the projection of r1 and r2 on the (xi , t) space, then r1 and r2 intersect if and only if there is at least one time instance t1 such that for each I=1,…….d, Πi,t(r1) and Πi,t(r2) intersect at t1.
Whether two d-dimensional parametric rectangles intersect can be checked in O(d) time.
INSERTION
The insertion algorithm of PR-trees is an extension of the insertion algorithm of R-trees.
Go down the tree to find an appropriate leaf node to insert the new index record,split the nodes that overflow, then propagate box upward.
Appropriate Subtree:At each level we choose the child whose bounding parametric rectangle needs least volume enlargement to include the new tuple.
Let r=(x[1, x1
],…………., xd[,x]
d , t[,t]) be a d-dimensional parametric rectangle and P be the corresponding polyhedron in the (x1,…,xd,t) space , the volume of r is the integral of the area function as follows:
Cont…
Let the children of a non-leaf node be E1,……,Ep and rj be the bounding parametric rectangle of Ej.Suppose we would like to insert anew tuple T into one child of the node.
Let MBPR(rj,T) denote the minimum bounding parametric rectangle of rj and T.
We will choose the child that needs the minimum the enlargement to include T, that is
minj enlarge(Ej,T)
Where enlarge ( rj,T) = vol(MBPR(rj,T))-vol(rj).
Node Splitting
When a new entry is added to a full node with M entries,it is necessary to divide the collection of M+1 entries between two nodes.
The PR-tree node splitting algorithm is an extension of the quadratic split algorithm of R-tree.
The idea is to choose two of the M+1 entries whose minimum bounding parametric rectangle has the largest volume increase as the first elements of the two new groups,where the volume increase is the volume of their MBPR minus their volume.Each of the remaining parametric rectangle is inserted into the group that needs less volume enlargement to include it.
Insertion of r12 when 3 is the maximum number of children per node.
This causes the splitting of the root and the height of the tree increases by 1 as shown in figure.
Split r1
Split the root
The process of the splitting of the root is similar to the splitting of r1.The insertion of a PR-tree can be done in O(logMN) time,where M is the page size and N is the number of moving objects.
Performance for Fixed Size Moving Objects
Performance For Growing or Shrinking Moving Objects
Conclusions and Future Work
More Systems experiments of on large-scale problems GIS, bioinformatics user feedback Timed Automata Verification evaluation of Datalog queries (complexity, termination, tuple recognition) Approximate Evaluation applied for Datalog queries?