Upload
fkou0623
View
290
Download
2
Embed Size (px)
Citation preview
Segment trees and interval treesLekcija 11
Sergio [email protected]
FMFUniverza v Ljubljani
Includes slides by Antoine Vigneron
Sergio Cabello RC – More trees
Outline
I segment trees
• stabbing queries• windowing problem• rectangle intersection• Klee’s measure problem
I interval trees
• improvement for some problems
I higher dimension
Sergio Cabello RC – More trees
Data structure for stabbing queries
I orthogonal range searching: data is points, queries are rectangles
I stabbing problem: data is rectangles, queries are points
I in one dimension
• data: a set of n intervals• query: report the k intervals that contain a query point q
I in Rd
• data: a set of n isothetic (axis-parallel) boxes• query: report the k boxes that contain a query point q
Sergio Cabello RC – More trees
Motivation
I in graphics and databases, objects are often stored according totheir bounding box
Object
Bounding box
I query: which objects does point x belong to?
I first find objects whose bounding boxes contain x
I then perform screening
Sergio Cabello RC – More trees
Data structure for windowing queries
I windowing queries
• data: a set of n disjoint segments in R2
• query: report the k segments that intersect a queryrectangle R .
I motivation: zoom in maps
Sergio Cabello RC – More trees
Rectangle intersection
I input: a set B of n isothetic boxes in R2
I output: all the intersecting pairs in B2
b4
b3
b1
b2
b5
I output: (b1, b3),(b2, b3),(b2, b4),(b3, b4)
Sergio Cabello RC – More trees
Klee’s measure problem
I input: a set B of n isothetic boxes
I output: the area/volume of the union
I well understood in R2 ⇒O(n log n) time
I the union can have complexity Θ(n2). Example?
I poorly understood in Rd for d > 2
Sergio Cabello RC – More trees
Segment tree
I a data structure to store intervals, or segments
I allows to answer stabbing queries• in R
2: report the segments that intersect a query verticalline l
• in R: report the segments that intersect a query point
reported
reported
reported
l
• query time: O(log n + k)• space usage: O(n log n)• preprocessing time: O(n log n)
Sergio Cabello RC – More trees
Notations
I let S = (s1, s2, . . . sn) be a set of segments in R
I let E be the set of the x–coordinates of the endpoints of thesegments of S
I we assume general position, that is: |E | = 2n
I first sort E in increasing order
I E = e1 < e2 < · · · < e2n
Sergio Cabello RC – More trees
Atomic intervals
I E splits R into 2n + 1 atomic intervals:
• [−∞, e1]• [ei , ei+1] for i ∈ 1, 2, . . . 2n − 1• [e2n,∞]
I these are the leaves of the segment tree
S
E
Sergio Cabello RC – More trees
Internal nodes
I the segment tree T is a balanced binary tree
I each internal node u with children v and v ′ is associated with aninterval Iu = Iv ∪ I ′v
I an elementary interval is an interval associated with a node of T(it can be an atomic interval)
Iu
Iv Iv ′
v v ′
u
Sergio Cabello RC – More trees
Example
E
S
root
Sergio Cabello RC – More trees
Partitioning a segment
I let s ∈ S be a segment whose endpoints have x–coordinates ei
and ej
I [ei , ej ] is split into several elementary intervalsI they are chosen as close as possible to the rootI s is stored in each node associated with these elementary
intervals
E
s
root
Sergio Cabello RC – More trees
Canonical subsets
I each node u is associated with a canonical subset S(u) ofsegments
I let ei < ej be the x–coordinates of the endpoints of s ∈ S
I then s is stored in S(u) iff Iu ⊂ [ei , ej ] and Iparent(u) 6⊂ [ei , ej ]
I standard segment tree: S(u) is stored as a list pointed from u
I we can also add more structure/data/pointers from u
I useful for multi-level data structures
I we will use it
Sergio Cabello RC – More trees
Example
root
Sergio Cabello RC – More trees
Answering a stabbing query
root
l
Sergio Cabello RC – More trees
Answering a stabbing query
Algorithm ReportStabbing(u, xl)Input: root u of T , x–coordinate of l
Output: segments in S that cross l
1. if u == NULL
2. then return
3. output S(u) traversing the list pointed from u
4. if xl ∈ Iu.left
5. then ReportStabbing(u.left, xl)6. if xl ∈ Iu.right
7. then ReportStabbing(u.right, xl)
I it clearly takes O(k + log n) time
Sergio Cabello RC – More trees
Inserting a segment
E
s
root
Sergio Cabello RC – More trees
Insertion in a segment tree
Algorithm Insert(u, s)Input: root u of T , segment s. Endpoints of s have x–coordinates
x− < x+
1. if Iu ⊂ [x−, x+]2. then insert s into the list storing S(u)3. else
4. if [x−, x+] ∩ Iu.left 6= ∅5. then Insert(u.left, s)6. if [x−, x+] ∩ Iu.right 6= ∅7. then Insert(u.right, s)
Sergio Cabello RC – More trees
Main Property
LemmaA segment s is stored at most twice at each level of T .
Dokaz.
I by contradiction
I if s stored at more than 2 nodes at level i
I let u be the leftmost such node, u′ be the rightmost
I let v be another node at level i containing s
u v u′
s
v .parent
I then Iv .parent ⊂ [x−, x+]
I so s cannot be stored at v
Sergio Cabello RC – More trees
Analysis
I lemma of previous slide implies
• each segment stored in O(log n) nodes• space usage: O(n log n)
I insertion in O(log n) time
• at most four nodes are visited at each level
I actually space usage is Θ(n log n) (example?)
I query time: O(k + log n)
I preprocessing
• sort endpoints: Θ(n log n) time• build empty segment tree over these endpoints: O(n) time• insert n segments into T : O(n log n) time• overall: Θ(n log n) preprocessing time
Sergio Cabello RC – More trees
Rectangle intersection
I input: a set B of n isothetic boxes in R2
I output: all the intersecting pairs in B2
I using segment trees, we give an O(n log n + k) time algorithm,where k is the number of intersecting pairs
I this is optimal. Why?
I note: faster than our line segment intersection algorithm
I space usage: Θ(n log n) due to segment trees
I space usage is suboptimal
Sergio Cabello RC – More trees
Two kinds of intersections
I overlap
I intersecting edges
I reduces to intersection reportingfor isothetic segments
I done as exercise (firsthomework)
I inclusion
I we can find them usingstabbing queries
Sergio Cabello RC – More trees
Reporting overlaps
I equivalent to reporting intersecting edges
I plane sweep approach
I sweep line status: BBST containing the horizontal line segmentsthat intersect the sweep line, by increasing y–coordinates
I each time a vertical line segment is encountered, reportintersection by range searching in the BBST
I preprocessing time: O(n log n) for sorting endpoints
I running time: O(k + n log n)
Sergio Cabello RC – More trees
Reporting inclusions
I also using plane sweep: sweep a horizontal line from top tobottom
I sweep line status: the boxes that intersect the sweep line l , in asegment tree with respect to x–coordinates
• the endpoints are the x–coordinates of the horizontal edgesof the boxes
• at a given time, only rectangles that intersect l are in thesegment tree
• we can perform insertion and deletions in a segment tree inO(log n) time
I each time a vertex of a box is encountered, perform a stabbingquery in the segment tree
Sergio Cabello RC – More trees
Remarks
I at each step a box intersection can be reported several times
I in addition there can be overlap and vertex stabbing a box at thesame time
I to obtain each intersecting pair only once, make some simplechecks. How?
Sergio Cabello RC – More trees
Stabbing queries for boxes
I in Rd , a set B of n boxes
I for a query point q find all the boxes that contain it
I we use a multi-level data structure, with a segment tree in eachlevel
I inductive definition, induction on d
I first, we store B in a segment tree T with respect tox1–coordinate
I for each node u of T , associate a (d − 1)–dimensionalmulti–level segment tree for the segments S(u), with respect to(x2, x3 . . . xd)
Sergio Cabello RC – More trees
Performing queries
I search for q in T using x1–coordinate
I for all nodes in the search path, query recursively the(d − 1)–dimensional multi–level segment tree
I there are log n such queries
I by induction on d , it follows that
• query time: O(k + logd n)• space usage: O(n logd n)• preprocessing time : O(n logd n)
I can be slightly improved. . .
Sergio Cabello RC – More trees
Windowing queries
I in Rd , a set S of n disjont segments
I for a query axis-aligned rectangle R , find all the segmentsintersecting R
I three types of segments intersect R :
• segments with one endpoint inside R• segments that intersect vertical side of R• segments that intersect horizontal side of R
I first type: range tree over the endpoints of the segments
I second type: multi-level data structure with segment tree
• store S in a segment tree T with respect to x–coordinate• for each node u of T , store the segments S(u) sorted by
their intersection with vertical line in BST
Sergio Cabello RC – More trees
Windowing queries
I for segments of the second type:
• a query visits O(log n) nodes of the main tree• the canonical subsets of those nodes are disjoint• in each node we spend O(log n) time, plus time to report
segments (1d range-tree)• each segment is reported once, because disjointness
I each segment reported at most twice: filter them
I For n disjoint segments:
• preprocessing: O(n log2 n) time• query: O(k + log2 n) time
I where did we use that the segments are disjoint?
Sergio Cabello RC – More trees
Klee’s measure problem
I in R2, a set S of n axis-parallel rectangles
I compute area of the union
I solution using O(n log n) time
I sweep a vertical line ` from left to right
• keep the length of `⋂
(⋃
S)• events: length changes when rectangles start or stop
intersecting `• relevant values: distance between consecutive events and
the length• we compute the area to the left of `, updating it at each
event
I use segment trees to maintain the length
I
http://www.cgl.uwaterloo.ca/~krmoule/courses/cs760m/klee
Sergio Cabello RC – More trees
Klee’s measure problem
I we need to maintain the length of union of intervals underinsertion and deletion of intervals
I make a segment tree (we know all endpoints in advance)
I at each node u we store
• list of S(u) (actually its cardinality is enough)• length(u): the length of Iu covered by segments stored
below u• note that length(u) only depends on subtree rooted at u• this allows quick updates
I length(root) is the real length we want
I insertion or deletion of interval takes O(log n) time
• if S(u) 6= ∅, then length(u) = length(Iu)• else, length(u) = length(u.left) + length(u.right)
Sergio Cabello RC – More trees
Klee’s measure problem
I in R3 best known algorithm in O(n3/2) time
I only lower bound: Ω(n log n)
I in R3, recent progress for unit boxes
Sergio Cabello RC – More trees
Interval trees
I interval trees allow to perform stabbing queries in one dimension
• query time: O(k + log n)• preprocessing time: O(n log n)• space: O(n)
I based on different approach
Sergio Cabello RC – More trees
Preliminary
I let xmed be the median of E
• Sl : segments of S that are completely to the left of xmed
• Smed : segments of S that contain xmed
• Sr : segments of S that are completely to the right of xmed
xmed
Smed
SrSl
Sergio Cabello RC – More trees
Data structure
I recursive data structure
I left child of the root: interval tree storing Sl
I right child of the root: interval tree storing Sr
I at the root of the interval tree, we store Smed in two lists
• ML is sorted according to the coordinate of the left endpoint(in increasing order)
• MR is sorted according to the coordinate of the rightendpoint (in decreasing order)
Sergio Cabello RC – More trees
Example
s1s2s3
s5
s7
s4
s6
Interval tree ons3 and s5
Interval tree on
Ml = (s4, s6, s1)Mr = (s1, s4, s6)
s2 and s7
Sergio Cabello RC – More trees
Stabbing queries
I query: xq, find the intervals that contain xq
I if xq < xmed then
• Scan Ml in increasing order, and report segments that arestabbed. When xq becomes smaller than the x–coordinateof the current left endpoint, stop.
• recurse on Sl
I if xq > xmed
• analogous, but on the right side
Sergio Cabello RC – More trees
Analysis
I query time
• size of the subtree divided by at least two at each level• scanning through Ml or Mr : proportional to the number of
reported intervals• conclusion: O(k + log n) time
I space usage: O(n) (each segment is stored in two lists, and thetree is balanced)
I preprocessing time: easy to do it in O(n log n) time
I pseudocode
Sergio Cabello RC – More trees