43
Intro To Segment Trees Victor Gao

segment trees · 2018. 4. 30. · SEGMENT TREE - LAZY PROPAGATION Basic idea: delay the modifications until a point that you absolutely have to do them. In other words, modifications

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Intro To Segment TreesVictor Gao

  • OUTLINE➤ Interval Notation

    ➤ Range Query & Modification

    ➤ Segment Tree - Analogy

    ➤ Building a Segment Tree

    ➤ Query & Modification

    ➤ Lazy Propagation

    ➤ Complexity Analysis

    ➤ Implementation Tricks

  • GENERAL INFORMATION

    ➤ This lecture is relatively comprehensive and covers most of the basic topics related to segment trees.

    ➤ You may be overwhelmed by amount of information I'm giving you today, but there is no need to panic.

    ➤ You are NOT expected to memorize or fully understand the details during the lecture. Pay more attention to the high level concepts and abstractions instead.

    ➤ Once you understand these, you'll be able to figure out the details pretty easily, even with yourself.

  • INTERVAL NOTATION

    ➤ We will be using the following notation:

    [left, right)➤ Mathematically, this represents an interval that is closed on

    the left end and open on the right end

    ➤ It includes numbers from left to right - 1

    ➤ Reasons to use this convention:

    ➤ Personal preference, you don’t necessarily have to follow it

    ➤ It simplifies certain operations

    ➤ Being able to represent an empty interval in a less awkward way, for example, [0, 0)

  • RANGE QUERY & MODIFICATION

    ➤ Consider the following problem:

    Given an array of integers, we have these two operations:

    1) calculate the sum of the numbers in interval [i, j) (Query)

    2) add x to every number in interval [i, j) (Modify)

  • RANGE QUERY & MODIFICATION

    ➤ Consider the following problem:

    Example: for array [0, 1, 2, 3] (indices start from 0)

    1) Query [0, 2) gives the sum 0 + 1 = 1

    2) Add (Modify) 4 to [1, 3), the array becomes [0, 5, 6, 3]

    3) Query [0, 4) gives the sum 0 + 5 + 6 + 3 = 14

    Given an array of integers, we have these two operations:

    1) calculate the sum of the numbers in interval [i, j) (Query)

    2) add x to every number in interval [i, j) (Modify)

  • RANGE QUERY & MODIFICATION - PRIMITIVE APPROACH

    ➤ Obviously, there is one primitive approach:

    ➤ For each query, iterate through all elements in the interval and calculate the sum.

    ➤ For each modification, modify all elements in the interval one by one.

  • RANGE QUERY & MODIFICATION - PRIMITIVE APPROACH

    ➤ Obviously, there is one primitive approach:

    ➤ For each query, iterate through all elements in the interval and calculate the sum.

    ➤ For each modification, modify all elements in the interval one by one.

    Time Complexity: ➤ Query: O(n) ➤ Modification: O(n)

  • RANGE QUERY & MODIFICATION - PRIMITIVE APPROACH

    ➤ Obviously, there is one primitive approach:

    ➤ For each query, iterate through all elements in the interval and calculate the sum.

    ➤ For each modification, modify all elements in the interval one by one.

    Time Complexity: ➤ Query: O(n) ➤ Modification: O(n)

    SLOW

  • RANGE QUERY & MODIFICATION - PRIMITIVE APPROACH

    ➤ Obviously, there is one primitive approach:

    ➤ For each query, iterate through all elements in the interval and calculate the sum.

    ➤ For each modification, modify all elements in the interval one by one.

    Time Complexity: ➤ Query: O(n) ➤ Modification: O(n)

    SLOW Data Structure: ➤ Faster Q & M ➤ Trade (not much)

    Space for Time

  • SEGMENT TREE - INTRO

    ➤ Core Idea: add nodes that store aggregate statistics.

    ➤ Analogy: number of students in a high school:

    class A: 21

    grade 10: 61class B: 22

    class C: 18

    class A: 24

    class B: 20

    class A: 22

    class B: 25

    grade 11: 44

    grade 12: 47

    total: 152

  • SEGMENT TREE - INTRO

    ➤ How to calculate the total number of students in the following three classes?

    class A: 21

    grade 10: 61class B: 22

    class C: 18

    class A: 24

    class B: 20

    class A: 22

    class B: 25

    grade 11: 44

    grade 12: 47

    total: 152

  • SEGMENT TREE - INTRO

    ➤ What should we if do when a transfer student joins grade 10, class B?

    class A: 21

    grade 10: 61class B: 22

    class C: 18

    class A: 24

    class B: 20

    class A: 22

    class B: 25

    grade 11: 44

    grade 12: 47

    total: 152

  • SEGMENT TREE - INTRO

    ➤ A segment tree looks like this. It's a binary tree with the leaf nodes storing statistics for individual array elements. The non-leaf nodes are the additional layers we add to store statistics of the combined intervals.

    [0, 4): 6

    [0, 2): 1 [2, 4): 5

    0: 0 1: 1 2: 2 3: 3

    *In this case, we store the sum of the elements in the interval in each node

  • SEGMENT TREE - INITIALIZATION

    ➤ A segment tree can be built from an array recursively:

    ➤ To initialize a leaf, simply copy the data over.

    ➤ To initialize a non-leaf,

    ➤ Initialize its children first.

    ➤ Then set its value to the sum of the values of the two children. (Assume we care about the sum here.)

  • SEGMENT TREE - QUERY

    ➤ Query(i, j): returns the sum of the numbers in interval [i, j)

    ➤ What are Query(0, 1), Query(2, 4) and Query(0, 4)?

    ➤ What is Query(0, 3)? (Hint: the high school example)

    [0, 4): 6

    [0, 2): 1 [2, 4): 5

    0: 0 1: 1 2: 2 3: 3

  • SEGMENT TREE - QUERY

    ➤ Basic idea: big intervals are made up by smaller ones. A query on an interval can be calculated as the sum of queries over some smaller intervals (that is present on the tree).

    QUERY(cur, [a, b)): // cur is current node if [a, b) contains cur.interval then return cur.value

    if cur is not a leaf then l

  • SEGMENT TREE - MODIFY (SINGLE POINT)

    ➤ Modify(i, x): add x to the ith element in the array.

    ➤ Basic Idea: increment not only the corresponding leaf, but also re-calculate the values stored in its ancestors.

    MODIFYPoint(leaf, x): // generic version leaf.value

  • SEGMENT TREE - MODIFY AN INTERVAL?

    ➤ Modifying a single element is easy.

    ➤ What if we want to modify an interval instead?

    ➤ Primitive approach: call ModifyPoint multiple times.

    ➤ Very inefficient.

  • SEGMENT TREE - MODIFY AN INTERVAL?

    ➤ Suppose we have a segment tree with 128 leaves.

    ➤ We're doing the following operations in sequence:

    ➤ Add 1 to all numbers in interval [2, 128)

    ➤ Add 3 to all numbers in interval [4, 88)

    ➤ Add -2 to all numbers in interval [17, 55)

    ➤ Add 10 to all numbers in interval [100, 112)

    ➤ Add 5 to all numbers in interval [44, 84)

    ➤ Query [47, 48)

    ➤ What do you observe?

  • SEGMENT TREE - LAZY PROPAGATION➤ Basic idea: delay the modifications until a point that you

    absolutely have to do them. In other words, modifications do not need to take effect immediately. We apply the changes to a node when the node is accessed next time.

    ➤ To implement this idea, we add another field flag to each node. Flag is used to store (cache) a pending modification.

    ➤ Two things we need to do when we touch a node with a non-null flag (a pending modification).

    ➤ Update the value.

    ➤ Pass down the flag. (Why?)

  • SEGMENT TREE - LAZY PROPAGATION➤ Assume we have the same segment tree that stores interval sums.

    ➤ With lazy propagation, whenever a node is accessed, we check if its flag is null (or a nonzero value, depending on the implementation).

    ➤ If the flag is null - nothing happens.

    ➤ If the flag is not null:

    ➤ Update the value.

    ➤ Update the child nodes' flags.

    ➤ Set flag to null.

  • SEGMENT TREE - LAZY PROPAGATION EXAMPLE

    [0, 4): 6

    [0, 2): 1 [2, 4): 5

    [0, 1): 0 [1, 2): 1 [2, 3): 2 [3, 4): 3

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

  • [0, 4): 6 | 1

    [0, 2): 1 [2, 4): 5

    [0, 1): 0 [1, 2): 1 [2, 3): 2 [3, 4): 3

    Add 1 to all elements

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE

  • [0, 4): 6 | 1

    [0, 2): 1 [2, 4): 5

    [0, 1): 0 [1, 2): 1 [2, 3): 2 [3, 4): 3

    Query [0, 1)

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE

  • [0, 4): 10

    [0, 2): 1 | 1 [2, 4): 5 | 1

    [0, 1): 0 [1, 2): 1 [2, 3): 2 [3, 4): 3

    Query [0, 1)

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

  • [0, 4): 10

    [0, 2): 1 | 1 [2, 4): 5 | 1

    [0, 1): 0 [1, 2): 1 [2, 3): 2 [3, 4): 3

    Query [0, 1)

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

  • [0, 4): 10

    [0, 2): 3 [2, 4): 5 | 1

    [0, 1): 0 | 1 [1, 2): 1 | 1 [2, 3): 2 [3, 4): 3

    Query [0, 1)

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

  • [0, 4): 10

    [0, 2): 3 [2, 4): 5 | 1

    [0, 1): 0 | 1 [1, 2): 1 | 1 [2, 3): 2 [3, 4): 3

    Query [0, 1)

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

  • [0, 4): 10

    [0, 2): 3 [2, 4): 5 | 1

    [0, 1): 1 [1, 2): 1 | 1 [2, 3): 2 [3, 4): 3

    Query [0, 1)

    SEGMENT TREE - LAZY PROPAGATION EXAMPLE

    ➤ Here's the same segment tree we had.

    ➤ Modify([l, r), x): add x to elements in interval [l, r).

    ➤ Query([l, r)): calculate the sum of elements in interval [l, r).

  • SEGMENT TREE - PROPAGATE

    Propagate(cur): if cur.flag != 0 then cur.val

  • SEGMENT TREE - QUERY (LAZY)➤ Query needs to be revised in order to maintain the lazy flags.

    QUERY(cur, [a, b)): Propagate(cur)

    if [a, b) contains cur.interval then return cur.value

    if cur is not a leaf then l

  • SEGMENT TREE - MODIFY (LAZY)

    Modify(cur, [a, b), x): Propagate(cur)

    if [a, b) contains cur.interval then cur.flag

  • SEGMENT TREE - MODIFY (LAZY)

    Modify(cur, [a, b), x): Propagate(cur)

    if [a, b) contains cur.interval then cur.flag

  • SEGMENT TREE - WHEN TO USE LAZY PROPAGATION

    ➤ Lazy propagation is usually only needed when you need to support both ranged query and modification.

    ➤ What about ranged modification + single point query?

    ➤ It depends.

    ➤ Sometimes you need lazy propagation.

    ➤ Sometimes you can turn it into a single point modification + single point query problem by converting an array into differences.

    ➤ For example, [5, 2, 7, 4] => [5, -3, 5, -3]

  • SEGMENT TREE - COMPLEXITY ANALYSIS➤ For a segment tree built from an n-element array, the total

    number of nodes is n + n/2 + n/4 + … + 1 = 2n - 1. Therefore the space complexity is O(n)

    ➤ The time complexity of building a segment tree is O(n)

    ➤ The height of the tree is O(log(n))

    ➤ Both Query and Modify do constant work at each node, and on each level, they visit at most two nodes. Therefore the time complexity of Q/M is O(log(n))

  • SEGMENT TREE - GENERALIZATION➤ Previously we've been talking about a segment tree that calculates interval

    sums and allows adding x to elements in an interval.

    ➤ We can easily support different kinds of aggregate statistics and operations by abstracting our pseudocode a step further.

    ➤ Notice that there are several operations we used in Propagate, Modify and Query:

    ➤ Updating a value with a flag.

    ➤ Let's make it a function merge_val_flag(val, flag, interval)

    ➤ Updating a flag with a flag passed down from above.

    ➤ Let's make it a function merge_flag_flag(flag1, flag2)

    ➤ Merging query results.

    ➤ Let's make it a function merge_val_val(val1, val2)

  • SEGMENT TREE - PROPAGATE (GENERALIZED)

    Propagate(cur): if cur.flag != null then cur.val

  • SEGMENT TREE - QUERY (GENERALIZED)

    QUERY(cur, [a, b)): Propagate(cur)

    if [a, b) contains cur.interval then return cur.value

    if cur is not a leaf then l

  • SEGMENT TREE - MODIFY (GENERALIZED)

    Modify(cur, [a, b), x): Propagate(cur)

    if [a, b) contains cur.interval then cur.flag

  • SEGMENT TREE - GENERALIZATION

    ➤ Define the utility functions as follows:

    ➤ merge_val_flag(val, flag, interval) => flag

    ➤ merge_flag_flag(flag1, flag2) => flag2

    ➤ merge_val_val(val1, val2) => max(val1, val2)

    ➤ Now we have a segment tree that calculates the maximum element in an interval, and allows setting all elements in an interval to a specified value.

    ➤ However it is very important that your value and flag only use O(1) space.

  • SEGMENT TREE - IMPLEMENTATION➤ How would you implement a segment tree?

    ➤ One way is to have pointers pointing to the child nodes, like you would do to many other tree structures.

    [0, 2)valueflagleft childright child

    [0, 1)valueflagleft childright child

    [1, 2)valueflagleft childright child

    Disadvantages?

    Cache Unfriendly

    Additional Space

  • SEGMENT TREE - IMPLEMENTATION➤ We can make the data structure faster and easier to implement by

    requiring that the length of the original array to be a power of 2.

    ➤ It has some nice properties:

    ➤ Since every node in the tree now either has 2 children, or is a leaf, the structure can be be seen as a full binary tree, and stored in a 1-D array, similar to binary heap.

    ➤ For node with index i, node ((i - 1)/2) is its parent, node (2i + 1) is its left child and node (2i + 2) is its right child. No need to store the pointers.

    ➤ No need to store the intervals in nodes, for they can be calculated during the recursion or less efficiently by the index of the node.