Advanced Data Structures and Algorithms COSC-600 Lecture presentation-6

Preview:

Citation preview

Advanced Data Structures and

AlgorithmsCOSC-600

Lecture presentation-6

TREESTrees are special subset of graphs. A tree is a collection of nodes. It consists of a) a distinguished node called rootb) zero or more non empty subtrees T1,T2………. Tk , each of these nodes connected by a directed edge from “r”c)recursive definition

root

T1 T2 Tk

Tree is a directed graph without a cycle 1. each node of a tree has only one parent node except root2. there is only one “path” from root to each node, no cycle Path: a sequence of nodes n1, n2….. nk such that ni is the parent of ni+1

Length of path: the number of edgesDepth of ni: length of the path from root to ni

Height of ni: length of longest path from ni to leaf nodeLeaf node: which has no child also called as terminal node

Linked list representation of a tree

Implementation: using linked lists

Tree Search (tree traverse)

1) Depth first search(DFS)……………… use stacks2) Breadth first search(BFS) …………… Use Queues

Binary Search Tree

• each son of a vertex is distinguished either as a left son or as right son• no vertex has more than one left son nor more than one right son a tree in

which no node can have more than two children

Operations:1. Find(Search) 2. Traversal 3. Insert4. Delete5. Build

Find Operation • Time Complexity – O(Height of the Binary Search Tree) That is O(N) in worst case Example,

A

B

C

D

E

Height of the tree = N Thus, Order of growth = O(N)

Find (Worst Case Example)

Order of growth will be O(N), no matter how the tree is structured.

A

D

E

B

C

A

B

C

D

E

Find Max and Find MinFor Find Max and Find Min operations, the worst case will have the time complexity order of O(N).

• L is the smallest value in this BST.

• I is the largest value in this BST.

• For sorting in ascending order use inorder (LVR) method.

• For sorting in descending order use inorder (RVL) method.

• It will have the order of O(N).

A

B C

D

F G

J

L M

E

H I

K

ON

Traversal & Median Value

• Inorder can be used to find the median value in the BST. • It will have the order of O(N).• We can use the balanced binary search tree in order to change the

order O(N) to O(logN).• Traversal in a BST will have the order O(N). • Recursion can be used for traversal operation.

Insert Operation • Always follow the BST rules while inserting a new node in the tree.

• Case 1) New node will always be a terminal node.

• Example ( 2 is the new node)

Case 2) In order to find the location to insert the node in some cases when following the BST protocols. The complexity will be in the order O(N).

Example ( 6 is the new node)

Time Complexity worst case – O(Height of the BST)

5

74

2

5

7

4

2

6

Delete Operation

• Case 1) The deleting node has no child ≡ terminal node (leaf) => just delete it !

• Example

• Case 2) The deleting node has only one child => reconnect the child to its parent node.

• Example

5

74

2

5

74

5

74

2

5

72

Delete Operation

• Case 3) The deleting node has two children (two sub trees).

• A) Find the smallest node from its right sub tree and replace the deleting node with it and delete the replaced node.

• B) Find the largest node from its left sub tree and replace the deleting node with it and delete the replaced node.

• Time Complexity worst case – O(Height of the BST)

• Example (A)

8

94

2 6

5

8

9

2 65

Build Operation

• For the following N elements, the BST is build based on the input from left to right.

Example 5,10,21,32,7

Example 10,7,5,21,32

5

10

21

32

7

5

10

21

32

7

Average Case Analysis

• T(N)=0+1+2+3+…+(N-1) ≠ O(N) because we are looking for the worst case which = O(N2)

And in average case time complexity = O(log N) , (this one can the best case too ).

Average depth all nodes in a tree is O(log N).

On the assumption that all insertion sequences are “equally likely ”.

Some of depth of all nodes ≡ internal path length.

• Time complexity of Insert/Delete Pairs in a BST is O().

• Time complexity of Insert Operation on the average is O(N log N).

• Time complexity of the height on the average is O(N log N).

• TheoremThe expected depth of tree needed to insert N random elements in to an initially empty binary search tree is O(N log N) for N>=1.

Balanced Binary Search Tree ( AVL Tree )

The difference of depth between any terminal node in a binary tree should be at most 1.

• AVL Tree is a BST which satisfies the balance condition. 1. Must be easy to maintain.2. Depth of tree is O(log N).

• How to implement it ?• Reshape the AVL tree if the depth difference is more than 1.

• Rotation • Single Rotation • Double Rotation • Height of empty AVL tree = -1 (Purely Mathematical assumption)

• Definition of Balance Condition• For “every” node in AVL tree, height of left and right sub tree can differ by at most 1.

Ex:

• To maintain balance condition we require additional operations after inserting or deleting a node.

• Note: Height information is kept for each node. After each insert operation, update the height of all the nodes from new inserted node to root node.

• Note: The min no. of nodes in an AVL tree of height “h” is S(h).• S(h) = S(h-1)+S(h-2)+1• Ex: S(0)=1; S(1)=2; S(2)=4; S(4)=12; S(5)=20• All the tree operations = O(log n)• To maintain AVL tree condition after each of insert or delete

operations we need to perform Rotations.

Insertion Operation: If you want to insert 6 into above AVL tree the newly inserted node will be leaf node.

• The above tree is not an AVL tree as node 8 violates the condition and height is not balanced.

So, we need to rotate. • Note: After one insertion only nodes that are on the path from the

insertion point to the root might have their balance altered.

• Inserted node to root : update the balancing info.• α : node that must be rebalanced• Case 1: Insertion into the left sub tree of the left child of α• Case 2: Insertion into the right sub tree of left child of α• Case 3: Insertion into the left sub tree of right child of α• Case 4: Insertion into the right sub tree of right child of α

Types of rotation: • Single Rotation: Case 1 and Case 4• Double rotation: Case 2 and Case 3.Case 1:

K2 is alpha. So perform single rotation.

Case 4:

After single rotation (counter-clock wise)

Case 3 – Double rotation

Case 2 – Double rotation

Upon the 1st rotation we get the following tree.

To understand the double rotation consider for example tree where we insert 5.But upon such action height violation occur at the node 8.

But this tree is also not balanced we get a violation at node 4.So single rotation does not work here and hence we have to use double rotation for the tree. After double rotation we get tree as below.

SPLAY TREEIt is a special case of Binary Search Tree and splay tree ≠ balanced Binary Search Tree.

* Relatively Simple.* Guarantee that any M Consecutive tree operations starting from an empty BST take

at most O(MlogN).* It does not mean O(logN) for single operation.* This just means O(logN) “amortized” cost per operation.

Idea: After a node is accessed it is pushed to the root by a series of AVL tree rotations.Why?:• Likely to be accessed again in the future ~ Locality.• Does not require the maintenance or height/balance information.

After 1st rotation

After 2nd rotation

After 3rd rotation

After 4th rotation

Problem:• Another node might be as deep as tree height.• Ω (M.N) time for sequence of M operations.Method 2: Perform a series of special rotations.• Zig-Zag.• Zig-Zig.• Let X be a (non-root) node on the access path at which we are rotating.Rule:• If the parent of X is the root of the tree, merely rotate X and the root

OTHERWISE X has both a parent and parent which are (P) and (G) respectively.

Case 1: Zig Zag

Case 2: Zig Zig

Delete Operation

1) Accessing the node to be deleted – push up the node to root2) Delete it (root) – two subtrees Tl and Tr.3) Find largest element in Tl – It will be a root with right child of

Tl.4) Tr will be the right subtree of the root.

B-Tree (M and L)

= Reduce the number of disc access time

M-ary search Tree

• M-way branches• M-1 keys• Maintain M-ary search tree is balanced

Properties of B-Tree

1. Data items are stored at leaves.2. Non-leaf nodes store up to (M-1) keys

Keyi represents the smallest key in subtree (i + 1)

3. Root is either a leaf or has between two & M children4. All non-leaf nodes(except root) rhave between “[M/2] & M” children5. All leaves are at the same depth & have between [L/2] & L children for same L

eg.

eg. deletion

eg.

Assume that one block = 8192bytes = 8K each key = 32bytes link = 4bytes

M-ary B-TreeM-1 keysM links

32(M-1)+4M <= 8192M=228

L value?

8192/record size(256byte) = 32records/leaf

10 million records

1. Each leaf has between 16 & 32 records2. Each internal node(except root) at least 114 branches

10,000,000 / 16 = 625,000 leaves (worst case)

Worst case, depth of B-Tree?

Log11410,000,000

Recommended