View
33
Download
0
Category
Tags:
Preview:
DESCRIPTION
Sorting. How fast can we sort?. All the sorting algorithms we have seen so far are comparison sorts: only use comparisons to determine the relative order of elements. The best worst-case running time that we’ve seen for comparison sorting is O( nlgn ). Is O( nlgn )the best we can do? - PowerPoint PPT Presentation
Citation preview
Sorting
How fast can we sort?
• All the sorting algorithms we have seen so far are comparison sorts: only use comparisons to determine the relative order of elements.
• The best worst-case running time that we’ve seen for comparison sorting is O(nlgn).
– Is O(nlgn)the best we can do?
• Decision treescan help us answer this question. •E.g., insertion sort, merge sort, quicksort, heapsort.
Decision-tree example• Sort ⟨a1, a2, …, an⟩
Each internal node is labeled i:j for i, j {1, 2,…, n}.∈The left subtree shows subsequent comparisons if ai≤aj.The right subtree shows subsequent comparisons if ai≥aj.
• Sort ⟨a1, a2, a3 = ⟩• ⟨9, 4, 6 :⟩
• Sort ⟨a1, a2, a3 = ⟩• ⟨9, 4, 6 :⟩
• Sort ⟨a1, a2, a3 = ⟩• ⟨9, 4, 6 :⟩
Each leaf contains a permutation π(1), π(2),…, π(⟨ n) to indicate ⟩that the ordering aπ(1)≤aπ(2)≤...≤aπ(n)has been established.
• Sort ⟨a1, a2, a3 = ⟩• ⟨9, 4, 6 :⟩
Decision-tree model• A decision tree can model the execution of any
comparison sort:
• One tree for each input size n. (there is specific tree for different n values tree is not generic)
• View the algorithm as splitting whenever it compares two elements.
• The tree contains the comparisons along all possible instruction traces.
• The running time of the algorithm =the length of the path taken.
• Worst-case running time =height of tree.
Lower bound for decision-tree sorting
• Theorem. Any decision tree that can sort n elements must have height Ω(nlgn).
Proof. The tree must contain ≥n! leaves, since there are n! possible permutations.
• A height-h binary tree has ≤2h leaves. – Thus, n! ≤2h.
• h≥lg(n!) (lg is mono. İncreasing )• ≥lg ((n/e)n)(Stirling’s formula)• = nl(g n–lg e) lg e is constant • = Ω(nlg n)
monotonically increasing,increasing if for all x and y such that x ≤ y one has f(x) ≤ f(y), so f preserves the order
Lower bound for comparison sortingCorollary• Heapsort and merge sort are asymptotically
optimal comparison sorting algorithms.
Counting Sort
1 CountingSort(A, B, k)
2 for i=1 to k
3 C[i]= 0;
4 for j=1 to n
5 C[A[j]] += 1;
6 for i=2 to k
7 C[i] = C[i] + C[i-1];
8 for j=n downto 1
9 B[C[A[j]]] = A[j];
10 C[A[j]] -= 1;
What will be the running time?
Takes time O(k)
Takes time O(n)
Analysis
Running time
• If k= O(n), then counting sort takes Θ(n)time.• But, sorting takes Ω(nlgn) time!
Answer:• Comparison sorting takes Ω(nlgn)time.• Counting sort is not a comparison sort.• In fact, not a single comparison between
elements occurs!
• Why don’t we always use counting sort?• Because it depends on range k of elements• Could we use counting sort to sort 32 bit
integers? Why or why not?• Answer: no, k too large (232 = 4,294,967,296)
Stable sorting
• Counting sort is a stablesort: it preserves the input order among equal elements.
Dynamic setsDynamic sets
Sets that can grow, shrink, or otherwise change over time.
• Two types of operations:– queries return information about the set– modifying operations change the set
Common queriesSearch, Minimum, Maximum, Successor, Predecessor
Common modifying operationsInsert, Delete
Dictionary
DictionaryStores a set S of elements, each with an associated key (integer value).
Operations
Search(S, k): return a pointer to an element x in S with key[x] = k, or NIL if such an element does not exist.
Insert(S, x): inserts element x into S, that is, S ← S ⋃ {x}
Delete(S, x): remove element x from S
Search Insert Delete
linked list
sorted array
hash table
Θ(1) Θ(1)
Θ(n)
Θ(n)
Θ(n)Θ(log n)
Implementing a dictionary
Hash table Running times are average times and assume (simple) uniform hashing and a large enough table (for example, of size 2n)
Today Binary search trees
Θ(1) Θ(1) Θ(1)
Binary search trees
• Binary search trees are an important data structure for dynamic sets.
• They can be used as both a dictionary and a priority queue.
• They accomplish many dynamic-set operations in O(h) time, where h = height of tree.
5
Binary search trees• root of T denoted by root[T]• internal nodes have four fields:
key (and possible other satellite data)left: points to left childright: points to right childp: points to parent. p[root[T]] = NIL
1
73
84
Binary tree T
root[T]
= NIL
= NIL
Binary search trees
1
5
73
84
Binary tree T
root[T]
Keys are stored only in internal nodes!There are binary search trees which store keys only in the leaves …
Binary search trees
• a binary tree is– a leaf or– a root node x with a binary tree as its left and/or right child
Binary-search-tree property– if y is in the left subtree of x, then key[y] ≤ key[x]– if y is in the right subtree of x, then key[y] ≥ key[x]
Keys don’t have to be unique …
1
5
73
84
Binary search trees
1
5
73
84
1
3
7
85
4
height h = 2
height h = 4
Binary-search-tree property– if y is in the left subtree of x, then key[y] ≤ key[x]– if y is in the right subtree of x, then key[y] ≥ key[x]
Inorder tree walk• binary search trees are recursive structures
➨ recursive algorithms often work well
Example: print all keys in order using an inorder tree walk
InorderTreeWalk(x) 1. if x ≠ NIL2. then InorderTreeWalk(left[x])3. print key[x]4. InorderTreeWalk(right[x])
Correctness: follows by induction from the binary search tree property
Running time?
Intuitively, O(n) time for a tree with n nodes, since we visit and print each node once.
1
5
73
84
Inorder tree walkInorderTreeWalk(x) 1. if x ≠ NIL2. then InorderTreeWalk(left[x])3. print key[x]4. InorderTreeWalk(right[x])
TheoremIf x is the root of an n-node subtree, then the call InorderTreeWalk(x) takes Θ(n) time.
Proof:– T(n) takes small, constant amount of time on empty subtree
➨ T(0) = c for some positive constant c– for n > 0 assume that left subtree has k nodes, right subtree n-k-1
➨ T(n) = T(k) + T(n-k-1) + d for some positive constant d
use substitution method … to show: T(n) = (c+d)n + c ■
Querying a binary search treeTreeSearch(x, k) 1. if x = NIL or k = key[x]2. then return x3. if k < key[x]4. then return TreeSearch(left[x], k)5. else return TreeSearch(right[x], k)
Initial call: TreeSearch(root[T], k)
– TreeSearch(root[T], 4)– TreeSearch (root[T], 2)
Running time: Θ(length of search path)worst case Θ(h)
k
≤ k ≥ k
Binary-search-tree property
1
5
73
84
Querying a binary search tree – iteratively
TreeSearch(x, k) 1. if x = NIL or k = key[x]2. then return x3. if k < key[x]4. then return TreeSearch(left[x], k)5. else return TreeSearch(right[x], k)
IterativeTreeSearch(x, k) 1. while x ≠ NIL and k ≠ key[x]2. do if k < key[x]3. then x ← left[x]4. else x ← right[x]5. return x
• iterative version more efficient on most computers
k
≤ k ≥ k
Binary-search-tree property
Minimum and maximum• binary-search-tree property guarantees that
– the minimum key is located in the leftmost node
– the maximum key is located in the rightmost node
TreeMinimum(x) 1. while left[x] ≠ NIL2. do x ← left[x]3. return x
TreeMaximum(x) 1. while right[x] ≠ NIL2. do x ← right[x]3. return x
k
≤ k ≥ k
Binary-search-tree property
1
5
73
84
Minimum and maximum• binary-search-tree property guarantees that
– the minimum key is located in the leftmost node
– the maximum key is located in the rightmost node
TreeMinimum(x) 1. while left[x] ≠ NIL2. do x ← left[x]3. return x
TreeMaximum(x) 1. while right[x] ≠ NIL2. do x ← right[x]3. return x
k
≤ k ≥ k
Binary-search-tree property
Running time?Both procedures visit nodes on a downward path from the root ➨ O(h) time
Successor and predecessor• Assume that all keys are distinct
Successor of a node xnode y such that key[y] is the smallest key > key[x]
We can find y based entirely on the tree structure, no key comparisons are necessary …
if x has the largest key, then we say x’s successor is NIL
Successor and predecessorSuccessor of a node x
node y such that key[y] is the smallest key > key[x]
Two cases:
1. x has a non-empty right subtree ➨ x’s successor is the minimum in x’s right subtree
2. x has an empty right subtree ➨ x’s successor y is the node that x is the predecessor of
(x is the maximum in y’s left subtree)as long as we move to the left up the tree (move up through right children), we’re visiting smaller keys …
Successor and predecessorTreeSucessor(x) 1. if right[x] ≠ NIL2. then return TreeMinimum(right[x])3. y ← p[x]4. while y ≠ NIL and x = right[y]5. do x ← y6. y ← p[x]7. return y
– Successor of 15?– Successor of 6?– Successor of 4?
3
15
186
20177
42 13
9
Successor and predecessorTreeSucessor(x) 1. if right[x] ≠ NIL2. then return TreeMinimum(right[x])3. y ← p[x]4. while y ≠ NIL and x = right[y]5. do x ← y6. y ← p[x]7. return y
Running time?
3
15
186
20177
42 13
9
O(h)
InsertionTreeInsert(T, z) 1. y ← NIL2. x ← root[T]3. while x ≠ NIL4. do y ← x5. if key[z] < key[x]6. then x ← left[x]7. else x ← right[x]8. p[z] ← y9. if y = NIL10. then root[T] ← x 11. else if key[z] < key[y]12. then left[y] ← z13. else right[y] ← z
• to insert value v, insert node z with key[z] = v, left[z] = NIL, and right[z] = NIL
• traverse tree down to find correct position for z
13
7
6
15
InsertionTreeInsert(T, z) 1. y ← NIL2. x ← root[T]3. while x ≠ NIL4. do y ← x5. if key[z] < key[x]6. then x ← left[x]7. else x ← right[x]8. p[z] ← y9. if y = NIL10. then root[T] ← x 11. else if key[z] < key[y]12. then left[y] ← z13. else right[y] ← z
• to insert value v, insert node z with key[z] = v, left[z] = NIL, and right[z] = NIL
• traverse tree down to find correct position for z
3
18
2017
42
9
14zx
y = NILy
x y
x y
x y
= NILx
InsertionTreeInsert(T, z) 1. y ← NIL2. x ← root[T]3. while x ≠ NIL4. do y ← x5. if key[z] < key[x]6. then x ← left[x]7. else x ← right[x]8. p[z] ← y9. if y = NIL10. then root[T] ← z 11. else if key[z] < key[y]12. then left[y] ← z13. else right[y] ← z
Running time?
• to insert value v, insert node z with key[z] = v, left[z] = NIL, and right[z] = NIL
• traverse tree down to find correct position for z
3
18
2017
42
9 14 z
y
15
6
7
13
O(h)
7
Deletion• we want to delete node z
TreeDelete has three cases:
• z has no children
➨ delete z by having z’s parent point to NIL, instead of to z
3
15
186
209
18
Deletion• we want to delete node z
TreeDelete has three cases:
• z has no children
➨ delete z by having z’s parent point to NIL, instead of to z
• z has one child
➨ delete z by having z’s parent point to z’s child, instead of to z
3
15
6
209
7
• we want to delete node z
TreeDelete has three cases:
• z has no children
➨ delete z by having z’s parent point to NIL, instead of to z
• z has one child
➨ delete z by having z’s parent point to z’s child, instead of to z
• z has two children
➨ z’s successor y has either no or one child ➨ delete y from the tree and replace z’s key and satellite data with y’s
6
7
Deletion
3
15
209
18
• we want to delete node zTreeDelete has three cases:• z has no children
➨ delete z by having z’s parent point to NIL, instead of to z
• z has one child
➨ delete z by having z’s parent point to z’s child, instead of to z
• z has two children➨ z’s successor y has either no or one child
➨ delete y from the tree and replace z’s key and satellite data with y’s
Running time?
7
Deletion
3
15
209
18
O(h)
Minimizing the running time• all operations can be executed in time proportional to the height h of the
tree (instead of proportional to the number n of nodes in the tree)
Worst case:
Solution: guarantee small height (balance the tree) ➨ h = Θ(log n)
Next week: several schemes to balance binary search trees
Θ(n)
Balanced binary search treesAdvantages of balanced binary search trees
over linked lists
efficient search in Θ(log n) time
over sorted arrays
efficient insertion and deletion in Θ(log n) time
over hash tables
can find successor and predecessor efficiently in Θ(log n) time
Search Insert Delete
linked list
sorted array
hash table
binary search tree
Θ(1) Θ(1)
Θ(n)
Θ(n)
Θ(n)Θ(log n)
Implementing a dictionary
balanced binary search trees ➨ height is Θ(log n)
Θ(1) Θ(1) Θ(1)
Θ(height)Θ(height)Θ(height)
Recommended