Upload
shakila-mahjabin
View
281
Download
2
Embed Size (px)
DESCRIPTION
CSC-391. Data Structure and Algorithm course material
Citation preview
Merge Sort and Quick Sort
CSC 391
2
Sorting• Insertion sort
– Design approach:– Sorts in place:– Best case:– Worst case:
• Bubble Sort– Design approach:– Sorts in place:– Running time:
Yes(n)
(n2)
incremental
Yes(n2)
incremental
3
Sorting• Selection sort
– Design approach:– Sorts in place:– Running time:
• Merge Sort– Design approach:– Sorts in place:– Running time:
Yes
(n2)
incremental
NoLet’s see!!
divide and conquer
4
Divide-and-Conquer
• Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size
• Conquer the sub-problems
– Solve the sub-problems recursively
– Sub-problem size small enough solve the problems in straightforward
manner
• Combine the solutions of the sub-problems
– Obtain the solution for the original problem
5
Merge Sort Approach• To sort an array A[p . . r]:
• Divide– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each• Conquer– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing more to do
• Combine– Merge the two sorted subsequences
6
Merge Sort
Alg.: MERGE-SORT(A, p, r)
if p < r Check for base case
then q ← (p + r)/2 Divide
MERGE-SORT(A, p, q) Conquer
MERGE-SORT(A, q + 1, r) Conquer
MERGE(A, p, q, r) Combine
• Initial call: MERGE-SORT(A, 1, n)
1 2 3 4 5 6 7 8
62317425
p rq
7
Example – n Power of 21 2 3 4 5 6 7 8
q = 462317425
1 2 3 4
74255 6 7 8
6231
1 2
25
3 4
745 6
317 8
62
1
5
2
2
3
4
4
7 16
37
28
6
5
Divide
8
Example – n Power of 2
1
5
2
2
3
4
4
7 16
37
28
6
5
1 2 3 4 5 6 7 8
76543221
1 2 3 4
75425 6 7 8
6321
1 2
52
3 4
745 6
317 8
62
ConquerandMerge
9
Example – n Not a Power of 2
625374162741 2 3 4 5 6 7 8 9 10 11
q = 6
4162741 2 3 4 5 6
625377 8 9 10 11
q = 9q = 3
2741 2 3
4164 5 6
5377 8 9
6210 11
741 2
23
164 5
46
377 8
59
210
611
41
72
64
15
77
38
Divide
10
Example – n Not a Power of 2
776654432211 2 3 4 5 6 7 8 9 10 11
7644211 2 3 4 5 6
765327 8 9 10 11
7421 2 3
6414 5 6
7537 8 9
6210 11
23
46
59
210
611
41
72
64
15
77
38
741 2
614 5
737 8
ConquerandMerge
11
Merge - PseudocodeAlg.: MERGE(A, p, q, r)1. Compute n1 and n2
2. Copy the first n1 elements into L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ; R[n2 + 1] ←
4. i ← 1; j ← 15. for k ← p to r6. do if L[ i ] ≤ R[ j ]7. then A[k] ← L[ i ]8. i ←i + 19. else A[k] ← R[ j ]10. j ← j + 1
p q
7542
6321rq + 1
L
R
1 2 3 4 5 6 7 8
63217542
p rq
n1 n2
Algorithm:mergesort( int [] a, int left, int right) { if (right > left) { middle = left + (right - left)/2; mergesort(a, left, middle); mergesort(a, middle+1, right); merge(a, left, middle, right); } }Assumption: N is a power of two. For N = 1: time is a constant (denoted by 1)Otherwise, time to mergesort N elements = time to mergesort N/2 elements + time to merge two arrays each N/2 elements.
Time to merge two arrays each N/2 elements is linear, i.e. N
Thus we have:(1) T(1) = 1(2) T(N) = 2T(N/2) + N
Next we will solve this recurrence relation. First we divide (2) by N:(3) T(N) / N = T(N/2) / (N/2) + 1
Complexity of Merge Sort
N is a power of two, so we can write(4) T(N/2) / (N/2) = T(N/4) / (N/4) +1
(5) T(N/4) / (N/4) = T(N/8) / (N/8) +1
(6) T(N/8) / (N/8) = T(N/16) / (N/16) +1
(7) ……(8) T(2) / 2 = T(1) / 1 + 1Now we add equations (3) through (8) : the sum of their left-hand sides will be equal to the sum of their right-hand sides:T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) + … + T(2)/2 =T(N/2) / (N/2) + T(N/4) / (N/4) + ….+ T(2) / 2 + T(1) / 1 + LogN(LogN is the sum of 1s in the right-hand sides)
After crossing the equal term, we get(9) T(N)/N = T(1)/1 + LogN
T(1) is 1, hence we obtain(10) T(N) = N + NlogN = O(NlogN)Hence the complexity of the MergeSort algorithm is O(NlogN).
Complexity of Merge Sort
14
Quicksort
• Sort an array A[p…r]
• Divide– Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such
that each element of A[p..q] is smaller than or equal to each element in A[q+1..r]
– Need to find index q to partition the array
≤A[p…q] A[q+1…r]
15
Quicksort
• Conquer– Recursively sort A[p..q] and A[q+1..r] using Quicksort
• Combine– Trivial: the arrays are sorted in place
– No additional work is required to combine them
– The entire array is now sorted
A[p…q] A[q+1…r]≤
16
Recurrence
Alg.: QUICKSORT(A, p, r)
if p < r
then q PARTITION(A, p, r)
QUICKSORT (A, p, q)
QUICKSORT (A, q+1, r)Recurrence:
Initially: p=1, r=n
T(n) = T(q) + T(n – q) + n
17
Partitioning the ArrayAlg. PARTITION (A, p, r)1. x A[p]2. i p – 13. j r + 14. while TRUE5. do repeat j j – 16. until A[j] ≤ x7. do repeat i i + 18. until A[i] ≥ x9. if i < j10. then exchange A[i] A[j]11. else return j
Running time: (n)n = r – p + 1
73146235
i j
A:
arap
ij=q
A:
A[p…q] A[q+1…r]≤
p r
Each element isvisited once!
Partition can be done in O(n) time, where n is the size of the array. Let T(n) be the number of comparisons required by Quicksort.
If the pivot ends up at position k, then we have
To determine best-, worst-, and average-case complexity we need to determine the values of k that correspond to these cases.
Analysis of quicksort
T(n) T(nk) T(k 1) n
Best-Case ComplexityThe best case is clearly when the pivot always
partitions the array equally. Intuitively, this would lead to a recursive depth of at
most lg n callsWe can actually prove this. In this case – T(n) T(n/2) T(n/2) n (n lg n)
Best Case Partitioning• Best-case partitioning
– Partitioning produces two regions of size n/2
• Recurrence: q=n/2T(n) = 2T(n/2) + (n)
T(n) = (nlgn) (Master theorem)
Average-Case Complexity
Average case is rather complex, but is where the algorithm earns its name. The bottom line is: T(n) =
(nlgn)
Worst Case Partitioning
The worst-case behavior for quicksort occurs when the partitioning routine produces one region with n - 1 elements and one with only l element.
Let us assume that this Unbalanced partitioning arisesat every step of the algorithm.
Since partitioning costs (n) time and T(1) = (1), the recurrence for the running time isT(n) = T(n - 1) + (n).To evaluate this recurrence, we observe that T(1) = (1) and then iterate:
nn - 1
n - 2
n - 3
2
1
1
1
1
1
1
n
nnn - 1
n - 2
3
2
(n2)
Best case: split in the middle — Θ( n log n) Worst case: sorted array! — Θ( n2) Average case: random arrays — Θ( n log n)
Worst Case Partitioning