Lecture 2: Divide and Conquer I: Merge-Sort and Master Theorem

Lecture 2:Divide and Conquer I:Merge-Sort and Master Theorem

Shang-Hua Teng

Example Problem: Sorting

• Input: Array A[1...n], of elements in arbitrary order; array size nOutput: Array A[1...n] of the same elements, but in the non-decreasing order

Algorithm Design Paradigm I• Solve smaller problems, and use solutions to the

smaller problems to solve larger ones– Incremental (e.g., insertion and selection sort)– Divide and Conquer (Today’s topic)– Dynamic Programming (Later)

• Correctness: mathematical induction• Running Time Analysis: recurrences• Incremental is often a special case of Divide and

Conquer

Divide and Conquer• Divide the problem into a number of

sub-problems (similar to the original problem but smaller);

• Conquer the sub-problems by solving them recursively (if a sub-problem is small enough, just solve it in a straightforward manner.

• Combine the solutions to the sub-problems into the solution for the original problem

Sorting

• Input: Array A[1...n], of elements in arbitrary order; array size nOutput: Array A[1...n] of the same elements, but in the non-decreasing order

Merge Sort• Divide the n-element sequence to be sorted into

two subsequences of n/2 element each• Conquer: Sort the two subsequences

recursively using merge sort• Combine: merge the two sorted subsequences

to produce the sorted answer• Note: during the recursion, if the subsequence

has only one element, then do nothing.

Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array

A[p..r] using divide and conquer

• Merge-Sort(A,p,r)– if p >= r, do nothing– if p< r then

• Merge-Sort(A,p,q)• Merge-Sort(A,q+1,r)• Merge(A,p,q,r)

• Starting by calling Merge-Sort(A,1,n)

2/)( rpq

A = MergeArray(L,R)Assume L[1:s] and R[1:t] are two sorted arrays of elements: Merge-Array(L,R) forms a single

sorted array A[1:s+t] of all elements in L and R.

• A = MergeArray(L,R)– – – for k 1 to s + t

• do if– then– else

1];[][ iiiLkA1];[][ jjjRkA

]1[;]1[ tRsL

][][ jRiL

1;1 ji

Correctness of MergeArray

• Loop-invariant– At the start of each iteration of the for loop, the

subarray A[1:k-1] contains the k-1 smallest elements of L[1:s+1] and R[1:t+1] in sorted order. Moreover, L[i] and R[j] are the smallest elements of their arrays that have not been copied back to A

Inductive Proof of Correctness

• Initialization: (the invariant is true at beginning)

prior to the first iteration of the loop, we have k = 1, so that A[1,k-1] is empty. This empty subarray contains k-1 = 0 smallest elements of L and R and since i = j = 1, L[i] and R[j] are the smallest element of their arrays that have not been copied back to A.


• Maintenance: (the invariant is true after each iteration)

WLOG: assume L[i] <= R[j], the L[i] is the smallest element not yet copied back to A. Hence after copy L[i] to A[k], the subarray A[1..k] contains the k smallest elements. Increasing k and i by 1 reestablishes the loop invariant for the next iteration.


• Termination: (loop invariant implies correctness)

At termination we have k = s+t + 1, by the loop invariant, we have A contains the k-1 (s+t) smallest elements of L and R in sorted order.

Complexity of MergeArray

• At each iteration, we perform 1 comparison, 1 assignment (copy one element to A) and 2 increments (to k and i or j )

• So number of operations per iteration is 4.• Thus, Merge-Array takes at most 4(s+t) time.• Linear in the size of the input.

Merge (A,p,q,r)Assume A[p..q] and A[q+1..r] are two sorted

Merge(A,p,q,r) forms a single sorted array A[p..r].

• Merge (A,p,q,r)– – – –

]1[;]1[ tRsL

;;1 qrtpqs

],1[];..[ rqARqpAL

),(]..[ RLMergeArrayrpA

Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array

A[p..r] using divide and conquer

• Merge-Sort(A,p,r)– if p >= r, do nothing– if p< r then

• Merge-Sort(A,p,q)• Merge-Sort(A,q+1,r)• Merge(A,p,q,r)

2/)( rpq

Running Time of Merge-Sort• Running time as a function of the input size,

that is the number of elements in the array A.• The Divide-and-Conquer scheme yields a

clean recurrences.• Assume T(n) be the running time of merge-

sort for sorting an array of n elements.• For simplicity assume n is a power of 2, that

is, there exists k such that n = 2k .

Recurrence of T(n)

• T(1) = 1• for n > 1, we have

nnTnT 4)2/(2)(

nnTnT

4)2/(21

)(if n = 1

if n > 1

Solution of Recurrence of T(n)

T(n) = 4 nlog n + n

• Picture Proof by Recursion Tree

The substitution method

Asymptotic Notation• As input size grow, how fast the running time grow.

– T1(n) = 100 n

– T2(n) = n2

• Which algorithms is better?• When n < 100 is small then T1 is smaller

• As n becomes larger, T2 grows much faster• To solve ambitious, large-scale problem, algorithm1

is preferred.

Asymptotic Notation(Removing the constant factor)

• TheNotation

(g(n)) = { f(n): there exist positive c1 and c2 and

n0 such that

for all n > n0}

• For example T(n) = 4nlog n + n = (nlog n)

)()()(0 21 ngcnfngc

Asymptotic Notation(Removing the constant factor)

• TheBigNotation

O(g(n)) = { f(n): there exist positive c and

n0 such that

for all n > n0}

• For example T(n) = 4nlog n + n = (nlog n)• But also T(n) = 4nlog n + n = (n2)

)()(0 ncgnf

General Recurrence for Divide-and-Conquer

• If a divide and conquer scheme divides a problem of size n into a sub-problems of size at most n/b. Suppose the time for Divide is D(n) and time for Combination is C(n), then

• How do we bound T(n)?

)()()/(

)1()(

nCnDbnaTnT

if n < c

if n > 1

The Master Theorem

• Consider

• where a >= 1 and b>= 1• we will ignore ceilings and floors (all

absorbed in the O or notation)

)()/(

)1()(

nfbnaTnT

if n < c

if n > 1

The Master Theorem

• If for some constant > 0 then• If then • If for some constant > 0 and

if a f(n/b) <= c f(n) for some constant c < 1 and all sufficiently large n, then T(n) =(f (n))

)()( log abnOnf)()( log abnnT

)()( log abnnf )log()( log nnnT ab)()( log abnOnf

• If T(n) = T(2n/3) + 1 then T(n) = (log n)• If T(n) = 9T(n/3) + n, then T(n) = (n2)• If T(n) = 3T(n/4) + n log n then T(n) =(n log n)

• If T(n) = 2T(n/2) + n log n then T(n) = ??? (not polynomially large!!!) but we can show that T(n) =(n log2 n)

Example:The Master Theorem

Documents

Lecture 2: Divide and Conquer I: Merge-Sort and Master Theorem