Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town

Theory of Algorithms:Theory of Algorithms:Divide and ConquerDivide and Conquer

Theory of Algorithms:Theory of Algorithms:Divide and ConquerDivide and Conquer

James Gain and Edwin Blake{jgain | edwin} @cs.uct.ac.za

Department of Computer Science

University of Cape Town

August - October 2004

ObjectivesObjectives

To introduce the divide-and-conquer mind set

To show a variety of divide-and-conquer solutions: Merge Sort

Quick Sort

Multiplication of Large Integers

Strassen’s Matrix Multiplication

Closest Pair and Convex Hull by Divide-and-Conquer

To discuss the strengths and weaknesses of a divide-and-conquer strategy

Divide-and-ConquerDivide-and-Conquer

Best known algorithm design strategy:1. Divide instance of problem into two or more smaller

instances

2. Solve smaller instances recursively

3. Obtain solution to original (larger) instance by combining these solutions

The Master Theorem applies Recurrences are of the form T(n) = aT(n/b) + f (n), a

1, b 2

Silly Example (Addition): a0 + …. + an-1 = (a0 + … + an/2-1) + (an/2 + … an-1)

Divide-and-Conquer IllustratedDivide-and-Conquer Illustrated

SUBPROBLEM 2 OF SIZE n/2

SUBPROBLEM 1 OF SIZE n/2

A SOLUTION TO SUBPROBLEM 1

A SOLUTION TO THEORIGINAL PROBLEM

A SOLUTION TO SUBPROBLEM 2

A PROBLEM OF SIZE n

MergesortMergesort

Algorithm:1. Split A[1..n] in half and copy of each half into arrays B[1.. n/2 ]

and C[1.. n/2 ]

2. Recursively MergeSort arrays B and C

3. Merge sorted arrays B and C into array A

Merging: REPEAT until no elements remain in one of B or C

1. Compare 1st element in the rest of B and C

2. Copy smaller into A, incrementing index of corresponding array

3. Once all elements in one of B or C is processed, copy the remaining unprocessed elements from the other array into A

Mergesort ExampleMergesort Example

7 2 1 6 4

7 2 1 6 4

7 2 1 6

4

7

2

2 7

1 2 7

4 6

1 2 4 6 7

Efficiency of MergesortEfficiency of Mergesort

Recurrence: C(n) = 2 C(n/2) + Cmerge(n) for n > 1, C(1) = 0

Cmerge(n) = n - 1 in the worst case

All cases have same efficiency: ( n log n)

Number of comparisons is close to theoretical minimum for comparison-based sorting: log n! ≈ n lg n - 1.44 n

Space requirement: ( n ) (NOT in-place)

Can be implemented without recursion (bottom-up)

QuicksortQuicksort

Select a pivot (partitioning element)

Rearrange the list into two sublists: All elements positioned before the pivot are ≤ the pivot

Those positioned after the pivot are > the pivot

Requires a pivoting algorithm

Exchange the pivot with the last element in the first sublist The pivot is now in its final position

QuickSort the two sublists

p

A[i]≤p A[i]>p

The Partition AlgorithmThe Partition Algorithm

Quicksort ExampleQuicksort Example

5 3 1 9 8 2 7

5 3 1 9 8 2 7

i j

i j

5 3 1 2 8 9 7 j i

2 3 1 5 8 9 7

2 3 1 8 9 7 i j

2 1 3

5 3 1 2 8 9 7 i j

i j

2 1 3 j i

1 2 3

Recursive Call Quicksort (2 3 1) & Quicksort (8 9 7)

Recursive Call Quicksort (1) & Quicksort (3)

i j

8 7 9 i j

8 7 9 j i

7 8 9

Recursive Call Quicksort (7) & Quicksort (9)

Worst Case Efficiency of QuicksortWorst Case Efficiency of Quicksort

In the worst case all splits are completely skewed

For instance, an already sorted list! One subarray is empty, other reduced by only one:

Make n+1 comparisons Exchange pivot with itself Quicksort left = Ø, right = A[1..n-1] Cworst = (n+1) + n + … + 3 = (n+1)(n+2)/2 - 3 = (n2)

p

A[i]≤p A[i]>p

General Efficiency of QuicksortGeneral Efficiency of Quicksort

Efficiency Cases: Best: split in the middle — ( n log n) Worst: sorted array! — ( n2) Average: random arrays — ( n log n)

Improvements (in combination 20-25% faster): Better pivot selection: median of three partitioning

avoids worst case in sorted files Switch to Insertion sort on small subfiles Elimination of recursion

Considered the method of choice for internal sorting for large files (n ≥ 10000)

QuickHull AlgorithmQuickHull Algorithm

Inspired by Quicksort

1. Sort points by increasing x-coordinate values

2. Identify leftmost and rightmost extreme points P1 and P2 (part of hull)

3. Compute upper hull: Find point Pmax that is farthest away from line P1P2

Quickhull the points to the left of line P1Pmax

Quickhull the points to the left of line PmaxP2

4. Similarly compute lower hull

P1

Pmax P2

L

L

Finding the Furthest PointFinding the Furthest Point

Given three points in the plane p1 , p2 , p3

Area of Triangle = p1 p2 p3 = 1/2 D

D =

D = x1 y1 + x3 y1 + x2 y3 - x3 y2 - x2 y1 - x1 y3

Properties of D : Positive iff p3 is to the left of p1p2

Correlates with distance of p3 from p1p2

x1 y1 1

x2 y2 1

x3 y3 1

Efficiency of QuickhullEfficiency of Quickhull

Finding point farthest away from line P1P2 is linear in the number of points

This gives same efficiency as quicksort: Worst case: (n2)

Average case: (n log n)

If an initial sort is required, this can be accomplished in ( n log n) — no increase in asymptotic efficiency class

Alternative Divide-and-Conquer Convex Hull: Graham’s scan and DCHull

Also ( n log n) but with lower coefficients

Strassen’s Matrix MultiplicationStrassen’s Matrix Multiplication

Strassen observed [1969] that the product of two matrices can be computed as follows:

Op Count: Each of M1, …, M7 rrequires 1 mult and 1 or 2 add/sub Total = 7 mul and 18 add/sub Compared with brute force which requires 8 mults and 4 add/sub

n/2 n/2 submatrices are computed recursively by the same method

C00 C01

C10 C11

A00 A01

A10 A11

B00 B01

B10 B11

M1 + M4 -M5+M7 M3+ M5

M2 + M4 M1 + M3 - M2 + M6

=

= *

Efficiency of Strassen’s AlgorithmEfficiency of Strassen’s Algorithm

If n is not a power of 2, matrices can be padded with zeros

Number of multiplications: M(n) = 7 M(n/2) for n > 1, M(1) = 1

Set n = 2k, apply backward substitution

M(2k) = 7M(2k-1) = 7 [7 M(2k-2)] = 72 M(2k-2) = … = 7k

M(n) = 7log n = nlog 7 ≈ n2.807

Number of additions: A(n) (nlog 7)

Other algorithms closer to the lower limit of n2 multiplications, but are even more complicated

Strengths and Weaknesses of Strengths and Weaknesses of Divide-and-ConquerDivide-and-Conquer

Strengths: Generally improves on Brute Force by one base

efficiency class

Easy to analyse using the Master Theorem

Weaknesses: Often requires recursion, which introduces overheads

Can be inapplicable and inferior to simpler algorithmic solutions

Documents

Theory of Algorithms: Divide and Conquer James Gain and Edwin Blake {jgain | edwin} @cs.uct.ac.za Department of Computer Science University of Cape Town