47
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.1 Introduction to Algorithms 6.046J/18.401J LECTURE 4 Quicksort Divide and conquer Partitioning Worst-case analysis Intuition Randomized quicksort Analysis Prof. Charles E. Leiserson

Cormen Algo-lec4

Embed Size (px)

Citation preview

Page 1: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.1

Introduction to Algorithms6.046J/18.401J

LECTURE 4Quicksort• Divide and conquer• Partitioning• Worst-case analysis• Intuition • Randomized quicksort• Analysis

Prof. Charles E. Leiserson

Page 2: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.2

Quicksort

• Proposed by C.A.R. Hoare in 1962.• Divide-and-conquer algorithm.• Sorts “in place” (like insertion sort, but not

like merge sort).• Very practical (with tuning).

Page 3: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.3

Divide and conquerQuicksort an n-element array:1. Divide: Partition the array into two subarrays

around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray.

2. Conquer: Recursively sort the two subarrays.3. Combine: Trivial.

≤ x≤ x xx ≥ x≥ x

Key: Linear-time partitioning subroutine.

Page 4: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.4

Partitioning subroutine

Running time= O(n) for nelements.

Running time= O(n) for nelements.

PARTITION(A, p, q) ⊳ A[p . . q] x ← A[p] ⊳ pivot = A[p]i ← pfor j ← p + 1 to q

do if A[ j] ≤ xthen i ← i + 1

exchange A[i] ↔ A[ j]exchange A[p] ↔ A[i]return i

Invariant: xx ≤ x≤ x ≥ x≥ x ??qjp i

Page 5: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.5

Example of partitioning

i j66 1010 1313 55 88 33 22 1111

Page 6: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.6

Example of partitioning

i j66 1010 1313 55 88 33 22 1111

Page 7: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.7

Example of partitioning

i j66 1010 1313 55 88 33 22 1111

Page 8: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.8

Example of partitioning

66 1010 1313 55 88 33 22 1111

66i j55 1313 1010 88 33 22 1111

Page 9: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.9

Example of partitioning

66 1010 1313 55 88 33 22 1111

66i j55 1313 1010 88 33 22 1111

Page 10: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.10

Example of partitioning

66 1010 1313 55 88 33 22 1111

66i j55 1313 1010 88 33 22 1111

Page 11: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.11

Example of partitioning

66 1010 1313 55 88 33 22 1111

66 55 1313 1010 88 33 22 1111

66 55i j33 1010 88 1313 22 1111

Page 12: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.12

Example of partitioning

66 1010 1313 55 88 33 22 1111

66 55 1313 1010 88 33 22 1111

66 55i j33 1010 88 1313 22 1111

Page 13: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.13

Example of partitioning

66 1010 1313 55 88 33 22 1111

66 55 1313 1010 88 33 22 1111

66 55 33 1010 88 1313 22 1111

66 55 33i j22 88 1313 1010 1111

Page 14: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.14

Example of partitioning

66 1010 1313 55 88 33 22 1111

66 55 1313 1010 88 33 22 1111

66 55 33 1010 88 1313 22 1111

66 55 33i j22 88 1313 1010 1111

Page 15: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.15

Example of partitioning

66 1010 1313 55 88 33 22 1111

66 55 1313 1010 88 33 22 1111

66 55 33 1010 88 1313 22 1111

66 55 33i j22 88 1313 1010 1111

Page 16: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.16

Example of partitioning

66 1010 1313 55 88 33 22 1111

66 55 1313 1010 88 33 22 1111

66 55 33 1010 88 1313 22 1111

66 55 33 22 88 1313 1010 1111

22 55 33i66 88 1313 1010 1111

Page 17: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.17

Pseudocode for quicksortQUICKSORT(A, p, r)

if p < rthen q ← PARTITION(A, p, r)

QUICKSORT(A, p, q–1)QUICKSORT(A, q+1, r)

Initial call: QUICKSORT(A, 1, n)

Page 18: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.18

Analysis of quicksort

• Assume all input elements are distinct.• In practice, there are better partitioning

algorithms for when duplicate input elements may exist.

• Let T(n) = worst-case running time on an array of n elements.

Page 19: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.19

Worst-case of quicksort

• Input sorted or reverse sorted.• Partition around min or max element.• One side of partition always has no elements.

)()()1(

)()1()1()()1()0()(

2nnnT

nnTnnTTnT

Θ=

Θ+−=Θ+−+Θ=Θ+−+=

(arithmetic series)

Page 20: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.20

Worst-case recursion treeT(n) = T(0) + T(n–1) + cn

Page 21: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.21

Worst-case recursion treeT(n) = T(0) + T(n–1) + cn

T )(n

Page 22: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.22

Worst-case recursion treeT(n) = T(0) + T(n–1) + cn

cnT(0) T(n–1)

Page 23: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.23

Worst-case recursion treeT(n) = T(0) + T(n–1) + cn

cnT(0) c(n–1)

T(0) T(n–2)

Page 24: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.24

Worst-case recursion treeT(n) = T(0) + T(n–1) + cn

cnT(0) c(n–1)

T(0) c(n–2)

T(0)

Θ(1)

O

Page 25: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.25

Worst-case recursion tree

cnT(0) c(n–1)

T(n) = T(0) + T(n–1) + cn

T(0) c(n–2)

T(0)

Θ(1)

O

( )2

1nk

n

kΘ=⎟⎟

⎞⎜⎜⎝

⎛Θ ∑

=

Page 26: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.26

Worst-case recursion tree

cnΘ(1) c(n–1)

T(n) = T(0) + T(n–1) + cn

Θ(1) c(n–2)

Θ(1)

Θ(1)

O

( )2

1nk

n

kΘ=⎟⎟

⎞⎜⎜⎝

⎛Θ ∑

=

T(n) = Θ(n) + Θ(n2)= Θ(n2)

h = n

Page 27: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.27

Best-case analysis(For intuition only!)

If we’re lucky, PARTITION splits the array evenly:T(n) = 2T(n/2) + Θ(n)

= Θ(n lg n) (same as merge sort)

What if the split is always 109

101 : ?

( ) ( ) )()( 109

101 nnTnTnT Θ++=

What is the solution to this recurrence?

Page 28: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.28

Analysis of “almost-best” case)(nT

Page 29: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.29

Analysis of “almost-best” casecn

( )nT 101 ( )nT 10

9

Page 30: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.30

Analysis of “almost-best” casecn

cn101 cn10

9

( )nT 1001 ( )nT 100

9 ( )nT 1009 ( )nT 100

81

Page 31: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.31

Analysis of “almost-best” casecn

cn101 cn10

9

cn1001 cn100

9 cn1009 cn100

81

Θ(1)

Θ(1)

… …log10/9n

cn

cn

cn

…O(n) leavesO(n) leaves

Page 32: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.32

Analysis of “almost-best” case

log10n

cn

cn101 cn10

9

cn1001 cn100

9 cn1009 cn100

81

Θ(1)

Θ(1)

… …log10/9n

cn

cn

cn

T(n) ≤ cn log10/9n + Ο(n)cn log10n ≤

O(n) leavesO(n) leaves

Θ(n lg n)Lucky!

Page 33: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.33

More intuitionSuppose we alternate lucky, unlucky, lucky, unlucky, lucky, ….

L(n) = 2U(n/2) + Θ(n) luckyU(n) = L(n – 1) + Θ(n) unlucky

Solving:L(n) = 2(L(n/2 – 1) + Θ(n/2)) + Θ(n)

= 2L(n/2 – 1) + Θ(n)= Θ(n lg n)

How can we make sure we are usually lucky?Lucky!

Page 34: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.34

Randomized quicksortIDEA: Partition around a random element.• Running time is independent of the input

order.• No assumptions need to be made about

the input distribution.• No specific input elicits the worst-case

behavior.• The worst case is determined only by the

output of a random-number generator.

Page 35: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.35

Randomized quicksort analysis

Let T(n) = the random variable for the running time of randomized quicksort on an input of size n, assuming random numbers are independent.For k = 0, 1, …, n–1, define the indicator random variable

Xk =1 if PARTITION generates a k : n–k–1 split,0 otherwise.

E[Xk] = Pr{Xk = 1} = 1/n, since all splits are equally likely, assuming elements are distinct.

Page 36: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.36

Analysis (continued)

T(0) + T(n–1) + Θ(n) if 0 : n–1 split,T(1) + T(n–2) + Θ(n) if 1 : n–2 split,

MT(n–1) + T(0) + Θ(n) if n–1 : 0 split,

T(n) =

( )∑−

=

Θ+−−+=1

0)()1()(

n

kk nknTkTX

Page 37: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.37

Calculating expectation( )⎥

⎤⎢⎣

⎡Θ+−−+= ∑

=

1

0)()1()()]([

n

kk nknTkTXEnTE

Take expectations of both sides.

Page 38: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.38

Calculating expectation( )

( )[ ]∑

∑−

=

=

Θ+−−+=

⎥⎦

⎤⎢⎣

⎡Θ+−−+=

1

0

1

0

)()1()(

)()1()()]([

n

kk

n

kk

nknTkTXE

nknTkTXEnTE

Linearity of expectation.

Page 39: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.39

Calculating expectation( )

( )[ ]

[ ] [ ]∑

=

=

=

Θ+−−+⋅=

Θ+−−+=

⎥⎦

⎤⎢⎣

⎡Θ+−−+=

1

0

1

0

1

0

)()1()(

)()1()(

)()1()()]([

n

kk

n

kk

n

kk

nknTkTEXE

nknTkTXE

nknTkTXEnTE

Independence of Xk from other random choices.

Page 40: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.40

Calculating expectation( )

( )[ ]

[ ] [ ]

[ ] [ ] ∑∑∑

=

=

=

=

=

=

Θ+−−+=

Θ+−−+⋅=

Θ+−−+=

⎥⎦

⎤⎢⎣

⎡Θ+−−+=

1

0

1

0

1

0

1

0

1

0

1

0

)(1)1(1)(1

)()1()(

)()1()(

)()1()()]([

n

k

n

k

n

k

n

kk

n

kk

n

kk

nn

knTEn

kTEn

nknTkTEXE

nknTkTXE

nknTkTXEnTE

Linearity of expectation; E[Xk] = 1/n .

Page 41: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.41

Calculating expectation( )

( )[ ]

[ ] [ ]

[ ] [ ]

[ ] )()(2

)(1)1(1)(1

)()1()(

)()1()(

)()1()()]([

1

1

1

0

1

0

1

0

1

0

1

0

1

0

nkTEn

nn

knTEn

kTEn

nknTkTEXE

nknTkTXE

nknTkTXEnTE

n

k

n

k

n

k

n

k

n

kk

n

kk

n

kk

Θ+=

Θ+−−+=

Θ+−−+⋅=

Θ+−−+=

⎥⎦

⎤⎢⎣

⎡Θ+−−+=

∑∑∑

=

=

=

=

=

=

=

Summations have identical terms.

Page 42: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.42

Hairy recurrence

[ ] )()(2)]([1

2nkTE

nnTE

n

kΘ+= ∑

=

(The k = 0, 1 terms can be absorbed in the Θ(n).)

Prove: E[T(n)] ≤ an lg n for constant a > 0 .• Choose a large enough so that an lg n

dominates E[T(n)] for sufficiently small n ≥ 2.

Use fact: 21

2812

21 lglg nnnkk

n

k∑−

=−≤ (exercise).

Page 43: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.43

Substitution method

[ ] )(lg2)(1

2nkak

nnTE

n

kΘ+≤ ∑

=

Substitute inductive hypothesis.

Page 44: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.44

Substitution method

[ ]

)(81lg

212

)(lg2)(

22

1

2

nnnnna

nkakn

nTEn

k

Θ+⎟⎠⎞⎜

⎝⎛ −≤

Θ+≤ ∑−

=

Use fact.

Page 45: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.45

Substitution method

[ ]

⎟⎠⎞⎜

⎝⎛ Θ−−=

Θ+⎟⎠⎞⎜

⎝⎛ −≤

Θ+≤ ∑−

=

)(4

lg

)(81lg

212

)(lg2)(

22

1

2

nannan

nnnnna

nkakn

nTEn

k

Express as desired – residual.

Page 46: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.46

Substitution method

[ ]

nan

nannan

nnnnna

nkakn

nTEn

k

lg

)(4

lg

)(81lg

212

)(lg2)(

22

1

2

⎟⎠⎞⎜

⎝⎛ Θ−−=

Θ+⎟⎠⎞⎜

⎝⎛ −=

Θ+≤ ∑−

=

,if a is chosen large enough so that an/4 dominates the Θ(n).

Page 47: Cormen Algo-lec4

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.47

Quicksort in practice

• Quicksort is a great general-purpose sorting algorithm.

• Quicksort is typically over twice as fast as merge sort.

• Quicksort can benefit substantially from code tuning.

• Quicksort behaves well even with caching and virtual memory.