10
Randomized algorithms Inge Li Gørtz 1 Today What are randomized algorithms? Properties of randomized algorithms Three examples: Median/Select. Quick-sort Closest pair of points Randomized algorithms 2 Randomized Algorithms 3 So far we dealt with deterministic algorithms: Giving the same input to the algorithm repeatedly results in: The same running time. The same output. Randomized algorithm: Can make random choices (using a random number generator) Randomized Algorithms 4

Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

  • Upload
    others

  • View
    31

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

Randomized algorithmsInge Li Gørtz

�1

• Today• What are randomized algorithms?• Properties of randomized algorithms• Three examples:

• Median/Select.• Quick-sort• Closest pair of points

Randomized algorithms

�2

Randomized Algorithms

�3

• So far we dealt with deterministic algorithms: • Giving the same input to the algorithm repeatedly results in:

• The same running time. • The same output.

• Randomized algorithm:• Can make random choices (using a random number generator)

Randomized Algorithms

�4

Page 2: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• A randomized algorithm has access to a random number generator rand(a, b), where a, b are integers, a < b.

• rand(a, b) returns a number x ∈ {a, (a + 1), . . . , (b − 1), b}.

• Each of the b − a + 1 numbers is equally likely.

• Calling rand(a, b) again results in a new, possibly different number.  

• rand(a, b) “has no memory”. The current number does not depend on the results of previous calls.

• Statistically speaking: rand(a, b) generates numbers independently according to uniform distribution.

Random number generator

�5

• The algorithm can make random decisions using rand

Randomized algorithms

Runningwhile (rand(0,1) = 0) do

print(“running”);end whileprint(“stopped”);

Julefrokostwhile (rand(1,4) = rand(1,4)) dohave another Christmas beer;

end whilego home;

�6

• A randomized algorithm might never stop.

• Example: Find the position of 7 in a three-element array containing the numbers 4, 3 and 7.

Properties of randomized algorithms

Find7(A[1…3])goon := true;while (goon) doi := rand(1,3);if (A[i] = 7) then print(Bingo: 7 found at position i);goon := false;end if

end while

�7

• A randomized algorithm might find the wrong solution.

• Example: Find the position of the minimum element in a three-element array.

• If A = [5,6,4] and i = 2 and j = 1 then the algorithm will wrongly return 5.

Properties of randomized algorithms

MinOfThree(A[1…3])i := rand(1,3);j := rand(1,3);return(min(A[i],A[j]);

�8

Page 3: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• Both examples are not so bad as they look.

• It is highly unlikely that Find7 runs a long time.

• It is more likely that MinOfThree returns a correct result than a wrong one.

Properties of randomized algorithms

�9

Analysis of Randomized Algorithms

�10

• Running time: O(1).• What is the chance that the answer of MinOfThree is correct? • 9 possible pairs all with probability 1/9:

• (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)• Assume wlog that A[1] is the minimum:

• 5 of the 9 pairs contain the index 1 and give the correct minimum. • P[correct] = 5/9 > 1/2.

Analysis of MinOfThreeMinOfThree(A[1…3])

i := rand(1,3);j := rand(1,3);return(min(A[i],A[j]);

�11

• Running MinOfThree once gives the correct answer with probaility 5/9 and the incorrect one with probability 4/9.

• Idea: Run MinOfThree many times and pick the smallest value returned. • The probability that MinOfThree fails every time in k runs is:

• To be 99% sure to find the minimum, choose k such that

Analysis of MinOfThreeMinOfThree(A[1…3])

i := rand(1,3);j := rand(1,3);return(min(A[i],A[j]);

�12

Page 4: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• Running time is variable. What is the expected running time?• Assume wlog that A[1] = 7.• If rand(1, 3) = 1 then the while-loop is terminated.

• P[rand(1,3) = 1] = 1/3 • If rand(1, 3) ≠ 1 then the while-loop is not terminated.

• P[rand(1,3) ≠ 1] = 2/3• The while-loop is terminated after one iteration with probability

P[rand(1,3) = 1] = 1/3.

Analysis of Find7Find7(A[1…3])

goon := true;while (goon) doi := rand(1,3);if (A[i] = 7) then print(Bingo: 7 found at position i);goon := false;end if

end while

�13

• The while-loop stops after exactly two iterations if • rand(1, 3) ≠ 1 in the first iteration. • rand(1, 3) = 1 in the second iteration. • This happens with probability 2/3 and 1/3, resp.• The probability for both to happen is - by independence -

Analysis of Find7Find7(A[1…3])

goon := true;while (goon) doi := rand(1,3);if (A[i] = 7) then print(Bingo: 7 found at position i);goon := false;end if

end while

�14

• The while-loop stops after exactly three iterations if • rand(1, 3) ≠ 1 in the first two iterations. • rand(1, 3) = 1 in the third iteration. • This happens with probability 2/3, 2/3 and 1/3, resp.• The probability for all three to happen is - by independence -

• The while-loop stops after exactly k iterations if • rand(1, 3) ≠ 1 in the first k-1 iterations. • rand(1, 3) = 1 in the kth iteration. • This happens with probability 2/3, 2/3 and 1/3, resp.• The probability for all three to happen is - by independence -

Analysis of Find7

�15

• We have

• The expected running time of Find7 is

• Idea: Stop Las Vegas algorithm if it runs “much longer” than the expected time and restart it.

Expected running time

�16

∑k=0

k ⋅ xk =x

(1 − x)2 for  |x | < 1.

Page 5: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• We have seen two kinds of algorithms: • Monte Carlo algorithms: stop after a fixed (polynomial) time and give

the correct answer with probability greater 50%.

• Las Vegas algorithms: have variable running time but always give the correct answer.

Types of randomized algorithms

�17

• Analyse the expected number of times running is printed:

• Analyse the expected number of beers you get if you follow the algorithm:

Randomized algorithms

Runningwhile (rand(0,1) = 0) do

print(“running”);end whileprint(“stopped”);

Julefrokostwhile (rand(1,4) = rand(1,4)) dohave another Christmas beer;

end whilego home;

�18

Summations

∑k=0

k ⋅ xk =x

(1 − x)2 for  |x | < 1.

∑k=0

xk =1

(1 − x) for  |x | < 1.

∑k=0

k ⋅ xk−1 =1

(1 − x)2 for  |x | < 1.

Median/Select

�20

Page 6: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• Given n numbers S = {a1, a2, …, an}. • Median: number that is in the middle position if in sorted order.• Select(S,k): Return the kth smallest number in S.

• Min(S) = Select(S,1), Max(S)= Select(S,n), Median = Select(S,n/2).

• Assume the numbers are distinct.

Select

Select(S, k) {

Choose a pivot s ∈ S uniformly at random.

For each element e in S if e < s put e in S’ if e > s put e in S’’

if |S’| = k-1 then return s

if |S’| ≥ k then call Select(S’, k)

if |S’| < k then call Select(S’’, k - |S’| - 1) }

�21

• Worst case running time: • If there is at least an fraction of elements both larger and smaller than s:

• Limit number of bad pivots.• Intuition: A fairly large fraction of elements are “well-centered” => random pivot

likely to be good.

SelectSelect(S, k) {

Choose a pivot s ∈ S uniformly at random.

For each element e in S if e < s put e in S’ if e > s put e in S’’

if |S’| = k-1 then return s

if |S’| ≥ k then call Select(S’, k)

if |S’| < k then call Select(S’’, k - |S’| - 1) }

T (n) = cn+ c(n� 1) + c(n� 2) + · · · = ⇥(n2).

"

T (n) = cn+ (1� ")cn+ (1� ")2cn+ · · ·=

�1 + (1� ") + (1� ")2 + · · ·

�cn

cn/".

�22

• Phase j: Size of set at most and at least .• Element is central if at least a quarter of the elements in the current call are smaller

and at least a quarter are larger.• At least half the elements are central.• Pivot central => size of set shrinks with by at least a factor 3/4 => current phase

ends.• Pr[s is central] = 1/2.• Expected number of iterations before a central pivot is found = 2 => expected number of iterations in phase j at most 2.

• X: random variable equal to number of steps taken by algorithm.• Xj: expected number of steps in phase j.• X = X1+ X2 + .…• Number of steps in one iteration in phase j is at most .• E[Xj] = .• Expected running time:

Selectn(3/4)j n(3/4)j+1

E[X] =X

j

E[Xj ] X

j

2cn

✓3

4

◆j

= 2cnX

j

✓3

4

◆j

8cn.

2cn(3/4)jcn(3/4)j

�23

Quicksort

�24

Page 7: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• Given n numbers S = {a1, a2, …, an} return the sorted list. • Assume the numbers are distinct.

Quicksort

Quicksort(A,p,r) {

if |S| ≤ 1 return S else

Choose a pivot s ∈ S uniformly at random.

For each element e in S if e < s put e in S’ if e > s put e in S’’

L = Quicksort(S’) R = Quicksort(S’’)

Return the sorted list L◦s◦R. }

�25

• Worst case Quicksort requires Ω(n2) comparisons: if pivot is the smallest element in the list in each recursive call.

• If pivot aways is the median then T(n) = O(n log n).• for i < j: random variable

• X total number of comparisons:

• Expected number of comparisons:

Quicksort: Analysis

Xij =

(1 if ai and aj compared by algorithm

0 otherwise

X =n�1X

i=1

nX

j=i+1

Xij

E[X] = E[n�1X

i=1

nX

j=i+1

Xij ] =n�1X

i=1

nX

j=i+1

E[Xij ]

�26

• Expected number of comparisons:

• Since Xij only takes values 0 and 1: • ai and aj compared iff ai or aj is the first pivot chosen from Zij = {ai,…,aj}.• Pivot chosen independently uniformly at random => all elements from Zij equally

likely to be chosen as first pivot from this set.

• We have • Thus

Quicksort: Analysis

E[X] = E[n�1X

i=1

nX

j=i+1

Xij ] =n�1X

i=1

nX

j=i+1

E[Xij ]

E[Xij ] = Pr[Xij = 1]

Pr[Xij = 1] = 2/(j � i+ 1)

E[X] =n�1X

i=1

nX

j=i+1

E[Xij ] =n�1X

i=1

nX

j=i+1

Pr[Xij ] =n�1X

i=1

nX

j=i+1

2

j � i+ 1

<n�1X

i=1

nX

k=1

2

k=

n�1X

i=1

O(log n) = O(n log n)=n�1X

i=1

n�i+1X

k=2

2

k�27

Closest pair of pointsA randomized algorithm

�28

Page 8: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

• Closest pair of points. Given n points in the plane, find a pair with smallest euclidean distance between them.

Closest Pair of Points

Thank you to Kevin Wayne for inspiration to slides.�29

• Closest pair of points. Given n points in the plane, find a pair with smallest euclidean distance between them.

• Fundamental geometric primitive.• Graphics, computer vision, geographic information systems, molecular modeling,

air traffic control.• Special case of nearest neighbor, Euclidean MST, Voronoi diagrams.

• Brute force. Compare all pairs => O(n2) time.

• 1-D version. Sort and scan => O(n log n) time.

• Simplifying assumption. No two points coincide (for a simpler presentation).

Closest Pair of Points

Thank you to Kevin Wayne for inspiration to slides.�30

• Assume wlog that points are in the unit square.• Sort points in random order.• Let δ = d(p1,p2). Check for each point pi (in order) if there exists a point pj, j<i, such that

d(pi,pj) < δ.

• If such a point found. Update δ.

Randomized algorithm

3

4

119 10

155

1

2

14

12613

8

7

δ2

δ1

�31

How to check a point

3

4

119 10

155

1

2

14

613

8

7

δ

δ/2

δ/2

12

• δ current smallest distance. Divide unit square into subsquares with side lengths δ/2.

�32

Page 9: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

How to check a point

3

4

119 10

15 5

1

2

14

613

8

7

δ/2

δ/2

12

• δ current smallest distance. Divide unit square into subsquares with side lengths δ/2.

• If two points i and j are in the same subsquare then d(i,j) < δ.

�33

How to check a point

3

4

119 10

15 5

1

2

14

613

8

7

δ/2

δ/2

12

• δ current smallest distance. Divide unit square into subsquares with side lengths δ/2.

• If two points i and j are in the same subsquare then d(i,j) < δ.

• If d(i,j) < δ then j is in the 5x5 grid of subsquares around i.

�34

• Use hashtable to store which square a point is in. Only store points already looked at (red points).

Closest Pair of Points: Randomized algorithm

3

4

119 10

155

1

2

14

12613

8

7

δ2

δ1

�35

Closest Pair of Points

3

4

119 10

155

1

2

14

12613

8

7

δ2

δ3

• Use hashtable to store which square a point is in. Only store points already looked at (red points).

• When starting new round: rehash all points from 1…i.

�36

Page 10: Randomized algorithms · Randomized algorithms 2 Randomized Algorithms 3 •So far we dealt with deterministic algorithms: •Giving the same input to the algorithm repeatedly results

Closest Pair of Points

3

4

119 10

15 5

1

2

14

12613

8

7

δ3

• Use hashtable to store which square a point is in. Only store points already looked at (red points).

• When starting new round: rehash all points from 1…i.

�37

• Number of lookup operations:• Number of distance calculations:• Number of MakeDictionary operations:

Closest Pair of Points: Analysis

�38

• Number of lookup operations:• Number of distance calculations:• Number of MakeDictionary operations:

Closest Pair of Points: Analysisat most 25 per point => O(n) total

O(n)at most 25 per point => O(n) total

�39

• Number of lookup operations:• Number of distance calculations:• Number of MakeDictionary operations:• Number of insertions:

• Random variable X = number of insertions• Random variable

• Pr[Xi = 1] ≤ 2/i • Expected number of insertions:

• Use hashtable as dictionary: Expected O(n) time in total.

Closest Pair of Points: Analysisat most 25 per point => O(n) total

O(n)at most 25 per point => O(n) total

Xi =

(1 i causes � to change

0 otherwise

E[X]= n + i ⋅E[Xii=1

n

∑ ]= n + i ⋅Pr[Xi = 1]i=1

n

∑ ≤ n + i ⋅2 / i = n + 2n = 3ni=1

n

�40