48
Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Expected Running Times and Randomized Algorithms Instructor Neelima Gupta [email protected]

Embed Size (px)

Citation preview

Page 1: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Expected Running Times and Randomized Algorithms

Instructor Neelima Gupta nguptacsduacin

Expected Running Time of Insertion Sort

x1x2 xi-1xihellipxn

For I = 2 to n

Insert the ith element xi in the partially sorted list x1x2 xi-1

(at rth position)

bull Let Xi be the random variable which represents the number of comparisons required to insert ith element of the input array in the sorted sub array of first i-1 elements

bull Xi xi1xi2hellipxii

E(Xi) = Σj xijp(xij )

where E(Xi) is the expected value Xi

And p(xij) is the probability of inserting xi in the jth position 1lejlei

Expected Running Time of Insertion Sort

x1x2 xi-1xihellipxn

How many comparisons it makes to insert ith element in jth position

(at jth position)

Expected Running Time of Insertion Sort

bull Position of Comparisionsi 1i-1 2i-2 3

2 i-11 i-1

Note Here both position 2 and 1 have of Comparisions equal to i-1 Why Because to insert element at position 2 we have to compare with previously

first element and after that comparison we know which of them come first and which at second

Thus E(Xi) = (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at jth position in the i possible positions

For n elements

E(X1 + X2 + +Xn)

= nΣi=2 E(Xi)

= nΣi=2 (1i) i-1Σk=1k + (i-1) = (n-1)(n-4)4

Therefore average case of insertion sort takes Θ(n2)

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 2: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Expected Running Time of Insertion Sort

x1x2 xi-1xihellipxn

For I = 2 to n

Insert the ith element xi in the partially sorted list x1x2 xi-1

(at rth position)

bull Let Xi be the random variable which represents the number of comparisons required to insert ith element of the input array in the sorted sub array of first i-1 elements

bull Xi xi1xi2hellipxii

E(Xi) = Σj xijp(xij )

where E(Xi) is the expected value Xi

And p(xij) is the probability of inserting xi in the jth position 1lejlei

Expected Running Time of Insertion Sort

x1x2 xi-1xihellipxn

How many comparisons it makes to insert ith element in jth position

(at jth position)

Expected Running Time of Insertion Sort

bull Position of Comparisionsi 1i-1 2i-2 3

2 i-11 i-1

Note Here both position 2 and 1 have of Comparisions equal to i-1 Why Because to insert element at position 2 we have to compare with previously

first element and after that comparison we know which of them come first and which at second

Thus E(Xi) = (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at jth position in the i possible positions

For n elements

E(X1 + X2 + +Xn)

= nΣi=2 E(Xi)

= nΣi=2 (1i) i-1Σk=1k + (i-1) = (n-1)(n-4)4

Therefore average case of insertion sort takes Θ(n2)

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 3: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

bull Let Xi be the random variable which represents the number of comparisons required to insert ith element of the input array in the sorted sub array of first i-1 elements

bull Xi xi1xi2hellipxii

E(Xi) = Σj xijp(xij )

where E(Xi) is the expected value Xi

And p(xij) is the probability of inserting xi in the jth position 1lejlei

Expected Running Time of Insertion Sort

x1x2 xi-1xihellipxn

How many comparisons it makes to insert ith element in jth position

(at jth position)

Expected Running Time of Insertion Sort

bull Position of Comparisionsi 1i-1 2i-2 3

2 i-11 i-1

Note Here both position 2 and 1 have of Comparisions equal to i-1 Why Because to insert element at position 2 we have to compare with previously

first element and after that comparison we know which of them come first and which at second

Thus E(Xi) = (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at jth position in the i possible positions

For n elements

E(X1 + X2 + +Xn)

= nΣi=2 E(Xi)

= nΣi=2 (1i) i-1Σk=1k + (i-1) = (n-1)(n-4)4

Therefore average case of insertion sort takes Θ(n2)

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 4: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

x1x2 xi-1xihellipxn

How many comparisons it makes to insert ith element in jth position

(at jth position)

Expected Running Time of Insertion Sort

bull Position of Comparisionsi 1i-1 2i-2 3

2 i-11 i-1

Note Here both position 2 and 1 have of Comparisions equal to i-1 Why Because to insert element at position 2 we have to compare with previously

first element and after that comparison we know which of them come first and which at second

Thus E(Xi) = (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at jth position in the i possible positions

For n elements

E(X1 + X2 + +Xn)

= nΣi=2 E(Xi)

= nΣi=2 (1i) i-1Σk=1k + (i-1) = (n-1)(n-4)4

Therefore average case of insertion sort takes Θ(n2)

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 5: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

bull Position of Comparisionsi 1i-1 2i-2 3

2 i-11 i-1

Note Here both position 2 and 1 have of Comparisions equal to i-1 Why Because to insert element at position 2 we have to compare with previously

first element and after that comparison we know which of them come first and which at second

Thus E(Xi) = (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at jth position in the i possible positions

For n elements

E(X1 + X2 + +Xn)

= nΣi=2 E(Xi)

= nΣi=2 (1i) i-1Σk=1k + (i-1) = (n-1)(n-4)4

Therefore average case of insertion sort takes Θ(n2)

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 6: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Thus E(Xi) = (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at jth position in the i possible positions

For n elements

E(X1 + X2 + +Xn)

= nΣi=2 E(Xi)

= nΣi=2 (1i) i-1Σk=1k + (i-1) = (n-1)(n-4)4

Therefore average case of insertion sort takes Θ(n2)

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 7: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

For n number of elements expected time taken is

T = nΣi=2 (1i) i-1Σk=1k + (i-1)

where 1i is the probability to insert at rth position in the i possible positions

E(X1 + X2 + +Xn) = nΣi=1 E(Xi)WhereXi is expected value of inserting Xi element

T = (n-1)(n-4)4Therefore average case of insertion sort takes

Θ(n2)

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 8: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Quick-Sort

bull Pick the first item from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 9: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Quicksort Expected number of comparisons

bull Partition may generate splits (0n-1 1n-2 2n-3 hellip n-21 n-

10) each with probability 1n

bull If T(n) is the expected running time

euro

T n( ) =1

nT k( ) + T n minus1minus k( )[ ] + Θ n( )

k= 0

nminus1

sum

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 10: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Randomized Quick-Sort

bull Pick an element from the array--call it the pivotbull Partition the items in the array around the pivot so all

elements to the left are to the pivot and all elements to the right are greater than the pivot

bull Use recursion to sort the two partitions

pivotpartition items gt pivotpartition 1 items pivot

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 11: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Remarksbull Not much different from the Q-sort except

that earlier the algorithm was deterministic and the bounds were probabilistic

bull Here the algorithm is also randomized We pick an element to be a pivot randomly Notice that there isnrsquot any difference as to how does the algorithm behave there onwards

bull In the earlier case we can identify the worst case input Here no input is worst case

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 12: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Randomized Select

1

0

1max1 n

k

nknTkTn

nT

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 13: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Randomized Algorithms

bull A randomized algorithm performs coin tosses (ieuses random bits) to control its execution

bull b larr random()if b = 0do A hellipelse ie b = 1do B hellip

bull Its running time depends on the outcomes of the coin tosses

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 14: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Assumptions

bull 1048708 the coins are unbiased andbull 1048708 the coin tosses are independent

bull The worst-case running time of a randomized algorithm may be large but occurs with very low probability (eg it occurs when all the coin tosses give ldquoheadsrdquo)

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 15: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Monte Carlo Algorithms

bull Running times are guaranteed but the output may not be completely correct

bull Probability of error is low

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 16: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Las Vegas Algorithms

bull Output is guaranteed to be correct

bull Bounds on running times hold with high probability

bull What type of algorithm is Randomized Qsort

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 17: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Why expected running times

bull Markovrsquos inequality

P( X gt k E(X)) lt 1k

ie the probability that the algorithm will take more than O(2 E(X)) time is less than 12

Or the probability that the algorithm will take more than O(10 E(X)) time is less than 110

This is the reason why Qsort does well in practice

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 18: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Markovrsquos Bound

P(XltkM)lt 1k where k is a constant

Chernouffrsquos Bound

P(Xgt2μ)lt frac12

A More Stronger Result

P(Xgtk μ )lt 1nk where k is a constant

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 19: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Binary search tree can be built randomly

Rank(x)=i

Randomly selected key becomes the root

Pivot element=root

x

gtlt

RANDOMLY BUILT BST

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 20: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

bull Xi the height of the tree rooted at a node with rank=i

bull Yi exponential height of the tree=2^Xi

bull H=maxH1H2 + 1

where H1 ht of left subtree

H2 htof right subtree

H ht of the tree rooted at x

HEIGHT OF THE TREE

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 21: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

bull Y=2^H

=2max2^H12^H2

bull Expected value of exponential ht of the tree with lsquonrsquo nodes

=E(EH(T(X)))

=2n sum maxEH(T(k))EH(T(n-1-k))

=O(n^3)=E(H(T(n)))=O(log n)

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 22: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Skip list is a data structure that can be used to maintain dictionary

Given n keys we insert these n keys in a linked list that has -infin as first node and infin as last node

Initial list S0

Then we flip coin a coin for each element until only one is left in Si if a tail occurswe insert it into next list Si+1 and so on

-infin infin5 9 25 30 35 38 40

Skip List Dictionary as ADT

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 23: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

-infin

-infin

-infin

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

9 30 38

30

38 30

head Tail

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 24: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Operations that can be performed on skip list

Each node has two pointers right and down

1 Drop down

bull This operation is performed when after(p)gtkey

bull In this operation pointer p moves down to immediate lower level list

(after drop down)

right

down

-infin

-infin infin

infin 30

309 38

p

S1

S0

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 25: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

2Scan forward

bull This operation is performed when after(p)ltkey

bull Here the pointer p moves to the next element in the list

bull eg here key=28 amp p is at 9 after(9)lt28 so scan forward

-infin 9 infin 25 30

p p pS0

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 26: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Searching a key kKeep a ptr p to the first node in the highest list Sh

while (after(p)gtk)

if (Scur==S0) Scur is the current skip list

then ldquokey k not foundrdquo

exit

if (after(p)gtk)

drop down to next skip list

If (after(p)ltk)

scan forward ie update pafter(p)

if (after(p)==k)

return after(p)

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 27: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

-infin

-infin

-infin

Searching for a key 25

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key found

p

p

p

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 28: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

-infin

-infin

-infin

Searching for a key 28

S3

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

Key not found

p

p

p

p

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 29: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

-infin

-infin

-infin

Deletion of a key

S3 eg delete 30

S2

S1

S0

-infin 5 infin 4038 353025 9

infin

infin

infin

30

9 30 38

p

p

p

p

p

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 30: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis

1 An element xk is in Si with probability 12i true forall elements

E(Si ) = sum 12i Xki where Xki = 1 if xk is in Si

0 otherwise

= n2i

E(total size) = E(sum ISi I)

= sum n2i le 2n

2 Expected height of a skip listh = log n

n2h =1

h ≃ log n

n

k=1

k=1

infin

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 31: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis(contd)

3 Drop down O(log n)

Since pointer p can drop atmost h times

ieheight of the skip list until S0 is reached

and h = logn

4 Scan forward O(log n)

of elements Total no of levels Total Cost

to scan at each level

O(1) O(log n ) O(log n )

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 32: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

The number of elements scanned at ith level is no more than 2 because

The key lies between p and after(p) on the (i+1)th level (thatrsquos why we came down to ith level) And there is only one element between p and after(p) of

(i+1)th level in Si the element pointed to by after(p) in Si

Thus we scan at most two elements at Si the element pointed to by p (when we came down) and after(p) in Si

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 33: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Hashing

bull Motivation symbol tablesndash A compiler uses a symbol table to relate

symbols to associated databull Symbols variable names procedure names etcbull Associated data memory location call graph etc

ndash For a symbol table (also called a dictionary) we care about search insertion and deletion

ndash We typically donrsquot care about sorted order

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 34: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Hash Tables

bull More formallyndash Given a table T and a record x with key (= symbol)

and satellite data we need to supportbull Insert (T x)bull Delete (T x)bull Search(T x)

ndash We want these to be fast but donrsquot care about sorting the records

bull The structure we will use is a hash tablendash Supports all the above in O(1) expected time

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 35: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Hash Functions

bull Next problem collision T

0

m - 1

h(k1)

h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 36: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Resolving Collisions

bull How can we solve the problem of collisions

bull One of the solution is chaining

bull Other solutions open addressing

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 37: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Chaining

bull Chaining puts elements that hash to the same slot in a linked list

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 38: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Chaining

bull How do we insert an element

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 39: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Chaining

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

bull How do we delete an element

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 40: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Chaining

bull How do we search for a element with a given key

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

mdashmdash

T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6

k8

k7

k1 k4 mdashmdash

k5 k2

k3

k8 k6 mdashmdash

mdashmdash

k7 mdashmdash

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 41: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 42: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 43: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 44: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis of Chaining

bull Assume simple uniform hashing each key in table is equally likely to be hashed to any slot

bull Given n keys and m slots in the table the load factor = nm = average keys per slot

bull What will be the average cost of an unsuccessful search for a key A O(1+)

bull What will be the average cost of a successful search A O((1 + )2) = O(1 + )

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 45: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Analysis of Chaining Continued

bull So the cost of searching = O(1 + )

bull If the number of keys n is proportional to the number of slots in the table what is

bull A = O(1)ndash In other words we can make the expected

cost of searching constant if we make constant

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 46: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

If we could prove this

P(failure)lt1k (we are sort of happy)

P(failure)lt1nk (most of times this is true and wersquore

happy )

P(failure)lt12n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 47: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

Acknowledgements

bull Kunal Verma

bull Nidhi Aggarwal

bull And other students of MSc(CS) batch 2009

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48
Page 48: Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs.du.ac.in

END

  • Slide 1
  • Expected Running Time of Insertion Sort
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Quick-Sort
  • Quicksort Expected number of comparisons
  • Randomized Quick-Sort
  • Remarks
  • Randomized Select
  • Randomized Algorithms
  • Assumptions
  • Monte Carlo Algorithms
  • Las Vegas Algorithms
  • Why expected running times
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Skip List Dictionary as ADT
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Hashing
  • Hash Tables
  • Hash Functions
  • Resolving Collisions
  • Chaining
  • Slide 38
  • Slide 39
  • Slide 40
  • Analysis of Chaining
  • Slide 42
  • Slide 43
  • Slide 44
  • Analysis of Chaining Continued
  • A Final Word About Randomized Algorithms
  • Acknowledgements
  • Slide 48