Download pdf - Algorithm Design Slides

Analysis of Algorithms

AlgorithmInput Output

An algorithm is a step-by-step procedure forsolving a problem in a finite amount of time.

Analysis of Algorithms 2

Running Time (1.1) w Most algorithms transform

input objects into outputobjects.w The running time of an

algorithm typically growswith the input size.w Average case time is often

difficult to determine.w We focus on the worst case

running time.n Easier to analyzen Crucial to applications such as

games, finance and robotics

0

20

40

60

80

100

120

Ru

nn

ing

Tim

e

1000 2000 3000 4000

Input Size

best case

average caseworst case


Experimental Studies ( 1.6)

w Write a programimplementing thealgorithmw Run the program with

inputs of varying size andcompositionw Use a method like

System.currentTimeMillis() toget an accurate measureof the actual running timew Plot the results 0

1000

2000

3000

4000

5000

6000

7000

8000

9000

0 50 100

Input Size

Tim

e (

ms)


Limitations of Experiments

w It is necessary to implement thealgorithm, which may be difficultw Results may not be indicative of the

running time on other inputs not includedin the experiment.w In order to compare two algorithms, the

same hardware and softwareenvironments must be used


Theoretical Analysis

w Uses a high-level description of thealgorithm instead of an implementationw Characterizes running time as a

function of the input size, n.w Takes into account all possible inputsw Allows us to evaluate the speed of an

algorithm independent of thehardware/software environment


Pseudocode (1.1)w High-level description

of an algorithmw More structured than

English prosew Less detailed than a

programw Preferred notation for

describing algorithmsw Hides program design

issues

Algorithm arrayMax(A, n)Input array A of n integersOutput maximum element of A

currentMax A[0]for i 1 to n - 1 do

if A[i] > currentMax thencurrentMax A[i]

return currentMax

Example: find maxelement of an array


Pseudocode Details

w Control flown if then [else ]n while do n repeat until n for do n Indentation replaces braces

w Method declarationAlgorithm method (arg [, arg])

Input Output

w Method callvar.method (arg [, arg])

w Return valuereturn expression

w ExpressionsAssignment

(like = in Java)= Equality testing

(like == in Java)n2 Superscripts and other

mathematicalformatting allowed


The Random Access Machine(RAM) Model

w A CPU

w An potentially unboundedbank of memory cells,each of which can hold anarbitrary number orcharacter

012

w Memory cells are numbered and accessingany cell in memory takes unit time.


Primitive Operationsw Basic computations

performed by an algorithmw Identifiable in pseudocodew Largely independent from the

programming languagew Exact definition not important

(we will see why later)w Assumed to take a constant

amount of time in the RAMmodel

w Examples:n Evaluating an

expressionn Assigning a value

to a variablen Indexing into an

arrayn Calling a methodn Returning from a

method


Counting PrimitiveOperations (1.1)w By inspecting the pseudocode, we can determine the

maximum number of primitive operations executed byan algorithm, as a function of the input size

Algorithm arrayMax(A, n) # operationscurrentMax A[0] 2for i 1 to n - 1 do 2 + n

if A[i] > currentMax then 2(n - 1)currentMax A[i] 2(n - 1)

{ increment counter i } 2(n - 1)return currentMax 1

Total 7n - 1


Estimating Running Timew Algorithm arrayMax executes 7n - 1 primitive

operations in the worst case. Define:a = Time taken by the fastest primitive operationb = Time taken by the slowest primitive operation

w Let T(n) be worst-case time of arrayMax. Thena (7n - 1) T(n) b(7n - 1)

w Hence, the running time T(n) is bounded by twolinear functions


Growth Rate of Running Time

w Changing the hardware/ softwareenvironmentn Affects T(n) by a constant factor, butn Does not alter the growth rate of T(n)

w The linear growth rate of the runningtime T(n) is an intrinsic property ofalgorithm arrayMax


Growth Rates

w Growth rates offunctions:n Linear nn Quadratic n2

n Cubic n3

w In a log-log chart,the slope of the linecorresponds to thegrowth rate of thefunction

1E+01E+21E+41E+61E+8

1E+101E+121E+141E+161E+181E+201E+221E+241E+261E+281E+30

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n

T(n

)

Cubic

Quadratic

Linear


Constant Factors

w The growth rate isnot affected byn constant factors orn lower-order terms

w Examplesn 102n + 105 is a linear

functionn 105n2 + 108n is a

quadratic function1E+01E+21E+41E+61E+8

1E+101E+121E+141E+161E+181E+201E+221E+241E+26

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n

T(n

)

Quadratic

Quadratic

LinearLinear


Big-Oh Notation (1.2)w Given functions f(n) and

g(n), we say that f(n) isO(g(n)) if there arepositive constantsc and n0 such that

f(n) cg(n) for n n0w Example: 2n + 10 is O(n)

n 2n + 10 cnn (c - 2) n 10n n 10/(c - 2)n Pick c = 3 and n0 = 10

1

10

100

1,000

10,000

1 10 100 1,000n

3n

2n+10

n


Big-Oh Example

w Example: the functionn2 is not O(n)n n2 cnn n cn The above inequality

cannot be satisfiedsince c must be aconstant

1

10

100

1,000

10,000

100,000

1,000,000

1 10 100 1,000n

n^2

100n

10n

n


More Big-Oh Examplesn 7n-2

7n-2 is O(n)need c > 0 and n 0 1 such that 7n-2 cn for n n0this is true for c = 7 and n0 = 1

n 3n3 + 20n2 + 53n3 + 20n2 + 5 is O(n3)need c > 0 and n 0 1 such that 3n3 + 20n 2 + 5 cn3 for n n0this is true for c = 4 and n0 = 21

n 3 log n + log log n3 log n + log log n is O(log n)need c > 0 and n 0 1 such that 3 log n + log log n clog n for n n0this is true for c = 4 and n0 = 2


Big-Oh and Growth Ratew The big-Oh notation gives an upper bound on the

growth rate of a functionw The statement f(n) is O(g(n)) means that the growth

rate of f(n) is no more than the growth rate of g(n)w We can use the big-Oh notation to rank functions

according to their growth rate

YesYesSame growthYesNof(n) grows moreNoYesg(n) grows more

g(n) is O(f(n))f(n) is O(g(n))


Big-Oh Rules

w If is f(n) a polynomial of degree d, then f(n) isO(nd), i.e.,

n Drop lower-order termsn Drop constant factors

w Use the smallest possible class of functionsn Say 2n is O(n) instead of 2n is O(n2)

w Use the simplest expression of the classn Say 3n + 5 is O(n) instead of 3n + 5 is O(3n)


Asymptotic Algorithm Analysisw The asymptotic analysis of an algorithm determines

the running time in big-Oh notationw To perform the asymptotic analysis

n We find the worst-case number of primitive operationsexecuted as a function of the input size

n We express this function with big-Oh notation

w Example:n We determine that algorithm arrayMax executes at most

7n - 1 primitive operationsn We say that algorithm arrayMax runs in O(n) time

w Since constant factors and lower-order terms areeventually dropped anyhow, we can disregard themwhen counting primitive operations


Computing Prefix Averagesw We further illustrate

asymptotic analysis withtwo algorithms for prefixaveragesw The i-th prefix average of

an array X is average of thefirst (i + 1) elements of X:

A[i] = (X[0] + X[1] + + X[i])/( i+1)

w Computing the array A ofprefix averages of anotherarray X has applications tofinancial analysis

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7

XA


Prefix Averages (Quadratic)w The following algorithm computes prefix averages in

quadratic time by applying the definition

Algorithm prefixAverages1(X, n)Input array X of n integersOutput array A of prefix averages of X #operations A new array of n integers nfor i 0 to n - 1 do n

s X[0] nfor j 1 to i do 1 + 2 + + (n - 1)

s s + X[j] 1 + 2 + + (n - 1)A[i] s / (i + 1) n

return A 1


Arithmetic Progression

w The running time ofprefixAverages1 isO(1 + 2 + + n)w The sum of the first n

integers is n(n + 1) / 2n There is a simple visual

proof of this fact

w Thus, algorithmprefixAverages1 runs inO(n2) time 0

1

2

3

4

5

6

7

1 2 3 4 5 6


Prefix Averages (Linear)w The following algorithm computes prefix averages in

linear time by keeping a running sum

Algorithm prefixAverages2(X, n)Input array X of n integersOutput array A of prefix averages of X #operationsA new array of n integers ns 0 1for i 0 to n - 1 do n

s s + X[i] nA[i] s / (i + 1) n

return A 1

w Algorithm prefixAverages2 runs in O(n) time


w properties of logarithms:logb(xy) = logbx + logbylogb (x/y) = logbx - logbylogbxa = alogbxlogba = logxa/log xb

w properties of exponentials:a(b+c) = aba cabc = (ab)cab /ac = a(b-c)b = a logabbc = a c*log ab

w Summations (Sec. 1.3.1)w Logarithms and Exponents (Sec. 1.3.2)

w Proof techniques (Sec. 1.3.3)w Basic probability (Sec. 1.3.4)

Math you need to Review


Relatives of Big-Ohw big-Omega

n f(n) is W(g(n)) if there is a constant c > 0and an integer constant n0 1 such thatf(n) cg(n) for n n0

w big-Thetan f(n) is Q(g(n)) if there are constants c > 0 and c > 0 and an

integer constant n 0 1 such that cg(n) f(n) cg(n) for n n0w little-oh

n f(n) is o(g(n)) if, for any constant c > 0, there is an integerconstant n 0 0 such that f(n) cg(n) for n n0

w little-omegan f(n) is w(g(n)) if, for any constant c > 0, there is an integer

constant n 0 0 such that f(n) cg(n) for n n0


Intuition for AsymptoticNotation

Big-Ohn f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)

big-Omegan f(n) is W(g(n)) if f(n) is asymptotically greater than or equal to g(n)

big-Thetan f(n) is Q(g(n)) if f(n) is asymptotically equal to g(n)

little-ohn f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)

little-omegan f(n) is w(g(n)) if is asymptotically strictly greater than g(n)


Example Uses of theRelatives of Big-Oh

f(n) is w(g(n)) if, for any constant c > 0, there is an integer constant n0 0 such that f(n) cg(n) for n n0

need 5n02 cn0 given c, the n0 that satifies this is n0 c/5 0

n 5n 2 is w(n)

f(n) is W(g(n)) if there is a constant c > 0 and an integer constant n 0 1such that f(n) cg(n) for n n0

let c = 1 and n0 = 1

n 5n 2 is W(n)

f(n) is W(g(n)) if there is a constant c > 0 and an integer constant n 0 1such that f(n) cg(n) for n n0

let c = 5 and n0 = 1

n 5n 2 is W(n2)

Elementary DataStructures

Stacks, Queues, & ListsAmortized analysisTrees

Elementary Data Structures 2

The Stack ADT (2.1.1) The Stack ADT stores

arbitrary objects Insertions and deletions

follow the last-in first-outscheme

Think of a spring-loadedplate dispenser

Main stack operations: push(object): inserts an

element object pop(): removes and

returns the last insertedelement

Auxiliary stackoperations: object top(): returns the

last inserted elementwithout removing it

integer size(): returns thenumber of elementsstored

boolean isEmpty():indicates whether noelements are stored


Applications of Stacks

Direct applications Page-visited history in a Web browser Undo sequence in a text editor Chain of method calls in the Java Virtual

Machine or C++ runtime environment Indirect applications

Auxiliary data structure for algorithms Component of other data structures


Array-based Stack (2.1.1)

A simple way ofimplementing theStack ADT uses anarray

We add elementsfrom left to right

A variable t keepstrack of the index ofthe top element(size is t+1)

S0 1 2 t

Algorithm pop():if isEmpty() then

throw EmptyStackException else

t t 1return S[t + 1]

Algorithm push(o)if t = S.length 1 then

throw FullStackException else

t t + 1S[t] o


Growable Array-basedStack (1.5) In a push operation, when

the array is full, instead ofthrowing an exception, wecan replace the array witha larger one How large should the new

array be? incremental strategy:

increase the size by aconstant c

doubling strategy: doublethe size


A new array ofsize

for i 0 to t do A[i] S[i] S A

t t + 1S[t] o


Comparison of theStrategies

We compare the incremental strategy andthe doubling strategy by analyzing the totaltime T(n) needed to perform a series of npush operations We assume that we start with an empty

stack represented by an array of size 1 We call amortized time of a push operation

the average time taken by a push over theseries of operations, i.e., T(n)/n


Analysis of theIncremental Strategy

We replace the array k = n/c times The total time T(n) of a series of n push

operations is proportional ton + c + 2c + 3c + 4c + + kc =

n + c(1 + 2 + 3 + + k) =n + ck(k + 1)/2

Since c is a constant, T(n) is O(n + k2), i.e.,O(n2) The amortized time of a push operation is O(n)


Direct Analysis of theDoubling Strategy We replace the array k = log2 n

times The total time T(n) of a series

of n push operations isproportional to

n + 1 + 2 + 4 + 8 + + 2k =n + 2k + 1 1 = 2n 1

T(n) is O(n) The amortized time of a push

operation is O(1)

geometric series

1

2

14

8


The accounting method determines the amortizedrunning time with a system of credits and debits

We view a computer as a coin-operated device requiring1 cyber-dollar for a constant amount of computing.

Accounting Method Analysisof the Doubling Strategy

We set up a scheme for charging operations. Thisis known as an amortization scheme.

The scheme must give us always enough money topay for the actual cost of the operation.

The total cost of the series of operations is no morethan the total amount charged.

(amortized time) (total $ charged) / (# operations)Elementary Data Structures 10

Amortization Scheme forthe Doubling Strategy Consider again the k phases, where each phase consisting of twice

as many pushes as the one before. At the end of a phase we must have saved enough to pay for the

array-growing push of the next phase. At the end of phase i we want to have saved i cyber-dollars, to pay

for the array growth for the beginning of the next phase.

0 2 4 5 6 731

$ $ $ $$ $ $ $

0 2 4 5 6 7 8 9 113 10 12 13 14 151

$$

We charge $3 for a push. The $2 saved for a regular push arestored in the second half of the array. Thus, we will have2(i/2)=i cyber-dollars saved at then end of phase i. Therefore, each push runs in O(1) amortized time; n pushes runin O(n) time.


The Queue ADT (2.1.2) The Queue ADT stores arbitrary

objects Insertions and deletions follow

the first-in first-out scheme Insertions are at the rear of the

queue and removals are at thefront of the queue

Main queue operations: enqueue(object): inserts an

element at the end of thequeue

object dequeue(): removes andreturns the element at the frontof the queue

Auxiliary queueoperations: object front(): returns the

element at the front withoutremoving it

integer size(): returns thenumber of elements stored

boolean isEmpty(): indicateswhether no elements arestored

Exceptions Attempting the execution of

dequeue or front on anempty queue throws anEmptyQueueException


Applications of Queues

Direct applications Waiting lines Access to shared resources (e.g., printer) Multiprogramming

Indirect applications Auxiliary data structure for algorithms Component of other data structures


Singly Linked List A singly linked list is a

concrete data structureconsisting of a sequenceof nodes

Each node stores element link to the next node

next

elem node

A B C D


Queue with a Singly Linked List We can implement a queue with a singly linked list

The front element is stored at the first node The rear element is stored at the last node

The space used is O(n) and each operation of theQueue ADT takes O(1) time

f

r

nodes

elements


List ADT (2.2.2)

The List ADT models asequence of positionsstoring arbitrary objects

It allows for insertionand removal in themiddle

Query methods: isFirst(p), isLast(p)

Accessor methods: first(), last() before(p), after(p)

Update methods: replaceElement(p, o),

swapElements(p, q) insertBefore(p, o),

insertAfter(p, o), insertFirst(o),

insertLast(o) remove(p)


Doubly Linked List A doubly linked list provides a natural

implementation of the List ADT Nodes implement Position and store:

element link to the previous node link to the next node

Special trailer and header nodes

prev next

elem

trailerheader nodes/positions

elements

node


Trees (2.3) In computer science, a

tree is an abstract modelof a hierarchicalstructure

A tree consists of nodeswith a parent-childrelation

Applications: Organization charts File systems Programming

environments

ComputersRUs

Sales R&DManufacturing

Laptops DesktopsUS International

Europe Asia Canada


Tree ADT (2.3.1) We use positions to abstract

nodes Generic methods:

integer size() boolean isEmpty() objectIterator elements() positionIterator positions()

Accessor methods: position root() position parent(p) positionIterator children(p)

Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)

Update methods: swapElements(p, q) object replaceElement(p, o)

Additional update methodsmay be defined by datastructures implementing theTree ADT


Preorder Traversal (2.3.2) A traversal visits the nodes of a

tree in a systematic manner In a preorder traversal, a node is

visited before its descendants Application: print a structured

document

Make Money Fast!

1. Motivations References2. Methods

2.1 StockFraud

2.2 PonziScheme1.1 Greed 1.2 Avidity

2.3 BankRobbery

1

2

3

5

4 6 7 8

9

Algorithm preOrder(v)visit(v)for each child w of v

preorder (w)


Postorder Traversal (2.3.2) In a postorder traversal, a

node is visited after itsdescendants

Application: compute spaceused by files in a directory andits subdirectories

Algorithm postOrder(v)for each child w of v

postOrder (w)visit(v)

cs16/

homeworks/ todo.txt1Kprograms/

DDR.java10K

Stocks.java25K

h1c.doc3K

h1nc.doc2K

Robot.java20K

9

3

1

7

2 4 5 6

8


Amortized Analysis ofTree Traversal Time taken in preorder or postorder traversal

of an n-node tree is proportional to the sum,taken over each node v in the tree, of thetime needed for the recursive call for v. The call for v costs $(cv + 1), where cv is the

number of children of v For the call for v, charge one cyber-dollar to v and

charge one cyber-dollar to each child of v. Each node (except the root) gets charged twice:

once for its own call and once for its parents call. Therefore, traversal time is O(n).


Binary Trees (2.3.3) A binary tree is a tree with the

following properties: Each internal node has two

children The children of a node are an

ordered pair We call the children of an internal

node left child and right child Alternative recursive definition: a

binary tree is either a tree consisting of a single node,

or a tree whose root has an ordered

pair of children, each of which is abinary tree

Applications: arithmetic expressions decision processes searching

A

B C

F GD E

H I


Arithmetic Expression Tree Binary tree associated with an arithmetic expression

internal nodes: operators external nodes: operands

Example: arithmetic expression tree for theexpression (2 (a 1) + (3 b))

+

2a 1

3 b


Decision Tree Binary tree associated with a decision process

internal nodes: questions with yes/no answer external nodes: decisions

Example: dining decisionWant a fast meal?

How about coffee? On expense account?

Starbucks In N Out Antoine's Dennys

Yes No

Yes No Yes No


Properties of Binary Trees Notation

n number of nodese number of

external nodesi number of internal

nodesh height

Properties: e = i + 1 n = 2e 1 h i h (n 1)/2 e 2h h log2 e h log2 (n + 1) 1


Inorder Traversal In an inorder traversal a

node is visited after its leftsubtree and before its rightsubtree

Application: draw a binarytree x(v) = inorder rank of v y(v) = depth of v

Algorithm inOrder(v)if isInternal (v)

inOrder (leftChild (v))visit(v)if isInternal (v)

inOrder (rightChild (v))

3

1

2

5

6

7 9

8

4


Euler Tour Traversal Generic traversal of a binary tree Includes a special cases the preorder, postorder and inorder traversals Walk around the tree and visit each node three times:

on the left (preorder) from below (inorder) on the right (postorder)

+

25 1

3 2

LB

R


Printing Arithmetic Expressions Specialization of an inorder

traversal print operand or operator

when visiting node print ( before traversing left

subtree print ) after traversing right

subtree

Algorithm printExpression(v)if isInternal (v)

print(()inOrder (leftChild (v))

print(v.element ())if isInternal (v)

inOrder (rightChild (v))print ())

+

2a 1

3 b((2 (a 1)) + (3 b))


Linked Data Structure forRepresenting Trees (2.3.4) A node is represented by

an object storing Element Parent node Sequence of children

nodes Node objects implement

the Position ADT

B

DA

C E

F

B

A D F

C

E


Linked Data Structure forBinary Trees A node is represented

by an object storing Element Parent node Left child node Right child node

Node objects implementthe Position ADT

B

DA

C E

B

A D

C E


Array-Based Representation ofBinary Trees nodes are stored in an array

let rank(node) be defined as follows: rank(root) = 1 if node is the left child of parent(node),

rank(node) = 2*rank(parent(node)) if node is the right child of parent(node),

rank(node) = 2*rank(parent(node))+1

1

2 3

6 74 5

10 11

A

HG

FE

D

C

B

J

1Stacks

Stacks 2

Outline and Reading

The Stack ADT (2.1.1)Applications of Stacks (2.1.1)Array-based implementation (2.1.1)Growable array-based stack (1.5)

Stacks 3

Abstract Data Types (ADTs)An abstract data type (ADT) is an abstraction of a data structureAn ADT specifies: Data stored Operations on the

data Error conditions

associated with operations

Example: ADT modeling a simple stock trading system The data stored are buy/sell

orders The operations supported are order buy(stock, shares, price) order sell(stock, shares, price) void cancel(order)

Error conditions: Buy/sell a nonexistent stock Cancel a nonexistent order

Stacks 4

The Stack ADTThe Stack ADT stores arbitrary objectsInsertions and deletions follow the last-in first-out schemeThink of a spring-loaded plate dispenserMain stack operations: push(object): inserts an

element object pop(): removes and

returns the last inserted element

Auxiliary stack operations: object top(): returns the

last inserted element without removing it

integer size(): returns the number of elements stored

boolean isEmpty(): indicates whether no elements are stored

Stacks 5

ExceptionsAttempting the execution of an operation of ADT may sometimes cause an error condition, called an exceptionExceptions are said to be thrown by an operation that cannot be executed

In the Stack ADT, operations pop and top cannot be performed if the stack is emptyAttempting the execution of pop or top on an empty stack throws an EmptyStackException

Stacks 6

Applications of Stacks

Direct applications Page-visited history in a Web browser Undo sequence in a text editor Chain of method calls in the Java Virtual

MachineIndirect applications Auxiliary data structure for algorithms Component of other data structures

2Stacks 7

Method Stack in the JVMThe Java Virtual Machine (JVM) keeps track of the chain of active methods with a stackWhen a method is called, the JVM pushes on the stack a frame containing Local variables and return value Program counter, keeping track of

the statement being executed When a method ends, its frame is popped from the stack and control is passed to the method on top of the stack

main() {int i = 5;foo(i);}

foo(int j) {int k;k = j+1;bar(k);}

bar(int m) {}

barPC = 1m = 6

fooPC = 3j = 5k = 6

mainPC = 2i = 5

Stacks 8

Array-based StackA simple way of implementing the Stack ADT uses an arrayWe add elements from left to rightA variable keeps track of the index of the top element

S0 1 2 t

Algorithm size()return t + 1

Algorithm pop()if isEmpty() then

throw EmptyStackExceptionelse

t t 1return S[t + 1]

Stacks 9

Array-based Stack (cont.)The array storing the stack elements may become fullA push operation will then throw aFullStackException Limitation of the array-

based implementation Not intrinsic to the

Stack ADT

S0 1 2 t


throw FullStackExceptionelse

t t + 1S[t] o

Stacks 10

Performance and LimitationsPerformance Let n be the number of elements in the stack The space used is O(n) Each operation runs in time O(1)

Limitations The maximum size of the stack must be defined a

priori and cannot be changed Trying to push a new element into a full stack

causes an implementation-specific exception

Stacks 11

Computing SpansWe show how to use a stack as an auxiliary data structure in an algorithmGiven an an array X, the span S[i] of X[i] is the maximum number of consecutive elements X[j] immediately preceding X[i] and such that X[j] X[i]Spans have applications to financial analysis E.g., stock at 52-week high

1321125436X

S

01234567

0 1 2 3 4

Stacks 12

Quadratic AlgorithmAlgorithm spans1(X, n)

Input array X of n integersOutput array S of spans of X #S new array of n integers nfor i 0 to n 1 do n

s 1 nwhile s i X[i s] X[i] 1 + 2 + + (n 1)

s s + 1 1 + 2 + + (n 1)S[i] s n

return S 1

Algorithm spans1 runs in O(n2) time

3Stacks 13

Computing Spans with a StackWe keep in a stack the indices of the elements visible when looking backWe scan the array from left to right Let i be the current index We pop indices from the

stack until we find index jsuch that X[i] < X[j]

We set S[i] i j We push x onto the stack

01234567

0 1 2 3 4 5 6 7

Stacks 14

Linear AlgorithmAlgorithm spans2(X, n) #

S new array of n integers nA new empty stack 1for i 0 to n 1 do n

while (A.isEmpty() X[top()] X[i] ) do n

j A.pop() nif A.isEmpty() then n

S[i] i + 1 nelse

S[i] i j nA.push(i) n

return S 1

Each index of the array Is pushed into the

stack exactly one Is popped from

the stack at most once

The statements in the while-loop are executed at most n times Algorithm spans2 runs in O(n) time

Stacks 15

Growable Array-based StackIn a push operation, when the array is full, instead of throwing an exception, we can replace the array with a larger oneHow large should the new array be? incremental strategy:

increase the size by a constant c

doubling strategy: double the size


A new array ofsize

for i 0 to t doA[i] S[i]S A

t t + 1S[t] o

Stacks 16

Comparison of the Strategies

We compare the incremental strategy and the doubling strategy by analyzing the total time T(n) needed to perform a series of npush operationsWe assume that we start with an empty stack represented by an array of size 1We call amortized time of a push operation the average time taken by a push over the series of operations, i.e., T(n)/n

Stacks 17

Incremental Strategy Analysis

We replace the array k = n/c timesThe total time T(n) of a series of n push operations is proportional to

n + c + 2c + 3c + 4c + + kc =n + c(1 + 2 + 3 + + k) =

n + ck(k + 1)/2Since c is a constant, T(n) is O(n + k2), i.e., O(n2)The amortized time of a push operation is O(n)

Stacks 18

Doubling Strategy AnalysisWe replace the array k = log2 n timesThe total time T(n) of a series of n push operations is proportional to

n + 1 + 2 + 4 + 8 + + 2k =n + 2k + 1 1 = 2n 1

T(n) is O(n)The amortized time of a push operation is O(1)

geometric series

1

2

14

8

4Stacks 19

Stack Interface in Java

Java interface corresponding to our Stack ADTRequires the definition of class EmptyStackExceptionDifferent from the built-in Java class java.util.Stack

public interface Stack {

public int size();

public boolean isEmpty();

public Object top()throws EmptyStackException;

public void push(Object o);

public Object pop()throws EmptyStackException;

}

Stacks 20

Array-based Stack in Javapublic class ArrayStack

implements Stack {

// holds the stack elementsprivate Object S[ ];

// index to top elementprivate int top = -1;

// constructorpublic ArrayStack(int capacity) {

S = new Object[capacity]);}

public Object pop()throws EmptyStackException {

if isEmpty()throw new EmptyStackException

(Empty stack: cannot pop);Object temp = S[top];// facilitates garbage collectionS[top] = null;top = top 1;return temp;

}

Vectors 6/8/2002 2:14 PM

1

6/8/2002 2:14 PM Vectors 1

Vectors

6/8/2002 2:14 PM Vectors 2

Outline and Reading

The Vector ADT (2.2.1)Array-based implementation (2.2.1)

6/8/2002 2:14 PM Vectors 3

The Vector ADTThe Vector ADT extends the notion of array by storing a sequence of arbitrary objectsAn element can be accessed, inserted or removed by specifying its rank (number of elements preceding it)An exception is thrown if an incorrect rank is specified (e.g., a negative rank)

Main vector operations: object elemAtRank(integer r):

returns the element at rank r without removing it

object replaceAtRank(integer r, object o): replace the element at rank with o and return the old element

insertAtRank(integer r, object o): insert a new element o to have rank r

object removeAtRank(integer r): removes and returns the element at rank r

Additional operations size() and isEmpty()

6/8/2002 2:14 PM Vectors 4

Applications of Vectors

Direct applications Sorted collection of objects (elementary

database)


6/8/2002 2:14 PM Vectors 5

Array-based VectorUse an array V of size NA variable n keeps track of the size of the vector (number of elements stored)Operation elemAtRank(r) is implemented in O(1)time by returning V[r]

V0 1 2 nr

6/8/2002 2:14 PM Vectors 6

InsertionIn operation insertAtRank(r, o), we need to make room for the new element by shifting forward the n r elements V[r], , V[n 1]In the worst case (r = 0), this takes O(n) time

V0 1 2 nr

V0 1 2 nr

V0 1 2 n

or

Vectors 6/8/2002 2:14 PM

2

6/8/2002 2:14 PM Vectors 7

DeletionIn operation removeAtRank(r), we need to fill the hole left by the removed element by shifting backward the n r 1 elements V[r + 1], , V[n 1]In the worst case (r = 0), this takes O(n) time

V0 1 2 nr

V0 1 2 n

or

V0 1 2 nr

6/8/2002 2:14 PM Vectors 8

PerformanceIn the array based implementation of a Vector The space used by the data structure is O(n) size, isEmpty, elemAtRank and replaceAtRank run in

O(1) time insertAtRank and removeAtRank run in O(n) time

If we use the array in a circular fashion,insertAtRank(0) and removeAtRank(0) run in O(1) timeIn an insertAtRank operation, when the array is full, instead of throwing an exception, we can replace the array with a larger one

Queues 6/8/2002 2:16 PM

1

6/8/2002 2:16 PM Queues 1

Queues

6/8/2002 2:16 PM Queues 2

Outline and Reading

The Queue ADT (2.1.2)Implementation with a circular array (2.1.2)Growable array-based queueQueue interface in Java

6/8/2002 2:16 PM Queues 3

The Queue ADTThe Queue ADT stores arbitrary objectsInsertions and deletions follow the first-in first-out schemeInsertions are at the rear of the queue and removals are at the front of the queueMain queue operations: enqueue(object): inserts an

element at the end of the queue

object dequeue(): removes and returns the element at the front of the queue

Auxiliary queue operations: object front(): returns the

element at the front without removing it

integer size(): returns the number of elements stored

boolean isEmpty(): indicates whether no elements are stored

Exceptions Attempting the execution of

dequeue or front on an empty queue throws an EmptyQueueException

6/8/2002 2:16 PM Queues 4

Applications of Queues

Direct applications Waiting lists, bureaucracy Access to shared resources (e.g., printer) Multiprogramming


6/8/2002 2:16 PM Queues 5

Array-based QueueUse an array of size N in a circular fashionTwo variables keep track of the front and rearf index of the front elementr index immediately past the rear element

Array location r is kept empty

Q0 1 2 rf

normal configuration

Q0 1 2 fr

wrapped-around configuration

6/8/2002 2:16 PM Queues 6

Queue OperationsWe use the modulo operator (remainder of division)

Algorithm size()return (N f + r) mod N

Algorithm isEmpty()return (f = r)

Q0 1 2 rf

Q0 1 2 fr

Queues 6/8/2002 2:16 PM

2

6/8/2002 2:16 PM Queues 7

Queue Operations (cont.)Algorithm enqueue(o)

if size() = N 1 thenthrow FullQueueException

else Q[r] or (r + 1) mod N

Operation enqueue throws an exception if the array is fullThis exception is implementation-dependent

Q0 1 2 rf

Q0 1 2 fr

6/8/2002 2:16 PM Queues 8

Queue Operations (cont.)Operation dequeue throws an exception if the queue is emptyThis exception is specified in the queue ADT

Algorithm dequeue()if isEmpty() then

throw EmptyQueueExceptionelse

o Q[f]f (f + 1) mod Nreturn o

Q0 1 2 rf

Q0 1 2 fr

6/8/2002 2:16 PM Queues 9

Growable Array-based QueueIn an enqueue operation, when the array is full, instead of throwing an exception, we can replace the array with a larger oneSimilar to what we did for an array-based stackThe enqueue operation has amortized running time O(n) with the incremental strategy O(1) with the doubling strategy

6/8/2002 2:16 PM Queues 10

Queue Interface in Java

Java interface corresponding to our Queue ADTRequires the definition of class EmptyQueueExceptionNo corresponding built-in Java class

public interface Queue {

public int size();

public boolean isEmpty();

public Object front()throws EmptyQueueException;

public void enqueue(Object o);

public Object dequeue()throws EmptyQueueException;

}

Sequences 6/8/2002 2:15 PM

1

6/8/2002 2:15 PM Sequences 1

Lists and Sequences

6/8/2002 2:15 PM Sequences 2

Outline and Reading

Singly linked listPosition ADT and List ADT (2.2.2)Doubly linked list ( 2.2.2)Sequence ADT ( 2.2.3)Implementations of the sequence ADT ( 2.2.3)Iterators (2.2.3)

6/8/2002 2:15 PM Sequences 3

Singly Linked ListA singly linked list is a concrete data structure consisting of a sequence of nodesEach node stores element link to the next node

next

elem node

A B C D

6/8/2002 2:15 PM Sequences 4

Stack with a Singly Linked ListWe can implement a stack with a singly linked listThe top element is stored at the first node of the listThe space used is O(n) and each operation of the Stack ADT takes O(1) time

t

nodes

elements

6/8/2002 2:15 PM Sequences 5

Queue with a Singly Linked ListWe can implement a queue with a singly linked list The front element is stored at the first node The rear element is stored at the last node

The space used is O(n) and each operation of the Queue ADT takes O(1) time

f

r

nodes

elements6/8/2002 2:15 PM Sequences 6

Position ADTThe Position ADT models the notion of place within a data structure where a single object is storedIt gives a unified view of diverse ways of storing data, such as a cell of an array a node of a linked list

Just one method: object element(): returns the element

stored at the position


2

6/8/2002 2:15 PM Sequences 7

List ADT

The List ADT models a sequence of positions storing arbitrary objectsIt establishes a before/after relation between positionsGeneric methods: size(), isEmpty()

Query methods: isFirst(p), isLast(p)

Accessor methods: first(), last() before(p), after(p)

Update methods: replaceElement(p, o),

swapElements(p, q) insertBefore(p, o),

insertAfter(p, o), insertFirst(o),

insertLast(o) remove(p)

6/8/2002 2:15 PM Sequences 8

Doubly Linked ListA doubly linked list provides a natural implementation of the List ADTNodes implement Position and store: element link to the previous node link to the next node

Special trailer and header nodes

prev next

elem

trailerheader nodes/positions

elements

node

6/8/2002 2:15 PM Sequences 9

InsertionWe visualize operation insertAfter(p, X), which returns position q

A B X C

A B C

p

A B C

p

X

q

p q

6/8/2002 2:15 PM Sequences 10

DeletionWe visualize remove(p), where p = last()

A B C D

p

A B C

D

p

A B C

6/8/2002 2:15 PM Sequences 11

PerformanceIn the implementation of the List ADT by means of a doubly linked list The space used by a list with n elements is O(n)

The space used by each position of the list is O(1)

All the operations of the List ADT run in O(1) time

Operation element() of the Position ADT runs in O(1) time

6/8/2002 2:15 PM Sequences 12

Sequence ADTThe Sequence ADT is the union of the Vector and List ADTsElements accessed by Rank, or Position

Generic methods: size(), isEmpty()

Vector-based methods: elemAtRank(r),

replaceAtRank(r, o), insertAtRank(r, o), removeAtRank(r)

List-based methods: first(), last(),

before(p), after(p), replaceElement(p, o), swapElements(p, q), insertBefore(p, o), insertAfter(p, o), insertFirst(o), insertLast(o), remove(p)

Bridge methods: atRank(r), rankOf(p)


3

6/8/2002 2:15 PM Sequences 13

Applications of SequencesThe Sequence ADT is a basic, general-purpose, data structure for storing an ordered collection of elementsDirect applications: Generic replacement for stack, queue, vector, or

list small database (e.g., address book)

Indirect applications: Building block of more complex data structures

6/8/2002 2:15 PM Sequences 14

Array-based ImplementationWe use a circular array storing positions A position object stores: Element Rank

Indices f and lkeep track of first and last positions

0 1 2 3positions

elements

S

lf

6/8/2002 2:15 PM Sequences 15

Sequence Implementations

nninsertAtRank, removeAtRank11insertFirst, insertLast1ninsertAfter, insertBefore

n1replaceAtRank11replaceElement, swapElements

n1atRank, rankOf, elemAtRank11size, isEmpty

1nremove

11first, last, before, after

ListArrayOperation

6/8/2002 2:15 PM Sequences 16

IteratorsAn iterator abstracts the process of scanning through a collection of elementsMethods of the ObjectIterator ADT: object object() boolean hasNext() object nextObject() reset()

Extends the concept of Position by adding a traversal capabilityImplementation with an array or singly linked list

An iterator is typically associated with an another data structureWe can augment the Stack, Queue, Vector, List and Sequence ADTs with method: ObjectIterator elements()

Two notions of iterator: snapshot: freezes the

contents of the data structure at a given time

dynamic: follows changes to the data structure

Trees 6/8/2002 2:15 PM

1

6/8/2002 2:15 PM Trees 1

Trees

Make Money Fast!

StockFraud

PonziScheme

BankRobbery

6/8/2002 2:15 PM Trees 2

Outline and Reading

Tree ADT (2.3.1)Preorder and postorder traversals (2.3.2)BinaryTree ADT (2.3.3)Inorder traversal (2.3.3)Euler Tour traversal (2.3.3)Template method patternData structures for trees (2.3.4)Java implementation (http://jdsl.org)

6/8/2002 2:15 PM Trees 3

What is a TreeIn computer science, a tree is an abstract model of a hierarchical structureA tree consists of nodes with a parent-child relationApplications: Organization charts File systems Programming

environments

ComputersRUs

Sales R&DManufacturing

Laptops DesktopsUS International

Europe Asia Canada

6/8/2002 2:15 PM Trees 4

subtree

Tree TerminologyRoot: node without parent (A)Internal node: node with at least one child (A, B, C, F)External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D)Ancestors of a node: parent, grandparent, grand-grandparent, etc.Depth of a node: number of ancestorsHeight of a tree: maximum depth of any node (3)Descendant of a node: child, grandchild, grand-grandchild, etc.

A

B DC

G HE F

I J K

Subtree: tree consisting of a node and its descendants

6/8/2002 2:15 PM Trees 5

Tree ADTWe use positions to abstract nodesGeneric methods: integer size() boolean isEmpty() objectIterator elements() positionIterator positions()

Accessor methods: position root() position parent(p) positionIterator children(p)

Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)

Update methods: swapElements(p, q) object replaceElement(p, o)

Additional update methods may be defined by data structures implementing the Tree ADT

6/8/2002 2:15 PM Trees 6

Preorder TraversalA traversal visits the nodes of a tree in a systematic mannerIn a preorder traversal, a node is visited before its descendants Application: print a structured document

Make Money Fast!

1. Motivations References2. Methods

2.1 StockFraud

2.2 PonziScheme1.1 Greed 1.2 Avidity

2.3 BankRobbery

1

2

3

5

4 6 7 8

9

Algorithm preOrder(v)visit(v)for each child w of v

preorder (w)

Trees 6/8/2002 2:15 PM

2

6/8/2002 2:15 PM Trees 7

Postorder TraversalIn a postorder traversal, a node is visited after its descendantsApplication: compute space used by files in a directory and its subdirectories

Algorithm postOrder(v)for each child w of v

postOrder (w)visit(v)

cs16/

homeworks/ todo.txt1Kprograms/

DDR.java10K

Stocks.java25K

h1c.doc3K

h1nc.doc2K

Robot.java20K

9

3

1

7

2 4 5 6

8

6/8/2002 2:15 PM Trees 8

Binary TreeA binary tree is a tree with the following properties: Each internal node has two

children The children of a node are an

ordered pairWe call the children of an internal node left child and right childAlternative recursive definition: a binary tree is either a tree consisting of a single node,

or a tree whose root has an ordered

pair of children, each of which is a binary tree

Applications: arithmetic expressions decision processes searching

A

B C

F GD E

H I

6/8/2002 2:15 PM Trees 9

Arithmetic Expression TreeBinary tree associated with an arithmetic expression internal nodes: operators external nodes: operands

Example: arithmetic expression tree for the expression (2 (a 1) + (3 b))

+

2a 1

3 b

6/8/2002 2:15 PM Trees 10

Decision TreeBinary tree associated with a decision process internal nodes: questions with yes/no answer external nodes: decisions

Example: dining decision

Want a fast meal?

How about coffee? On expense account?

Starbucks Spikes Al Forno Caf Paragon

Yes No

Yes No Yes No

6/8/2002 2:15 PM Trees 11

Properties of Binary TreesNotationn number of nodese number of

external nodesi number of internal

nodesh height

Properties: e = i + 1 n = 2e 1 h i h (n 1)/2 e 2h h log2 e h log2 (n + 1) 1

6/8/2002 2:15 PM Trees 12

BinaryTree ADT

The BinaryTree ADT extends the Tree ADT, i.e., it inherits all the methods of the Tree ADTAdditional methods: position leftChild(p) position rightChild(p) position sibling(p)

Update methods may be defined by data structures implementing the BinaryTree ADT

Trees 6/8/2002 2:15 PM

3

6/8/2002 2:15 PM Trees 13

Inorder TraversalIn an inorder traversal a node is visited after its left subtree and before its right subtreeApplication: draw a binary tree x(v) = inorder rank of v y(v) = depth of v

Algorithm inOrder(v)if isInternal (v)

inOrder (leftChild (v))visit(v)if isInternal (v)

inOrder (rightChild (v))

3

1

2

5

6

7 9

8

4

6/8/2002 2:15 PM Trees 14

Print Arithmetic ExpressionsSpecialization of an inorder traversal print operand or operator

when visiting node print ( before traversing left

subtree print ) after traversing right

subtree

Algorithm printExpression(v)if isInternal (v)

print(()inOrder (leftChild (v))

print(v.element ())if isInternal (v)

inOrder (rightChild (v))print ())

+

2a 1

3 b((2 (a 1)) + (3 b))

6/8/2002 2:15 PM Trees 15

Evaluate Arithmetic ExpressionsSpecialization of a postorder traversal recursive method returning

the value of a subtree when visiting an internal

node, combine the values of the subtrees

Algorithm evalExpr(v)if isExternal (v)

return v.element ()else

x evalExpr(leftChild (v))y evalExpr(rightChild (v)) operator stored at vreturn x y+

2

5 1

3 2

6/8/2002 2:15 PM Trees 16

Euler Tour TraversalGeneric traversal of a binary treeIncludes a special cases the preorder, postorder and inorder traversalsWalk around the tree and visit each node three times: on the left (preorder) from below (inorder) on the right (postorder)

+

25 1

3 2

LB

R

6/8/2002 2:15 PM Trees 17

Template Method PatternGeneric algorithm that can be specialized by redefining certain stepsImplemented by means of an abstract Java class Visit methods that can be redefined by subclassesTemplate method eulerTour Recursively called on the

left and right children A Result object with fields

leftResult, rightResult andfinalResult keeps track of the output of the recursive calls to eulerTour

public abstract class EulerTour {protected BinaryTree tree;protected void visitExternal(Position p, Result r) { }protected void visitLeft(Position p, Result r) { }protected void visitBelow(Position p, Result r) { }protected void visitRight(Position p, Result r) { }protected Object eulerTour(Position p) {

Result r = new Result();if tree.isExternal(p) { visitExternal(p, r); }

else {visitLeft(p, r);r.leftResult = eulerTour(tree.leftChild(p));visitBelow(p, r);r.rightResult = eulerTour(tree.rightChild(p));visitRight(p, r);return r.finalResult;

}

6/8/2002 2:15 PM Trees 18

Specializations of EulerTourWe show how to specialize class EulerTour to evaluate an arithmetic expressionAssumptions External nodes store

Integer objects Internal nodes store

Operator objects supporting methodoperation (Integer, Integer)

public class EvaluateExpressionextends EulerTour {

protected void visitExternal(Position p, Result r) {r.finalResult = (Integer) p.element();

}

protected void visitRight(Position p, Result r) {Operator op = (Operator) p.element();r.finalResult = op.operation(

(Integer) r.leftResult,(Integer) r.rightResult);

}

}

Trees 6/8/2002 2:15 PM

4

6/8/2002 2:15 PM Trees 19

Data Structure for TreesA node is represented by an object storing Element Parent node Sequence of children

nodesNode objects implement the Position ADT

B

DA

C E

F

B

A D F

C

E

6/8/2002 2:15 PM Trees 20

Data Structure for Binary TreesA node is represented by an object storing Element Parent node Left child node Right child node

Node objects implement the Position ADT

B

DA

C E

B

A D

C E

6/8/2002 2:15 PM Trees 21

Java ImplementationTree interfaceBinaryTree interface extending TreeClasses implementing Tree and BinaryTree and providing Constructors Update methods Print methods

Examples of updates for binary trees expandExternal(v) removeAboveExternal(w)

A

expandExternal(v)

A

CB

B

removeAboveExternal(w)

Av v

w

6/8/2002 2:15 PM Trees 22

Trees in JDSLJDSL is the Library of Data Structures in JavaTree interfaces in JDSL InspectableBinaryTree InspectableTree BinaryTree Tree

Inspectable versions of the interfaces do not have update methodsTree classes in JDSL NodeBinaryTree NodeTree

JDSL was developed at Browns Center for Geometric ComputingSee the JDSL documentation and tutorials at http://jdsl.org

InspectableTree

InspectableBinaryTree

Tree

BinaryTree

Heaps 4/5/2002 14:4

Heaps and Priority Queues 1

Heaps and Priority Queues

2

65

79


Priority QueueADT ( 2.4.1) A priority queue stores a

collection of items An item is a pair

(key, element) Main methods of the Priority

Queue ADT insertItem(k, o)

inserts an item with key kand element o

removeMin()removes the item withsmallest key and returns itselement

Additional methods minKey(k, o)

returns, but does notremove, the smallest key ofan item

minElement()returns, but does notremove, the element of anitem with smallest key

size(), isEmpty() Applications:

Standby flyers Auctions Stock market


Total Order Relation

Keys in a priorityqueue can bearbitrary objectson which an orderis defined Two distinct items

in a priority queuecan have thesame key

Mathematical concept oftotal order relation Reflexive property:

x x Antisymmetric property:

x y y x x = y Transitive property:

x y y z x z


Comparator ADT ( 2.4.1) A comparator encapsulates

the action of comparing twoobjects according to a giventotal order relation

A generic priority queueuses an auxiliarycomparator

The comparator is externalto the keys being compared

When the priority queueneeds to compare two keys,it uses its comparator

Methods of the ComparatorADT, all with Booleanreturn type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x)


Sorting with a Priority Queue ( 2.4.2) We can use a priority

queue to sort a set ofcomparable elements Insert the elements one

by one with a series ofinsertItem(e, e)operations

Remove the elements insorted order with a seriesof removeMin()operations

The running time of thissorting method depends onthe priority queueimplementation

Algorithm PQ-Sort(S, C)Input sequence S, comparator Cfor the elements of SOutput sequence S sorted inincreasing order according to CP priority queue with

comparator Cwhile S.isEmpty ()

e S.remove (S. first ())P.insertItem(e, e)

while P.isEmpty()e P.removeMin()S.insertLast(e)


Sequence-based Priority Queue Implementation with an

unsorted list

Performance: insertItem takes O(1) time

since we can insert the itemat the beginning or end ofthe sequence

removeMin, minKey andminElement take O(n) timesince we have to traversethe entire sequence to findthe smallest key

Implementation with asorted list

Performance: insertItem takes O(n) time

since we have to find theplace where to insert theitem

removeMin, minKey andminElement take O(1) timesince the smallest key is atthe beginning of thesequence

4 5 2 3 1 1 2 3 4 5

Heaps 4/5/2002 14:4


Selection-Sort

Selection-sort is the variation of PQ-sort where thepriority queue is implemented with an unsortedsequence

Running time of Selection-sort: Inserting the elements into the priority queue with n

insertItem operations takes O(n) time Removing the elements in sorted order from the priority

queue with n removeMin operations takes timeproportional to

1 + 2 + + n Selection-sort runs in O(n2) time

4 5 2 3 1


Insertion-Sort

Insertion-sort is the variation of PQ-sort where thepriority queue is implemented with a sortedsequence

Running time of Insertion-sort: Inserting the elements into the priority queue with n

insertItem operations takes time proportional to 1 + 2 + + n

Removing the elements in sorted order from the priorityqueue with a series of n removeMin operations takes O(n)time

Insertion-sort runs in O(n2) time

1 2 3 4 5


What is a heap (2.4.3) A heap is a binary tree

storing keys at its internalnodes and satisfying thefollowing properties: Heap-Order: for every

internal node v other thanthe root,key(v) key(parent(v))

Complete Binary Tree: let hbe the height of the heap for i = 0, , h 1, there are

2i nodes of depth i at depth h 1, the internal

nodes are to the left of theexternal nodes

2

65

79

The last node of a heapis the rightmost internalnode of depth h 1

last node


Height of a Heap (2.4.3) Theorem: A heap storing n keys has height O(log n)

Proof: (we apply the complete binary tree property) Let h be the height of a heap storing n keys Since there are 2i keys at depth i = 0, , h 2 and at least one key

at depth h 1, we have n 1 + 2 + 4 + + 2h2 + 1 Thus, n 2h1 , i.e., h log n + 1

1

2

2h2

1

keys0

1

h2h1

depth


Heaps and Priority Queues We can use a heap to implement a priority queue We store a (key, element) item at each internal node We keep track of the position of the last node For simplicity, we show only the keys in the pictures

(2, Sue)

(6, Mark)(5, Pat)

(9, Jeff) (7, Anna)


Insertion into aHeap (2.4.3) Method insertItem of the

priority queue ADTcorresponds to theinsertion of a key k tothe heap

The insertion algorithmconsists of three steps Find the insertion node z

(the new last node) Store k at z and expand z

into an internal node Restore the heap-order

property (discussed next)

2

65

79

insertion node

2

65

79 1

z

z

Heaps 4/5/2002 14:4


Upheap After the insertion of a new key k, the heap-order property may be

violated Algorithm upheap restores the heap-order property by swapping k

along an upward path from the insertion node Upheap terminates when the key k reaches the root or a node

whose parent has a key smaller than or equal to k Since a heap has height O(log n), upheap runs in O(log n) time

2

15

79 6z

1

25

79 6z


Removal from a Heap (2.4.3) Method removeMin of

the priority queue ADTcorresponds to theremoval of the root keyfrom the heap

The removal algorithmconsists of three steps Replace the root key with

the key of the last node w Compress w and its

children into a leaf Restore the heap-order

property (discussed next)

2

65

79

last node

w

7

65

9w


Downheap After replacing the root key with the key k of the last node, the

heap-order property may be violated Algorithm downheap restores the heap-order property by

swapping key k along a downward path from the root Upheap terminates when key k reaches a leaf or a node whose

children have keys greater than or equal to k Since a heap has height O(log n), downheap runs in O(log n) time

7

65

9w

5

67

9w


Updating the Last Node The insertion node can be found by traversing a path of O(log n)

nodes Go up until a left child or the root is reached If a left child is reached, go to the right child Go down left until a leaf is reached

Similar algorithm for updating the last node after a removal


Heap-Sort (2.4.4)

Consider a priorityqueue with n itemsimplemented by meansof a heap the space used is O(n) methods insertItem and

removeMin take O(log n)time

methods size, isEmpty,minKey, and minElementtake time O(1) time

Using a heap-basedpriority queue, we cansort a sequence of nelements in O(n log n)time

The resulting algorithmis called heap-sort

Heap-sort is muchfaster than quadraticsorting algorithms, suchas insertion-sort andselection-sort


Vector-based HeapImplementation (2.4.3) We can represent a heap with n

keys by means of a vector oflength n + 1

For the node at rank i the left child is at rank 2i the right child is at rank 2i + 1

Links between nodes are notexplicitly stored

The leaves are not represented The cell of at rank 0 is not used Operation insertItem corresponds

to inserting at rank n + 1 Operation removeMin corresponds

to removing at rank n Yields in-place heap-sort

2

65

79

2 5 6 9 71 2 3 4 50

Heaps 4/5/2002 14:4


Merging Two Heaps We are given two two

heaps and a key k We create a new heap

with the root nodestoring k and with thetwo heaps as subtrees

We perform downheapto restore the heap-order property

7

3

58

2

64

3

58

2

64

2

3

58

4

67


We can construct a heapstoring n given keys inusing a bottom-upconstruction with log nphases

In phase i, pairs ofheaps with 2i 1 keys aremerged into heaps with2i+11 keys

Bottom-up HeapConstruction (2.4.3)

2i 1 2i 1

2i+11


Example

1516 124 76 2023

25

1516

5

124

11

76

27

2023


Example (contd.)

25

1516

5

124

11

96

27

2023

15

2516

4

125

6

911

23

2027


Example (contd.)

7

15

2516

4

125

8

6

911

23

2027

4

15

2516

5

127

6

8

911

23

2027


Example (end)

4

15

2516

5

127

10

6

8

911

23

2027

5

15

2516

7

1210

4

6

8

911

23

2027

Heaps 4/5/2002 14:4


Analysis We visualize the worst-case time of a downheap with a proxy path

that goes first right and then repeatedly goes left until the bottomof the heap (this path may differ from the actual downheap path)

Since each node is traversed by at most two proxy paths, the totalnumber of nodes of the proxy paths is O(n)

Thus, bottom-up heap construction runs in O(n) time Bottom-up heap construction is faster than n successive insertions

and speeds up the first phase of heap-sort

Priority Queues 6/8/2002 2:00 PM

1

6/8/2002 2:00 PM Priority Queues 1

Priority Queues

$118IBM400Buy$119IBM500Buy

IBMIBM

$120300Sell$122100Sell


Outline and Reading

PriorityQueue ADT (2.4.1)Total order relation (2.4.1)Comparator ADT (2.4.1)Sorting with a priority queue (2.4.2)Selection-sort (2.4.2)Insertion-sort (2.4.2)


Priority Queue ADT

A priority queue stores a collection of itemsAn item is a pair(key, element)Main methods of the Priority Queue ADT insertItem(k, o)

inserts an item with key k and element o

removeMin()removes the item with smallest key and returns its element

Additional methods minKey(k, o)

returns, but does not remove, the smallest key of an item

minElement()returns, but does not remove, the element of an item with smallest key

size(), isEmpty()Applications: Standby flyers Auctions Stock market


Total Order Relation

Keys in a priority queue can be arbitrary objects on which an order is definedTwo distinct items in a priority queue can have the same key

Mathematical concept of total order relation Reflexive property:

x x Antisymmetric property:

x y y x x = y Transitive property:

x y y z x z


Comparator ADTA comparator encapsulates the action of comparing two objects according to a given total order relationA generic priority queue uses an auxiliary comparatorThe comparator is external to the keys being comparedWhen the priority queue needs to compare two keys, it uses its comparator

Methods of the Comparator ADT, all with Boolean return type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x)


Sorting with a Priority QueueWe can use a priority queue to sort a set of comparable elements1. Insert the elements one

by one with a series of insertItem(e, e) operations

2. Remove the elements in sorted order with a series of removeMin() operations

The running time of this sorting method depends on the priority queue implementation

Algorithm PQ-Sort(S, C)Input sequence S, comparator Cfor the elements of SOutput sequence S sorted in increasing order according to CP priority queue with

comparator Cwhile S.isEmpty ()

e S.remove (S. first ())P.insertItem(e, e)

while P.isEmpty()e P.removeMin()S.insertLast(e)

Priority Queues 6/8/2002 2:00 PM

2


Sequence-based Priority QueueImplementation with an unsorted sequence Store the items of the

priority queue in a list-based sequence, in arbitrary order

Performance: insertItem takes O(1) time

since we can insert the item at the beginning or end of the sequence

removeMin, minKey and minElement take O(n) time since we have to traverse the entire sequence to find the smallest key

Implementation with a sorted sequence Store the items of the

priority queue in a sequence, sorted by key

Performance: insertItem takes O(n) time

since we have to find the place where to insert the item

removeMin, minKey and minElement take O(1) time since the smallest key is at the beginning of the sequence


Selection-Sort

Selection-sort is the variation of PQ-sort where the priority queue is implemented with an unsorted sequenceRunning time of Selection-sort:1. Inserting the elements into the priority queue with n

insertItem operations takes O(n) time2. Removing the elements in sorted order from the priority

queue with n removeMin operations takes time proportional to

1 + 2 + + nSelection-sort runs in O(n2) time


Insertion-SortInsertion-sort is the variation of PQ-sort where the priority queue is implemented with a sorted sequenceRunning time of Insertion-sort:

1. Inserting the elements into the priority queue with ninsertItem operations takes time proportional to

1 + 2 + + n2. Removing the elements in sorted order from the priority

queue with a series of n removeMin operations takes O(n) time

Insertion-sort runs in O(n2) time


In-place Insertion-sortInstead of using an external data structure, we can implement selection-sort and insertion-sort in-placeA portion of the input sequence itself serves as the priority queueFor in-place insertion-sort We keep sorted the initial

portion of the sequence We can use

swapElements instead of modifying the sequence

5 4 2 3 1

5 4 2 3 1

4 5 2 3 1

2 4 5 3 1

2 3 4 5 1

1 2 3 4 5

1 2 3 4 5

Dictionaries 4/5/2002 15:1

Dictionaries and Hash Tables 1

Dictionaries and Hash Tables

01234 451-229-0004

981-101-0002025-612-0001


Dictionary ADT (2.5.1) The dictionary ADT models a

searchable collection of key-element items

The main operations of adictionary are searching,inserting, and deleting items

Multiple items with the samekey are allowed

Applications: address book credit card authorization mapping host names (e.g.,

cs16.net) to internet addresses(e.g., 128.148.34.101)

Dictionary ADT methods: findElement(k): if the

dictionary has an item withkey k, returns its element,else, returns the specialelement NO_SUCH_KEY

insertItem(k, o): inserts item(k, o) into the dictionary

removeElement(k): if thedictionary has an item withkey k, removes it from thedictionary and returns itselement, else returns thespecial elementNO_SUCH_KEY

size(), isEmpty() keys(), Elements()


Log File (2.5.1) A log file is a dictionary implemented by means of an unsorted

sequence We store the items of the dictionary in a sequence (based on a

doubly-linked lists or a circular array), in arbitrary order Performance:

insertItem takes O(1) time since we can insert the new item at thebeginning or at the end of the sequence

findElement and removeElement take O(n) time since in the worstcase (the item is not found) we traverse the entire sequence tolook for an item with the given key

The log file is effective only for dictionaries of small size or fordictionaries on which insertions are the most commonoperations, while searches and removals are rarely performed(e.g., historical record of logins to a workstation)


Hash Functions andHash Tables (2.5.2) A hash function h maps keys of a given type to

integers in a fixed interval [0, N 1] Example:

h(x) = x mod Nis a hash function for integer keys

The integer h(x) is called the hash value of key x A hash table for a given key type consists of

Hash function h Array (called table) of size N

When implementing a dictionary with a hash table,the goal is to store item (k, o) at index i = h(k)


Example

We design a hash table fora dictionary storing items(SSN, Name), where SSN(social security number) is anine-digit positive integer

Our hash table uses anarray of size N = 10,000 andthe hash functionh(x) = last four digits of x

01234

999799989999

451-229-0004

981-101-0002

200-751-9998

025-612-0001


Hash Functions ( 2.5.3)

A hash function isusually specified as thecomposition of twofunctions:Hash code map: h1: keys integersCompression map: h2: integers [0, N 1]

The hash code map isapplied first, and thecompression map isapplied next on theresult, i.e.,

h(x) = h2(h1(x)) The goal of the hash

function is todisperse the keys inan apparently randomway



Hash Code Maps (2.5.3) Memory address:

We reinterpret the memoryaddress of the key object asan integer (default hash codeof all Java objects)

Good in general, except fornumeric and string keys

Integer cast: We reinterpret the bits of the

key as an integer Suitable for keys of length

less than or equal to thenumber of bits of the integertype (e.g., byte, short, intand float in Java)

Component sum: We partition the bits of

the key into componentsof fixed length (e.g., 16or 32 bits) and we sumthe components(ignoring overflows)

Suitable for numeric keysof fixed length greaterthan or equal to thenumber of bits of theinteger type (e.g., longand double in Java)


Hash Code Maps (cont.) Polynomial accumulation:

We partition the bits of thekey into a sequence ofcomponents of fixed length(e.g., 8, 16 or 32 bits) a0 a1 an1

We evaluate the polynomialp(z) = a0 + a1 z + a2 z2 +

+ an1zn1at a fixed value z, ignoringoverflows

Especially suitable for strings(e.g., the choice z = 33 givesat most 6 collisions on a setof 50,000 English words)

Polynomial p(z) can beevaluated in O(n) timeusing Horners rule: The following

polynomials aresuccessively computed,each from the previousone in O(1) time

p0(z) = an1pi (z) = ani1 + zpi1(z)

(i = 1, 2, , n 1) We have p(z) = pn1(z)


CompressionMaps (2.5.4)

Division: h2 (y) = y mod N The size N of the

hash table is usuallychosen to be a prime

The reason has to dowith number theoryand is beyond thescope of this course

Multiply, Add andDivide (MAD): h2 (y) = (ay + b) mod N a and b are

nonnegative integerssuch that a mod N 0

Otherwise, everyinteger would map tothe same value b


Collision Handling( 2.5.5)

Collisions occur whendifferent elements aremapped to the samecell Chaining: let each

cell in the table pointto a linked list ofelements that mapthere

Chaining is simple,but requiresadditional memoryoutside the table

01234 451-229-0004 981-101-0004

025-612-0001


Linear Probing (2.5.5) Open addressing: the

colliding item is placed in adifferent cell of the table

Linear probing handlescollisions by placing thecolliding item in the next(circularly) available table cell

Each table cell inspected isreferred to as a probe

Colliding items lump together,causing future collisions tocause a longer sequence ofprobes

Example: h(x) = x mod 13 Insert keys 18, 41,

22, 44, 59, 32, 31,73, in this order

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 59 32 22 31 73 0 1 2 3 4 5 6 7 8 9 10 11 12


Search with Linear Probing Consider a hash table A

that uses linear probing findElement(k)

We start at cell h(k) We probe consecutive

locations until one of thefollowing occurs An item with key k is

found, or An empty cell is found,

or N cells have been

unsuccessfully probed

Algorithm findElement(k)i h(k)p 0repeat

c A[i]if c =

return NO_SUCH_KEY else if c.key () = k

return c.element()else

i (i + 1) mod Np p + 1

until p = Nreturn NO_SUCH_KEY



Updates with Linear Probing To handle insertions and

deletions, we introduce aspecial object, calledAVAILABLE, which replacesdeleted elements

removeElement(k) We search for an item with

key k If such an item (k, o) is

found, we replace it with thespecial item AVAILABLEand we return element o

Else, we returnNO_SUCH_KEY

insert Item(k, o) We throw an exception

if the table is full We start at cell h(k) We probe consecutive

cells until one of thefollowing occurs A cell i is found that is

either empty or storesAVAILABLE, or

N cells have beenunsuccessfully probed

We store item (k, o) incell i


Double Hashing Double hashing uses a

secondary hash functiond(k) and handlescollisions by placing anitem in the first availablecell of the series

(i + jd(k)) mod N for j = 0, 1, , N 1

The secondary hashfunction d(k) cannothave zero values

The table size N must bea prime to allow probingof all the cells

Common choice ofcompression map for thesecondary hash function:

d2(k) = q k mod qwhere q < N q is a prime

The possible values ford2(k) are

1, 2, , q


Consider a hashtable storing integerkeys that handlescollision with doublehashing N = 13 h(k) = k mod 13 d(k) = 7 k mod 7

Insert keys 18, 41,22, 44, 59, 32, 31,73, in this order

Example of Double Hashing

0 1 2 3 4 5 6 7 8 9 10 11 12

31 41 18 32 59 73 22 44 0 1 2 3 4 5 6 7 8 9 10 11 12

k h (k ) d (k ) Probes18 5 3 541 2 1 222 9 6 944 5 5 5 1059 7 4 732 6 3 631 5 4 5 9 073 8 4 8


Performance ofHashing In the worst case, searches,

insertions and removals on ahash table take O(n) time

The worst case occurs whenall the keys inserted into thedictionary collide

The load factor = n/Naffects the performance of ahash table

Assuming that the hashvalues are like randomnumbers, it can be shownthat the expected number ofprobes for an insertion withopen addressing is

1 / (1 )

The expected runningtime of all the dictionaryADT operations in ahash table is O(1)

In practice, hashing isvery fast provided theload factor is not closeto 100%

Applications of hashtables: small databases compilers browser caches


Universal Hashing ( 2.5.6)

A family of hash functionsis universal if, for any0



Proof of Universality (Part 2) If f causes no collisions, only g can make h cause collisions. Fix a number x. Of the p integers y=f(k), different from x,

the number such that g(y)=g(x) is at most Since there are p choices for x, the number of hs that will

cause a collision between j and k is at most

There are p(p-1) functions h. So probability of collision isat most

Therefore, the set of possible h functions is universal.

1/ Np

( ) NppNpp )1(1/

NppNpp 1

)1(/)1( =

Dictionaries 6/8/2002 2:01 PM

1

6/8/2002 2:01 PM Dictionaries 1

Dictionaries

6

92

41 8

=


Outline and ReadingDictionary ADT (2.5.1)Log file (2.5.1)Binary search (3.1.1)Lookup table (3.1.1)Binary search tree (3.1.2) Search (3.1.3) Insertion (3.1.4) Deletion (3.1.5) Performance (3.1.6)


Dictionary ADTThe dictionary ADT models a searchable collection of key-element itemsThe main operations of a dictionary are searching, inserting, and deleting itemsMultiple items with the same key are allowedApplications: address book credit card authorization mapping host names (e.g.,

cs16.net) to internet addresses (e.g., 128.148.34.101)

Dictionary ADT methods: findElement(k): if the

dictionary has an item with key k, returns its element, else, returns the special element NO_SUCH_KEY

insertItem(k, o): inserts item (k, o) into the dictionary

removeElement(k): if the dictionary has an item with key k, removes it from the dictionary and returns its element, else returns the special element NO_SUCH_KEY

size(), isEmpty() keys(), Elements()


Log FileA log file is a dictionary implemented by means of an unsorted sequence We store the items of the dictionary in a sequence (based on a

doubly-linked lists or a circular array), in arbitrary orderPerformance: insertItem takes O(1) time since we can insert the new item at the

beginning or at the end of the sequence findElement and removeElement take O(n) time since in the worst

case (the item is not found) we traverse the entire sequence to look for an item with the given key

The log file is effective only for dictionaries of small size or for dictionaries on which insertions are the most common operations, while searches and removals are rarely performed (e.g., historical record of logins to a workstation)


Binary SearchBinary search performs operation findElement(k) on a dictionary implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after a logarithmic number of steps

Example: findElement(7)

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

0

0

0

0

ml h

ml h

ml h

l=m =h6/8/2002 2:01 PM Dictionaries 6

Lookup TableA lookup table is a dictionary implemented by means of a sorted sequence We store the items of the dictionary in an array-based sequence,

sorted by key We use an external comparator for the keys

Performance: findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift

n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to

shift n/2 items to compact the items after the removalThe lookup table is effective only for dictionaries of small size or for dictionaries on which searches are the most common operations, while insertions and removals are rarely performed (e.g., credit card authorizations)

Dictionaries 6/8/2002 2:01 PM

2


Binary Search TreeA binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying the following property: Let u, v, and w be three

nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) key(v) key(w)

External nodes do not store items

An inorder traversal of a binary search trees visits the keys in increasing order

6

92

41 8


SearchTo search for a key k, we trace a downward path starting at the rootThe next node visited depends on the outcome of the comparison of k with the key of the current nodeIf we reach a leaf, the key is not found and we return NO_SUCH_KEYExample: findElement(4)

Algorithm findElement(k, v)if T.isExternal (v)

return NO_SUCH_KEYif k < key(v)

return findElement(k, T.leftChild(v))else if k = key(v)

return element(v)else { k > key(v) }

return findElement(k, T.rightChild(v))

6

92

41 8

=


InsertionTo perform operation insertItem(k, o), we search for key kAssume k is not already in the tree, and let let w be the leaf reached by the searchWe insert k at node w and expand w into an internal nodeExample: insert 5

6

92

41 8

6

92

41 8

5

>w

w


DeletionTo perform operation removeElement(k), we search for key kAssume key k is in the tree, and let let v be the node storing kIf node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w)Example: remove 4

6

92

41 8

5

vw

6

92

51 8


Deletion (cont.)We consider the case where the key k to be removed is stored at a node v whose children are both internal we find the internal node w

that follows v in an inorder traversal

we copy key(w) into node v we remove node w and its

left child z (which must be a leaf) by means of operation removeAboveExternal(z)

Example: remove 3

3

1

8

6 9

5

v

w

z

2

5

1

8

6 9

v

2


PerformanceConsider a dictionary with n items implemented by means of a binary search tree of height h the space used is O(n) methods findElement ,

insertItem and removeElement take O(h) time

The height h is O(n) in the worst case and O(log n) in the best case


Binary Search Trees 1

Binary Search Trees

6

92

41 8

=


Ordered Dictionaries

Keys are assumed to come from a totalorder. New operations:

closestKeyBefore(k) closestElemBefore(k) closestKeyAfter(k) closestElemAfter(k)


Binary Search (3.1.1) Binary search performs operation findElement(k) on a dictionary

implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after O(log n) steps

Example: findElement(7)

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

0

0

0

0

ml h

ml h

ml h

l=m =hBinary Search Trees 4

Lookup Table (3.1.1)

A lookup table is a dictionary implemented by means of a sortedsequence We store the items of the dictionary in an array-based sequence,

sorted by key We use an external comparator for the keys

Performance: findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift

n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to

shift n/2 items to compact the items after the removal The lookup table is effective only for dictionaries of small size or

for dictionaries on which searches are the most commonoperations, while insertions and removals are rarely performed(e.g., credit card authorizations)


Binary SearchTree (3.1.2) A binary search tree is a

binary tree storing keys(or key-element pairs)at its internal nodes andsatisfying the followingproperty: Let u, v, and w be three

nodes such that u is inthe left subtree of v andw is in the right subtreeof v. We havekey(u) key(v) key(w)

External nodes do notstore items

An inorder traversal of abinary search treesvisits the keys inincreasing order

6

92

41 8


Search (3.1.3) To search for a key k,

we trace a downwardpath starting at the root

The next node visiteddepends on theoutcome of thecomparison of k withthe key of the currentnode

If we reach a leaf, thekey is not found and wereturn NO_SUCH_KEY

Example:findElement(4)

Algorithm findElement(k, v)if T.isExternal (v)

return NO_SUCH_KEYif k < key(v)

return findElement(k, T.leftChild(v))else if k = key(v)

return element(v)else { k > key(v) }

return findElement(k, T.rightChild(v))

6

92

41 8

=



Insertion (3.1.4) To perform operation

insertItem(k, o), we searchfor key k

Assume k is not already inthe tree, and let let w bethe leaf reached by thesearch

We insert k at node w andexpand w into an internalnode

Example: insert 5

6

92

41 8

6

92

41 8

5

>w

w


Deletion (3.1.5) To perform operation

removeElement(k), wesearch for key k

Assume key k is in the tree,and let let v be the nodestoring k

If node v has a leaf child w,we remove v and w from thetree with operationremoveAboveExternal(w)

Example: remove 4

6

92

41 8

5

vw

6

92

51 8


Deletion (cont.) We consider the case where

the key k to be removed isstored at a node v whosechildren are both internal we find the internal node w

that fol