Author
thangngo
View
20
Download
1
Embed Size (px)
DESCRIPTION
grrgrgrgr
Analysis of Algorithms
AlgorithmInput Output
An algorithm is a step-by-step procedure forsolving a problem in a finite amount of time.
Analysis of Algorithms 2
Running Time (1.1) w Most algorithms transform
input objects into outputobjects.w The running time of an
algorithm typically growswith the input size.w Average case time is often
difficult to determine.w We focus on the worst case
running time.n Easier to analyzen Crucial to applications such as
games, finance and robotics
0
20
40
60
80
100
120
Ru
nn
ing
Tim
e
1000 2000 3000 4000
Input Size
best case
average caseworst case
Analysis of Algorithms 3
Experimental Studies ( 1.6)
w Write a programimplementing thealgorithmw Run the program with
inputs of varying size andcompositionw Use a method like
System.currentTimeMillis() toget an accurate measureof the actual running timew Plot the results 0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 50 100
Input Size
Tim
e (
ms)
Analysis of Algorithms 4
Limitations of Experiments
w It is necessary to implement thealgorithm, which may be difficultw Results may not be indicative of the
running time on other inputs not includedin the experiment.w In order to compare two algorithms, the
same hardware and softwareenvironments must be used
Analysis of Algorithms 5
Theoretical Analysis
w Uses a high-level description of thealgorithm instead of an implementationw Characterizes running time as a
function of the input size, n.w Takes into account all possible inputsw Allows us to evaluate the speed of an
algorithm independent of thehardware/software environment
Analysis of Algorithms 6
Pseudocode (1.1)w High-level description
of an algorithmw More structured than
English prosew Less detailed than a
programw Preferred notation for
describing algorithmsw Hides program design
issues
Algorithm arrayMax(A, n)Input array A of n integersOutput maximum element of A
currentMax A[0]for i 1 to n - 1 do
if A[i] > currentMax thencurrentMax A[i]
return currentMax
Example: find maxelement of an array
Analysis of Algorithms 7
Pseudocode Details
w Control flown if then [else ]n while do n repeat until n for do n Indentation replaces braces
w Method declarationAlgorithm method (arg [, arg])
Input Output
w Method callvar.method (arg [, arg])
w Return valuereturn expression
w ExpressionsAssignment
(like = in Java)= Equality testing
(like == in Java)n2 Superscripts and other
mathematicalformatting allowed
Analysis of Algorithms 8
The Random Access Machine(RAM) Model
w A CPU
w An potentially unboundedbank of memory cells,each of which can hold anarbitrary number orcharacter
012
w Memory cells are numbered and accessingany cell in memory takes unit time.
Analysis of Algorithms 9
Primitive Operationsw Basic computations
performed by an algorithmw Identifiable in pseudocodew Largely independent from the
programming languagew Exact definition not important
(we will see why later)w Assumed to take a constant
amount of time in the RAMmodel
w Examples:n Evaluating an
expressionn Assigning a value
to a variablen Indexing into an
arrayn Calling a methodn Returning from a
method
Analysis of Algorithms 10
Counting PrimitiveOperations (1.1)w By inspecting the pseudocode, we can determine the
maximum number of primitive operations executed byan algorithm, as a function of the input size
Algorithm arrayMax(A, n) # operationscurrentMax A[0] 2for i 1 to n - 1 do 2 + n
if A[i] > currentMax then 2(n - 1)currentMax A[i] 2(n - 1)
{ increment counter i } 2(n - 1)return currentMax 1
Total 7n - 1
Analysis of Algorithms 11
Estimating Running Timew Algorithm arrayMax executes 7n - 1 primitive
operations in the worst case. Define:a = Time taken by the fastest primitive operationb = Time taken by the slowest primitive operation
w Let T(n) be worst-case time of arrayMax. Thena (7n - 1) T(n) b(7n - 1)
w Hence, the running time T(n) is bounded by twolinear functions
Analysis of Algorithms 12
Growth Rate of Running Time
w Changing the hardware/ softwareenvironmentn Affects T(n) by a constant factor, butn Does not alter the growth rate of T(n)
w The linear growth rate of the runningtime T(n) is an intrinsic property ofalgorithm arrayMax
Analysis of Algorithms 13
Growth Rates
w Growth rates offunctions:n Linear nn Quadratic n2
n Cubic n3
w In a log-log chart,the slope of the linecorresponds to thegrowth rate of thefunction
1E+01E+21E+41E+61E+8
1E+101E+121E+141E+161E+181E+201E+221E+241E+261E+281E+30
1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n
T(n
)
Cubic
Quadratic
Linear
Analysis of Algorithms 14
Constant Factors
w The growth rate isnot affected byn constant factors orn lower-order terms
w Examplesn 102n + 105 is a linear
functionn 105n2 + 108n is a
quadratic function1E+01E+21E+41E+61E+8
1E+101E+121E+141E+161E+181E+201E+221E+241E+26
1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n
T(n
)
Quadratic
Quadratic
LinearLinear
Analysis of Algorithms 15
Big-Oh Notation (1.2)w Given functions f(n) and
g(n), we say that f(n) isO(g(n)) if there arepositive constantsc and n0 such that
f(n) cg(n) for n n0w Example: 2n + 10 is O(n)
n 2n + 10 cnn (c - 2) n 10n n 10/(c - 2)n Pick c = 3 and n0 = 10
1
10
100
1,000
10,000
1 10 100 1,000n
3n
2n+10
n
Analysis of Algorithms 16
Big-Oh Example
w Example: the functionn2 is not O(n)n n2 cnn n cn The above inequality
cannot be satisfiedsince c must be aconstant
1
10
100
1,000
10,000
100,000
1,000,000
1 10 100 1,000n
n^2
100n
10n
n
Analysis of Algorithms 17
More Big-Oh Examplesn 7n-2
7n-2 is O(n)need c > 0 and n 0 1 such that 7n-2 cn for n n0this is true for c = 7 and n0 = 1
n 3n3 + 20n2 + 53n3 + 20n2 + 5 is O(n3)need c > 0 and n 0 1 such that 3n3 + 20n 2 + 5 cn3 for n n0this is true for c = 4 and n0 = 21
n 3 log n + log log n3 log n + log log n is O(log n)need c > 0 and n 0 1 such that 3 log n + log log n clog n for n n0this is true for c = 4 and n0 = 2
Analysis of Algorithms 18
Big-Oh and Growth Ratew The big-Oh notation gives an upper bound on the
growth rate of a functionw The statement f(n) is O(g(n)) means that the growth
rate of f(n) is no more than the growth rate of g(n)w We can use the big-Oh notation to rank functions
according to their growth rate
YesYesSame growthYesNof(n) grows moreNoYesg(n) grows more
g(n) is O(f(n))f(n) is O(g(n))
Analysis of Algorithms 19
Big-Oh Rules
w If is f(n) a polynomial of degree d, then f(n) isO(nd), i.e.,
n Drop lower-order termsn Drop constant factors
w Use the smallest possible class of functionsn Say 2n is O(n) instead of 2n is O(n2)
w Use the simplest expression of the classn Say 3n + 5 is O(n) instead of 3n + 5 is O(3n)
Analysis of Algorithms 20
Asymptotic Algorithm Analysisw The asymptotic analysis of an algorithm determines
the running time in big-Oh notationw To perform the asymptotic analysis
n We find the worst-case number of primitive operationsexecuted as a function of the input size
n We express this function with big-Oh notation
w Example:n We determine that algorithm arrayMax executes at most
7n - 1 primitive operationsn We say that algorithm arrayMax runs in O(n) time
w Since constant factors and lower-order terms areeventually dropped anyhow, we can disregard themwhen counting primitive operations
Analysis of Algorithms 21
Computing Prefix Averagesw We further illustrate
asymptotic analysis withtwo algorithms for prefixaveragesw The i-th prefix average of
an array X is average of thefirst (i + 1) elements of X:
A[i] = (X[0] + X[1] + + X[i])/( i+1)
w Computing the array A ofprefix averages of anotherarray X has applications tofinancial analysis
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7
XA
Analysis of Algorithms 22
Prefix Averages (Quadratic)w The following algorithm computes prefix averages in
quadratic time by applying the definition
Algorithm prefixAverages1(X, n)Input array X of n integersOutput array A of prefix averages of X #operations A new array of n integers nfor i 0 to n - 1 do n
s X[0] nfor j 1 to i do 1 + 2 + + (n - 1)
s s + X[j] 1 + 2 + + (n - 1)A[i] s / (i + 1) n
return A 1
Analysis of Algorithms 23
Arithmetic Progression
w The running time ofprefixAverages1 isO(1 + 2 + + n)w The sum of the first n
integers is n(n + 1) / 2n There is a simple visual
proof of this fact
w Thus, algorithmprefixAverages1 runs inO(n2) time 0
1
2
3
4
5
6
7
1 2 3 4 5 6
Analysis of Algorithms 24
Prefix Averages (Linear)w The following algorithm computes prefix averages in
linear time by keeping a running sum
Algorithm prefixAverages2(X, n)Input array X of n integersOutput array A of prefix averages of X #operationsA new array of n integers ns 0 1for i 0 to n - 1 do n
s s + X[i] nA[i] s / (i + 1) n
return A 1
w Algorithm prefixAverages2 runs in O(n) time
Analysis of Algorithms 25
w properties of logarithms:logb(xy) = logbx + logbylogb (x/y) = logbx - logbylogbxa = alogbxlogba = logxa/log xb
w properties of exponentials:a(b+c) = aba cabc = (ab)cab /ac = a(b-c)b = a logabbc = a c*log ab
w Summations (Sec. 1.3.1)w Logarithms and Exponents (Sec. 1.3.2)
w Proof techniques (Sec. 1.3.3)w Basic probability (Sec. 1.3.4)
Math you need to Review
Analysis of Algorithms 26
Relatives of Big-Ohw big-Omega
n f(n) is W(g(n)) if there is a constant c > 0and an integer constant n0 1 such thatf(n) cg(n) for n n0
w big-Thetan f(n) is Q(g(n)) if there are constants c > 0 and c > 0 and an
integer constant n 0 1 such that cg(n) f(n) cg(n) for n n0w little-oh
n f(n) is o(g(n)) if, for any constant c > 0, there is an integerconstant n 0 0 such that f(n) cg(n) for n n0
w little-omegan f(n) is w(g(n)) if, for any constant c > 0, there is an integer
constant n 0 0 such that f(n) cg(n) for n n0
Analysis of Algorithms 27
Intuition for AsymptoticNotation
Big-Ohn f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)
big-Omegan f(n) is W(g(n)) if f(n) is asymptotically greater than or equal to g(n)
big-Thetan f(n) is Q(g(n)) if f(n) is asymptotically equal to g(n)
little-ohn f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
little-omegan f(n) is w(g(n)) if is asymptotically strictly greater than g(n)
Analysis of Algorithms 28
Example Uses of theRelatives of Big-Oh
f(n) is w(g(n)) if, for any constant c > 0, there is an integer constant n0 0 such that f(n) cg(n) for n n0
need 5n02 cn0 given c, the n0 that satifies this is n0 c/5 0
n 5n 2 is w(n)
f(n) is W(g(n)) if there is a constant c > 0 and an integer constant n 0 1such that f(n) cg(n) for n n0
let c = 1 and n0 = 1
n 5n 2 is W(n)
f(n) is W(g(n)) if there is a constant c > 0 and an integer constant n 0 1such that f(n) cg(n) for n n0
let c = 5 and n0 = 1
n 5n 2 is W(n2)
Elementary DataStructures
Stacks, Queues, & ListsAmortized analysisTrees
Elementary Data Structures 2
The Stack ADT (2.1.1) The Stack ADT stores
arbitrary objects Insertions and deletions
follow the last-in first-outscheme
Think of a spring-loadedplate dispenser
Main stack operations: push(object): inserts an
element object pop(): removes and
returns the last insertedelement
Auxiliary stackoperations: object top(): returns the
last inserted elementwithout removing it
integer size(): returns thenumber of elementsstored
boolean isEmpty():indicates whether noelements are stored
Elementary Data Structures 3
Applications of Stacks
Direct applications Page-visited history in a Web browser Undo sequence in a text editor Chain of method calls in the Java Virtual
Machine or C++ runtime environment Indirect applications
Auxiliary data structure for algorithms Component of other data structures
Elementary Data Structures 4
Array-based Stack (2.1.1)
A simple way ofimplementing theStack ADT uses anarray
We add elementsfrom left to right
A variable t keepstrack of the index ofthe top element(size is t+1)
S0 1 2 t
Algorithm pop():if isEmpty() then
throw EmptyStackException else
t t 1return S[t + 1]
Algorithm push(o)if t = S.length 1 then
throw FullStackException else
t t + 1S[t] o
Elementary Data Structures 5
Growable Array-basedStack (1.5) In a push operation, when
the array is full, instead ofthrowing an exception, wecan replace the array witha larger one How large should the new
array be? incremental strategy:
increase the size by aconstant c
doubling strategy: doublethe size
Algorithm push(o)if t = S.length 1 then
A new array ofsize
for i 0 to t do A[i] S[i] S A
t t + 1S[t] o
Elementary Data Structures 6
Comparison of theStrategies
We compare the incremental strategy andthe doubling strategy by analyzing the totaltime T(n) needed to perform a series of npush operations We assume that we start with an empty
stack represented by an array of size 1 We call amortized time of a push operation
the average time taken by a push over theseries of operations, i.e., T(n)/n
Elementary Data Structures 7
Analysis of theIncremental Strategy
We replace the array k = n/c times The total time T(n) of a series of n push
operations is proportional ton + c + 2c + 3c + 4c + + kc =
n + c(1 + 2 + 3 + + k) =n + ck(k + 1)/2
Since c is a constant, T(n) is O(n + k2), i.e.,O(n2) The amortized time of a push operation is O(n)
Elementary Data Structures 8
Direct Analysis of theDoubling Strategy We replace the array k = log2 n
times The total time T(n) of a series
of n push operations isproportional to
n + 1 + 2 + 4 + 8 + + 2k =n + 2k + 1 1 = 2n 1
T(n) is O(n) The amortized time of a push
operation is O(1)
geometric series
1
2
14
8
Elementary Data Structures 9
The accounting method determines the amortizedrunning time with a system of credits and debits
We view a computer as a coin-operated device requiring1 cyber-dollar for a constant amount of computing.
Accounting Method Analysisof the Doubling Strategy
We set up a scheme for charging operations. Thisis known as an amortization scheme.
The scheme must give us always enough money topay for the actual cost of the operation.
The total cost of the series of operations is no morethan the total amount charged.
(amortized time) (total $ charged) / (# operations)Elementary Data Structures 10
Amortization Scheme forthe Doubling Strategy Consider again the k phases, where each phase consisting of twice
as many pushes as the one before. At the end of a phase we must have saved enough to pay for the
array-growing push of the next phase. At the end of phase i we want to have saved i cyber-dollars, to pay
for the array growth for the beginning of the next phase.
0 2 4 5 6 731
$ $ $ $$ $ $ $
0 2 4 5 6 7 8 9 113 10 12 13 14 151
$$
We charge $3 for a push. The $2 saved for a regular push arestored in the second half of the array. Thus, we will have2(i/2)=i cyber-dollars saved at then end of phase i. Therefore, each push runs in O(1) amortized time; n pushes runin O(n) time.
Elementary Data Structures 11
The Queue ADT (2.1.2) The Queue ADT stores arbitrary
objects Insertions and deletions follow
the first-in first-out scheme Insertions are at the rear of the
queue and removals are at thefront of the queue
Main queue operations: enqueue(object): inserts an
element at the end of thequeue
object dequeue(): removes andreturns the element at the frontof the queue
Auxiliary queueoperations: object front(): returns the
element at the front withoutremoving it
integer size(): returns thenumber of elements stored
boolean isEmpty(): indicateswhether no elements arestored
Exceptions Attempting the execution of
dequeue or front on anempty queue throws anEmptyQueueException
Elementary Data Structures 12
Applications of Queues
Direct applications Waiting lines Access to shared resources (e.g., printer) Multiprogramming
Indirect applications Auxiliary data structure for algorithms Component of other data structures
Elementary Data Structures 13
Singly Linked List A singly linked list is a
concrete data structureconsisting of a sequenceof nodes
Each node stores element link to the next node
next
elem node
A B C D
Elementary Data Structures 14
Queue with a Singly Linked List We can implement a queue with a singly linked list
The front element is stored at the first node The rear element is stored at the last node
The space used is O(n) and each operation of theQueue ADT takes O(1) time
f
r
nodes
elements
Elementary Data Structures 15
List ADT (2.2.2)
The List ADT models asequence of positionsstoring arbitrary objects
It allows for insertionand removal in themiddle
Query methods: isFirst(p), isLast(p)
Accessor methods: first(), last() before(p), after(p)
Update methods: replaceElement(p, o),
swapElements(p, q) insertBefore(p, o),
insertAfter(p, o), insertFirst(o),
insertLast(o) remove(p)
Elementary Data Structures 16
Doubly Linked List A doubly linked list provides a natural
implementation of the List ADT Nodes implement Position and store:
element link to the previous node link to the next node
Special trailer and header nodes
prev next
elem
trailerheader nodes/positions
elements
node
Elementary Data Structures 17
Trees (2.3) In computer science, a
tree is an abstract modelof a hierarchicalstructure
A tree consists of nodeswith a parent-childrelation
Applications: Organization charts File systems Programming
environments
ComputersRUs
Sales R&DManufacturing
Laptops DesktopsUS International
Europe Asia Canada
Elementary Data Structures 18
Tree ADT (2.3.1) We use positions to abstract
nodes Generic methods:
integer size() boolean isEmpty() objectIterator elements() positionIterator positions()
Accessor methods: position root() position parent(p) positionIterator children(p)
Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)
Update methods: swapElements(p, q) object replaceElement(p, o)
Additional update methodsmay be defined by datastructures implementing theTree ADT
Elementary Data Structures 19
Preorder Traversal (2.3.2) A traversal visits the nodes of a
tree in a systematic manner In a preorder traversal, a node is
visited before its descendants Application: print a structured
document
Make Money Fast!
1. Motivations References2. Methods
2.1 StockFraud
2.2 PonziScheme1.1 Greed 1.2 Avidity
2.3 BankRobbery
1
2
3
5
4 6 7 8
9
Algorithm preOrder(v)visit(v)for each child w of v
preorder (w)
Elementary Data Structures 20
Postorder Traversal (2.3.2) In a postorder traversal, a
node is visited after itsdescendants
Application: compute spaceused by files in a directory andits subdirectories
Algorithm postOrder(v)for each child w of v
postOrder (w)visit(v)
cs16/
homeworks/ todo.txt1Kprograms/
DDR.java10K
Stocks.java25K
h1c.doc3K
h1nc.doc2K
Robot.java20K
9
3
1
7
2 4 5 6
8
Elementary Data Structures 21
Amortized Analysis ofTree Traversal Time taken in preorder or postorder traversal
of an n-node tree is proportional to the sum,taken over each node v in the tree, of thetime needed for the recursive call for v. The call for v costs $(cv + 1), where cv is the
number of children of v For the call for v, charge one cyber-dollar to v and
charge one cyber-dollar to each child of v. Each node (except the root) gets charged twice:
once for its own call and once for its parents call. Therefore, traversal time is O(n).
Elementary Data Structures 22
Binary Trees (2.3.3) A binary tree is a tree with the
following properties: Each internal node has two
children The children of a node are an
ordered pair We call the children of an internal
node left child and right child Alternative recursive definition: a
binary tree is either a tree consisting of a single node,
or a tree whose root has an ordered
pair of children, each of which is abinary tree
Applications: arithmetic expressions decision processes searching
A
B C
F GD E
H I
Elementary Data Structures 23
Arithmetic Expression Tree Binary tree associated with an arithmetic expression
internal nodes: operators external nodes: operands
Example: arithmetic expression tree for theexpression (2 (a 1) + (3 b))
+
2a 1
3 b
Elementary Data Structures 24
Decision Tree Binary tree associated with a decision process
internal nodes: questions with yes/no answer external nodes: decisions
Example: dining decisionWant a fast meal?
How about coffee? On expense account?
Starbucks In N Out Antoine's Dennys
Yes No
Yes No Yes No
Elementary Data Structures 25
Properties of Binary Trees Notation
n number of nodese number of
external nodesi number of internal
nodesh height
Properties: e = i + 1 n = 2e 1 h i h (n 1)/2 e 2h h log2 e h log2 (n + 1) 1
Elementary Data Structures 26
Inorder Traversal In an inorder traversal a
node is visited after its leftsubtree and before its rightsubtree
Application: draw a binarytree x(v) = inorder rank of v y(v) = depth of v
Algorithm inOrder(v)if isInternal (v)
inOrder (leftChild (v))visit(v)if isInternal (v)
inOrder (rightChild (v))
3
1
2
5
6
7 9
8
4
Elementary Data Structures 27
Euler Tour Traversal Generic traversal of a binary tree Includes a special cases the preorder, postorder and inorder traversals Walk around the tree and visit each node three times:
on the left (preorder) from below (inorder) on the right (postorder)
+
25 1
3 2
LB
R
Elementary Data Structures 28
Printing Arithmetic Expressions Specialization of an inorder
traversal print operand or operator
when visiting node print ( before traversing left
subtree print ) after traversing right
subtree
Algorithm printExpression(v)if isInternal (v)
print(()inOrder (leftChild (v))
print(v.element ())if isInternal (v)
inOrder (rightChild (v))print ())
+
2a 1
3 b((2 (a 1)) + (3 b))
Elementary Data Structures 29
Linked Data Structure forRepresenting Trees (2.3.4) A node is represented by
an object storing Element Parent node Sequence of children
nodes Node objects implement
the Position ADT
B
DA
C E
F
B
A D F
C
E
Elementary Data Structures 30
Linked Data Structure forBinary Trees A node is represented
by an object storing Element Parent node Left child node Right child node
Node objects implementthe Position ADT
B
DA
C E
B
A D
C E
Elementary Data Structures 31
Array-Based Representation ofBinary Trees nodes are stored in an array
let rank(node) be defined as follows: rank(root) = 1 if node is the left child of parent(node),
rank(node) = 2*rank(parent(node)) if node is the right child of parent(node),
rank(node) = 2*rank(parent(node))+1
1
2 3
6 74 5
10 11
A
HG
FE
D
C
B
J
1Stacks
Stacks 2
Outline and Reading
The Stack ADT (2.1.1)Applications of Stacks (2.1.1)Array-based implementation (2.1.1)Growable array-based stack (1.5)
Stacks 3
Abstract Data Types (ADTs)An abstract data type (ADT) is an abstraction of a data structureAn ADT specifies: Data stored Operations on the
data Error conditions
associated with operations
Example: ADT modeling a simple stock trading system The data stored are buy/sell
orders The operations supported are order buy(stock, shares, price) order sell(stock, shares, price) void cancel(order)
Error conditions: Buy/sell a nonexistent stock Cancel a nonexistent order
Stacks 4
The Stack ADTThe Stack ADT stores arbitrary objectsInsertions and deletions follow the last-in first-out schemeThink of a spring-loaded plate dispenserMain stack operations: push(object): inserts an
element object pop(): removes and
returns the last inserted element
Auxiliary stack operations: object top(): returns the
last inserted element without removing it
integer size(): returns the number of elements stored
boolean isEmpty(): indicates whether no elements are stored
Stacks 5
ExceptionsAttempting the execution of an operation of ADT may sometimes cause an error condition, called an exceptionExceptions are said to be thrown by an operation that cannot be executed
In the Stack ADT, operations pop and top cannot be performed if the stack is emptyAttempting the execution of pop or top on an empty stack throws an EmptyStackException
Stacks 6
Applications of Stacks
Direct applications Page-visited history in a Web browser Undo sequence in a text editor Chain of method calls in the Java Virtual
MachineIndirect applications Auxiliary data structure for algorithms Component of other data structures
2Stacks 7
Method Stack in the JVMThe Java Virtual Machine (JVM) keeps track of the chain of active methods with a stackWhen a method is called, the JVM pushes on the stack a frame containing Local variables and return value Program counter, keeping track of
the statement being executed When a method ends, its frame is popped from the stack and control is passed to the method on top of the stack
main() {int i = 5;foo(i);}
foo(int j) {int k;k = j+1;bar(k);}
bar(int m) {}
barPC = 1m = 6
fooPC = 3j = 5k = 6
mainPC = 2i = 5
Stacks 8
Array-based StackA simple way of implementing the Stack ADT uses an arrayWe add elements from left to rightA variable keeps track of the index of the top element
S0 1 2 t
Algorithm size()return t + 1
Algorithm pop()if isEmpty() then
throw EmptyStackExceptionelse
t t 1return S[t + 1]
Stacks 9
Array-based Stack (cont.)The array storing the stack elements may become fullA push operation will then throw aFullStackException Limitation of the array-
based implementation Not intrinsic to the
Stack ADT
S0 1 2 t
Algorithm push(o)if t = S.length 1 then
throw FullStackExceptionelse
t t + 1S[t] o
Stacks 10
Performance and LimitationsPerformance Let n be the number of elements in the stack The space used is O(n) Each operation runs in time O(1)
Limitations The maximum size of the stack must be defined a
priori and cannot be changed Trying to push a new element into a full stack
causes an implementation-specific exception
Stacks 11
Computing SpansWe show how to use a stack as an auxiliary data structure in an algorithmGiven an an array X, the span S[i] of X[i] is the maximum number of consecutive elements X[j] immediately preceding X[i] and such that X[j] X[i]Spans have applications to financial analysis E.g., stock at 52-week high
1321125436X
S
01234567
0 1 2 3 4
Stacks 12
Quadratic AlgorithmAlgorithm spans1(X, n)
Input array X of n integersOutput array S of spans of X #S new array of n integers nfor i 0 to n 1 do n
s 1 nwhile s i X[i s] X[i] 1 + 2 + + (n 1)
s s + 1 1 + 2 + + (n 1)S[i] s n
return S 1
Algorithm spans1 runs in O(n2) time
3Stacks 13
Computing Spans with a StackWe keep in a stack the indices of the elements visible when looking backWe scan the array from left to right Let i be the current index We pop indices from the
stack until we find index jsuch that X[i] < X[j]
We set S[i] i j We push x onto the stack
01234567
0 1 2 3 4 5 6 7
Stacks 14
Linear AlgorithmAlgorithm spans2(X, n) #
S new array of n integers nA new empty stack 1for i 0 to n 1 do n
while (A.isEmpty() X[top()] X[i] ) do n
j A.pop() nif A.isEmpty() then n
S[i] i + 1 nelse
S[i] i j nA.push(i) n
return S 1
Each index of the array Is pushed into the
stack exactly one Is popped from
the stack at most once
The statements in the while-loop are executed at most n times Algorithm spans2 runs in O(n) time
Stacks 15
Growable Array-based StackIn a push operation, when the array is full, instead of throwing an exception, we can replace the array with a larger oneHow large should the new array be? incremental strategy:
increase the size by a constant c
doubling strategy: double the size
Algorithm push(o)if t = S.length 1 then
A new array ofsize
for i 0 to t doA[i] S[i]S A
t t + 1S[t] o
Stacks 16
Comparison of the Strategies
We compare the incremental strategy and the doubling strategy by analyzing the total time T(n) needed to perform a series of npush operationsWe assume that we start with an empty stack represented by an array of size 1We call amortized time of a push operation the average time taken by a push over the series of operations, i.e., T(n)/n
Stacks 17
Incremental Strategy Analysis
We replace the array k = n/c timesThe total time T(n) of a series of n push operations is proportional to
n + c + 2c + 3c + 4c + + kc =n + c(1 + 2 + 3 + + k) =
n + ck(k + 1)/2Since c is a constant, T(n) is O(n + k2), i.e., O(n2)The amortized time of a push operation is O(n)
Stacks 18
Doubling Strategy AnalysisWe replace the array k = log2 n timesThe total time T(n) of a series of n push operations is proportional to
n + 1 + 2 + 4 + 8 + + 2k =n + 2k + 1 1 = 2n 1
T(n) is O(n)The amortized time of a push operation is O(1)
geometric series
1
2
14
8
4Stacks 19
Stack Interface in Java
Java interface corresponding to our Stack ADTRequires the definition of class EmptyStackExceptionDifferent from the built-in Java class java.util.Stack
public interface Stack {
public int size();
public boolean isEmpty();
public Object top()throws EmptyStackException;
public void push(Object o);
public Object pop()throws EmptyStackException;
}
Stacks 20
Array-based Stack in Javapublic class ArrayStack
implements Stack {
// holds the stack elementsprivate Object S[ ];
// index to top elementprivate int top = -1;
// constructorpublic ArrayStack(int capacity) {
S = new Object[capacity]);}
public Object pop()throws EmptyStackException {
if isEmpty()throw new EmptyStackException
(Empty stack: cannot pop);Object temp = S[top];// facilitates garbage collectionS[top] = null;top = top 1;return temp;
}
Vectors 6/8/2002 2:14 PM
1
6/8/2002 2:14 PM Vectors 1
Vectors
6/8/2002 2:14 PM Vectors 2
Outline and Reading
The Vector ADT (2.2.1)Array-based implementation (2.2.1)
6/8/2002 2:14 PM Vectors 3
The Vector ADTThe Vector ADT extends the notion of array by storing a sequence of arbitrary objectsAn element can be accessed, inserted or removed by specifying its rank (number of elements preceding it)An exception is thrown if an incorrect rank is specified (e.g., a negative rank)
Main vector operations: object elemAtRank(integer r):
returns the element at rank r without removing it
object replaceAtRank(integer r, object o): replace the element at rank with o and return the old element
insertAtRank(integer r, object o): insert a new element o to have rank r
object removeAtRank(integer r): removes and returns the element at rank r
Additional operations size() and isEmpty()
6/8/2002 2:14 PM Vectors 4
Applications of Vectors
Direct applications Sorted collection of objects (elementary
database)
Indirect applications Auxiliary data structure for algorithms Component of other data structures
6/8/2002 2:14 PM Vectors 5
Array-based VectorUse an array V of size NA variable n keeps track of the size of the vector (number of elements stored)Operation elemAtRank(r) is implemented in O(1)time by returning V[r]
V0 1 2 nr
6/8/2002 2:14 PM Vectors 6
InsertionIn operation insertAtRank(r, o), we need to make room for the new element by shifting forward the n r elements V[r], , V[n 1]In the worst case (r = 0), this takes O(n) time
V0 1 2 nr
V0 1 2 nr
V0 1 2 n
or
Vectors 6/8/2002 2:14 PM
2
6/8/2002 2:14 PM Vectors 7
DeletionIn operation removeAtRank(r), we need to fill the hole left by the removed element by shifting backward the n r 1 elements V[r + 1], , V[n 1]In the worst case (r = 0), this takes O(n) time
V0 1 2 nr
V0 1 2 n
or
V0 1 2 nr
6/8/2002 2:14 PM Vectors 8
PerformanceIn the array based implementation of a Vector The space used by the data structure is O(n) size, isEmpty, elemAtRank and replaceAtRank run in
O(1) time insertAtRank and removeAtRank run in O(n) time
If we use the array in a circular fashion,insertAtRank(0) and removeAtRank(0) run in O(1) timeIn an insertAtRank operation, when the array is full, instead of throwing an exception, we can replace the array with a larger one
Queues 6/8/2002 2:16 PM
1
6/8/2002 2:16 PM Queues 1
Queues
6/8/2002 2:16 PM Queues 2
Outline and Reading
The Queue ADT (2.1.2)Implementation with a circular array (2.1.2)Growable array-based queueQueue interface in Java
6/8/2002 2:16 PM Queues 3
The Queue ADTThe Queue ADT stores arbitrary objectsInsertions and deletions follow the first-in first-out schemeInsertions are at the rear of the queue and removals are at the front of the queueMain queue operations: enqueue(object): inserts an
element at the end of the queue
object dequeue(): removes and returns the element at the front of the queue
Auxiliary queue operations: object front(): returns the
element at the front without removing it
integer size(): returns the number of elements stored
boolean isEmpty(): indicates whether no elements are stored
Exceptions Attempting the execution of
dequeue or front on an empty queue throws an EmptyQueueException
6/8/2002 2:16 PM Queues 4
Applications of Queues
Direct applications Waiting lists, bureaucracy Access to shared resources (e.g., printer) Multiprogramming
Indirect applications Auxiliary data structure for algorithms Component of other data structures
6/8/2002 2:16 PM Queues 5
Array-based QueueUse an array of size N in a circular fashionTwo variables keep track of the front and rearf index of the front elementr index immediately past the rear element
Array location r is kept empty
Q0 1 2 rf
normal configuration
Q0 1 2 fr
wrapped-around configuration
6/8/2002 2:16 PM Queues 6
Queue OperationsWe use the modulo operator (remainder of division)
Algorithm size()return (N f + r) mod N
Algorithm isEmpty()return (f = r)
Q0 1 2 rf
Q0 1 2 fr
Queues 6/8/2002 2:16 PM
2
6/8/2002 2:16 PM Queues 7
Queue Operations (cont.)Algorithm enqueue(o)
if size() = N 1 thenthrow FullQueueException
else Q[r] or (r + 1) mod N
Operation enqueue throws an exception if the array is fullThis exception is implementation-dependent
Q0 1 2 rf
Q0 1 2 fr
6/8/2002 2:16 PM Queues 8
Queue Operations (cont.)Operation dequeue throws an exception if the queue is emptyThis exception is specified in the queue ADT
Algorithm dequeue()if isEmpty() then
throw EmptyQueueExceptionelse
o Q[f]f (f + 1) mod Nreturn o
Q0 1 2 rf
Q0 1 2 fr
6/8/2002 2:16 PM Queues 9
Growable Array-based QueueIn an enqueue operation, when the array is full, instead of throwing an exception, we can replace the array with a larger oneSimilar to what we did for an array-based stackThe enqueue operation has amortized running time O(n) with the incremental strategy O(1) with the doubling strategy
6/8/2002 2:16 PM Queues 10
Queue Interface in Java
Java interface corresponding to our Queue ADTRequires the definition of class EmptyQueueExceptionNo corresponding built-in Java class
public interface Queue {
public int size();
public boolean isEmpty();
public Object front()throws EmptyQueueException;
public void enqueue(Object o);
public Object dequeue()throws EmptyQueueException;
}
Sequences 6/8/2002 2:15 PM
1
6/8/2002 2:15 PM Sequences 1
Lists and Sequences
6/8/2002 2:15 PM Sequences 2
Outline and Reading
Singly linked listPosition ADT and List ADT (2.2.2)Doubly linked list ( 2.2.2)Sequence ADT ( 2.2.3)Implementations of the sequence ADT ( 2.2.3)Iterators (2.2.3)
6/8/2002 2:15 PM Sequences 3
Singly Linked ListA singly linked list is a concrete data structure consisting of a sequence of nodesEach node stores element link to the next node
next
elem node
A B C D
6/8/2002 2:15 PM Sequences 4
Stack with a Singly Linked ListWe can implement a stack with a singly linked listThe top element is stored at the first node of the listThe space used is O(n) and each operation of the Stack ADT takes O(1) time
t
nodes
elements
6/8/2002 2:15 PM Sequences 5
Queue with a Singly Linked ListWe can implement a queue with a singly linked list The front element is stored at the first node The rear element is stored at the last node
The space used is O(n) and each operation of the Queue ADT takes O(1) time
f
r
nodes
elements6/8/2002 2:15 PM Sequences 6
Position ADTThe Position ADT models the notion of place within a data structure where a single object is storedIt gives a unified view of diverse ways of storing data, such as a cell of an array a node of a linked list
Just one method: object element(): returns the element
stored at the position
Sequences 6/8/2002 2:15 PM
2
6/8/2002 2:15 PM Sequences 7
List ADT
The List ADT models a sequence of positions storing arbitrary objectsIt establishes a before/after relation between positionsGeneric methods: size(), isEmpty()
Query methods: isFirst(p), isLast(p)
Accessor methods: first(), last() before(p), after(p)
Update methods: replaceElement(p, o),
swapElements(p, q) insertBefore(p, o),
insertAfter(p, o), insertFirst(o),
insertLast(o) remove(p)
6/8/2002 2:15 PM Sequences 8
Doubly Linked ListA doubly linked list provides a natural implementation of the List ADTNodes implement Position and store: element link to the previous node link to the next node
Special trailer and header nodes
prev next
elem
trailerheader nodes/positions
elements
node
6/8/2002 2:15 PM Sequences 9
InsertionWe visualize operation insertAfter(p, X), which returns position q
A B X C
A B C
p
A B C
p
X
q
p q
6/8/2002 2:15 PM Sequences 10
DeletionWe visualize remove(p), where p = last()
A B C D
p
A B C
D
p
A B C
6/8/2002 2:15 PM Sequences 11
PerformanceIn the implementation of the List ADT by means of a doubly linked list The space used by a list with n elements is O(n)
The space used by each position of the list is O(1)
All the operations of the List ADT run in O(1) time
Operation element() of the Position ADT runs in O(1) time
6/8/2002 2:15 PM Sequences 12
Sequence ADTThe Sequence ADT is the union of the Vector and List ADTsElements accessed by Rank, or Position
Generic methods: size(), isEmpty()
Vector-based methods: elemAtRank(r),
replaceAtRank(r, o), insertAtRank(r, o), removeAtRank(r)
List-based methods: first(), last(),
before(p), after(p), replaceElement(p, o), swapElements(p, q), insertBefore(p, o), insertAfter(p, o), insertFirst(o), insertLast(o), remove(p)
Bridge methods: atRank(r), rankOf(p)
Sequences 6/8/2002 2:15 PM
3
6/8/2002 2:15 PM Sequences 13
Applications of SequencesThe Sequence ADT is a basic, general-purpose, data structure for storing an ordered collection of elementsDirect applications: Generic replacement for stack, queue, vector, or
list small database (e.g., address book)
Indirect applications: Building block of more complex data structures
6/8/2002 2:15 PM Sequences 14
Array-based ImplementationWe use a circular array storing positions A position object stores: Element Rank
Indices f and lkeep track of first and last positions
0 1 2 3positions
elements
S
lf
6/8/2002 2:15 PM Sequences 15
Sequence Implementations
nninsertAtRank, removeAtRank11insertFirst, insertLast1ninsertAfter, insertBefore
n1replaceAtRank11replaceElement, swapElements
n1atRank, rankOf, elemAtRank11size, isEmpty
1nremove
11first, last, before, after
ListArrayOperation
6/8/2002 2:15 PM Sequences 16
IteratorsAn iterator abstracts the process of scanning through a collection of elementsMethods of the ObjectIterator ADT: object object() boolean hasNext() object nextObject() reset()
Extends the concept of Position by adding a traversal capabilityImplementation with an array or singly linked list
An iterator is typically associated with an another data structureWe can augment the Stack, Queue, Vector, List and Sequence ADTs with method: ObjectIterator elements()
Two notions of iterator: snapshot: freezes the
contents of the data structure at a given time
dynamic: follows changes to the data structure
Trees 6/8/2002 2:15 PM
1
6/8/2002 2:15 PM Trees 1
Trees
Make Money Fast!
StockFraud
PonziScheme
BankRobbery
6/8/2002 2:15 PM Trees 2
Outline and Reading
Tree ADT (2.3.1)Preorder and postorder traversals (2.3.2)BinaryTree ADT (2.3.3)Inorder traversal (2.3.3)Euler Tour traversal (2.3.3)Template method patternData structures for trees (2.3.4)Java implementation (http://jdsl.org)
6/8/2002 2:15 PM Trees 3
What is a TreeIn computer science, a tree is an abstract model of a hierarchical structureA tree consists of nodes with a parent-child relationApplications: Organization charts File systems Programming
environments
ComputersRUs
Sales R&DManufacturing
Laptops DesktopsUS International
Europe Asia Canada
6/8/2002 2:15 PM Trees 4
subtree
Tree TerminologyRoot: node without parent (A)Internal node: node with at least one child (A, B, C, F)External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D)Ancestors of a node: parent, grandparent, grand-grandparent, etc.Depth of a node: number of ancestorsHeight of a tree: maximum depth of any node (3)Descendant of a node: child, grandchild, grand-grandchild, etc.
A
B DC
G HE F
I J K
Subtree: tree consisting of a node and its descendants
6/8/2002 2:15 PM Trees 5
Tree ADTWe use positions to abstract nodesGeneric methods: integer size() boolean isEmpty() objectIterator elements() positionIterator positions()
Accessor methods: position root() position parent(p) positionIterator children(p)
Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)
Update methods: swapElements(p, q) object replaceElement(p, o)
Additional update methods may be defined by data structures implementing the Tree ADT
6/8/2002 2:15 PM Trees 6
Preorder TraversalA traversal visits the nodes of a tree in a systematic mannerIn a preorder traversal, a node is visited before its descendants Application: print a structured document
Make Money Fast!
1. Motivations References2. Methods
2.1 StockFraud
2.2 PonziScheme1.1 Greed 1.2 Avidity
2.3 BankRobbery
1
2
3
5
4 6 7 8
9
Algorithm preOrder(v)visit(v)for each child w of v
preorder (w)
Trees 6/8/2002 2:15 PM
2
6/8/2002 2:15 PM Trees 7
Postorder TraversalIn a postorder traversal, a node is visited after its descendantsApplication: compute space used by files in a directory and its subdirectories
Algorithm postOrder(v)for each child w of v
postOrder (w)visit(v)
cs16/
homeworks/ todo.txt1Kprograms/
DDR.java10K
Stocks.java25K
h1c.doc3K
h1nc.doc2K
Robot.java20K
9
3
1
7
2 4 5 6
8
6/8/2002 2:15 PM Trees 8
Binary TreeA binary tree is a tree with the following properties: Each internal node has two
children The children of a node are an
ordered pairWe call the children of an internal node left child and right childAlternative recursive definition: a binary tree is either a tree consisting of a single node,
or a tree whose root has an ordered
pair of children, each of which is a binary tree
Applications: arithmetic expressions decision processes searching
A
B C
F GD E
H I
6/8/2002 2:15 PM Trees 9
Arithmetic Expression TreeBinary tree associated with an arithmetic expression internal nodes: operators external nodes: operands
Example: arithmetic expression tree for the expression (2 (a 1) + (3 b))
+
2a 1
3 b
6/8/2002 2:15 PM Trees 10
Decision TreeBinary tree associated with a decision process internal nodes: questions with yes/no answer external nodes: decisions
Example: dining decision
Want a fast meal?
How about coffee? On expense account?
Starbucks Spikes Al Forno Caf Paragon
Yes No
Yes No Yes No
6/8/2002 2:15 PM Trees 11
Properties of Binary TreesNotationn number of nodese number of
external nodesi number of internal
nodesh height
Properties: e = i + 1 n = 2e 1 h i h (n 1)/2 e 2h h log2 e h log2 (n + 1) 1
6/8/2002 2:15 PM Trees 12
BinaryTree ADT
The BinaryTree ADT extends the Tree ADT, i.e., it inherits all the methods of the Tree ADTAdditional methods: position leftChild(p) position rightChild(p) position sibling(p)
Update methods may be defined by data structures implementing the BinaryTree ADT
Trees 6/8/2002 2:15 PM
3
6/8/2002 2:15 PM Trees 13
Inorder TraversalIn an inorder traversal a node is visited after its left subtree and before its right subtreeApplication: draw a binary tree x(v) = inorder rank of v y(v) = depth of v
Algorithm inOrder(v)if isInternal (v)
inOrder (leftChild (v))visit(v)if isInternal (v)
inOrder (rightChild (v))
3
1
2
5
6
7 9
8
4
6/8/2002 2:15 PM Trees 14
Print Arithmetic ExpressionsSpecialization of an inorder traversal print operand or operator
when visiting node print ( before traversing left
subtree print ) after traversing right
subtree
Algorithm printExpression(v)if isInternal (v)
print(()inOrder (leftChild (v))
print(v.element ())if isInternal (v)
inOrder (rightChild (v))print ())
+
2a 1
3 b((2 (a 1)) + (3 b))
6/8/2002 2:15 PM Trees 15
Evaluate Arithmetic ExpressionsSpecialization of a postorder traversal recursive method returning
the value of a subtree when visiting an internal
node, combine the values of the subtrees
Algorithm evalExpr(v)if isExternal (v)
return v.element ()else
x evalExpr(leftChild (v))y evalExpr(rightChild (v)) operator stored at vreturn x y+
2
5 1
3 2
6/8/2002 2:15 PM Trees 16
Euler Tour TraversalGeneric traversal of a binary treeIncludes a special cases the preorder, postorder and inorder traversalsWalk around the tree and visit each node three times: on the left (preorder) from below (inorder) on the right (postorder)
+
25 1
3 2
LB
R
6/8/2002 2:15 PM Trees 17
Template Method PatternGeneric algorithm that can be specialized by redefining certain stepsImplemented by means of an abstract Java class Visit methods that can be redefined by subclassesTemplate method eulerTour Recursively called on the
left and right children A Result object with fields
leftResult, rightResult andfinalResult keeps track of the output of the recursive calls to eulerTour
public abstract class EulerTour {protected BinaryTree tree;protected void visitExternal(Position p, Result r) { }protected void visitLeft(Position p, Result r) { }protected void visitBelow(Position p, Result r) { }protected void visitRight(Position p, Result r) { }protected Object eulerTour(Position p) {
Result r = new Result();if tree.isExternal(p) { visitExternal(p, r); }
else {visitLeft(p, r);r.leftResult = eulerTour(tree.leftChild(p));visitBelow(p, r);r.rightResult = eulerTour(tree.rightChild(p));visitRight(p, r);return r.finalResult;
}
6/8/2002 2:15 PM Trees 18
Specializations of EulerTourWe show how to specialize class EulerTour to evaluate an arithmetic expressionAssumptions External nodes store
Integer objects Internal nodes store
Operator objects supporting methodoperation (Integer, Integer)
public class EvaluateExpressionextends EulerTour {
protected void visitExternal(Position p, Result r) {r.finalResult = (Integer) p.element();
}
protected void visitRight(Position p, Result r) {Operator op = (Operator) p.element();r.finalResult = op.operation(
(Integer) r.leftResult,(Integer) r.rightResult);
}
}
Trees 6/8/2002 2:15 PM
4
6/8/2002 2:15 PM Trees 19
Data Structure for TreesA node is represented by an object storing Element Parent node Sequence of children
nodesNode objects implement the Position ADT
B
DA
C E
F
B
A D F
C
E
6/8/2002 2:15 PM Trees 20
Data Structure for Binary TreesA node is represented by an object storing Element Parent node Left child node Right child node
Node objects implement the Position ADT
B
DA
C E
B
A D
C E
6/8/2002 2:15 PM Trees 21
Java ImplementationTree interfaceBinaryTree interface extending TreeClasses implementing Tree and BinaryTree and providing Constructors Update methods Print methods
Examples of updates for binary trees expandExternal(v) removeAboveExternal(w)
A
expandExternal(v)
A
CB
B
removeAboveExternal(w)
Av v
w
6/8/2002 2:15 PM Trees 22
Trees in JDSLJDSL is the Library of Data Structures in JavaTree interfaces in JDSL InspectableBinaryTree InspectableTree BinaryTree Tree
Inspectable versions of the interfaces do not have update methodsTree classes in JDSL NodeBinaryTree NodeTree
JDSL was developed at Browns Center for Geometric ComputingSee the JDSL documentation and tutorials at http://jdsl.org
InspectableTree
InspectableBinaryTree
Tree
BinaryTree
Heaps 4/5/2002 14:4
Heaps and Priority Queues 1
Heaps and Priority Queues
2
65
79
Heaps and Priority Queues 2
Priority QueueADT ( 2.4.1) A priority queue stores a
collection of items An item is a pair
(key, element) Main methods of the Priority
Queue ADT insertItem(k, o)
inserts an item with key kand element o
removeMin()removes the item withsmallest key and returns itselement
Additional methods minKey(k, o)
returns, but does notremove, the smallest key ofan item
minElement()returns, but does notremove, the element of anitem with smallest key
size(), isEmpty() Applications:
Standby flyers Auctions Stock market
Heaps and Priority Queues 3
Total Order Relation
Keys in a priorityqueue can bearbitrary objectson which an orderis defined Two distinct items
in a priority queuecan have thesame key
Mathematical concept oftotal order relation Reflexive property:
x x Antisymmetric property:
x y y x x = y Transitive property:
x y y z x z
Heaps and Priority Queues 4
Comparator ADT ( 2.4.1) A comparator encapsulates
the action of comparing twoobjects according to a giventotal order relation
A generic priority queueuses an auxiliarycomparator
The comparator is externalto the keys being compared
When the priority queueneeds to compare two keys,it uses its comparator
Methods of the ComparatorADT, all with Booleanreturn type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x)
Heaps and Priority Queues 5
Sorting with a Priority Queue ( 2.4.2) We can use a priority
queue to sort a set ofcomparable elements Insert the elements one
by one with a series ofinsertItem(e, e)operations
Remove the elements insorted order with a seriesof removeMin()operations
The running time of thissorting method depends onthe priority queueimplementation
Algorithm PQ-Sort(S, C)Input sequence S, comparator Cfor the elements of SOutput sequence S sorted inincreasing order according to CP priority queue with
comparator Cwhile S.isEmpty ()
e S.remove (S. first ())P.insertItem(e, e)
while P.isEmpty()e P.removeMin()S.insertLast(e)
Heaps and Priority Queues 6
Sequence-based Priority Queue Implementation with an
unsorted list
Performance: insertItem takes O(1) time
since we can insert the itemat the beginning or end ofthe sequence
removeMin, minKey andminElement take O(n) timesince we have to traversethe entire sequence to findthe smallest key
Implementation with asorted list
Performance: insertItem takes O(n) time
since we have to find theplace where to insert theitem
removeMin, minKey andminElement take O(1) timesince the smallest key is atthe beginning of thesequence
4 5 2 3 1 1 2 3 4 5
Heaps 4/5/2002 14:4
Heaps and Priority Queues 7
Selection-Sort
Selection-sort is the variation of PQ-sort where thepriority queue is implemented with an unsortedsequence
Running time of Selection-sort: Inserting the elements into the priority queue with n
insertItem operations takes O(n) time Removing the elements in sorted order from the priority
queue with n removeMin operations takes timeproportional to
1 + 2 + + n Selection-sort runs in O(n2) time
4 5 2 3 1
Heaps and Priority Queues 8
Insertion-Sort
Insertion-sort is the variation of PQ-sort where thepriority queue is implemented with a sortedsequence
Running time of Insertion-sort: Inserting the elements into the priority queue with n
insertItem operations takes time proportional to 1 + 2 + + n
Removing the elements in sorted order from the priorityqueue with a series of n removeMin operations takes O(n)time
Insertion-sort runs in O(n2) time
1 2 3 4 5
Heaps and Priority Queues 9
What is a heap (2.4.3) A heap is a binary tree
storing keys at its internalnodes and satisfying thefollowing properties: Heap-Order: for every
internal node v other thanthe root,key(v) key(parent(v))
Complete Binary Tree: let hbe the height of the heap for i = 0, , h 1, there are
2i nodes of depth i at depth h 1, the internal
nodes are to the left of theexternal nodes
2
65
79
The last node of a heapis the rightmost internalnode of depth h 1
last node
Heaps and Priority Queues 10
Height of a Heap (2.4.3) Theorem: A heap storing n keys has height O(log n)
Proof: (we apply the complete binary tree property) Let h be the height of a heap storing n keys Since there are 2i keys at depth i = 0, , h 2 and at least one key
at depth h 1, we have n 1 + 2 + 4 + + 2h2 + 1 Thus, n 2h1 , i.e., h log n + 1
1
2
2h2
1
keys0
1
h2h1
depth
Heaps and Priority Queues 11
Heaps and Priority Queues We can use a heap to implement a priority queue We store a (key, element) item at each internal node We keep track of the position of the last node For simplicity, we show only the keys in the pictures
(2, Sue)
(6, Mark)(5, Pat)
(9, Jeff) (7, Anna)
Heaps and Priority Queues 12
Insertion into aHeap (2.4.3) Method insertItem of the
priority queue ADTcorresponds to theinsertion of a key k tothe heap
The insertion algorithmconsists of three steps Find the insertion node z
(the new last node) Store k at z and expand z
into an internal node Restore the heap-order
property (discussed next)
2
65
79
insertion node
2
65
79 1
z
z
Heaps 4/5/2002 14:4
Heaps and Priority Queues 13
Upheap After the insertion of a new key k, the heap-order property may be
violated Algorithm upheap restores the heap-order property by swapping k
along an upward path from the insertion node Upheap terminates when the key k reaches the root or a node
whose parent has a key smaller than or equal to k Since a heap has height O(log n), upheap runs in O(log n) time
2
15
79 6z
1
25
79 6z
Heaps and Priority Queues 14
Removal from a Heap (2.4.3) Method removeMin of
the priority queue ADTcorresponds to theremoval of the root keyfrom the heap
The removal algorithmconsists of three steps Replace the root key with
the key of the last node w Compress w and its
children into a leaf Restore the heap-order
property (discussed next)
2
65
79
last node
w
7
65
9w
Heaps and Priority Queues 15
Downheap After replacing the root key with the key k of the last node, the
heap-order property may be violated Algorithm downheap restores the heap-order property by
swapping key k along a downward path from the root Upheap terminates when key k reaches a leaf or a node whose
children have keys greater than or equal to k Since a heap has height O(log n), downheap runs in O(log n) time
7
65
9w
5
67
9w
Heaps and Priority Queues 16
Updating the Last Node The insertion node can be found by traversing a path of O(log n)
nodes Go up until a left child or the root is reached If a left child is reached, go to the right child Go down left until a leaf is reached
Similar algorithm for updating the last node after a removal
Heaps and Priority Queues 17
Heap-Sort (2.4.4)
Consider a priorityqueue with n itemsimplemented by meansof a heap the space used is O(n) methods insertItem and
removeMin take O(log n)time
methods size, isEmpty,minKey, and minElementtake time O(1) time
Using a heap-basedpriority queue, we cansort a sequence of nelements in O(n log n)time
The resulting algorithmis called heap-sort
Heap-sort is muchfaster than quadraticsorting algorithms, suchas insertion-sort andselection-sort
Heaps and Priority Queues 18
Vector-based HeapImplementation (2.4.3) We can represent a heap with n
keys by means of a vector oflength n + 1
For the node at rank i the left child is at rank 2i the right child is at rank 2i + 1
Links between nodes are notexplicitly stored
The leaves are not represented The cell of at rank 0 is not used Operation insertItem corresponds
to inserting at rank n + 1 Operation removeMin corresponds
to removing at rank n Yields in-place heap-sort
2
65
79
2 5 6 9 71 2 3 4 50
Heaps 4/5/2002 14:4
Heaps and Priority Queues 19
Merging Two Heaps We are given two two
heaps and a key k We create a new heap
with the root nodestoring k and with thetwo heaps as subtrees
We perform downheapto restore the heap-order property
7
3
58
2
64
3
58
2
64
2
3
58
4
67
Heaps and Priority Queues 20
We can construct a heapstoring n given keys inusing a bottom-upconstruction with log nphases
In phase i, pairs ofheaps with 2i 1 keys aremerged into heaps with2i+11 keys
Bottom-up HeapConstruction (2.4.3)
2i 1 2i 1
2i+11
Heaps and Priority Queues 21
Example
1516 124 76 2023
25
1516
5
124
11
76
27
2023
Heaps and Priority Queues 22
Example (contd.)
25
1516
5
124
11
96
27
2023
15
2516
4
125
6
911
23
2027
Heaps and Priority Queues 23
Example (contd.)
7
15
2516
4
125
8
6
911
23
2027
4
15
2516
5
127
6
8
911
23
2027
Heaps and Priority Queues 24
Example (end)
4
15
2516
5
127
10
6
8
911
23
2027
5
15
2516
7
1210
4
6
8
911
23
2027
Heaps 4/5/2002 14:4
Heaps and Priority Queues 25
Analysis We visualize the worst-case time of a downheap with a proxy path
that goes first right and then repeatedly goes left until the bottomof the heap (this path may differ from the actual downheap path)
Since each node is traversed by at most two proxy paths, the totalnumber of nodes of the proxy paths is O(n)
Thus, bottom-up heap construction runs in O(n) time Bottom-up heap construction is faster than n successive insertions
and speeds up the first phase of heap-sort
Priority Queues 6/8/2002 2:00 PM
1
6/8/2002 2:00 PM Priority Queues 1
Priority Queues
$118IBM400Buy$119IBM500Buy
IBMIBM
$120300Sell$122100Sell
6/8/2002 2:00 PM Priority Queues 2
Outline and Reading
PriorityQueue ADT (2.4.1)Total order relation (2.4.1)Comparator ADT (2.4.1)Sorting with a priority queue (2.4.2)Selection-sort (2.4.2)Insertion-sort (2.4.2)
6/8/2002 2:00 PM Priority Queues 3
Priority Queue ADT
A priority queue stores a collection of itemsAn item is a pair(key, element)Main methods of the Priority Queue ADT insertItem(k, o)
inserts an item with key k and element o
removeMin()removes the item with smallest key and returns its element
Additional methods minKey(k, o)
returns, but does not remove, the smallest key of an item
minElement()returns, but does not remove, the element of an item with smallest key
size(), isEmpty()Applications: Standby flyers Auctions Stock market
6/8/2002 2:00 PM Priority Queues 4
Total Order Relation
Keys in a priority queue can be arbitrary objects on which an order is definedTwo distinct items in a priority queue can have the same key
Mathematical concept of total order relation Reflexive property:
x x Antisymmetric property:
x y y x x = y Transitive property:
x y y z x z
6/8/2002 2:00 PM Priority Queues 5
Comparator ADTA comparator encapsulates the action of comparing two objects according to a given total order relationA generic priority queue uses an auxiliary comparatorThe comparator is external to the keys being comparedWhen the priority queue needs to compare two keys, it uses its comparator
Methods of the Comparator ADT, all with Boolean return type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x)
6/8/2002 2:00 PM Priority Queues 6
Sorting with a Priority QueueWe can use a priority queue to sort a set of comparable elements1. Insert the elements one
by one with a series of insertItem(e, e) operations
2. Remove the elements in sorted order with a series of removeMin() operations
The running time of this sorting method depends on the priority queue implementation
Algorithm PQ-Sort(S, C)Input sequence S, comparator Cfor the elements of SOutput sequence S sorted in increasing order according to CP priority queue with
comparator Cwhile S.isEmpty ()
e S.remove (S. first ())P.insertItem(e, e)
while P.isEmpty()e P.removeMin()S.insertLast(e)
Priority Queues 6/8/2002 2:00 PM
2
6/8/2002 2:00 PM Priority Queues 7
Sequence-based Priority QueueImplementation with an unsorted sequence Store the items of the
priority queue in a list-based sequence, in arbitrary order
Performance: insertItem takes O(1) time
since we can insert the item at the beginning or end of the sequence
removeMin, minKey and minElement take O(n) time since we have to traverse the entire sequence to find the smallest key
Implementation with a sorted sequence Store the items of the
priority queue in a sequence, sorted by key
Performance: insertItem takes O(n) time
since we have to find the place where to insert the item
removeMin, minKey and minElement take O(1) time since the smallest key is at the beginning of the sequence
6/8/2002 2:00 PM Priority Queues 8
Selection-Sort
Selection-sort is the variation of PQ-sort where the priority queue is implemented with an unsorted sequenceRunning time of Selection-sort:1. Inserting the elements into the priority queue with n
insertItem operations takes O(n) time2. Removing the elements in sorted order from the priority
queue with n removeMin operations takes time proportional to
1 + 2 + + nSelection-sort runs in O(n2) time
6/8/2002 2:00 PM Priority Queues 9
Insertion-SortInsertion-sort is the variation of PQ-sort where the priority queue is implemented with a sorted sequenceRunning time of Insertion-sort:
1. Inserting the elements into the priority queue with ninsertItem operations takes time proportional to
1 + 2 + + n2. Removing the elements in sorted order from the priority
queue with a series of n removeMin operations takes O(n) time
Insertion-sort runs in O(n2) time
6/8/2002 2:00 PM Priority Queues 10
In-place Insertion-sortInstead of using an external data structure, we can implement selection-sort and insertion-sort in-placeA portion of the input sequence itself serves as the priority queueFor in-place insertion-sort We keep sorted the initial
portion of the sequence We can use
swapElements instead of modifying the sequence
5 4 2 3 1
5 4 2 3 1
4 5 2 3 1
2 4 5 3 1
2 3 4 5 1
1 2 3 4 5
1 2 3 4 5
Dictionaries 4/5/2002 15:1
Dictionaries and Hash Tables 1
Dictionaries and Hash Tables
01234 451-229-0004
981-101-0002025-612-0001
Dictionaries and Hash Tables 2
Dictionary ADT (2.5.1) The dictionary ADT models a
searchable collection of key-element items
The main operations of adictionary are searching,inserting, and deleting items
Multiple items with the samekey are allowed
Applications: address book credit card authorization mapping host names (e.g.,
cs16.net) to internet addresses(e.g., 128.148.34.101)
Dictionary ADT methods: findElement(k): if the
dictionary has an item withkey k, returns its element,else, returns the specialelement NO_SUCH_KEY
insertItem(k, o): inserts item(k, o) into the dictionary
removeElement(k): if thedictionary has an item withkey k, removes it from thedictionary and returns itselement, else returns thespecial elementNO_SUCH_KEY
size(), isEmpty() keys(), Elements()
Dictionaries and Hash Tables 3
Log File (2.5.1) A log file is a dictionary implemented by means of an unsorted
sequence We store the items of the dictionary in a sequence (based on a
doubly-linked lists or a circular array), in arbitrary order Performance:
insertItem takes O(1) time since we can insert the new item at thebeginning or at the end of the sequence
findElement and removeElement take O(n) time since in the worstcase (the item is not found) we traverse the entire sequence tolook for an item with the given key
The log file is effective only for dictionaries of small size or fordictionaries on which insertions are the most commonoperations, while searches and removals are rarely performed(e.g., historical record of logins to a workstation)
Dictionaries and Hash Tables 4
Hash Functions andHash Tables (2.5.2) A hash function h maps keys of a given type to
integers in a fixed interval [0, N 1] Example:
h(x) = x mod Nis a hash function for integer keys
The integer h(x) is called the hash value of key x A hash table for a given key type consists of
Hash function h Array (called table) of size N
When implementing a dictionary with a hash table,the goal is to store item (k, o) at index i = h(k)
Dictionaries and Hash Tables 5
Example
We design a hash table fora dictionary storing items(SSN, Name), where SSN(social security number) is anine-digit positive integer
Our hash table uses anarray of size N = 10,000 andthe hash functionh(x) = last four digits of x
01234
999799989999
451-229-0004
981-101-0002
200-751-9998
025-612-0001
Dictionaries and Hash Tables 6
Hash Functions ( 2.5.3)
A hash function isusually specified as thecomposition of twofunctions:Hash code map: h1: keys integersCompression map: h2: integers [0, N 1]
The hash code map isapplied first, and thecompression map isapplied next on theresult, i.e.,
h(x) = h2(h1(x)) The goal of the hash
function is todisperse the keys inan apparently randomway
Dictionaries 4/5/2002 15:1
Dictionaries and Hash Tables 7
Hash Code Maps (2.5.3) Memory address:
We reinterpret the memoryaddress of the key object asan integer (default hash codeof all Java objects)
Good in general, except fornumeric and string keys
Integer cast: We reinterpret the bits of the
key as an integer Suitable for keys of length
less than or equal to thenumber of bits of the integertype (e.g., byte, short, intand float in Java)
Component sum: We partition the bits of
the key into componentsof fixed length (e.g., 16or 32 bits) and we sumthe components(ignoring overflows)
Suitable for numeric keysof fixed length greaterthan or equal to thenumber of bits of theinteger type (e.g., longand double in Java)
Dictionaries and Hash Tables 8
Hash Code Maps (cont.) Polynomial accumulation:
We partition the bits of thekey into a sequence ofcomponents of fixed length(e.g., 8, 16 or 32 bits) a0 a1 an1
We evaluate the polynomialp(z) = a0 + a1 z + a2 z2 +
+ an1zn1at a fixed value z, ignoringoverflows
Especially suitable for strings(e.g., the choice z = 33 givesat most 6 collisions on a setof 50,000 English words)
Polynomial p(z) can beevaluated in O(n) timeusing Horners rule: The following
polynomials aresuccessively computed,each from the previousone in O(1) time
p0(z) = an1pi (z) = ani1 + zpi1(z)
(i = 1, 2, , n 1) We have p(z) = pn1(z)
Dictionaries and Hash Tables 9
CompressionMaps (2.5.4)
Division: h2 (y) = y mod N The size N of the
hash table is usuallychosen to be a prime
The reason has to dowith number theoryand is beyond thescope of this course
Multiply, Add andDivide (MAD): h2 (y) = (ay + b) mod N a and b are
nonnegative integerssuch that a mod N 0
Otherwise, everyinteger would map tothe same value b
Dictionaries and Hash Tables 10
Collision Handling( 2.5.5)
Collisions occur whendifferent elements aremapped to the samecell Chaining: let each
cell in the table pointto a linked list ofelements that mapthere
Chaining is simple,but requiresadditional memoryoutside the table
01234 451-229-0004 981-101-0004
025-612-0001
Dictionaries and Hash Tables 11
Linear Probing (2.5.5) Open addressing: the
colliding item is placed in adifferent cell of the table
Linear probing handlescollisions by placing thecolliding item in the next(circularly) available table cell
Each table cell inspected isreferred to as a probe
Colliding items lump together,causing future collisions tocause a longer sequence ofprobes
Example: h(x) = x mod 13 Insert keys 18, 41,
22, 44, 59, 32, 31,73, in this order
0 1 2 3 4 5 6 7 8 9 10 11 12
41 18 44 59 32 22 31 73 0 1 2 3 4 5 6 7 8 9 10 11 12
Dictionaries and Hash Tables 12
Search with Linear Probing Consider a hash table A
that uses linear probing findElement(k)
We start at cell h(k) We probe consecutive
locations until one of thefollowing occurs An item with key k is
found, or An empty cell is found,
or N cells have been
unsuccessfully probed
Algorithm findElement(k)i h(k)p 0repeat
c A[i]if c =
return NO_SUCH_KEY else if c.key () = k
return c.element()else
i (i + 1) mod Np p + 1
until p = Nreturn NO_SUCH_KEY
Dictionaries 4/5/2002 15:1
Dictionaries and Hash Tables 13
Updates with Linear Probing To handle insertions and
deletions, we introduce aspecial object, calledAVAILABLE, which replacesdeleted elements
removeElement(k) We search for an item with
key k If such an item (k, o) is
found, we replace it with thespecial item AVAILABLEand we return element o
Else, we returnNO_SUCH_KEY
insert Item(k, o) We throw an exception
if the table is full We start at cell h(k) We probe consecutive
cells until one of thefollowing occurs A cell i is found that is
either empty or storesAVAILABLE, or
N cells have beenunsuccessfully probed
We store item (k, o) incell i
Dictionaries and Hash Tables 14
Double Hashing Double hashing uses a
secondary hash functiond(k) and handlescollisions by placing anitem in the first availablecell of the series
(i + jd(k)) mod N for j = 0, 1, , N 1
The secondary hashfunction d(k) cannothave zero values
The table size N must bea prime to allow probingof all the cells
Common choice ofcompression map for thesecondary hash function:
d2(k) = q k mod qwhere q < N q is a prime
The possible values ford2(k) are
1, 2, , q
Dictionaries and Hash Tables 15
Consider a hashtable storing integerkeys that handlescollision with doublehashing N = 13 h(k) = k mod 13 d(k) = 7 k mod 7
Insert keys 18, 41,22, 44, 59, 32, 31,73, in this order
Example of Double Hashing
0 1 2 3 4 5 6 7 8 9 10 11 12
31 41 18 32 59 73 22 44 0 1 2 3 4 5 6 7 8 9 10 11 12
k h (k ) d (k ) Probes18 5 3 541 2 1 222 9 6 944 5 5 5 1059 7 4 732 6 3 631 5 4 5 9 073 8 4 8
Dictionaries and Hash Tables 16
Performance ofHashing In the worst case, searches,
insertions and removals on ahash table take O(n) time
The worst case occurs whenall the keys inserted into thedictionary collide
The load factor = n/Naffects the performance of ahash table
Assuming that the hashvalues are like randomnumbers, it can be shownthat the expected number ofprobes for an insertion withopen addressing is
1 / (1 )
The expected runningtime of all the dictionaryADT operations in ahash table is O(1)
In practice, hashing isvery fast provided theload factor is not closeto 100%
Applications of hashtables: small databases compilers browser caches
Dictionaries and Hash Tables 17
Universal Hashing ( 2.5.6)
A family of hash functionsis universal if, for any0
Dictionaries 4/5/2002 15:1
Dictionaries and Hash Tables 19
Proof of Universality (Part 2) If f causes no collisions, only g can make h cause collisions. Fix a number x. Of the p integers y=f(k), different from x,
the number such that g(y)=g(x) is at most Since there are p choices for x, the number of hs that will
cause a collision between j and k is at most
There are p(p-1) functions h. So probability of collision isat most
Therefore, the set of possible h functions is universal.
1/ Np
( ) NppNpp )1(1/
NppNpp 1
)1(/)1( =
Dictionaries 6/8/2002 2:01 PM
1
6/8/2002 2:01 PM Dictionaries 1
Dictionaries
6
92
41 8
=
6/8/2002 2:01 PM Dictionaries 2
Outline and ReadingDictionary ADT (2.5.1)Log file (2.5.1)Binary search (3.1.1)Lookup table (3.1.1)Binary search tree (3.1.2) Search (3.1.3) Insertion (3.1.4) Deletion (3.1.5) Performance (3.1.6)
6/8/2002 2:01 PM Dictionaries 3
Dictionary ADTThe dictionary ADT models a searchable collection of key-element itemsThe main operations of a dictionary are searching, inserting, and deleting itemsMultiple items with the same key are allowedApplications: address book credit card authorization mapping host names (e.g.,
cs16.net) to internet addresses (e.g., 128.148.34.101)
Dictionary ADT methods: findElement(k): if the
dictionary has an item with key k, returns its element, else, returns the special element NO_SUCH_KEY
insertItem(k, o): inserts item (k, o) into the dictionary
removeElement(k): if the dictionary has an item with key k, removes it from the dictionary and returns its element, else returns the special element NO_SUCH_KEY
size(), isEmpty() keys(), Elements()
6/8/2002 2:01 PM Dictionaries 4
Log FileA log file is a dictionary implemented by means of an unsorted sequence We store the items of the dictionary in a sequence (based on a
doubly-linked lists or a circular array), in arbitrary orderPerformance: insertItem takes O(1) time since we can insert the new item at the
beginning or at the end of the sequence findElement and removeElement take O(n) time since in the worst
case (the item is not found) we traverse the entire sequence to look for an item with the given key
The log file is effective only for dictionaries of small size or for dictionaries on which insertions are the most common operations, while searches and removals are rarely performed (e.g., historical record of logins to a workstation)
6/8/2002 2:01 PM Dictionaries 5
Binary SearchBinary search performs operation findElement(k) on a dictionary implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after a logarithmic number of steps
Example: findElement(7)
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
0
0
0
0
ml h
ml h
ml h
l=m =h6/8/2002 2:01 PM Dictionaries 6
Lookup TableA lookup table is a dictionary implemented by means of a sorted sequence We store the items of the dictionary in an array-based sequence,
sorted by key We use an external comparator for the keys
Performance: findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift
n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to
shift n/2 items to compact the items after the removalThe lookup table is effective only for dictionaries of small size or for dictionaries on which searches are the most common operations, while insertions and removals are rarely performed (e.g., credit card authorizations)
Dictionaries 6/8/2002 2:01 PM
2
6/8/2002 2:01 PM Dictionaries 7
Binary Search TreeA binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying the following property: Let u, v, and w be three
nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) key(v) key(w)
External nodes do not store items
An inorder traversal of a binary search trees visits the keys in increasing order
6
92
41 8
6/8/2002 2:01 PM Dictionaries 8
SearchTo search for a key k, we trace a downward path starting at the rootThe next node visited depends on the outcome of the comparison of k with the key of the current nodeIf we reach a leaf, the key is not found and we return NO_SUCH_KEYExample: findElement(4)
Algorithm findElement(k, v)if T.isExternal (v)
return NO_SUCH_KEYif k < key(v)
return findElement(k, T.leftChild(v))else if k = key(v)
return element(v)else { k > key(v) }
return findElement(k, T.rightChild(v))
6
92
41 8
=
6/8/2002 2:01 PM Dictionaries 9
InsertionTo perform operation insertItem(k, o), we search for key kAssume k is not already in the tree, and let let w be the leaf reached by the searchWe insert k at node w and expand w into an internal nodeExample: insert 5
6
92
41 8
6
92
41 8
5
>w
w
6/8/2002 2:01 PM Dictionaries 10
DeletionTo perform operation removeElement(k), we search for key kAssume key k is in the tree, and let let v be the node storing kIf node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w)Example: remove 4
6
92
41 8
5
vw
6
92
51 8
6/8/2002 2:01 PM Dictionaries 11
Deletion (cont.)We consider the case where the key k to be removed is stored at a node v whose children are both internal we find the internal node w
that follows v in an inorder traversal
we copy key(w) into node v we remove node w and its
left child z (which must be a leaf) by means of operation removeAboveExternal(z)
Example: remove 3
3
1
8
6 9
5
v
w
z
2
5
1
8
6 9
v
2
6/8/2002 2:01 PM Dictionaries 12
PerformanceConsider a dictionary with n items implemented by means of a binary search tree of height h the space used is O(n) methods findElement ,
insertItem and removeElement take O(h) time
The height h is O(n) in the worst case and O(log n) in the best case
Dictionaries 4/5/2002 15:1
Binary Search Trees 1
Binary Search Trees
6
92
41 8
=
Binary Search Trees 2
Ordered Dictionaries
Keys are assumed to come from a totalorder. New operations:
closestKeyBefore(k) closestElemBefore(k) closestKeyAfter(k) closestElemAfter(k)
Binary Search Trees 3
Binary Search (3.1.1) Binary search performs operation findElement(k) on a dictionary
implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after O(log n) steps
Example: findElement(7)
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
0
0
0
0
ml h
ml h
ml h
l=m =hBinary Search Trees 4
Lookup Table (3.1.1)
A lookup table is a dictionary implemented by means of a sortedsequence We store the items of the dictionary in an array-based sequence,
sorted by key We use an external comparator for the keys
Performance: findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift
n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to
shift n/2 items to compact the items after the removal The lookup table is effective only for dictionaries of small size or
for dictionaries on which searches are the most commonoperations, while insertions and removals are rarely performed(e.g., credit card authorizations)
Binary Search Trees 5
Binary SearchTree (3.1.2) A binary search tree is a
binary tree storing keys(or key-element pairs)at its internal nodes andsatisfying the followingproperty: Let u, v, and w be three
nodes such that u is inthe left subtree of v andw is in the right subtreeof v. We havekey(u) key(v) key(w)
External nodes do notstore items
An inorder traversal of abinary search treesvisits the keys inincreasing order
6
92
41 8
Binary Search Trees 6
Search (3.1.3) To search for a key k,
we trace a downwardpath starting at the root
The next node visiteddepends on theoutcome of thecomparison of k withthe key of the currentnode
If we reach a leaf, thekey is not found and wereturn NO_SUCH_KEY
Example:findElement(4)
Algorithm findElement(k, v)if T.isExternal (v)
return NO_SUCH_KEYif k < key(v)
return findElement(k, T.leftChild(v))else if k = key(v)
return element(v)else { k > key(v) }
return findElement(k, T.rightChild(v))
6
92
41 8
=
Dictionaries 4/5/2002 15:1
Binary Search Trees 7
Insertion (3.1.4) To perform operation
insertItem(k, o), we searchfor key k
Assume k is not already inthe tree, and let let w bethe leaf reached by thesearch
We insert k at node w andexpand w into an internalnode
Example: insert 5
6
92
41 8
6
92
41 8
5
>w
w
Binary Search Trees 8
Deletion (3.1.5) To perform operation
removeElement(k), wesearch for key k
Assume key k is in the tree,and let let v be the nodestoring k
If node v has a leaf child w,we remove v and w from thetree with operationremoveAboveExternal(w)
Example: remove 4
6
92
41 8
5
vw
6
92
51 8
Binary Search Trees 9
Deletion (cont.) We consider the case where
the key k to be removed isstored at a node v whosechildren are both internal we find the internal node w
that fol