Download pdf - Algorithm Design Slides

Transcript
  • Analysis of Algorithms

    AlgorithmInput Output

    An algorithm is a step-by-step procedure forsolving a problem in a finite amount of time.

    Analysis of Algorithms 2

    Running Time (1.1) w Most algorithms transform

    input objects into outputobjects.w The running time of an

    algorithm typically growswith the input size.w Average case time is often

    difficult to determine.w We focus on the worst case

    running time.n Easier to analyzen Crucial to applications such as

    games, finance and robotics

    0

    20

    40

    60

    80

    100

    120

    Ru

    nn

    ing

    Tim

    e

    1000 2000 3000 4000

    Input Size

    best case

    average caseworst case

    Analysis of Algorithms 3

    Experimental Studies ( 1.6)

    w Write a programimplementing thealgorithmw Run the program with

    inputs of varying size andcompositionw Use a method like

    System.currentTimeMillis() toget an accurate measureof the actual running timew Plot the results 0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    8000

    9000

    0 50 100

    Input Size

    Tim

    e (

    ms)

    Analysis of Algorithms 4

    Limitations of Experiments

    w It is necessary to implement thealgorithm, which may be difficultw Results may not be indicative of the

    running time on other inputs not includedin the experiment.w In order to compare two algorithms, the

    same hardware and softwareenvironments must be used

    Analysis of Algorithms 5

    Theoretical Analysis

    w Uses a high-level description of thealgorithm instead of an implementationw Characterizes running time as a

    function of the input size, n.w Takes into account all possible inputsw Allows us to evaluate the speed of an

    algorithm independent of thehardware/software environment

    Analysis of Algorithms 6

    Pseudocode (1.1)w High-level description

    of an algorithmw More structured than

    English prosew Less detailed than a

    programw Preferred notation for

    describing algorithmsw Hides program design

    issues

    Algorithm arrayMax(A, n)Input array A of n integersOutput maximum element of A

    currentMax A[0]for i 1 to n - 1 do

    if A[i] > currentMax thencurrentMax A[i]

    return currentMax

    Example: find maxelement of an array

  • Analysis of Algorithms 7

    Pseudocode Details

    w Control flown if then [else ]n while do n repeat until n for do n Indentation replaces braces

    w Method declarationAlgorithm method (arg [, arg])

    Input Output

    w Method callvar.method (arg [, arg])

    w Return valuereturn expression

    w ExpressionsAssignment

    (like = in Java)= Equality testing

    (like == in Java)n2 Superscripts and other

    mathematicalformatting allowed

    Analysis of Algorithms 8

    The Random Access Machine(RAM) Model

    w A CPU

    w An potentially unboundedbank of memory cells,each of which can hold anarbitrary number orcharacter

    012

    w Memory cells are numbered and accessingany cell in memory takes unit time.

    Analysis of Algorithms 9

    Primitive Operationsw Basic computations

    performed by an algorithmw Identifiable in pseudocodew Largely independent from the

    programming languagew Exact definition not important

    (we will see why later)w Assumed to take a constant

    amount of time in the RAMmodel

    w Examples:n Evaluating an

    expressionn Assigning a value

    to a variablen Indexing into an

    arrayn Calling a methodn Returning from a

    method

    Analysis of Algorithms 10

    Counting PrimitiveOperations (1.1)w By inspecting the pseudocode, we can determine the

    maximum number of primitive operations executed byan algorithm, as a function of the input size

    Algorithm arrayMax(A, n) # operationscurrentMax A[0] 2for i 1 to n - 1 do 2 + n

    if A[i] > currentMax then 2(n - 1)currentMax A[i] 2(n - 1)

    { increment counter i } 2(n - 1)return currentMax 1

    Total 7n - 1

    Analysis of Algorithms 11

    Estimating Running Timew Algorithm arrayMax executes 7n - 1 primitive

    operations in the worst case. Define:a = Time taken by the fastest primitive operationb = Time taken by the slowest primitive operation

    w Let T(n) be worst-case time of arrayMax. Thena (7n - 1) T(n) b(7n - 1)

    w Hence, the running time T(n) is bounded by twolinear functions

    Analysis of Algorithms 12

    Growth Rate of Running Time

    w Changing the hardware/ softwareenvironmentn Affects T(n) by a constant factor, butn Does not alter the growth rate of T(n)

    w The linear growth rate of the runningtime T(n) is an intrinsic property ofalgorithm arrayMax

  • Analysis of Algorithms 13

    Growth Rates

    w Growth rates offunctions:n Linear nn Quadratic n2

    n Cubic n3

    w In a log-log chart,the slope of the linecorresponds to thegrowth rate of thefunction

    1E+01E+21E+41E+61E+8

    1E+101E+121E+141E+161E+181E+201E+221E+241E+261E+281E+30

    1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n

    T(n

    )

    Cubic

    Quadratic

    Linear

    Analysis of Algorithms 14

    Constant Factors

    w The growth rate isnot affected byn constant factors orn lower-order terms

    w Examplesn 102n + 105 is a linear

    functionn 105n2 + 108n is a

    quadratic function1E+01E+21E+41E+61E+8

    1E+101E+121E+141E+161E+181E+201E+221E+241E+26

    1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n

    T(n

    )

    Quadratic

    Quadratic

    LinearLinear

    Analysis of Algorithms 15

    Big-Oh Notation (1.2)w Given functions f(n) and

    g(n), we say that f(n) isO(g(n)) if there arepositive constantsc and n0 such that

    f(n) cg(n) for n n0w Example: 2n + 10 is O(n)

    n 2n + 10 cnn (c - 2) n 10n n 10/(c - 2)n Pick c = 3 and n0 = 10

    1

    10

    100

    1,000

    10,000

    1 10 100 1,000n

    3n

    2n+10

    n

    Analysis of Algorithms 16

    Big-Oh Example

    w Example: the functionn2 is not O(n)n n2 cnn n cn The above inequality

    cannot be satisfiedsince c must be aconstant

    1

    10

    100

    1,000

    10,000

    100,000

    1,000,000

    1 10 100 1,000n

    n^2

    100n

    10n

    n

    Analysis of Algorithms 17

    More Big-Oh Examplesn 7n-2

    7n-2 is O(n)need c > 0 and n 0 1 such that 7n-2 cn for n n0this is true for c = 7 and n0 = 1

    n 3n3 + 20n2 + 53n3 + 20n2 + 5 is O(n3)need c > 0 and n 0 1 such that 3n3 + 20n 2 + 5 cn3 for n n0this is true for c = 4 and n0 = 21

    n 3 log n + log log n3 log n + log log n is O(log n)need c > 0 and n 0 1 such that 3 log n + log log n clog n for n n0this is true for c = 4 and n0 = 2

    Analysis of Algorithms 18

    Big-Oh and Growth Ratew The big-Oh notation gives an upper bound on the

    growth rate of a functionw The statement f(n) is O(g(n)) means that the growth

    rate of f(n) is no more than the growth rate of g(n)w We can use the big-Oh notation to rank functions

    according to their growth rate

    YesYesSame growthYesNof(n) grows moreNoYesg(n) grows more

    g(n) is O(f(n))f(n) is O(g(n))

  • Analysis of Algorithms 19

    Big-Oh Rules

    w If is f(n) a polynomial of degree d, then f(n) isO(nd), i.e.,

    n Drop lower-order termsn Drop constant factors

    w Use the smallest possible class of functionsn Say 2n is O(n) instead of 2n is O(n2)

    w Use the simplest expression of the classn Say 3n + 5 is O(n) instead of 3n + 5 is O(3n)

    Analysis of Algorithms 20

    Asymptotic Algorithm Analysisw The asymptotic analysis of an algorithm determines

    the running time in big-Oh notationw To perform the asymptotic analysis

    n We find the worst-case number of primitive operationsexecuted as a function of the input size

    n We express this function with big-Oh notation

    w Example:n We determine that algorithm arrayMax executes at most

    7n - 1 primitive operationsn We say that algorithm arrayMax runs in O(n) time

    w Since constant factors and lower-order terms areeventually dropped anyhow, we can disregard themwhen counting primitive operations

    Analysis of Algorithms 21

    Computing Prefix Averagesw We further illustrate

    asymptotic analysis withtwo algorithms for prefixaveragesw The i-th prefix average of

    an array X is average of thefirst (i + 1) elements of X:

    A[i] = (X[0] + X[1] + + X[i])/( i+1)

    w Computing the array A ofprefix averages of anotherarray X has applications tofinancial analysis

    0

    5

    10

    15

    20

    25

    30

    35

    1 2 3 4 5 6 7

    XA

    Analysis of Algorithms 22

    Prefix Averages (Quadratic)w The following algorithm computes prefix averages in

    quadratic time by applying the definition

    Algorithm prefixAverages1(X, n)Input array X of n integersOutput array A of prefix averages of X #operations A new array of n integers nfor i 0 to n - 1 do n

    s X[0] nfor j 1 to i do 1 + 2 + + (n - 1)

    s s + X[j] 1 + 2 + + (n - 1)A[i] s / (i + 1) n

    return A 1

    Analysis of Algorithms 23

    Arithmetic Progression

    w The running time ofprefixAverages1 isO(1 + 2 + + n)w The sum of the first n

    integers is n(n + 1) / 2n There is a simple visual

    proof of this fact

    w Thus, algorithmprefixAverages1 runs inO(n2) time 0

    1

    2

    3

    4

    5

    6

    7

    1 2 3 4 5 6

    Analysis of Algorithms 24

    Prefix Averages (Linear)w The following algorithm computes prefix averages in

    linear time by keeping a running sum

    Algorithm prefixAverages2(X, n)Input array X of n integersOutput array A of prefix averages of X #operationsA new array of n integers ns 0 1for i 0 to n - 1 do n

    s s + X[i] nA[i] s / (i + 1) n

    return A 1

    w Algorithm prefixAverages2 runs in O(n) time

  • Analysis of Algorithms 25

    w properties of logarithms:logb(xy) = logbx + logbylogb (x/y) = logbx - logbylogbxa = alogbxlogba = logxa/log xb

    w properties of exponentials:a(b+c) = aba cabc = (ab)cab /ac = a(b-c)b = a logabbc = a c*log ab

    w Summations (Sec. 1.3.1)w Logarithms and Exponents (Sec. 1.3.2)

    w Proof techniques (Sec. 1.3.3)w Basic probability (Sec. 1.3.4)

    Math you need to Review

    Analysis of Algorithms 26

    Relatives of Big-Ohw big-Omega

    n f(n) is W(g(n)) if there is a constant c > 0and an integer constant n0 1 such thatf(n) cg(n) for n n0

    w big-Thetan f(n) is Q(g(n)) if there are constants c > 0 and c > 0 and an

    integer constant n 0 1 such that cg(n) f(n) cg(n) for n n0w little-oh

    n f(n) is o(g(n)) if, for any constant c > 0, there is an integerconstant n 0 0 such that f(n) cg(n) for n n0

    w little-omegan f(n) is w(g(n)) if, for any constant c > 0, there is an integer

    constant n 0 0 such that f(n) cg(n) for n n0

    Analysis of Algorithms 27

    Intuition for AsymptoticNotation

    Big-Ohn f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)

    big-Omegan f(n) is W(g(n)) if f(n) is asymptotically greater than or equal to g(n)

    big-Thetan f(n) is Q(g(n)) if f(n) is asymptotically equal to g(n)

    little-ohn f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)

    little-omegan f(n) is w(g(n)) if is asymptotically strictly greater than g(n)

    Analysis of Algorithms 28

    Example Uses of theRelatives of Big-Oh

    f(n) is w(g(n)) if, for any constant c > 0, there is an integer constant n0 0 such that f(n) cg(n) for n n0

    need 5n02 cn0 given c, the n0 that satifies this is n0 c/5 0

    n 5n 2 is w(n)

    f(n) is W(g(n)) if there is a constant c > 0 and an integer constant n 0 1such that f(n) cg(n) for n n0

    let c = 1 and n0 = 1

    n 5n 2 is W(n)

    f(n) is W(g(n)) if there is a constant c > 0 and an integer constant n 0 1such that f(n) cg(n) for n n0

    let c = 5 and n0 = 1

    n 5n 2 is W(n2)

  • Elementary DataStructures

    Stacks, Queues, & ListsAmortized analysisTrees

    Elementary Data Structures 2

    The Stack ADT (2.1.1) The Stack ADT stores

    arbitrary objects Insertions and deletions

    follow the last-in first-outscheme

    Think of a spring-loadedplate dispenser

    Main stack operations: push(object): inserts an

    element object pop(): removes and

    returns the last insertedelement

    Auxiliary stackoperations: object top(): returns the

    last inserted elementwithout removing it

    integer size(): returns thenumber of elementsstored

    boolean isEmpty():indicates whether noelements are stored

    Elementary Data Structures 3

    Applications of Stacks

    Direct applications Page-visited history in a Web browser Undo sequence in a text editor Chain of method calls in the Java Virtual

    Machine or C++ runtime environment Indirect applications

    Auxiliary data structure for algorithms Component of other data structures

    Elementary Data Structures 4

    Array-based Stack (2.1.1)

    A simple way ofimplementing theStack ADT uses anarray

    We add elementsfrom left to right

    A variable t keepstrack of the index ofthe top element(size is t+1)

    S0 1 2 t

    Algorithm pop():if isEmpty() then

    throw EmptyStackException else

    t t 1return S[t + 1]

    Algorithm push(o)if t = S.length 1 then

    throw FullStackException else

    t t + 1S[t] o

    Elementary Data Structures 5

    Growable Array-basedStack (1.5) In a push operation, when

    the array is full, instead ofthrowing an exception, wecan replace the array witha larger one How large should the new

    array be? incremental strategy:

    increase the size by aconstant c

    doubling strategy: doublethe size

    Algorithm push(o)if t = S.length 1 then

    A new array ofsize

    for i 0 to t do A[i] S[i] S A

    t t + 1S[t] o

    Elementary Data Structures 6

    Comparison of theStrategies

    We compare the incremental strategy andthe doubling strategy by analyzing the totaltime T(n) needed to perform a series of npush operations We assume that we start with an empty

    stack represented by an array of size 1 We call amortized time of a push operation

    the average time taken by a push over theseries of operations, i.e., T(n)/n

  • Elementary Data Structures 7

    Analysis of theIncremental Strategy

    We replace the array k = n/c times The total time T(n) of a series of n push

    operations is proportional ton + c + 2c + 3c + 4c + + kc =

    n + c(1 + 2 + 3 + + k) =n + ck(k + 1)/2

    Since c is a constant, T(n) is O(n + k2), i.e.,O(n2) The amortized time of a push operation is O(n)

    Elementary Data Structures 8

    Direct Analysis of theDoubling Strategy We replace the array k = log2 n

    times The total time T(n) of a series

    of n push operations isproportional to

    n + 1 + 2 + 4 + 8 + + 2k =n + 2k + 1 1 = 2n 1

    T(n) is O(n) The amortized time of a push

    operation is O(1)

    geometric series

    1

    2

    14

    8

    Elementary Data Structures 9

    The accounting method determines the amortizedrunning time with a system of credits and debits

    We view a computer as a coin-operated device requiring1 cyber-dollar for a constant amount of computing.

    Accounting Method Analysisof the Doubling Strategy

    We set up a scheme for charging operations. Thisis known as an amortization scheme.

    The scheme must give us always enough money topay for the actual cost of the operation.

    The total cost of the series of operations is no morethan the total amount charged.

    (amortized time) (total $ charged) / (# operations)Elementary Data Structures 10

    Amortization Scheme forthe Doubling Strategy Consider again the k phases, where each phase consisting of twice

    as many pushes as the one before. At the end of a phase we must have saved enough to pay for the

    array-growing push of the next phase. At the end of phase i we want to have saved i cyber-dollars, to pay

    for the array growth for the beginning of the next phase.

    0 2 4 5 6 731

    $ $ $ $$ $ $ $

    0 2 4 5 6 7 8 9 113 10 12 13 14 151

    $$

    We charge $3 for a push. The $2 saved for a regular push arestored in the second half of the array. Thus, we will have2(i/2)=i cyber-dollars saved at then end of phase i. Therefore, each push runs in O(1) amortized time; n pushes runin O(n) time.

    Elementary Data Structures 11

    The Queue ADT (2.1.2) The Queue ADT stores arbitrary

    objects Insertions and deletions follow

    the first-in first-out scheme Insertions are at the rear of the

    queue and removals are at thefront of the queue

    Main queue operations: enqueue(object): inserts an

    element at the end of thequeue

    object dequeue(): removes andreturns the element at the frontof the queue

    Auxiliary queueoperations: object front(): returns the

    element at the front withoutremoving it

    integer size(): returns thenumber of elements stored

    boolean isEmpty(): indicateswhether no elements arestored

    Exceptions Attempting the execution of

    dequeue or front on anempty queue throws anEmptyQueueException

    Elementary Data Structures 12

    Applications of Queues

    Direct applications Waiting lines Access to shared resources (e.g., printer) Multiprogramming

    Indirect applications Auxiliary data structure for algorithms Component of other data structures

  • Elementary Data Structures 13

    Singly Linked List A singly linked list is a

    concrete data structureconsisting of a sequenceof nodes

    Each node stores element link to the next node

    next

    elem node

    A B C D

    Elementary Data Structures 14

    Queue with a Singly Linked List We can implement a queue with a singly linked list

    The front element is stored at the first node The rear element is stored at the last node

    The space used is O(n) and each operation of theQueue ADT takes O(1) time

    f

    r

    nodes

    elements

    Elementary Data Structures 15

    List ADT (2.2.2)

    The List ADT models asequence of positionsstoring arbitrary objects

    It allows for insertionand removal in themiddle

    Query methods: isFirst(p), isLast(p)

    Accessor methods: first(), last() before(p), after(p)

    Update methods: replaceElement(p, o),

    swapElements(p, q) insertBefore(p, o),

    insertAfter(p, o), insertFirst(o),

    insertLast(o) remove(p)

    Elementary Data Structures 16

    Doubly Linked List A doubly linked list provides a natural

    implementation of the List ADT Nodes implement Position and store:

    element link to the previous node link to the next node

    Special trailer and header nodes

    prev next

    elem

    trailerheader nodes/positions

    elements

    node

    Elementary Data Structures 17

    Trees (2.3) In computer science, a

    tree is an abstract modelof a hierarchicalstructure

    A tree consists of nodeswith a parent-childrelation

    Applications: Organization charts File systems Programming

    environments

    ComputersRUs

    Sales R&DManufacturing

    Laptops DesktopsUS International

    Europe Asia Canada

    Elementary Data Structures 18

    Tree ADT (2.3.1) We use positions to abstract

    nodes Generic methods:

    integer size() boolean isEmpty() objectIterator elements() positionIterator positions()

    Accessor methods: position root() position parent(p) positionIterator children(p)

    Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)

    Update methods: swapElements(p, q) object replaceElement(p, o)

    Additional update methodsmay be defined by datastructures implementing theTree ADT

  • Elementary Data Structures 19

    Preorder Traversal (2.3.2) A traversal visits the nodes of a

    tree in a systematic manner In a preorder traversal, a node is

    visited before its descendants Application: print a structured

    document

    Make Money Fast!

    1. Motivations References2. Methods

    2.1 StockFraud

    2.2 PonziScheme1.1 Greed 1.2 Avidity

    2.3 BankRobbery

    1

    2

    3

    5

    4 6 7 8

    9

    Algorithm preOrder(v)visit(v)for each child w of v

    preorder (w)

    Elementary Data Structures 20

    Postorder Traversal (2.3.2) In a postorder traversal, a

    node is visited after itsdescendants

    Application: compute spaceused by files in a directory andits subdirectories

    Algorithm postOrder(v)for each child w of v

    postOrder (w)visit(v)

    cs16/

    homeworks/ todo.txt1Kprograms/

    DDR.java10K

    Stocks.java25K

    h1c.doc3K

    h1nc.doc2K

    Robot.java20K

    9

    3

    1

    7

    2 4 5 6

    8

    Elementary Data Structures 21

    Amortized Analysis ofTree Traversal Time taken in preorder or postorder traversal

    of an n-node tree is proportional to the sum,taken over each node v in the tree, of thetime needed for the recursive call for v. The call for v costs $(cv + 1), where cv is the

    number of children of v For the call for v, charge one cyber-dollar to v and

    charge one cyber-dollar to each child of v. Each node (except the root) gets charged twice:

    once for its own call and once for its parents call. Therefore, traversal time is O(n).

    Elementary Data Structures 22

    Binary Trees (2.3.3) A binary tree is a tree with the

    following properties: Each internal node has two

    children The children of a node are an

    ordered pair We call the children of an internal

    node left child and right child Alternative recursive definition: a

    binary tree is either a tree consisting of a single node,

    or a tree whose root has an ordered

    pair of children, each of which is abinary tree

    Applications: arithmetic expressions decision processes searching

    A

    B C

    F GD E

    H I

    Elementary Data Structures 23

    Arithmetic Expression Tree Binary tree associated with an arithmetic expression

    internal nodes: operators external nodes: operands

    Example: arithmetic expression tree for theexpression (2 (a 1) + (3 b))

    +

    2a 1

    3 b

    Elementary Data Structures 24

    Decision Tree Binary tree associated with a decision process

    internal nodes: questions with yes/no answer external nodes: decisions

    Example: dining decisionWant a fast meal?

    How about coffee? On expense account?

    Starbucks In N Out Antoine's Dennys

    Yes No

    Yes No Yes No

  • Elementary Data Structures 25

    Properties of Binary Trees Notation

    n number of nodese number of

    external nodesi number of internal

    nodesh height

    Properties: e = i + 1 n = 2e 1 h i h (n 1)/2 e 2h h log2 e h log2 (n + 1) 1

    Elementary Data Structures 26

    Inorder Traversal In an inorder traversal a

    node is visited after its leftsubtree and before its rightsubtree

    Application: draw a binarytree x(v) = inorder rank of v y(v) = depth of v

    Algorithm inOrder(v)if isInternal (v)

    inOrder (leftChild (v))visit(v)if isInternal (v)

    inOrder (rightChild (v))

    3

    1

    2

    5

    6

    7 9

    8

    4

    Elementary Data Structures 27

    Euler Tour Traversal Generic traversal of a binary tree Includes a special cases the preorder, postorder and inorder traversals Walk around the tree and visit each node three times:

    on the left (preorder) from below (inorder) on the right (postorder)

    +

    25 1

    3 2

    LB

    R

    Elementary Data Structures 28

    Printing Arithmetic Expressions Specialization of an inorder

    traversal print operand or operator

    when visiting node print ( before traversing left

    subtree print ) after traversing right

    subtree

    Algorithm printExpression(v)if isInternal (v)

    print(()inOrder (leftChild (v))

    print(v.element ())if isInternal (v)

    inOrder (rightChild (v))print ())

    +

    2a 1

    3 b((2 (a 1)) + (3 b))

    Elementary Data Structures 29

    Linked Data Structure forRepresenting Trees (2.3.4) A node is represented by

    an object storing Element Parent node Sequence of children

    nodes Node objects implement

    the Position ADT

    B

    DA

    C E

    F

    B

    A D F

    C

    E

    Elementary Data Structures 30

    Linked Data Structure forBinary Trees A node is represented

    by an object storing Element Parent node Left child node Right child node

    Node objects implementthe Position ADT

    B

    DA

    C E

    B

    A D

    C E

  • Elementary Data Structures 31

    Array-Based Representation ofBinary Trees nodes are stored in an array

    let rank(node) be defined as follows: rank(root) = 1 if node is the left child of parent(node),

    rank(node) = 2*rank(parent(node)) if node is the right child of parent(node),

    rank(node) = 2*rank(parent(node))+1

    1

    2 3

    6 74 5

    10 11

    A

    HG

    FE

    D

    C

    B

    J

  • 1Stacks

    Stacks 2

    Outline and Reading

    The Stack ADT (2.1.1)Applications of Stacks (2.1.1)Array-based implementation (2.1.1)Growable array-based stack (1.5)

    Stacks 3

    Abstract Data Types (ADTs)An abstract data type (ADT) is an abstraction of a data structureAn ADT specifies: Data stored Operations on the

    data Error conditions

    associated with operations

    Example: ADT modeling a simple stock trading system The data stored are buy/sell

    orders The operations supported are order buy(stock, shares, price) order sell(stock, shares, price) void cancel(order)

    Error conditions: Buy/sell a nonexistent stock Cancel a nonexistent order

    Stacks 4

    The Stack ADTThe Stack ADT stores arbitrary objectsInsertions and deletions follow the last-in first-out schemeThink of a spring-loaded plate dispenserMain stack operations: push(object): inserts an

    element object pop(): removes and

    returns the last inserted element

    Auxiliary stack operations: object top(): returns the

    last inserted element without removing it

    integer size(): returns the number of elements stored

    boolean isEmpty(): indicates whether no elements are stored

    Stacks 5

    ExceptionsAttempting the execution of an operation of ADT may sometimes cause an error condition, called an exceptionExceptions are said to be thrown by an operation that cannot be executed

    In the Stack ADT, operations pop and top cannot be performed if the stack is emptyAttempting the execution of pop or top on an empty stack throws an EmptyStackException

    Stacks 6

    Applications of Stacks

    Direct applications Page-visited history in a Web browser Undo sequence in a text editor Chain of method calls in the Java Virtual

    MachineIndirect applications Auxiliary data structure for algorithms Component of other data structures

  • 2Stacks 7

    Method Stack in the JVMThe Java Virtual Machine (JVM) keeps track of the chain of active methods with a stackWhen a method is called, the JVM pushes on the stack a frame containing Local variables and return value Program counter, keeping track of

    the statement being executed When a method ends, its frame is popped from the stack and control is passed to the method on top of the stack

    main() {int i = 5;foo(i);}

    foo(int j) {int k;k = j+1;bar(k);}

    bar(int m) {}

    barPC = 1m = 6

    fooPC = 3j = 5k = 6

    mainPC = 2i = 5

    Stacks 8

    Array-based StackA simple way of implementing the Stack ADT uses an arrayWe add elements from left to rightA variable keeps track of the index of the top element

    S0 1 2 t

    Algorithm size()return t + 1

    Algorithm pop()if isEmpty() then

    throw EmptyStackExceptionelse

    t t 1return S[t + 1]

    Stacks 9

    Array-based Stack (cont.)The array storing the stack elements may become fullA push operation will then throw aFullStackException Limitation of the array-

    based implementation Not intrinsic to the

    Stack ADT

    S0 1 2 t

    Algorithm push(o)if t = S.length 1 then

    throw FullStackExceptionelse

    t t + 1S[t] o

    Stacks 10

    Performance and LimitationsPerformance Let n be the number of elements in the stack The space used is O(n) Each operation runs in time O(1)

    Limitations The maximum size of the stack must be defined a

    priori and cannot be changed Trying to push a new element into a full stack

    causes an implementation-specific exception

    Stacks 11

    Computing SpansWe show how to use a stack as an auxiliary data structure in an algorithmGiven an an array X, the span S[i] of X[i] is the maximum number of consecutive elements X[j] immediately preceding X[i] and such that X[j] X[i]Spans have applications to financial analysis E.g., stock at 52-week high

    1321125436X

    S

    01234567

    0 1 2 3 4

    Stacks 12

    Quadratic AlgorithmAlgorithm spans1(X, n)

    Input array X of n integersOutput array S of spans of X #S new array of n integers nfor i 0 to n 1 do n

    s 1 nwhile s i X[i s] X[i] 1 + 2 + + (n 1)

    s s + 1 1 + 2 + + (n 1)S[i] s n

    return S 1

    Algorithm spans1 runs in O(n2) time

  • 3Stacks 13

    Computing Spans with a StackWe keep in a stack the indices of the elements visible when looking backWe scan the array from left to right Let i be the current index We pop indices from the

    stack until we find index jsuch that X[i] < X[j]

    We set S[i] i j We push x onto the stack

    01234567

    0 1 2 3 4 5 6 7

    Stacks 14

    Linear AlgorithmAlgorithm spans2(X, n) #

    S new array of n integers nA new empty stack 1for i 0 to n 1 do n

    while (A.isEmpty() X[top()] X[i] ) do n

    j A.pop() nif A.isEmpty() then n

    S[i] i + 1 nelse

    S[i] i j nA.push(i) n

    return S 1

    Each index of the array Is pushed into the

    stack exactly one Is popped from

    the stack at most once

    The statements in the while-loop are executed at most n times Algorithm spans2 runs in O(n) time

    Stacks 15

    Growable Array-based StackIn a push operation, when the array is full, instead of throwing an exception, we can replace the array with a larger oneHow large should the new array be? incremental strategy:

    increase the size by a constant c

    doubling strategy: double the size

    Algorithm push(o)if t = S.length 1 then

    A new array ofsize

    for i 0 to t doA[i] S[i]S A

    t t + 1S[t] o

    Stacks 16

    Comparison of the Strategies

    We compare the incremental strategy and the doubling strategy by analyzing the total time T(n) needed to perform a series of npush operationsWe assume that we start with an empty stack represented by an array of size 1We call amortized time of a push operation the average time taken by a push over the series of operations, i.e., T(n)/n

    Stacks 17

    Incremental Strategy Analysis

    We replace the array k = n/c timesThe total time T(n) of a series of n push operations is proportional to

    n + c + 2c + 3c + 4c + + kc =n + c(1 + 2 + 3 + + k) =

    n + ck(k + 1)/2Since c is a constant, T(n) is O(n + k2), i.e., O(n2)The amortized time of a push operation is O(n)

    Stacks 18

    Doubling Strategy AnalysisWe replace the array k = log2 n timesThe total time T(n) of a series of n push operations is proportional to

    n + 1 + 2 + 4 + 8 + + 2k =n + 2k + 1 1 = 2n 1

    T(n) is O(n)The amortized time of a push operation is O(1)

    geometric series

    1

    2

    14

    8

  • 4Stacks 19

    Stack Interface in Java

    Java interface corresponding to our Stack ADTRequires the definition of class EmptyStackExceptionDifferent from the built-in Java class java.util.Stack

    public interface Stack {

    public int size();

    public boolean isEmpty();

    public Object top()throws EmptyStackException;

    public void push(Object o);

    public Object pop()throws EmptyStackException;

    }

    Stacks 20

    Array-based Stack in Javapublic class ArrayStack

    implements Stack {

    // holds the stack elementsprivate Object S[ ];

    // index to top elementprivate int top = -1;

    // constructorpublic ArrayStack(int capacity) {

    S = new Object[capacity]);}

    public Object pop()throws EmptyStackException {

    if isEmpty()throw new EmptyStackException

    (Empty stack: cannot pop);Object temp = S[top];// facilitates garbage collectionS[top] = null;top = top 1;return temp;

    }

  • Vectors 6/8/2002 2:14 PM

    1

    6/8/2002 2:14 PM Vectors 1

    Vectors

    6/8/2002 2:14 PM Vectors 2

    Outline and Reading

    The Vector ADT (2.2.1)Array-based implementation (2.2.1)

    6/8/2002 2:14 PM Vectors 3

    The Vector ADTThe Vector ADT extends the notion of array by storing a sequence of arbitrary objectsAn element can be accessed, inserted or removed by specifying its rank (number of elements preceding it)An exception is thrown if an incorrect rank is specified (e.g., a negative rank)

    Main vector operations: object elemAtRank(integer r):

    returns the element at rank r without removing it

    object replaceAtRank(integer r, object o): replace the element at rank with o and return the old element

    insertAtRank(integer r, object o): insert a new element o to have rank r

    object removeAtRank(integer r): removes and returns the element at rank r

    Additional operations size() and isEmpty()

    6/8/2002 2:14 PM Vectors 4

    Applications of Vectors

    Direct applications Sorted collection of objects (elementary

    database)

    Indirect applications Auxiliary data structure for algorithms Component of other data structures

    6/8/2002 2:14 PM Vectors 5

    Array-based VectorUse an array V of size NA variable n keeps track of the size of the vector (number of elements stored)Operation elemAtRank(r) is implemented in O(1)time by returning V[r]

    V0 1 2 nr

    6/8/2002 2:14 PM Vectors 6

    InsertionIn operation insertAtRank(r, o), we need to make room for the new element by shifting forward the n r elements V[r], , V[n 1]In the worst case (r = 0), this takes O(n) time

    V0 1 2 nr

    V0 1 2 nr

    V0 1 2 n

    or

  • Vectors 6/8/2002 2:14 PM

    2

    6/8/2002 2:14 PM Vectors 7

    DeletionIn operation removeAtRank(r), we need to fill the hole left by the removed element by shifting backward the n r 1 elements V[r + 1], , V[n 1]In the worst case (r = 0), this takes O(n) time

    V0 1 2 nr

    V0 1 2 n

    or

    V0 1 2 nr

    6/8/2002 2:14 PM Vectors 8

    PerformanceIn the array based implementation of a Vector The space used by the data structure is O(n) size, isEmpty, elemAtRank and replaceAtRank run in

    O(1) time insertAtRank and removeAtRank run in O(n) time

    If we use the array in a circular fashion,insertAtRank(0) and removeAtRank(0) run in O(1) timeIn an insertAtRank operation, when the array is full, instead of throwing an exception, we can replace the array with a larger one

  • Queues 6/8/2002 2:16 PM

    1

    6/8/2002 2:16 PM Queues 1

    Queues

    6/8/2002 2:16 PM Queues 2

    Outline and Reading

    The Queue ADT (2.1.2)Implementation with a circular array (2.1.2)Growable array-based queueQueue interface in Java

    6/8/2002 2:16 PM Queues 3

    The Queue ADTThe Queue ADT stores arbitrary objectsInsertions and deletions follow the first-in first-out schemeInsertions are at the rear of the queue and removals are at the front of the queueMain queue operations: enqueue(object): inserts an

    element at the end of the queue

    object dequeue(): removes and returns the element at the front of the queue

    Auxiliary queue operations: object front(): returns the

    element at the front without removing it

    integer size(): returns the number of elements stored

    boolean isEmpty(): indicates whether no elements are stored

    Exceptions Attempting the execution of

    dequeue or front on an empty queue throws an EmptyQueueException

    6/8/2002 2:16 PM Queues 4

    Applications of Queues

    Direct applications Waiting lists, bureaucracy Access to shared resources (e.g., printer) Multiprogramming

    Indirect applications Auxiliary data structure for algorithms Component of other data structures

    6/8/2002 2:16 PM Queues 5

    Array-based QueueUse an array of size N in a circular fashionTwo variables keep track of the front and rearf index of the front elementr index immediately past the rear element

    Array location r is kept empty

    Q0 1 2 rf

    normal configuration

    Q0 1 2 fr

    wrapped-around configuration

    6/8/2002 2:16 PM Queues 6

    Queue OperationsWe use the modulo operator (remainder of division)

    Algorithm size()return (N f + r) mod N

    Algorithm isEmpty()return (f = r)

    Q0 1 2 rf

    Q0 1 2 fr

  • Queues 6/8/2002 2:16 PM

    2

    6/8/2002 2:16 PM Queues 7

    Queue Operations (cont.)Algorithm enqueue(o)

    if size() = N 1 thenthrow FullQueueException

    else Q[r] or (r + 1) mod N

    Operation enqueue throws an exception if the array is fullThis exception is implementation-dependent

    Q0 1 2 rf

    Q0 1 2 fr

    6/8/2002 2:16 PM Queues 8

    Queue Operations (cont.)Operation dequeue throws an exception if the queue is emptyThis exception is specified in the queue ADT

    Algorithm dequeue()if isEmpty() then

    throw EmptyQueueExceptionelse

    o Q[f]f (f + 1) mod Nreturn o

    Q0 1 2 rf

    Q0 1 2 fr

    6/8/2002 2:16 PM Queues 9

    Growable Array-based QueueIn an enqueue operation, when the array is full, instead of throwing an exception, we can replace the array with a larger oneSimilar to what we did for an array-based stackThe enqueue operation has amortized running time O(n) with the incremental strategy O(1) with the doubling strategy

    6/8/2002 2:16 PM Queues 10

    Queue Interface in Java

    Java interface corresponding to our Queue ADTRequires the definition of class EmptyQueueExceptionNo corresponding built-in Java class

    public interface Queue {

    public int size();

    public boolean isEmpty();

    public Object front()throws EmptyQueueException;

    public void enqueue(Object o);

    public Object dequeue()throws EmptyQueueException;

    }

  • Sequences 6/8/2002 2:15 PM

    1

    6/8/2002 2:15 PM Sequences 1

    Lists and Sequences

    6/8/2002 2:15 PM Sequences 2

    Outline and Reading

    Singly linked listPosition ADT and List ADT (2.2.2)Doubly linked list ( 2.2.2)Sequence ADT ( 2.2.3)Implementations of the sequence ADT ( 2.2.3)Iterators (2.2.3)

    6/8/2002 2:15 PM Sequences 3

    Singly Linked ListA singly linked list is a concrete data structure consisting of a sequence of nodesEach node stores element link to the next node

    next

    elem node

    A B C D

    6/8/2002 2:15 PM Sequences 4

    Stack with a Singly Linked ListWe can implement a stack with a singly linked listThe top element is stored at the first node of the listThe space used is O(n) and each operation of the Stack ADT takes O(1) time

    t

    nodes

    elements

    6/8/2002 2:15 PM Sequences 5

    Queue with a Singly Linked ListWe can implement a queue with a singly linked list The front element is stored at the first node The rear element is stored at the last node

    The space used is O(n) and each operation of the Queue ADT takes O(1) time

    f

    r

    nodes

    elements6/8/2002 2:15 PM Sequences 6

    Position ADTThe Position ADT models the notion of place within a data structure where a single object is storedIt gives a unified view of diverse ways of storing data, such as a cell of an array a node of a linked list

    Just one method: object element(): returns the element

    stored at the position

  • Sequences 6/8/2002 2:15 PM

    2

    6/8/2002 2:15 PM Sequences 7

    List ADT

    The List ADT models a sequence of positions storing arbitrary objectsIt establishes a before/after relation between positionsGeneric methods: size(), isEmpty()

    Query methods: isFirst(p), isLast(p)

    Accessor methods: first(), last() before(p), after(p)

    Update methods: replaceElement(p, o),

    swapElements(p, q) insertBefore(p, o),

    insertAfter(p, o), insertFirst(o),

    insertLast(o) remove(p)

    6/8/2002 2:15 PM Sequences 8

    Doubly Linked ListA doubly linked list provides a natural implementation of the List ADTNodes implement Position and store: element link to the previous node link to the next node

    Special trailer and header nodes

    prev next

    elem

    trailerheader nodes/positions

    elements

    node

    6/8/2002 2:15 PM Sequences 9

    InsertionWe visualize operation insertAfter(p, X), which returns position q

    A B X C

    A B C

    p

    A B C

    p

    X

    q

    p q

    6/8/2002 2:15 PM Sequences 10

    DeletionWe visualize remove(p), where p = last()

    A B C D

    p

    A B C

    D

    p

    A B C

    6/8/2002 2:15 PM Sequences 11

    PerformanceIn the implementation of the List ADT by means of a doubly linked list The space used by a list with n elements is O(n)

    The space used by each position of the list is O(1)

    All the operations of the List ADT run in O(1) time

    Operation element() of the Position ADT runs in O(1) time

    6/8/2002 2:15 PM Sequences 12

    Sequence ADTThe Sequence ADT is the union of the Vector and List ADTsElements accessed by Rank, or Position

    Generic methods: size(), isEmpty()

    Vector-based methods: elemAtRank(r),

    replaceAtRank(r, o), insertAtRank(r, o), removeAtRank(r)

    List-based methods: first(), last(),

    before(p), after(p), replaceElement(p, o), swapElements(p, q), insertBefore(p, o), insertAfter(p, o), insertFirst(o), insertLast(o), remove(p)

    Bridge methods: atRank(r), rankOf(p)

  • Sequences 6/8/2002 2:15 PM

    3

    6/8/2002 2:15 PM Sequences 13

    Applications of SequencesThe Sequence ADT is a basic, general-purpose, data structure for storing an ordered collection of elementsDirect applications: Generic replacement for stack, queue, vector, or

    list small database (e.g., address book)

    Indirect applications: Building block of more complex data structures

    6/8/2002 2:15 PM Sequences 14

    Array-based ImplementationWe use a circular array storing positions A position object stores: Element Rank

    Indices f and lkeep track of first and last positions

    0 1 2 3positions

    elements

    S

    lf

    6/8/2002 2:15 PM Sequences 15

    Sequence Implementations

    nninsertAtRank, removeAtRank11insertFirst, insertLast1ninsertAfter, insertBefore

    n1replaceAtRank11replaceElement, swapElements

    n1atRank, rankOf, elemAtRank11size, isEmpty

    1nremove

    11first, last, before, after

    ListArrayOperation

    6/8/2002 2:15 PM Sequences 16

    IteratorsAn iterator abstracts the process of scanning through a collection of elementsMethods of the ObjectIterator ADT: object object() boolean hasNext() object nextObject() reset()

    Extends the concept of Position by adding a traversal capabilityImplementation with an array or singly linked list

    An iterator is typically associated with an another data structureWe can augment the Stack, Queue, Vector, List and Sequence ADTs with method: ObjectIterator elements()

    Two notions of iterator: snapshot: freezes the

    contents of the data structure at a given time

    dynamic: follows changes to the data structure

  • Trees 6/8/2002 2:15 PM

    1

    6/8/2002 2:15 PM Trees 1

    Trees

    Make Money Fast!

    StockFraud

    PonziScheme

    BankRobbery

    6/8/2002 2:15 PM Trees 2

    Outline and Reading

    Tree ADT (2.3.1)Preorder and postorder traversals (2.3.2)BinaryTree ADT (2.3.3)Inorder traversal (2.3.3)Euler Tour traversal (2.3.3)Template method patternData structures for trees (2.3.4)Java implementation (http://jdsl.org)

    6/8/2002 2:15 PM Trees 3

    What is a TreeIn computer science, a tree is an abstract model of a hierarchical structureA tree consists of nodes with a parent-child relationApplications: Organization charts File systems Programming

    environments

    ComputersRUs

    Sales R&DManufacturing

    Laptops DesktopsUS International

    Europe Asia Canada

    6/8/2002 2:15 PM Trees 4

    subtree

    Tree TerminologyRoot: node without parent (A)Internal node: node with at least one child (A, B, C, F)External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D)Ancestors of a node: parent, grandparent, grand-grandparent, etc.Depth of a node: number of ancestorsHeight of a tree: maximum depth of any node (3)Descendant of a node: child, grandchild, grand-grandchild, etc.

    A

    B DC

    G HE F

    I J K

    Subtree: tree consisting of a node and its descendants

    6/8/2002 2:15 PM Trees 5

    Tree ADTWe use positions to abstract nodesGeneric methods: integer size() boolean isEmpty() objectIterator elements() positionIterator positions()

    Accessor methods: position root() position parent(p) positionIterator children(p)

    Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)

    Update methods: swapElements(p, q) object replaceElement(p, o)

    Additional update methods may be defined by data structures implementing the Tree ADT

    6/8/2002 2:15 PM Trees 6

    Preorder TraversalA traversal visits the nodes of a tree in a systematic mannerIn a preorder traversal, a node is visited before its descendants Application: print a structured document

    Make Money Fast!

    1. Motivations References2. Methods

    2.1 StockFraud

    2.2 PonziScheme1.1 Greed 1.2 Avidity

    2.3 BankRobbery

    1

    2

    3

    5

    4 6 7 8

    9

    Algorithm preOrder(v)visit(v)for each child w of v

    preorder (w)

  • Trees 6/8/2002 2:15 PM

    2

    6/8/2002 2:15 PM Trees 7

    Postorder TraversalIn a postorder traversal, a node is visited after its descendantsApplication: compute space used by files in a directory and its subdirectories

    Algorithm postOrder(v)for each child w of v

    postOrder (w)visit(v)

    cs16/

    homeworks/ todo.txt1Kprograms/

    DDR.java10K

    Stocks.java25K

    h1c.doc3K

    h1nc.doc2K

    Robot.java20K

    9

    3

    1

    7

    2 4 5 6

    8

    6/8/2002 2:15 PM Trees 8

    Binary TreeA binary tree is a tree with the following properties: Each internal node has two

    children The children of a node are an

    ordered pairWe call the children of an internal node left child and right childAlternative recursive definition: a binary tree is either a tree consisting of a single node,

    or a tree whose root has an ordered

    pair of children, each of which is a binary tree

    Applications: arithmetic expressions decision processes searching

    A

    B C

    F GD E

    H I

    6/8/2002 2:15 PM Trees 9

    Arithmetic Expression TreeBinary tree associated with an arithmetic expression internal nodes: operators external nodes: operands

    Example: arithmetic expression tree for the expression (2 (a 1) + (3 b))

    +

    2a 1

    3 b

    6/8/2002 2:15 PM Trees 10

    Decision TreeBinary tree associated with a decision process internal nodes: questions with yes/no answer external nodes: decisions

    Example: dining decision

    Want a fast meal?

    How about coffee? On expense account?

    Starbucks Spikes Al Forno Caf Paragon

    Yes No

    Yes No Yes No

    6/8/2002 2:15 PM Trees 11

    Properties of Binary TreesNotationn number of nodese number of

    external nodesi number of internal

    nodesh height

    Properties: e = i + 1 n = 2e 1 h i h (n 1)/2 e 2h h log2 e h log2 (n + 1) 1

    6/8/2002 2:15 PM Trees 12

    BinaryTree ADT

    The BinaryTree ADT extends the Tree ADT, i.e., it inherits all the methods of the Tree ADTAdditional methods: position leftChild(p) position rightChild(p) position sibling(p)

    Update methods may be defined by data structures implementing the BinaryTree ADT

  • Trees 6/8/2002 2:15 PM

    3

    6/8/2002 2:15 PM Trees 13

    Inorder TraversalIn an inorder traversal a node is visited after its left subtree and before its right subtreeApplication: draw a binary tree x(v) = inorder rank of v y(v) = depth of v

    Algorithm inOrder(v)if isInternal (v)

    inOrder (leftChild (v))visit(v)if isInternal (v)

    inOrder (rightChild (v))

    3

    1

    2

    5

    6

    7 9

    8

    4

    6/8/2002 2:15 PM Trees 14

    Print Arithmetic ExpressionsSpecialization of an inorder traversal print operand or operator

    when visiting node print ( before traversing left

    subtree print ) after traversing right

    subtree

    Algorithm printExpression(v)if isInternal (v)

    print(()inOrder (leftChild (v))

    print(v.element ())if isInternal (v)

    inOrder (rightChild (v))print ())

    +

    2a 1

    3 b((2 (a 1)) + (3 b))

    6/8/2002 2:15 PM Trees 15

    Evaluate Arithmetic ExpressionsSpecialization of a postorder traversal recursive method returning

    the value of a subtree when visiting an internal

    node, combine the values of the subtrees

    Algorithm evalExpr(v)if isExternal (v)

    return v.element ()else

    x evalExpr(leftChild (v))y evalExpr(rightChild (v)) operator stored at vreturn x y+

    2

    5 1

    3 2

    6/8/2002 2:15 PM Trees 16

    Euler Tour TraversalGeneric traversal of a binary treeIncludes a special cases the preorder, postorder and inorder traversalsWalk around the tree and visit each node three times: on the left (preorder) from below (inorder) on the right (postorder)

    +

    25 1

    3 2

    LB

    R

    6/8/2002 2:15 PM Trees 17

    Template Method PatternGeneric algorithm that can be specialized by redefining certain stepsImplemented by means of an abstract Java class Visit methods that can be redefined by subclassesTemplate method eulerTour Recursively called on the

    left and right children A Result object with fields

    leftResult, rightResult andfinalResult keeps track of the output of the recursive calls to eulerTour

    public abstract class EulerTour {protected BinaryTree tree;protected void visitExternal(Position p, Result r) { }protected void visitLeft(Position p, Result r) { }protected void visitBelow(Position p, Result r) { }protected void visitRight(Position p, Result r) { }protected Object eulerTour(Position p) {

    Result r = new Result();if tree.isExternal(p) { visitExternal(p, r); }

    else {visitLeft(p, r);r.leftResult = eulerTour(tree.leftChild(p));visitBelow(p, r);r.rightResult = eulerTour(tree.rightChild(p));visitRight(p, r);return r.finalResult;

    }

    6/8/2002 2:15 PM Trees 18

    Specializations of EulerTourWe show how to specialize class EulerTour to evaluate an arithmetic expressionAssumptions External nodes store

    Integer objects Internal nodes store

    Operator objects supporting methodoperation (Integer, Integer)

    public class EvaluateExpressionextends EulerTour {

    protected void visitExternal(Position p, Result r) {r.finalResult = (Integer) p.element();

    }

    protected void visitRight(Position p, Result r) {Operator op = (Operator) p.element();r.finalResult = op.operation(

    (Integer) r.leftResult,(Integer) r.rightResult);

    }

    }

  • Trees 6/8/2002 2:15 PM

    4

    6/8/2002 2:15 PM Trees 19

    Data Structure for TreesA node is represented by an object storing Element Parent node Sequence of children

    nodesNode objects implement the Position ADT

    B

    DA

    C E

    F

    B

    A D F

    C

    E

    6/8/2002 2:15 PM Trees 20

    Data Structure for Binary TreesA node is represented by an object storing Element Parent node Left child node Right child node

    Node objects implement the Position ADT

    B

    DA

    C E

    B

    A D

    C E

    6/8/2002 2:15 PM Trees 21

    Java ImplementationTree interfaceBinaryTree interface extending TreeClasses implementing Tree and BinaryTree and providing Constructors Update methods Print methods

    Examples of updates for binary trees expandExternal(v) removeAboveExternal(w)

    A

    expandExternal(v)

    A

    CB

    B

    removeAboveExternal(w)

    Av v

    w

    6/8/2002 2:15 PM Trees 22

    Trees in JDSLJDSL is the Library of Data Structures in JavaTree interfaces in JDSL InspectableBinaryTree InspectableTree BinaryTree Tree

    Inspectable versions of the interfaces do not have update methodsTree classes in JDSL NodeBinaryTree NodeTree

    JDSL was developed at Browns Center for Geometric ComputingSee the JDSL documentation and tutorials at http://jdsl.org

    InspectableTree

    InspectableBinaryTree

    Tree

    BinaryTree

  • Heaps 4/5/2002 14:4

    Heaps and Priority Queues 1

    Heaps and Priority Queues

    2

    65

    79

    Heaps and Priority Queues 2

    Priority QueueADT ( 2.4.1) A priority queue stores a

    collection of items An item is a pair

    (key, element) Main methods of the Priority

    Queue ADT insertItem(k, o)

    inserts an item with key kand element o

    removeMin()removes the item withsmallest key and returns itselement

    Additional methods minKey(k, o)

    returns, but does notremove, the smallest key ofan item

    minElement()returns, but does notremove, the element of anitem with smallest key

    size(), isEmpty() Applications:

    Standby flyers Auctions Stock market

    Heaps and Priority Queues 3

    Total Order Relation

    Keys in a priorityqueue can bearbitrary objectson which an orderis defined Two distinct items

    in a priority queuecan have thesame key

    Mathematical concept oftotal order relation Reflexive property:

    x x Antisymmetric property:

    x y y x x = y Transitive property:

    x y y z x z

    Heaps and Priority Queues 4

    Comparator ADT ( 2.4.1) A comparator encapsulates

    the action of comparing twoobjects according to a giventotal order relation

    A generic priority queueuses an auxiliarycomparator

    The comparator is externalto the keys being compared

    When the priority queueneeds to compare two keys,it uses its comparator

    Methods of the ComparatorADT, all with Booleanreturn type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x)

    Heaps and Priority Queues 5

    Sorting with a Priority Queue ( 2.4.2) We can use a priority

    queue to sort a set ofcomparable elements Insert the elements one

    by one with a series ofinsertItem(e, e)operations

    Remove the elements insorted order with a seriesof removeMin()operations

    The running time of thissorting method depends onthe priority queueimplementation

    Algorithm PQ-Sort(S, C)Input sequence S, comparator Cfor the elements of SOutput sequence S sorted inincreasing order according to CP priority queue with

    comparator Cwhile S.isEmpty ()

    e S.remove (S. first ())P.insertItem(e, e)

    while P.isEmpty()e P.removeMin()S.insertLast(e)

    Heaps and Priority Queues 6

    Sequence-based Priority Queue Implementation with an

    unsorted list

    Performance: insertItem takes O(1) time

    since we can insert the itemat the beginning or end ofthe sequence

    removeMin, minKey andminElement take O(n) timesince we have to traversethe entire sequence to findthe smallest key

    Implementation with asorted list

    Performance: insertItem takes O(n) time

    since we have to find theplace where to insert theitem

    removeMin, minKey andminElement take O(1) timesince the smallest key is atthe beginning of thesequence

    4 5 2 3 1 1 2 3 4 5

  • Heaps 4/5/2002 14:4

    Heaps and Priority Queues 7

    Selection-Sort

    Selection-sort is the variation of PQ-sort where thepriority queue is implemented with an unsortedsequence

    Running time of Selection-sort: Inserting the elements into the priority queue with n

    insertItem operations takes O(n) time Removing the elements in sorted order from the priority

    queue with n removeMin operations takes timeproportional to

    1 + 2 + + n Selection-sort runs in O(n2) time

    4 5 2 3 1

    Heaps and Priority Queues 8

    Insertion-Sort

    Insertion-sort is the variation of PQ-sort where thepriority queue is implemented with a sortedsequence

    Running time of Insertion-sort: Inserting the elements into the priority queue with n

    insertItem operations takes time proportional to 1 + 2 + + n

    Removing the elements in sorted order from the priorityqueue with a series of n removeMin operations takes O(n)time

    Insertion-sort runs in O(n2) time

    1 2 3 4 5

    Heaps and Priority Queues 9

    What is a heap (2.4.3) A heap is a binary tree

    storing keys at its internalnodes and satisfying thefollowing properties: Heap-Order: for every

    internal node v other thanthe root,key(v) key(parent(v))

    Complete Binary Tree: let hbe the height of the heap for i = 0, , h 1, there are

    2i nodes of depth i at depth h 1, the internal

    nodes are to the left of theexternal nodes

    2

    65

    79

    The last node of a heapis the rightmost internalnode of depth h 1

    last node

    Heaps and Priority Queues 10

    Height of a Heap (2.4.3) Theorem: A heap storing n keys has height O(log n)

    Proof: (we apply the complete binary tree property) Let h be the height of a heap storing n keys Since there are 2i keys at depth i = 0, , h 2 and at least one key

    at depth h 1, we have n 1 + 2 + 4 + + 2h2 + 1 Thus, n 2h1 , i.e., h log n + 1

    1

    2

    2h2

    1

    keys0

    1

    h2h1

    depth

    Heaps and Priority Queues 11

    Heaps and Priority Queues We can use a heap to implement a priority queue We store a (key, element) item at each internal node We keep track of the position of the last node For simplicity, we show only the keys in the pictures

    (2, Sue)

    (6, Mark)(5, Pat)

    (9, Jeff) (7, Anna)

    Heaps and Priority Queues 12

    Insertion into aHeap (2.4.3) Method insertItem of the

    priority queue ADTcorresponds to theinsertion of a key k tothe heap

    The insertion algorithmconsists of three steps Find the insertion node z

    (the new last node) Store k at z and expand z

    into an internal node Restore the heap-order

    property (discussed next)

    2

    65

    79

    insertion node

    2

    65

    79 1

    z

    z

  • Heaps 4/5/2002 14:4

    Heaps and Priority Queues 13

    Upheap After the insertion of a new key k, the heap-order property may be

    violated Algorithm upheap restores the heap-order property by swapping k

    along an upward path from the insertion node Upheap terminates when the key k reaches the root or a node

    whose parent has a key smaller than or equal to k Since a heap has height O(log n), upheap runs in O(log n) time

    2

    15

    79 6z

    1

    25

    79 6z

    Heaps and Priority Queues 14

    Removal from a Heap (2.4.3) Method removeMin of

    the priority queue ADTcorresponds to theremoval of the root keyfrom the heap

    The removal algorithmconsists of three steps Replace the root key with

    the key of the last node w Compress w and its

    children into a leaf Restore the heap-order

    property (discussed next)

    2

    65

    79

    last node

    w

    7

    65

    9w

    Heaps and Priority Queues 15

    Downheap After replacing the root key with the key k of the last node, the

    heap-order property may be violated Algorithm downheap restores the heap-order property by

    swapping key k along a downward path from the root Upheap terminates when key k reaches a leaf or a node whose

    children have keys greater than or equal to k Since a heap has height O(log n), downheap runs in O(log n) time

    7

    65

    9w

    5

    67

    9w

    Heaps and Priority Queues 16

    Updating the Last Node The insertion node can be found by traversing a path of O(log n)

    nodes Go up until a left child or the root is reached If a left child is reached, go to the right child Go down left until a leaf is reached

    Similar algorithm for updating the last node after a removal

    Heaps and Priority Queues 17

    Heap-Sort (2.4.4)

    Consider a priorityqueue with n itemsimplemented by meansof a heap the space used is O(n) methods insertItem and

    removeMin take O(log n)time

    methods size, isEmpty,minKey, and minElementtake time O(1) time

    Using a heap-basedpriority queue, we cansort a sequence of nelements in O(n log n)time

    The resulting algorithmis called heap-sort

    Heap-sort is muchfaster than quadraticsorting algorithms, suchas insertion-sort andselection-sort

    Heaps and Priority Queues 18

    Vector-based HeapImplementation (2.4.3) We can represent a heap with n

    keys by means of a vector oflength n + 1

    For the node at rank i the left child is at rank 2i the right child is at rank 2i + 1

    Links between nodes are notexplicitly stored

    The leaves are not represented The cell of at rank 0 is not used Operation insertItem corresponds

    to inserting at rank n + 1 Operation removeMin corresponds

    to removing at rank n Yields in-place heap-sort

    2

    65

    79

    2 5 6 9 71 2 3 4 50

  • Heaps 4/5/2002 14:4

    Heaps and Priority Queues 19

    Merging Two Heaps We are given two two

    heaps and a key k We create a new heap

    with the root nodestoring k and with thetwo heaps as subtrees

    We perform downheapto restore the heap-order property

    7

    3

    58

    2

    64

    3

    58

    2

    64

    2

    3

    58

    4

    67

    Heaps and Priority Queues 20

    We can construct a heapstoring n given keys inusing a bottom-upconstruction with log nphases

    In phase i, pairs ofheaps with 2i 1 keys aremerged into heaps with2i+11 keys

    Bottom-up HeapConstruction (2.4.3)

    2i 1 2i 1

    2i+11

    Heaps and Priority Queues 21

    Example

    1516 124 76 2023

    25

    1516

    5

    124

    11

    76

    27

    2023

    Heaps and Priority Queues 22

    Example (contd.)

    25

    1516

    5

    124

    11

    96

    27

    2023

    15

    2516

    4

    125

    6

    911

    23

    2027

    Heaps and Priority Queues 23

    Example (contd.)

    7

    15

    2516

    4

    125

    8

    6

    911

    23

    2027

    4

    15

    2516

    5

    127

    6

    8

    911

    23

    2027

    Heaps and Priority Queues 24

    Example (end)

    4

    15

    2516

    5

    127

    10

    6

    8

    911

    23

    2027

    5

    15

    2516

    7

    1210

    4

    6

    8

    911

    23

    2027

  • Heaps 4/5/2002 14:4

    Heaps and Priority Queues 25

    Analysis We visualize the worst-case time of a downheap with a proxy path

    that goes first right and then repeatedly goes left until the bottomof the heap (this path may differ from the actual downheap path)

    Since each node is traversed by at most two proxy paths, the totalnumber of nodes of the proxy paths is O(n)

    Thus, bottom-up heap construction runs in O(n) time Bottom-up heap construction is faster than n successive insertions

    and speeds up the first phase of heap-sort

  • Priority Queues 6/8/2002 2:00 PM

    1

    6/8/2002 2:00 PM Priority Queues 1

    Priority Queues

    $118IBM400Buy$119IBM500Buy

    IBMIBM

    $120300Sell$122100Sell

    6/8/2002 2:00 PM Priority Queues 2

    Outline and Reading

    PriorityQueue ADT (2.4.1)Total order relation (2.4.1)Comparator ADT (2.4.1)Sorting with a priority queue (2.4.2)Selection-sort (2.4.2)Insertion-sort (2.4.2)

    6/8/2002 2:00 PM Priority Queues 3

    Priority Queue ADT

    A priority queue stores a collection of itemsAn item is a pair(key, element)Main methods of the Priority Queue ADT insertItem(k, o)

    inserts an item with key k and element o

    removeMin()removes the item with smallest key and returns its element

    Additional methods minKey(k, o)

    returns, but does not remove, the smallest key of an item

    minElement()returns, but does not remove, the element of an item with smallest key

    size(), isEmpty()Applications: Standby flyers Auctions Stock market

    6/8/2002 2:00 PM Priority Queues 4

    Total Order Relation

    Keys in a priority queue can be arbitrary objects on which an order is definedTwo distinct items in a priority queue can have the same key

    Mathematical concept of total order relation Reflexive property:

    x x Antisymmetric property:

    x y y x x = y Transitive property:

    x y y z x z

    6/8/2002 2:00 PM Priority Queues 5

    Comparator ADTA comparator encapsulates the action of comparing two objects according to a given total order relationA generic priority queue uses an auxiliary comparatorThe comparator is external to the keys being comparedWhen the priority queue needs to compare two keys, it uses its comparator

    Methods of the Comparator ADT, all with Boolean return type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x)

    6/8/2002 2:00 PM Priority Queues 6

    Sorting with a Priority QueueWe can use a priority queue to sort a set of comparable elements1. Insert the elements one

    by one with a series of insertItem(e, e) operations

    2. Remove the elements in sorted order with a series of removeMin() operations

    The running time of this sorting method depends on the priority queue implementation

    Algorithm PQ-Sort(S, C)Input sequence S, comparator Cfor the elements of SOutput sequence S sorted in increasing order according to CP priority queue with

    comparator Cwhile S.isEmpty ()

    e S.remove (S. first ())P.insertItem(e, e)

    while P.isEmpty()e P.removeMin()S.insertLast(e)

  • Priority Queues 6/8/2002 2:00 PM

    2

    6/8/2002 2:00 PM Priority Queues 7

    Sequence-based Priority QueueImplementation with an unsorted sequence Store the items of the

    priority queue in a list-based sequence, in arbitrary order

    Performance: insertItem takes O(1) time

    since we can insert the item at the beginning or end of the sequence

    removeMin, minKey and minElement take O(n) time since we have to traverse the entire sequence to find the smallest key

    Implementation with a sorted sequence Store the items of the

    priority queue in a sequence, sorted by key

    Performance: insertItem takes O(n) time

    since we have to find the place where to insert the item

    removeMin, minKey and minElement take O(1) time since the smallest key is at the beginning of the sequence

    6/8/2002 2:00 PM Priority Queues 8

    Selection-Sort

    Selection-sort is the variation of PQ-sort where the priority queue is implemented with an unsorted sequenceRunning time of Selection-sort:1. Inserting the elements into the priority queue with n

    insertItem operations takes O(n) time2. Removing the elements in sorted order from the priority

    queue with n removeMin operations takes time proportional to

    1 + 2 + + nSelection-sort runs in O(n2) time

    6/8/2002 2:00 PM Priority Queues 9

    Insertion-SortInsertion-sort is the variation of PQ-sort where the priority queue is implemented with a sorted sequenceRunning time of Insertion-sort:

    1. Inserting the elements into the priority queue with ninsertItem operations takes time proportional to

    1 + 2 + + n2. Removing the elements in sorted order from the priority

    queue with a series of n removeMin operations takes O(n) time

    Insertion-sort runs in O(n2) time

    6/8/2002 2:00 PM Priority Queues 10

    In-place Insertion-sortInstead of using an external data structure, we can implement selection-sort and insertion-sort in-placeA portion of the input sequence itself serves as the priority queueFor in-place insertion-sort We keep sorted the initial

    portion of the sequence We can use

    swapElements instead of modifying the sequence

    5 4 2 3 1

    5 4 2 3 1

    4 5 2 3 1

    2 4 5 3 1

    2 3 4 5 1

    1 2 3 4 5

    1 2 3 4 5

  • Dictionaries 4/5/2002 15:1

    Dictionaries and Hash Tables 1

    Dictionaries and Hash Tables

    01234 451-229-0004

    981-101-0002025-612-0001

    Dictionaries and Hash Tables 2

    Dictionary ADT (2.5.1) The dictionary ADT models a

    searchable collection of key-element items

    The main operations of adictionary are searching,inserting, and deleting items

    Multiple items with the samekey are allowed

    Applications: address book credit card authorization mapping host names (e.g.,

    cs16.net) to internet addresses(e.g., 128.148.34.101)

    Dictionary ADT methods: findElement(k): if the

    dictionary has an item withkey k, returns its element,else, returns the specialelement NO_SUCH_KEY

    insertItem(k, o): inserts item(k, o) into the dictionary

    removeElement(k): if thedictionary has an item withkey k, removes it from thedictionary and returns itselement, else returns thespecial elementNO_SUCH_KEY

    size(), isEmpty() keys(), Elements()

    Dictionaries and Hash Tables 3

    Log File (2.5.1) A log file is a dictionary implemented by means of an unsorted

    sequence We store the items of the dictionary in a sequence (based on a

    doubly-linked lists or a circular array), in arbitrary order Performance:

    insertItem takes O(1) time since we can insert the new item at thebeginning or at the end of the sequence

    findElement and removeElement take O(n) time since in the worstcase (the item is not found) we traverse the entire sequence tolook for an item with the given key

    The log file is effective only for dictionaries of small size or fordictionaries on which insertions are the most commonoperations, while searches and removals are rarely performed(e.g., historical record of logins to a workstation)

    Dictionaries and Hash Tables 4

    Hash Functions andHash Tables (2.5.2) A hash function h maps keys of a given type to

    integers in a fixed interval [0, N 1] Example:

    h(x) = x mod Nis a hash function for integer keys

    The integer h(x) is called the hash value of key x A hash table for a given key type consists of

    Hash function h Array (called table) of size N

    When implementing a dictionary with a hash table,the goal is to store item (k, o) at index i = h(k)

    Dictionaries and Hash Tables 5

    Example

    We design a hash table fora dictionary storing items(SSN, Name), where SSN(social security number) is anine-digit positive integer

    Our hash table uses anarray of size N = 10,000 andthe hash functionh(x) = last four digits of x

    01234

    999799989999

    451-229-0004

    981-101-0002

    200-751-9998

    025-612-0001

    Dictionaries and Hash Tables 6

    Hash Functions ( 2.5.3)

    A hash function isusually specified as thecomposition of twofunctions:Hash code map: h1: keys integersCompression map: h2: integers [0, N 1]

    The hash code map isapplied first, and thecompression map isapplied next on theresult, i.e.,

    h(x) = h2(h1(x)) The goal of the hash

    function is todisperse the keys inan apparently randomway

  • Dictionaries 4/5/2002 15:1

    Dictionaries and Hash Tables 7

    Hash Code Maps (2.5.3) Memory address:

    We reinterpret the memoryaddress of the key object asan integer (default hash codeof all Java objects)

    Good in general, except fornumeric and string keys

    Integer cast: We reinterpret the bits of the

    key as an integer Suitable for keys of length

    less than or equal to thenumber of bits of the integertype (e.g., byte, short, intand float in Java)

    Component sum: We partition the bits of

    the key into componentsof fixed length (e.g., 16or 32 bits) and we sumthe components(ignoring overflows)

    Suitable for numeric keysof fixed length greaterthan or equal to thenumber of bits of theinteger type (e.g., longand double in Java)

    Dictionaries and Hash Tables 8

    Hash Code Maps (cont.) Polynomial accumulation:

    We partition the bits of thekey into a sequence ofcomponents of fixed length(e.g., 8, 16 or 32 bits) a0 a1 an1

    We evaluate the polynomialp(z) = a0 + a1 z + a2 z2 +

    + an1zn1at a fixed value z, ignoringoverflows

    Especially suitable for strings(e.g., the choice z = 33 givesat most 6 collisions on a setof 50,000 English words)

    Polynomial p(z) can beevaluated in O(n) timeusing Horners rule: The following

    polynomials aresuccessively computed,each from the previousone in O(1) time

    p0(z) = an1pi (z) = ani1 + zpi1(z)

    (i = 1, 2, , n 1) We have p(z) = pn1(z)

    Dictionaries and Hash Tables 9

    CompressionMaps (2.5.4)

    Division: h2 (y) = y mod N The size N of the

    hash table is usuallychosen to be a prime

    The reason has to dowith number theoryand is beyond thescope of this course

    Multiply, Add andDivide (MAD): h2 (y) = (ay + b) mod N a and b are

    nonnegative integerssuch that a mod N 0

    Otherwise, everyinteger would map tothe same value b

    Dictionaries and Hash Tables 10

    Collision Handling( 2.5.5)

    Collisions occur whendifferent elements aremapped to the samecell Chaining: let each

    cell in the table pointto a linked list ofelements that mapthere

    Chaining is simple,but requiresadditional memoryoutside the table

    01234 451-229-0004 981-101-0004

    025-612-0001

    Dictionaries and Hash Tables 11

    Linear Probing (2.5.5) Open addressing: the

    colliding item is placed in adifferent cell of the table

    Linear probing handlescollisions by placing thecolliding item in the next(circularly) available table cell

    Each table cell inspected isreferred to as a probe

    Colliding items lump together,causing future collisions tocause a longer sequence ofprobes

    Example: h(x) = x mod 13 Insert keys 18, 41,

    22, 44, 59, 32, 31,73, in this order

    0 1 2 3 4 5 6 7 8 9 10 11 12

    41 18 44 59 32 22 31 73 0 1 2 3 4 5 6 7 8 9 10 11 12

    Dictionaries and Hash Tables 12

    Search with Linear Probing Consider a hash table A

    that uses linear probing findElement(k)

    We start at cell h(k) We probe consecutive

    locations until one of thefollowing occurs An item with key k is

    found, or An empty cell is found,

    or N cells have been

    unsuccessfully probed

    Algorithm findElement(k)i h(k)p 0repeat

    c A[i]if c =

    return NO_SUCH_KEY else if c.key () = k

    return c.element()else

    i (i + 1) mod Np p + 1

    until p = Nreturn NO_SUCH_KEY

  • Dictionaries 4/5/2002 15:1

    Dictionaries and Hash Tables 13

    Updates with Linear Probing To handle insertions and

    deletions, we introduce aspecial object, calledAVAILABLE, which replacesdeleted elements

    removeElement(k) We search for an item with

    key k If such an item (k, o) is

    found, we replace it with thespecial item AVAILABLEand we return element o

    Else, we returnNO_SUCH_KEY

    insert Item(k, o) We throw an exception

    if the table is full We start at cell h(k) We probe consecutive

    cells until one of thefollowing occurs A cell i is found that is

    either empty or storesAVAILABLE, or

    N cells have beenunsuccessfully probed

    We store item (k, o) incell i

    Dictionaries and Hash Tables 14

    Double Hashing Double hashing uses a

    secondary hash functiond(k) and handlescollisions by placing anitem in the first availablecell of the series

    (i + jd(k)) mod N for j = 0, 1, , N 1

    The secondary hashfunction d(k) cannothave zero values

    The table size N must bea prime to allow probingof all the cells

    Common choice ofcompression map for thesecondary hash function:

    d2(k) = q k mod qwhere q < N q is a prime

    The possible values ford2(k) are

    1, 2, , q

    Dictionaries and Hash Tables 15

    Consider a hashtable storing integerkeys that handlescollision with doublehashing N = 13 h(k) = k mod 13 d(k) = 7 k mod 7

    Insert keys 18, 41,22, 44, 59, 32, 31,73, in this order

    Example of Double Hashing

    0 1 2 3 4 5 6 7 8 9 10 11 12

    31 41 18 32 59 73 22 44 0 1 2 3 4 5 6 7 8 9 10 11 12

    k h (k ) d (k ) Probes18 5 3 541 2 1 222 9 6 944 5 5 5 1059 7 4 732 6 3 631 5 4 5 9 073 8 4 8

    Dictionaries and Hash Tables 16

    Performance ofHashing In the worst case, searches,

    insertions and removals on ahash table take O(n) time

    The worst case occurs whenall the keys inserted into thedictionary collide

    The load factor = n/Naffects the performance of ahash table

    Assuming that the hashvalues are like randomnumbers, it can be shownthat the expected number ofprobes for an insertion withopen addressing is

    1 / (1 )

    The expected runningtime of all the dictionaryADT operations in ahash table is O(1)

    In practice, hashing isvery fast provided theload factor is not closeto 100%

    Applications of hashtables: small databases compilers browser caches

    Dictionaries and Hash Tables 17

    Universal Hashing ( 2.5.6)

    A family of hash functionsis universal if, for any0

  • Dictionaries 4/5/2002 15:1

    Dictionaries and Hash Tables 19

    Proof of Universality (Part 2) If f causes no collisions, only g can make h cause collisions. Fix a number x. Of the p integers y=f(k), different from x,

    the number such that g(y)=g(x) is at most Since there are p choices for x, the number of hs that will

    cause a collision between j and k is at most

    There are p(p-1) functions h. So probability of collision isat most

    Therefore, the set of possible h functions is universal.

    1/ Np

    ( ) NppNpp )1(1/

    NppNpp 1

    )1(/)1( =

  • Dictionaries 6/8/2002 2:01 PM

    1

    6/8/2002 2:01 PM Dictionaries 1

    Dictionaries

    6

    92

    41 8

    =

    6/8/2002 2:01 PM Dictionaries 2

    Outline and ReadingDictionary ADT (2.5.1)Log file (2.5.1)Binary search (3.1.1)Lookup table (3.1.1)Binary search tree (3.1.2) Search (3.1.3) Insertion (3.1.4) Deletion (3.1.5) Performance (3.1.6)

    6/8/2002 2:01 PM Dictionaries 3

    Dictionary ADTThe dictionary ADT models a searchable collection of key-element itemsThe main operations of a dictionary are searching, inserting, and deleting itemsMultiple items with the same key are allowedApplications: address book credit card authorization mapping host names (e.g.,

    cs16.net) to internet addresses (e.g., 128.148.34.101)

    Dictionary ADT methods: findElement(k): if the

    dictionary has an item with key k, returns its element, else, returns the special element NO_SUCH_KEY

    insertItem(k, o): inserts item (k, o) into the dictionary

    removeElement(k): if the dictionary has an item with key k, removes it from the dictionary and returns its element, else returns the special element NO_SUCH_KEY

    size(), isEmpty() keys(), Elements()

    6/8/2002 2:01 PM Dictionaries 4

    Log FileA log file is a dictionary implemented by means of an unsorted sequence We store the items of the dictionary in a sequence (based on a

    doubly-linked lists or a circular array), in arbitrary orderPerformance: insertItem takes O(1) time since we can insert the new item at the

    beginning or at the end of the sequence findElement and removeElement take O(n) time since in the worst

    case (the item is not found) we traverse the entire sequence to look for an item with the given key

    The log file is effective only for dictionaries of small size or for dictionaries on which insertions are the most common operations, while searches and removals are rarely performed (e.g., historical record of logins to a workstation)

    6/8/2002 2:01 PM Dictionaries 5

    Binary SearchBinary search performs operation findElement(k) on a dictionary implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after a logarithmic number of steps

    Example: findElement(7)

    1 3 4 5 7 8 9 11 14 16 18 19

    1 3 4 5 7 8 9 11 14 16 18 19

    1 3 4 5 7 8 9 11 14 16 18 19

    1 3 4 5 7 8 9 11 14 16 18 19

    0

    0

    0

    0

    ml h

    ml h

    ml h

    l=m =h6/8/2002 2:01 PM Dictionaries 6

    Lookup TableA lookup table is a dictionary implemented by means of a sorted sequence We store the items of the dictionary in an array-based sequence,

    sorted by key We use an external comparator for the keys

    Performance: findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift

    n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to

    shift n/2 items to compact the items after the removalThe lookup table is effective only for dictionaries of small size or for dictionaries on which searches are the most common operations, while insertions and removals are rarely performed (e.g., credit card authorizations)

  • Dictionaries 6/8/2002 2:01 PM

    2

    6/8/2002 2:01 PM Dictionaries 7

    Binary Search TreeA binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying the following property: Let u, v, and w be three

    nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) key(v) key(w)

    External nodes do not store items

    An inorder traversal of a binary search trees visits the keys in increasing order

    6

    92

    41 8

    6/8/2002 2:01 PM Dictionaries 8

    SearchTo search for a key k, we trace a downward path starting at the rootThe next node visited depends on the outcome of the comparison of k with the key of the current nodeIf we reach a leaf, the key is not found and we return NO_SUCH_KEYExample: findElement(4)

    Algorithm findElement(k, v)if T.isExternal (v)

    return NO_SUCH_KEYif k < key(v)

    return findElement(k, T.leftChild(v))else if k = key(v)

    return element(v)else { k > key(v) }

    return findElement(k, T.rightChild(v))

    6

    92

    41 8

    =

    6/8/2002 2:01 PM Dictionaries 9

    InsertionTo perform operation insertItem(k, o), we search for key kAssume k is not already in the tree, and let let w be the leaf reached by the searchWe insert k at node w and expand w into an internal nodeExample: insert 5

    6

    92

    41 8

    6

    92

    41 8

    5

    >w

    w

    6/8/2002 2:01 PM Dictionaries 10

    DeletionTo perform operation removeElement(k), we search for key kAssume key k is in the tree, and let let v be the node storing kIf node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w)Example: remove 4

    6

    92

    41 8

    5

    vw

    6

    92

    51 8

    6/8/2002 2:01 PM Dictionaries 11

    Deletion (cont.)We consider the case where the key k to be removed is stored at a node v whose children are both internal we find the internal node w

    that follows v in an inorder traversal

    we copy key(w) into node v we remove node w and its

    left child z (which must be a leaf) by means of operation removeAboveExternal(z)

    Example: remove 3

    3

    1

    8

    6 9

    5

    v

    w

    z

    2

    5

    1

    8

    6 9

    v

    2

    6/8/2002 2:01 PM Dictionaries 12

    PerformanceConsider a dictionary with n items implemented by means of a binary search tree of height h the space used is O(n) methods findElement ,

    insertItem and removeElement take O(h) time

    The height h is O(n) in the worst case and O(log n) in the best case

  • Dictionaries 4/5/2002 15:1

    Binary Search Trees 1

    Binary Search Trees

    6

    92

    41 8

    =

    Binary Search Trees 2

    Ordered Dictionaries

    Keys are assumed to come from a totalorder. New operations:

    closestKeyBefore(k) closestElemBefore(k) closestKeyAfter(k) closestElemAfter(k)

    Binary Search Trees 3

    Binary Search (3.1.1) Binary search performs operation findElement(k) on a dictionary

    implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after O(log n) steps

    Example: findElement(7)

    1 3 4 5 7 8 9 11 14 16 18 19

    1 3 4 5 7 8 9 11 14 16 18 19

    1 3 4 5 7 8 9 11 14 16 18 19

    1 3 4 5 7 8 9 11 14 16 18 19

    0

    0

    0

    0

    ml h

    ml h

    ml h

    l=m =hBinary Search Trees 4

    Lookup Table (3.1.1)

    A lookup table is a dictionary implemented by means of a sortedsequence We store the items of the dictionary in an array-based sequence,

    sorted by key We use an external comparator for the keys

    Performance: findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift

    n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to

    shift n/2 items to compact the items after the removal The lookup table is effective only for dictionaries of small size or

    for dictionaries on which searches are the most commonoperations, while insertions and removals are rarely performed(e.g., credit card authorizations)

    Binary Search Trees 5

    Binary SearchTree (3.1.2) A binary search tree is a

    binary tree storing keys(or key-element pairs)at its internal nodes andsatisfying the followingproperty: Let u, v, and w be three

    nodes such that u is inthe left subtree of v andw is in the right subtreeof v. We havekey(u) key(v) key(w)

    External nodes do notstore items

    An inorder traversal of abinary search treesvisits the keys inincreasing order

    6

    92

    41 8

    Binary Search Trees 6

    Search (3.1.3) To search for a key k,

    we trace a downwardpath starting at the root

    The next node visiteddepends on theoutcome of thecomparison of k withthe key of the currentnode

    If we reach a leaf, thekey is not found and wereturn NO_SUCH_KEY

    Example:findElement(4)

    Algorithm findElement(k, v)if T.isExternal (v)

    return NO_SUCH_KEYif k < key(v)

    return findElement(k, T.leftChild(v))else if k = key(v)

    return element(v)else { k > key(v) }

    return findElement(k, T.rightChild(v))

    6

    92

    41 8

    =

  • Dictionaries 4/5/2002 15:1

    Binary Search Trees 7

    Insertion (3.1.4) To perform operation

    insertItem(k, o), we searchfor key k

    Assume k is not already inthe tree, and let let w bethe leaf reached by thesearch

    We insert k at node w andexpand w into an internalnode

    Example: insert 5

    6

    92

    41 8

    6

    92

    41 8

    5

    >w

    w

    Binary Search Trees 8

    Deletion (3.1.5) To perform operation

    removeElement(k), wesearch for key k

    Assume key k is in the tree,and let let v be the nodestoring k

    If node v has a leaf child w,we remove v and w from thetree with operationremoveAboveExternal(w)

    Example: remove 4

    6

    92

    41 8

    5

    vw

    6

    92

    51 8

    Binary Search Trees 9

    Deletion (cont.) We consider the case where

    the key k to be removed isstored at a node v whosechildren are both internal we find the internal node w

    that fol


Recommended