Upload
neal-curtis
View
223
Download
0
Embed Size (px)
Citation preview
The Set ADTList and Tree Implementations
CSCI 2720
Spring 2007
Eileen Kraemer
The Set ADT
We assume: Members drawn from a single universe Sets are finite; universe may be infinite No duplicate members
But some representations can be used to represent multisets (which may contain duplicate members)
Note: ADT = abstract data typeInteresting because “is x in S?” is asked in
many applications.
The Set ADT
Member(x,S) If x in S return true else return false
Union(S,T) return S T, the set consisting of every x in S or T or both
Intersection(S,T) return S T, the set consisting of every x in both S and T
Difference (S,T) return S – T, the set of all x in S that are not in T
MakeEmptySet() Return the empty set
The Set ADT
IsEmptySet(S) return true if S is the empty set, else return false
Size(S) return |S| (“cardinality of S”), the number of elements in the set S
Insert(x,S) set S to S {x} ; no effect if x is already in S
Delete(x,S) set S to S – {x}; no effect if x not in S
Equal(S,T) Return true if S == T (if S and T have the same members)
Iterate(S,F) perform operation F on each member of set S (in no particular
order)
Additional set operations ….
Apply only if members are linearly ordered Min(S) – return smallest member of set S Max(S) – return largest member of set S
Applies only if members are of form <K,I> Where K is a key drawn from some key space and I is
some associated info Lookup(K,S)
Given a key value K, return an Info I such that <K,I> is a member of S; if no such member exists return
If a <K,I> is found : successful search If is returned : unsuccessful search
The Dictionary ADT
A set ADT with the operations: MakeEmptySet IsEmptySet Insert Delete Lookup
Unordered Lists
Simplest representation of setsMay use contiguous memory, singly
linked, doubly linkedApplies to sets drawn from any universeKeys may be ordered or unordered
Unordered Lists
Lookup Simple sequential search;
At each step, compare two keys for equality If dictionary contains n elements, then Lookup
is Θ(n) in the worst case if the lists are unordered
Insert = Lookup + insertion cost Delete = Lookup + deletion cost
Time to perform Lookup
Count the number of comparisons: How many (K == K’) operations are performed?
Linked list: Worst case for successful search:
• K is last key in list: n comparisons Worst case for unsuccessful search:
• Compare to all keys in list: n comparisons Seems “okay” only if list is short
Expected-cost analysis
Here, we consider the frequency and cost of each lookup and compute a weighted sum: the expected cost
Assume:Uniform distribution of access to keys
Expected cost for linked lists: = average number of comparisons to lookup a key
= (1+2+3+…+n)/n = (n+1)/2
Thus: Expected cost for a lookup in linked list is Θ(n)
Assume:non-uniform access to keys Closer to reality: relatively few keys account for
most of the LookUps Let the keys in the dictionary be K1, …, Kn,
ordered by frequency of access (K1 accessed most frequently)
Let pi be frequency of access to key Ki p1 + p2 + … + pn = 1.0
For successful searches: Expected cost = (p1 *1)+ (p2 *2)+ … + (pn *n)
For unsuccessful searches: Cost is always θ(n)
What is the optimal ordering?
A list kept in frequency order: p1 >= p2 >= … >= pn
If the frequencies are sufficiently skewed, this can be much less than (n+1)/2
Example: p1 = ½, p2 = ¼, p3= ⅛, …etc. Expected cost is :
(1* ½ + 2* ¼ + 3* ⅛+ …..n* 2-n+1)/n less than 2
If p=4, then cost = (1* ½ + 2* ¼ + 3* ⅛+4*⅛) = 15/32
But you usually don’t know the frequency of access in advance …
However, you can adjust the order of the list as the Lookups occur: Move-to-front Heuristic: Transpose Heuristic:
Move-to-front Heuristic:
After each successful search, move the item that was sought to the front of the list
Cost is no more than twice the optimum (proof in text)
Example: {A, B, C, D}Search on C: cost=3, adjust to {C, A, B, D}Search on D: cost=4, adjust to {D, C, A, B}Search on C: cost=2, adjust to {C, D, A, B}Search on D: cost=1, adjust to {D, C, A, B}
Transpose Heuristic:
If the item sought is not the first in the list, move it one position forward by exchanging it with the item just before it
Performance (after steady state reached) is even better than Move-to-Front (but harder to prove)
Example: {A, B, C, D} Search on C: cost=3, adjust to {A, C, B, D} Search on D: cost=4, adjust to {A, C, D, B} Search on C: cost=2, adjust to {C, A, D, B} Search on A, cost=2, adjust to {A, C, D, B}
Ordered Lists
If the keys have some linear order, can maintain list in order by key value
Benefits: Unsuccessful search can stop when key of
greater value is encountered: expected number of comparisons reduced to n/2
Ordered Lists
If representation is tabular (contigous memory, “array-based”), can employ binary search
Binary search (worst case) between 1 and floor(lg n) + 1 comparisons in a
successful search of a table of size n, either floor(lg n) or floor(lg n) + 1 comparisons
in an unsuccessful search Floor(lg n) + 1 exactly, if n is one less than a
power of 2
BinarySearchLookup
Function BinarySearchLookUp(key K, table T[0..n-1]):info{return information stored with key K in T, or if K is not in T}Left = 0Right = n-1Repeat forever
if Right < Left thenreturn
elseMiddle = floor((Left+Right)/2)if (K == Key(T[Middle])) then return Info(T[Middle])else if K < Key(T[Middle])then Right <= Middle -1else Left = Middle + 1
Binary Search Example
Choose an array of 16 values (have class provide values)
Work through example on board.
Interpolation Search
function InterpolationSearchLookup(key K, table T[0..n-1]): info//return info stored with K in ordered table T, or if K not present//table positions T[-1] and T[n] are assumed to be available to the algKey(T[-1]) = -1Key(T[n]) = NLeft = 0Right = n-1repeat forever
if Right < Left return else
p = (K – Key(T[Left-1])) / (Key(T[Right+1]) – Key(T[Left-1]))middle = floor(p * (right-left+1)) + leftif K == Key(T[middle]) return Info(T[middle])else if K < Key(T[Middle]) then Right = Middle -1else Left = Middle + 1