29
1 The Disjoint Set ADT The Disjoint Set ADT CS146 CS146 Chapter 8 Chapter 8 Yan Qing Lei Yan Qing Lei

1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

Embed Size (px)

Citation preview

Page 1: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

11

The Disjoint Set ADTThe Disjoint Set ADT

CS146CS146

Chapter 8Chapter 8

Yan Qing LeiYan Qing Lei

Page 2: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

22

Issues:Issues:

• The equivalence problemThe equivalence problem

• The first algorithmThe first algorithm

• Smart union algorithmsSmart union algorithms

• Union and FindUnion and Find

Page 3: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

33

Equivalence RelationsEquivalence Relations

• A relation R is defined on a set S if for A relation R is defined on a set S if for every pair of elements (a,b), a,bevery pair of elements (a,b), a,bS, a R b S, a R b is either true or false. If a R b is true, then is either true or false. If a R b is true, then we say that a is related to b.we say that a is related to b.

• An equivalence relation is a relation R that An equivalence relation is a relation R that satisfy three properties:satisfy three properties:– (reflexive) a R a, for all a (reflexive) a R a, for all a S. S.– (symmetric) a R b if and only if b R a.(symmetric) a R b if and only if b R a.– (transitive) a R b and b R c implies that a R c.(transitive) a R b and b R c implies that a R c.

Page 4: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

44

The Dynamic Equivalence The Dynamic Equivalence ProblemProblem• The equivalence class of an element The equivalence class of an element aaS S

is the subset of S that contains all the is the subset of S that contains all the elements that are related to elements that are related to aa..

• Equivalence classes form a partition of S: Equivalence classes form a partition of S: every member of S appears in exactly one every member of S appears in exactly one equivalence class.equivalence class.

• To decide if two member are related, only To decide if two member are related, only need to check whether the two are in the need to check whether the two are in the same equivalence class.same equivalence class.

Page 5: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

55

Disjoint SetsDisjoint Sets

• Make the input a collection of N sets, each with Make the input a collection of N sets, each with one element.one element.– All relations (except reflexive) are false;All relations (except reflexive) are false;– Each set has a different element: SEach set has a different element: SiiSSjj== => it makes => it makes

the sets disjoint.the sets disjoint.

• Find operation: returns the name of the set Find operation: returns the name of the set containing a given element. containing a given element.

• Add operation (e.g., add relation aAdd operation (e.g., add relation a~b)~b)– Check if a and b are already related: if they are in the Check if a and b are already related: if they are in the

same equivalence class.same equivalence class.– If not, apply “union”: merge the two equivalence If not, apply “union”: merge the two equivalence

classes containing a and b into a new equivalence classes containing a and b into a new equivalence class.class.

Page 6: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

66

Example:Example:

On a set of subsets, the three On a set of subsets, the three operations amount to operations amount to

• Create a set of n disjoint subsets with Create a set of n disjoint subsets with every node in its own subset every node in its own subset

• Test wheter A and B are in the same Test wheter A and B are in the same subset subset

• If A and B are in the same subset, If A and B are in the same subset, then do nothing, else unify the then do nothing, else unify the subsets to which A and B belong subsets to which A and B belong

Page 7: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

77

• The most efficient way to implement The most efficient way to implement the operation is:the operation is:

• For every A it is possible to ask for the For every A it is possible to ask for the index of the subset to which A belongs index of the subset to which A belongs

• This will be denoted as This will be denoted as find(A)find(A). And . And we have A ~ B <--> find(A) == we have A ~ B <--> find(A) == find(B). The operation of adding a find(B). The operation of adding a relation between A and B will be relation between A and B will be denoted by denoted by union(A, B)union(A, B) (even though (even though nothing needs to happen). Thus, for nothing needs to happen). Thus, for union(A, B) one performs union(A, B) one performs

Page 8: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

88

codingcoding

void union(Node A, Node B) void union(Node A, Node B)

{{

if (find(A) != find(B))if (find(A) != find(B))

unify(A, B);unify(A, B);

} }

Page 9: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

99

The first algorithmThe first algorithm

• All the elements in S are numbered sequentially All the elements in S are numbered sequentially from 0 to N-1from 0 to N-1—the numbering can be determined —the numbering can be determined by hashing —we have Sby hashing —we have Sii={i} for i=0 through N-1.={i} for i=0 through N-1.

• Maintain, in an array, the name of the Maintain, in an array, the name of the equivalence class for each element.equivalence class for each element.

• “ “find” is just a simple O(1) lookup.find” is just a simple O(1) lookup.• ““union(a,b)”: suppose that a is in equivalence union(a,b)”: suppose that a is in equivalence

class i and b is in equivalence class j. Scan class i and b is in equivalence class j. Scan through the array, changing each i to j. It takes through the array, changing each i to j. It takes O(N) for one operation and O(NO(N) for one operation and O(N22) for N number of ) for N number of unionunion

Page 10: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1010

OptimizationsOptimizations

• Keep all the elements that are in the same Keep all the elements that are in the same equivalence class in a linked list.equivalence class in a linked list.

• By tracking the size of each equivalence class, By tracking the size of each equivalence class, we, when “union”, change the name of the we, when “union”, change the name of the smaller equivalence class to the larger. Thus smaller equivalence class to the larger. Thus the total time spent for N “union” is O(NlogN). the total time spent for N “union” is O(NlogN). Using this strategy, any sequence of M finds Using this strategy, any sequence of M finds and up to N-1unions takes at most and up to N-1unions takes at most O(M+NlogN) time.O(M+NlogN) time.

Page 11: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1111

Page 12: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1212

Page 13: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1313

Page 14: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1414

The O(M+N) algorithmThe O(M+N) algorithm

Class DisjSetsClass DisjSets{{pubic: pubic:

explicit DisjSets(int numElements);explicit DisjSets(int numElements);int find(int x) const;int find(int x) const;int find(int x);int find(int x);void unionSets(int root1,int root2);void unionSets(int root1,int root2);

private:private:vector<int> s;vector<int> s;

}}

Page 15: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1515

The O(M+N) algorithmThe O(M+N) algorithm

DisjSets::DisjSets(int DisjSets::DisjSets(int numElements):s(numElements)numElements):s(numElements)

{{for(int j=0; j<s.size(); j++)for(int j=0; j<s.size(); j++)

s[j]=-1;s[j]=-1;}}Void DisjSets::unionSets(root1, root2)Void DisjSets::unionSets(root1, root2){{s[root2]=root1;s[root2]=root1;

}}

Page 16: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1616

The O(M+N) algorithmThe O(M+N) algorithm

DisjSets:: find(int x) constDisjSets:: find(int x) const

{{

if(s[x]<0)if(s[x]<0)

return x;return x;

elseelse

return find(s[x]);return find(s[x]);

}}

Page 17: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1717

Smart Union AlgorithmsSmart Union Algorithms

• Union-by-size: make the smaller tree Union-by-size: make the smaller tree a subtree of the larger.a subtree of the larger.

• If union-by-size, the depth of any If union-by-size, the depth of any node is never more than logN: a find node is never more than logN: a find operation is O(logN), and O(MlogN) operation is O(logN), and O(MlogN) for a sequence of M. The worse-case for a sequence of M. The worse-case trees are binomial trees.trees are binomial trees.

Page 18: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1818

Example:Example:

Page 19: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

1919

Page 20: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2020

Path Compression for faster Path Compression for faster findfind

• Path compression for find(x): every Path compression for find(x): every node on the path from x to the root node on the path from x to the root has its parent changed to the root: has its parent changed to the root: make the tree shallow.make the tree shallow.

Page 21: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2121

Smart Union AlgorithmsSmart Union Algorithms

/* Union-by-height: make the shallow tree a /* Union-by-height: make the shallow tree a subtree of the deeper. */subtree of the deeper. */

void DisjSets::unionSets(root1, root2)void DisjSets::unionSets(root1, root2){{

if(s[root2]<s[root1]) //root2 is deeperif(s[root2]<s[root1]) //root2 is deepers[root1]=root2;s[root1]=root2;

else { //update height if sameelse { //update height if sameif(s[root1]==s[root2])if(s[root1]==s[root2]) s[root1]--;s[root1]--;s[root2]=root1;s[root2]=root1;

}}}}

Page 22: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2222

DisjSets:: find(int x)DisjSets:: find(int x)

{{

if(s[x]<0)if(s[x]<0)

return x;return x;

elseelse

return s[x]=find(s[x]);return s[x]=find(s[x]);

}}

Page 23: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2323

Union and FindUnion and Find

The Union-Find problem starts with n elements (numbered 1 to n), each one representing a singleton set. We allow two operations to be performed:

Page 24: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2424

• Find (i) returns the ID number of the set that i currently is in. We assume the ID number of each set is the number of one of the elements in it.

• Union (i; j) combines the elements in the sets with ID numbers i and j into a single set. The ID number of the new set will be either i or j . This is a destructive operation in the sense that the original sets i and j are lost when they are combined to form the new set.

Page 25: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2525

• We can implement Union-Find by maintaining

an array A[1 : : : n] of integers. If i is the ID number of a set, then A[i] will be the

negative of the number of elements in this set.

Otherwise, A[i] will the number of another element in this set.

• We begin with each A[i] equal to -1 -1The Union operation is easily implemented as

follows:

Page 26: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2626

void Union (int i, int j) { if (A [i] < A [j]) { A [i] += A [j]; A [j] = i; } else { A [j] += A [i]; A [i] = j; } return; } This simply makes the \leader" of the smaller set point to the leader of the larger set, and adjusts the size of the leader. The time

complexity is O(1)

Page 27: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2727

To implement Find (i) we just follow the pointers

to the leader: int Find (int i) { while (A [i] > 0) i = A [i]; return i; }This yields O(lg n) time. We can do better,

though, with path compression. After we find the leader, we make all the nodes we've passed through point directly to it:

Page 28: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2828

int Find (int i) { int j = i, k; if (A [i] < 0) return i; while (A [j] > 0) j = A [j]; while (A [i] != j) { k = A [i]; A [i] = j; i = k; } return j; }

Page 29: 1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei

2929

Using Union by size (rank) and Find with path compression, we get the following result:

Starting with n singleton sets, any sequence of M Union and/or Find operations takes O(M logN) time.