29
Disjoint Sets Chapter 21 CPTR 430 Algorithms Disjoint Sets 1

Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint Sets

Chapter 21

CPTR 430 Algorithms Disjoint Sets 1

Page 2: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint Sets

A disjoint-set data structure maintains a collection

S1 S2 Sk

of disjoint dynamic sets

Each set has a designated representative which is an element of theset

For some applications, the representative may be arbitrary For others, the “smallest” element is the representative (if the

elements can be ordered)

CPTR 430 Algorithms Disjoint Sets 2

Page 3: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint Set Operations

makeSet(x)—creates a new set whose only member is x

union(x,y)—combines the sets that contain elements x and y

If x Sx and y Sy, then union(x,y) returns a new set equal toSx

Sy

Sx and Sy are disjoint before the union() operation The representative of the resulting set can be any element in Sx

Sy,but usually we choose either the representative of Sx or Sy

The original sets, Sx and Sy are removed from

findSet(x)—returns a reference to the representative of the setcontaining x

CPTR 430 Algorithms Disjoint Sets 3

Page 4: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Skeleton Implementation

public class DisjointSet public static void makeSet(DSElement element) /* To be determined */ public static void union(DSElement x, DSElement y) /* To be determined */ public static DSElement findSet(DSElement element) /* To be determined */

CPTR 430 Algorithms Disjoint Sets 4

Page 5: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint Set Analysis n—the number makeSet() operations

m—the total number of makeSet(), and union(), and findSet()operations

The sets in are disjoint, so

Each union() operation reduces

by 1

After n 1 union() operations

1

The number of union() operations is at most n 1

m

n

For the purposes of analysis, assume the first n operations aremakeSet() operations

CPTR 430 Algorithms Disjoint Sets 5

Page 6: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Applications of Disjoint Sets

Determining the connected components of an undirected graph

Kruskal’s minimum spanning tree algorithm

In FORTRAN, handling the EQUIVALENCE(X,Y) statement

Type unification by compilers and interpreters of dynamically typedprogramming languages

Image processing—blob coloring

Colorizing old movies

CPTR 430 Algorithms Disjoint Sets 6

Page 7: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Sample Application—Detecting theConnected Components of an Undirected

Graph

This undirected graph has four connected components:

a

c

b

d

e

g

f h

i

j

CPTR 430 Algorithms Disjoint Sets 7

Page 8: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Connectivity Algorithm

public class ConnectedGraph public static void connectedComponents(Graph g)

Vertex[] vertices = g.getVertices();Edge[] edges = g.getEdges();for ( int i = 0; i < vertices.length; i++ )

DisjointSet.makeSet(vertices[i]);for ( int i = 0; i < edges.length; i++ )

DSElement fromSetRep = DisjointSet.findSet(edges[i].from),toSetRep = DisjointSet.findSet(edges[i].to);

if ( fromSetRep != toSetRep ) DisjointSet.union(fromSetRep, toSetRep);

public static boolean sameComponent(Vertex v1, Vertex v2)

return DisjointSet.findSet(v1) == DisjointSet.findSet(v2);

CPTR 430 Algorithms Disjoint Sets 8

Page 9: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Connectivity Algorithm

connectedComponents() initially places each vertex into its own set

Next, all edges are examined; an edge connecting two vertices impliesthat the two vertices are to be unioned into one set

After all edges have been examined, two vertices are within the sameconnected component if sameComponent() returns true

For things to work, a vertex object must reference an associated disjointset object and vice-versa

CPTR 430 Algorithms Disjoint Sets 9

Page 10: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Linked List Implementation

The linked list implementation is simple

The first element in the list is the set’s representative

Each element in the list contains:

a data object a pointer to the next element in

the list a pointer to the representative

data

next

rep

CPTR 430 Algorithms Disjoint Sets 10

Page 11: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Linked List Implementation (cont.)

Pointers head and tail refer, respectively, to the first and lastelements in the list

head points to the representative tail points to the position where a new element can be added and

another set can be unioned

next

rep

data

next

rep

data

next

rep

data

next

rep

data

a c db

tail

head

CPTR 430 Algorithms Disjoint Sets 11

Page 12: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Efficiency of the Linked Implementation

makeSet()

Create a new list with one element O

1

findSet()

Return the pointer to the representative stored in each node O

1

union()

Attach one list to end of the other The end can be found quickly via the tail pointer Updating the representative pointers in every node in the attached

list takes time proportional to the length of the attached list O

n

CPTR 430 Algorithms Disjoint Sets 12

Page 13: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

The Amortized Analysis

In the worst case, a sequence of m operations requires O

n2 time

Take objects x1 x2 xn perform the operations

Operation Number of Objects Updated

makeSet(x1) 1makeSet(x2) 1

... ...makeSet(xn) 1

... ...union(x1 x2) 1union(x2 x3) 2union(x3 x4) 3

... ...union(xn 1 xn) n 1

CPTR 430 Algorithms Disjoint Sets 13

Page 14: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

The Amortized Analysis (cont.)

The operation sequence is n makeSet()s following by n 1 union()ssuch that the longer list is always appended to the shorter list

The n makeSet() operations take Θ

n

time

The ith union() operation updates i objects

Total number of objects updated by all n 1 union() operations is

n 1

∑i 1

i Θ

n2

The total number of operations is 2n 1

Each operation on average requires Θ

n

time

By aggregate analysis, then, the amortized cost of each operation isΘ

n

CPTR 430 Algorithms Disjoint Sets 14

Page 15: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Weighted-union Heuristic

Ensure that the shorter list is always appended to the longer list

Fewer representative pointers to update

Maintain the length of each list (easy—add an extra integer field)

A union can still require Ω

n

if both lists have Ω

n

elements

Helps a little?

CPTR 430 Algorithms Disjoint Sets 15

Page 16: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

It Does Better than Θ

n2

Given: linked list representation with the weighted-union heuristic

Any sequence of m makeSet(), findSet(), and union() operations,n of which are makeSet() operations, takes O

m

n lgn

time(Theorem 21.1)

Why?

CPTR 430 Algorithms Disjoint Sets 16

Page 17: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Consider each object in a set of size n

For a given object, x, how many times has its representative beenupdated?

The first time it was updated it originally had to have been an elementin the smaller set, since the weighted-union heuristic always appendsthe smaller list to the larger one

After x’s representative was updated the first time, x’s resulting set musthave had at least two elements (Why?)

The next time x’s representative is updated, x’s set must have at leastfour elements (Again, why?)

For all k

n, the resulting set has at least k elements after x’srepresentative has been updated

lgn

times

CPTR 430 Algorithms Disjoint Sets 17

Page 18: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Proof of Theorem 21.1 (cont.)

For all k

n, the resulting set has at least k elements after x’srepresentative has been updated

lgn

times

The largest set has at most n elements (Why?)

Each element in that largest set has been updated at most

lgn

times

The time to update the n elements is O

n lgn

The time to adjust the head and tail pointers, as well as the lengthfield, is constant

CPTR 430 Algorithms Disjoint Sets 18

Page 19: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Proof of Theorem 21.1 (cont.)

The makeSet() and findSet() operations take O

1

time

There are O

m

makeSet() and findSet() operations

The time for the entire sequence of m operations is

O

m

n lgn

CPTR 430 Algorithms Disjoint Sets 19

Page 20: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint-set Forests

A set is represented by a rooted tree

The root is the set’s representative

Each node points to its parent (the root points to itself)

So, unlike trees we are used to seeing, the pointers point “up” insteadof “down”

e,gunion( )f

d

g

c

b

eh

c

h e

b

f

d

g

CPTR 430 Algorithms Disjoint Sets 20

Page 21: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Operation Implementations

makeSet()—create a tree containing one node

findSet()—follow parent pointers until the root is found

The path is called the find path

union()—redirect the parent pointer of one of the roots to point to theother root

e,gunion( )f

d

g

c

b

eh

c

h e

b

f

d

g

CPTR 430 Algorithms Disjoint Sets 21

Page 22: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Efficiency of Disjoint-set Forests

The straightforward approach is no better than the linked list version

A sequence of n 1 union() operations can create a tree of height n

A couple of heuristics can tweak the implementation into theasymptotically fastest disjoint-set data structure known

Union by rank Path compression

CPTR 430 Algorithms Disjoint Sets 22

Page 23: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Union by Rank

Same idea as weighted-union for linked lists

The root of the tree with fewer nodes points to the root of the tree withmore nodes

We could have each node keep track of the number of nodes in itssubtree

Instead, each node maintains a rank that is an upper bound on itsheight

union() then redirects the pointer of the root of the tree with smallerrank to the root of the tree with larger rank

CPTR 430 Algorithms Disjoint Sets 23

Page 24: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Path Compression

Simple in concept and to implement but very effective

Alter the findSet() operation so that each node on the find path pointsdirectly to the root instead of its immediate parent

Path compression does not affect any ranks (Why?)

gfindSet( )c

h e d

b

g

f

c

dh ebg

f

CPTR 430 Algorithms Disjoint Sets 24

Page 25: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint-set Forest Implementation

The node definition is slightly different:

public class Node public DSElement data; // Element to storepublic Node parent; // Pointer to the parent nodepublic int rank;public Node(DSElement d)

data = d;parent = this; // No parent node yetrank = 0; // No subtree for this new nodedata.setNode(this);

CPTR 430 Algorithms Disjoint Sets 25

Page 26: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint-set Forest Implementation (cont.)

public class DisjointSet public static void makeSet(DSElement element)

new Node(element);public static void union(DSElement x, DSElement y)

link(findSet(x), findSet(y));private static void link(DSElement x, DSElement y)

Node nx = findSet(x).getNode(),ny = findSet(y).getNode();

if ( nx.rank >= ny.rank ) ny.parent = nx;

else nx.parent = ny;if ( nx.rank == ny.rank )

ny.rank++;

. . .

CPTR 430 Algorithms Disjoint Sets 26

Page 27: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Disjoint-set Forest Implementation (cont.)

public class DisjointSet . . .public static DSElement findSet(DSElement element)

Node node = element.getNode();if ( node.parent != node )

node.parent = findSet(node.parent.data).getNode();return node.parent.data;

Note the recursion in findSet()

findSet() uses two passes:

The first pass up the tree to find the root (representative) The second pass is down the tree to update the parents of all the nodes in the find

path (to point directly to the root) The recursive calls make up the first pass The returns from the recursive calls make up the second pass

CPTR 430 Algorithms Disjoint Sets 27

Page 28: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

Do the Heuristics Help?

Union by rank, by itself, yields a running time of

O

m lgn

Path compression, by itself, yields a running time of

Θ

n

f

1

log2

f

n n

where

n is the number of makeSet() operations (which means at most n 1union() operations)

f is the number of findSet() operations

CPTR 430 Algorithms Disjoint Sets 28

Page 29: Disjoint Sets - computing.southern.educomputing.southern.edu/halterman/Courses/Winter2012/318/Slides/p… · Disjoint Set Operations makeSet(x)—creates a new set whose only member

The Combined Effect

Together, they yield a running time of

O

m α

n

where

α

n

is a very slowly growing function

For all practical applications of disjoint sets, α

n

4

Thus, the running time is linear in m, for all practical purposes

CPTR 430 Algorithms Disjoint Sets 29