prasadlahre.files.wordpress.com · Web view2021. 3. 12. · Apply the Prim’s and Kruskal’s algorithm in finding MST. Implement Dijkstra’s algorithm for finding shortest path

UNIT IIIGraphs

UNIT IIIUnit Rationale

Unit II: (9 Hrs)

Basic Concepts, Storage representation, Adjacency matrix, adjacency list, adjacency multi list,inverse adjacency list. Traversals -depth first and breadth first, Introduction to Greedy Strategy, Minimum spanning Tree, Greedy algorithms for computing minimum spanning tree -Prims and Kruskal Algorithms, Dikjtra's Single source shortest path, Topological ordering.Case study-Data structure used in Webgraph and Google map.

Upon completion Students will be able to: -

When the students have successfully completed this course, they will be able to:

Implement Graph data structures.

Demonstrate applications of graph.

Apply the Prim’s and Kruskal’s algorithm in finding MST.

Implement Dijkstra’s algorithm for finding shortest path

Understand the transitive closure and connected components concepts.

Session Objectives and Outcome

Unit Objective: To learn concept of graph data structure and its applications.

Unit Outcome: To apply the concepts of graph data structure in real life.

Session Contents Objective Outcome

1.

Basic Concepts, Storage representation of graphs To study the concept of

graph To apply the concept of graph

2.

Adjacency matrix, adjacency list, adjacency multi list, Inverse adjacency list

To study representations of graph using different data structures

To demonstrate how to represent graph using different data structures

3.

Traversals-depth first and breadth first To study traversal

techniques of graphTo demonstrate how to perform DFS and BFS on Graph

4.

Introduction to Greedy Strategy, Minimum spanning Tree To learn

implementation of MST To Explain MST

5.

Greedy algorithms for computing minimum spanning tree- Prims ,kruskals Algorithms

To learn implementation MST

To explain Prim’s and kruskal’s algorithm

6.

Dikjtra's Single source shortest path algorithm To gain knowledge of

how to find shortest path

To explain about dijkstra’s algorithm

7.Topological ordering

To study topological ordering To learn topological ordering

8.Case study- Data structure used in Webgraph and Google map

To study the use of webgraph To study the use of webgraph

9. Case study- Data structure used in Google map

To study the use of google map To study the use of google map

Session 1Session 1: Basic Concepts, Storage representation of graphs

Session Objective:

o To study the concept of Graphs and Storage representation of graphs

At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------

o To apply the concept of graph data structure

Teaching Learning Material:

o Black board

o PPT

Session Plan

Time (in min)

Content Learning Aid/Methodology

Faculty Approach

Typical Student Activity

Skill /Competency Developed

CO PO

20 Definition of graph and related terminologies

Brain StormingDiscussion

FacilitatesExplain

ParticipateDiscuss

KnowledgeIntrapersonal

15 Properties of graph and Graph Operations

Brain StormingPresentation

FacilitatesExplain

ParticipateDiscuss

Knowledge

15 Storage Representation of graph

Discussion Presentation

Explain Listen Knowledge

10 Summary Discussion Summarizes ParticipateDiscuss

KnowledgeComprehension


Describe the concepts of graph

Explain operations on graph data structure

Explain storage structure

Teaching Learning Material:-

PPT, Notes

References

o http://www.studytonight.com/data-structures/

o http://www.wikipedia.com

Notes

GraphDefinition:

A graph data structure consists of a finite (and possibly mutable) set of nodes or vertices, together with a set of ordered pairs of these nodes (or, in some cases, a set of unordered pairs). These pairs are known as edges or arcs.

A graph data structure may also associate to each edge some edge value, such as a symbolic label or a numeric attribute (cost, capacity, length, etc.). Following fig 1 represents graph having 6 vertices and 7 edges.

Fig1: Graph

Types of Graph

1) Undirected Graph:

An undirected graph is one in which edges have no orientation. The edge (a, b) is

identical to the edge (b, a), i.e., they are not ordered pairs, but sets {u, v} (or 2-multisets)

of vertices. The maximum number of edges in an undirected graph without a self-loop is

n(n - 1)/2.

2) Directed Graph:

A directed graph or digraph is an ordered pair D = (V, A) with

V a set whose elements are called vertices or nodes, and

A a set of ordered pairs of vertices, called arcs, directed edges, or arrows.

An arc a = (x, y) is considered to be directed from x to y; y is called the head and x is called the

tail of the arc;

3) Weighted Graph:

A graph is a weighted graph if a number (weight) is assigned to each edge.Such weights

might represent, for example, costs, lengths or capacities, etc. depending on the problem

at hand.

Terminologies:

1) End vertices: End-vertices of an edge are the endpoints of the edge.

2) Adjacent vertices: Two vertices are adjacent if they are endpoints of the same edge.

3) Edge: An edge is incident on a vertex if the vertex is an endpoint of the edge.

4) Outgoing edges: the edges of a vertex are directed edges that the vertex is the origin.

5) Incoming edges: the edges of a vertex are directed edges that the vertex is the destination.

6) Degree of a vertex, v, denoted deg(v) is the number of incident edges.

Degree of vertex A=3

Degree of vertex B=3

Degree of vertex C=3

Degree of vertex D=3

Degree of vertex E=2

http://en.wikipedia.org/wiki/Weighted_graph

7) Out-degree: outdeg(v), is the number of outgoing edges.

8) In-degree: indeg(v), is the number of incoming edges.

Vertices Indegree Outdegree

1 1 2

2 0 2

3 1 1

4 2 1

5 2 0

9) Parallel edges or multiple edges: these are edges of the same type and end-vertices.

10) Self-loop: It is an edge with the end vertices the same vertex.

11) Simple graphs: these graphs have no parallel edges or self-loops.

12) Pathis a sequence of alternating vertices and edges such that each successive vertex is

connected by the edge. Frequently only the vertices are listed especially if there are no

parallel edges.

13) Cycle: is a path that starts and end at the same vertex. Simple path is a path with distinct

vertices.

14) Directed path is a path of only directed edges. Directed cycle is a cycle of only directed

edges.

15) Sub-graph is a subset of vertices and edges. Spanning sub-graph contains all the vertices.

16) Complete Graph: An n vertex graph with exactly n(n-1)/2 edges is said to be complete

graph.

17) Connected graph has all pairs of vertices connected by at least one path.

18) Connected component is the maximal connected sub-graph of an unconnected graph.

19) Forest is a graph without cycles.

20) Tree is a connected forest (previous type of trees are called rooted trees, these are free

trees)

21) Spanning tree is a spanning subgraph that is also a tree.

Properties:

1. If graph, G, has m edges then Σv∈Gdeg(v) = 2m

2. If a di-graph, G, has m edges then Σv∈Gindeg(v) = m = Σv∈Goutdeg(v)

3. If a simple graph, G, has m edges and n vertices:

If G is also directed then m ≤ n(n-1)

If G is also undirected then m ≤ n(n-1)/2

So a simple graph with n vertices has O(n2) edges at most

If G is an undirected graph with n vertices and m edges:

If G is connected thenm ≥ n - 1

If G is a tree thenm = n - 1

If G is a forest then m ≤ n– 1

Basic Operations:

The basic operations provided by a graph data structure G usually include:

adjacent(G, x, y): tests whether there is an edge from node x to node y.

neighbors(G, x): lists all nodes y such that there is an edge from x to y.

add(G, x, y): adds to G the edge from x to y, if it is not there.

delete(G, x, y): removes the edge from x to y, if it is there.

get_node_value(G, x): returns the value associated with the node x.

set_node_value(G, x, a): sets the value associated with the node x to a.

Structures that associate values to the edges usually also provide:

get_edge_value(G, x, y): returns the value associated to the edge (x,y).

set_edge_value(G, x, y, v): sets the value associated to the edge (x,y) to v.

Representation of graphs

Different data structures for the representation of graphs are used in practice:

1)Adjacency list

2)Adjacency matrix

3)Incidence matrix

Questions

Exam Theory Questions: (Minimum 10)

(Question should be per session)

Question No. Question

1 Define the graph with examples. Explain various types of graph.

2 What is indegree, outdegree and degree of node.

3 Explain complete graph, simple graph, connected graph, strongly

connected graph with examples.

4 Explain various operations on graph.

5 What are the different properties of graph.

6 What are the graph representation techniques? Explain in detail.

Session 2Session 1: Adjacency matrix, adjacency list, adjacency multi list, inverse adjacency list

Session Objective:

o To implement graph representations using different data structures


o To apply different data structures to store the graph


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

5 Revision Brain StormingDiscussion

Summarizes ParticipateDiscuss


15 Adjacency matrix representation of

Brain Storming

FacilitatesExplain

ParticipateDiscuss


graph Presentation Comprehension

15 Adjacency list representation of graph


Explain Listen KnowledgeComprehension

10 Adjacency multi list representation of graph



5 Inverse Adjacency list representation of graph



10 Summary Discussion Explain Listen KnowledgeIntrapersonal


Understand representation of graph using different data structures


PPT, Notes

References

1. Data structures a pseudocode approach with c by Richard f. Gilberg , Behrouz a. Forouzan

Notes

Representation of graphs

Different data structures for the representation of graphs are used in practice:

Adjacency matrix:

A two-dimensional matrix, in which the rows represent source vertices and columns

represent destination vertices. Data on edges and vertices must be stored externally. Only

the cost for one edge can be stored between each pair of vertices.

Pros: Representation is easier to implement and follow. Removing an edge takes O(1)

time. Queries like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and

can be done O(1).

Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of

edges), it consumes the same space. Adding a vertex is O(V^2) time.

Adjacency list :

Vertices are stored as records or objects, and every vertex stores a list of adjacent

vertices. This data structure allows the storage of additional data on the vertices.

Additional data can be stored if edges are also stored as objects, in which case each

vertex stores its incident edges and each edge stores its incident vertices.

Pros: Saves space O(|V|+|E|) . In the worst case, there can be C(V, 2) number of edges in

a graph thus consuming O(V^2) space. Adding a vertex is easier.

Cons: Queries like whether there is an edge from vertex u to vertex v are not efficient and

can be done O(V).

Adjacency multi-list :

An edge in an undirected graph is represented by two nodes in adjacency list representation.

Adjacency Multilists lists in which nodes may be shared among several lists.

(an edge is shared by two different paths)

Example for Adjacency Multlists

Lists: vertex 0: M1->M2->M3, vertex 1: M1->M4->M5

vertex 2: M2->M4->M6, vertex 3: M3->M5->M6

typedef struct edge *edge_pointer;

typedef struct edge {

short int marked;

int vertex1, vertex2;

edge_pointer path1, path2;

};

edge_pointer graph[MAX_VERTICES];

Inverse Adjacency List :

Inverse adjacency list for the given graph is shown below.

Questions

In sem Exam Theory Questions: (Minimum 10)



1. Explain matrix representation of graph. What is the pros and cons of

this representation.

2. Explain adjacency list representation of graph. What is the pros and

cons of this representation.

3. Explain multi list representation of graph. What is the pros and cons of

this representation.

4. Explain inverse adjacency list representation of graph. What is the pros

and cons of this representation.

Session 3Session 1: Traversals-depth first and breadth first

Session Objective:

o To implement the graph traversal techniques


o To apply the concept of traversals like DFS and BFS on graphs


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

10 Revision Brain Summarizes Participate Knowledge

StormingDiscussion

Discuss Intrapersonal

20Graph Traversal: DFS


FacilitatesExplain

ParticipateDiscuss

KnowledgeIntrapersonalComprehension

20 Breadth First Search with its applications



10 Summary Discussion Explain Listen KnowledgeIntrapersonal


Explain BFS traversal of Graph

Explain DFS traversal of Graph


PPT, Notes

Referenceshttps://www.tutorialspoint.com/data_structures_algorithms/breadth_first_traversal.htm

Notes

Depth-First Search

A depth-first search (DFS) is an algorithm for traversing a finite graph. DFS visits the child

nodes before visiting the sibling nodes; that is, it traverses the depth of any particular path before

exploring its breadth. A stack (often the program's call stack via recursion) is generally used

when implementing the algorithm.

The algorithm begins with a chosen "root" node; it then iteratively transitions from the current

node to an adjacent, unvisited node, until it can no longer find an unexplored node to transition

http://en.wikipedia.org/wiki/Recursion_(computer_science)

http://en.wikipedia.org/wiki/Call_stack

to from its current location. The algorithm then backtracks along previously visited nodes, until it

finds a node connected to yet more uncharted territory. It will then proceed down the new path as

it had before, backtracking as it encounters dead-ends, and ending only when the algorithm has

backtracked past the original "root" node from the very first step.

Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to

remember to get the next vertex to start a search, when a dead end occurs in any iteration.

As in the example given above, DFS algorithm traverses from A to B to C to D first then to E,

then to F and lastly to G. It employs the following rules.

Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a

stack.

Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all

the vertices from the stack, which do not have adjacent vertices.)

Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

DFS is the basis for many graph-related algorithms, including topological sorts and planarity

testing.

Pseudocode

Input: A graph G and a vertex v of G

Output: A labeling of the edges in the connected component of v as discovery edges and back

edges

1 procedure DFS(G,v):2 label v as explored

3 for all edges e in G.incidentEdges(v) do4 if edge e is unexplored then5 w ← G.adjacentVertex(v,e)

6 if vertex w is unexplored then7 label e as a discovery edge

8 recursively call DFS(G,w)

9 else10 label e as a back edge

Breadth-first search

A breadth-first search (BFS) is another technique for traversing a finite graph. BFS visits the

parent nodes before visiting the child nodes, and a queue is used in the search process. This

algorithm is often used to find the shortest path from one node to another.

Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a

queue to remember to get the next vertex to start a search, when a dead end occurs in any

iteration.

http://en.wikipedia.org/wiki/Queue_(abstract_data_type)

As in the example given above, BFS algorithm traverses from A to B to E to F first then to C

and G lastly to D. It employs the following rules.

Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a

queue.

Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.

Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.

Pseudocode

Input: A graph G and a root v of G

Output: The node closest to v in G satisfying some conditions, or null if no such a node exists in

G

1 procedure BFS(G,v):2 create a queue Q

3 enqueuev onto Q

4 mark v

5 whileQ is not empty:6 t ← Q.dequeue()

7 ift is what we are looking for:8 return t

9 for all edges e in G.adjacentEdges(t) do12 o ← G.adjacentVertex(t,e)

13 ifo is not marked:14 mark o

15 enqueueo onto Q

16 return null

Applications

Breadth-first search can be used to solve many problems in graph theory, for example:

Finding all nodes within one connected component

Copying Collection, Cheney's algorithm

Finding the shortest path between two nodes u and v (with path length measured by

number of edges)

Testing a graph for bipartiteness

Ford–Fulkerson method for computing the maximum flow in a flow network

Serialization/Deserialization of a binary tree vs serialization in sorted order, allows the

tree to be re-constructed in an efficient manner.

The flood fill algorithm for marking contiguous regions of a two dimensional image or n-

dimensional array

The analysis of networks and relationships

http://en.wikipedia.org/wiki/Flood_fill

http://en.wikipedia.org/wiki/Flow_network

http://en.wikipedia.org/wiki/Maximum_flow_problem

http://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm

http://en.wikipedia.org/wiki/Bipartite_graph

http://en.wikipedia.org/wiki/Shortest_path

http://en.wikipedia.org/wiki/Cheney's_algorithm

http://en.wikipedia.org/wiki/Connected_component_(graph_theory)

Questions




1. Explain graph traversal techniques in detail.

2. Explain DFS with example.

3. Explain BFS with example.

Session 4Session 1: Introduction to Greedy Strategy, Minimum spanning Tree

Session Objective:

o To learn the concept of MST


o To find the MST using Prim’s algorithm


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

5 Revision Brain StormingDiscussion

Explain ParticipateDiscuss


10Greedy strategy


FacilitatesExplain

Listen KnowledgeIntrapersonalComprehension

20 Spanning tree & minimum spanning tree using prims algorithm

Discussion Blackboard

FacilitatesExplain

Listen KnowledgeIntrapersonalComprehension

10 Summary Discussion Summarizes ListenParticipate



Describe Spanning tree and minimum spanning tree

Apply Prim’s algorithm to find nminimum spanning tree


PPT, Notes

References

o https://www.tutorialspoint.com/data_structures_algorithms/greedy_algorithms.htm

o http://www.wikipedia.com

o http://proceedings.esri.com/library/userconf/proc01/professional/papers/pap565/p565.htm

Notes

An algorithm is designed to achieve optimum solution for a given problem. In greedy algorithm

approach, decisions are made from the given solution domain. As being greedy, the closest

solution that seems to provide an optimum solution is chosen.

Greedy algorithms try to find a localized optimum solution, which may eventually lead to

globally optimized solutions. However, generally greedy algorithms do not provide globally

optimized solutions.

Examples

Most networking algorithms use the greedy approach. Here is a list of few of them −

Travelling Salesman Problem

Prim's Minimal Spanning Tree Algorithm

Kruskal's Minimal Spanning Tree Algorithm

Dijkstra's Minimal Spanning Tree Algorithm

Graph - Map Coloring

http://www.wikipedia.com/

https://www.tutorialspoint.com/data_structures_algorithms/greedy_algorithms.htm

Spanning Tree

Given a connected, undirected graph, a spanning tree of that graph is a subgraph that is a tree and

connects all the vertices together. A single graph can have many different spanning trees. We can

also assign a weight to each edge, which is a number representing how unfavorable it is, and use

this to assign a weight to a spanning tree by computing the sum of the weights of the edges in

that spanning tree. Consider the following weighted graph

There are multiple spanning trees are formed as follows:

http://en.wikipedia.org/wiki/Vertex_(graph_theory)

http://en.wikipedia.org/wiki/Tree_graph

http://en.wikipedia.org/wiki/Glossary_of_graph_theory#Subgraphs

http://en.wikipedia.org/wiki/Spanning_tree_(mathematics)

http://en.wikipedia.org/wiki/Undirected_graph

http://en.wikipedia.org/wiki/Connected_graph

Minimum Spanning Tree

A minimum spanning tree (MST) or minimum weight spanning tree is then a spanning tree with

weight less than or equal to the weight of every other spanning tree. In above case it is following

tree with weight 15.

Application of Minimum Spanning Tree:

One example would be a telecommunications company laying cable to a new neighborhood. If it

is constrained to bury the cable only along certain paths (e.g. along roads), then there would be a

graph representing which points are connected by those paths. Some of those paths might be

more expensive, because they are longer, or require the cable to be buried deeper; these paths

would be represented by edges with larger weights. Currency is an acceptable unit for edge

weight — there is no requirement for edge lengths to obey normal rules of geometry such as the

triangle inequality. A spanning tree for that graph would be a subset of those paths that has no

cycles but still connects to every house; there might be several spanning trees possible. A

minimum spanning tree would be one with the lowest total cost, thus would represent the least

expensive path for laying the cable.

There are two algorithms for finding minimum spanning tree:

1) Prim’s Algorithm 2) Kruskal’s Algorithm

Prim's algorithm

It is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph.

This means it finds a subset of the edges that forms a tree that includes every vertex, where the

total weight of all the edges in the tree is minimized. The algorithm operates by building this tree

one vertex at a time, from an arbitrary starting vertex, at each step adding the cheapest possible

connection from the tree to another vertex.

In more detail, it may be implemented following the pseudo code below.

1. Associate with each vertex v of the graph a number C[v] (the cheapest cost of a

connection to v) and an edge E[v] (the edge providing that cheapest connection). To

initialize these values, set all values of C[v] to +∞ (or to any number larger than the

http://en.wikipedia.org/wiki/Pseudocode

http://en.wikipedia.org/wiki/Graph_theory


http://en.wikipedia.org/wiki/Tree_(graph_theory)

http://en.wikipedia.org/wiki/Edge_(graph_theory)


http://en.wikipedia.org/wiki/Weighted_graph

http://en.wikipedia.org/wiki/Minimum_spanning_tree

http://en.wikipedia.org/wiki/Greedy_algorithm

maximum edge weight) and set each E[v] to a special flag value indicating that there is no

edge connecting v to earlier vertices.

2. Initialize an empty forest F and a set Q of vertices that have not yet been included in F

(initially, all vertices).

3. Repeat the following steps until Q is empty:

a. Find and remove a vertex v from Q having the minimum possible value of C[v]

b. Add v to F and, if E[v] is not the special flag value, also add E[v] to F

c. Loop over the edges vw connecting v to other vertices w. For each such edge, if w

still belongs to Q and vw has smaller weight than C[w], perform the following

steps:

i. Set C[w] to the cost of edge vw

ii. Set E[w] to point to edge vw.

4. Return F

As described above, the starting vertex for the algorithm will be chosen arbitrarily, because the

first iteration of the main loop of the algorithm will have a set of vertices in Q that all have equal

weights, and the algorithm will automatically start a new tree in F when it completes a spanning

tree of each connected component of the input graph.

Analysis of Algorithm:

The time complexity of Prim's algorithm depends on the data structures used for the graph and

for ordering the edges by weight, which can be done using a priority queue. The following table

shows the typical choices.

Minimum edge weight data structure Time complexity (total)

adjacency matrix, searching O(|V|2)

binary heap and adjacency list O((|V| + |E|) log |V|) = O(|E| log |V|)

Fibonacci heap and adjacency list O(|E| + |V| log |V|)

http://en.wikipedia.org/wiki/Adjacency_list

http://en.wikipedia.org/wiki/Fibonacci_heap

http://en.wikipedia.org/wiki/Adjacency_list

http://en.wikipedia.org/wiki/Binary_heap

http://en.wikipedia.org/wiki/Adjacency_matrix

http://en.wikipedia.org/wiki/Priority_queue

http://en.wikipedia.org/wiki/Flag_value


1. List various applications of graph.

2. Define spanning tree and minimum spanning tree with example.

3. Explain Prim’s Algorithm to find MST in detail.

Session 5Session 1: Greedy algorithms for computing minimum spanning tree- ,kruskals Algorithms

Session Objective:

o To study the concept of and Kruskal’s algorithm


o To apply the concept Kruskal’s algorithm on graph


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

20 Finding MST using Kruskal’s algorithm


FacilitatesExplain

ListenParticipate


30 Solving examples Discussion Presentation

Explain ParticipateDiscuss

KnowledgeComprehension




Apply Kruskal’s algorithm to find the MST

Explain connected component with examples


PPT, Notes

References

1. http://www.wikipedia.com

Notes

Kruskal's algorithm

It is a minimum-spanning-tree algorithm where the algorithm finds an edge of the least possible

weight that connects any two trees in the forest. It is a greedy algorithm in graph theory as it

finds a minimum spanning tree for a connected weighted graph at each step. This means it finds

a subset of the edges that forms a tree that includes every vertex, where the total weight of all the

edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning

forest (a minimum spanning tree for each connected component).

Consider the above graph the steps to find minimum spanning tree using Kruskal’s Algorithm is

as follows:

http://en.wikipedia.org/wiki/Connected_component_(graph_theory)


http://en.wikipedia.org/wiki/Edge_(graph_theory)

http://en.wikipedia.org/wiki/Glossary_of_graph_theory#Weighted_graphs_and_networks

http://en.wikipedia.org/wiki/Connectivity_(graph_theory)

http://en.wikipedia.org/wiki/Minimum_spanning_tree


http://en.wikipedia.org/wiki/Greedy_algorithm

Description

create a forest F (a set of trees), where each vertex in the graph is a separate tree

create a set S containing all the edges in the graph

while S is nonempty and F is not yet spanning

o remove an edge with minimum weight from S

o if the removed edge connects two different trees then add it to the forest F,

combining two trees into a single tree

At the termination of the algorithm, the forest forms a minimum spanning forest of the graph. If

the graph is connected, the forest has a single component and forms a minimum spanning tree.

Pseudocode

The following code is implemented with disjoint-set data structure:

KRUSKAL(G):

http://en.wikipedia.org/wiki/Disjoint-set_data_structure

http://en.wikipedia.org/wiki/Spanning_tree

http://en.wikipedia.org/wiki/Nonempty

http://en.wikipedia.org/wiki/Tree_(graph_theory)

1 A = ∅2 foreach v ∈ G.V:

3 MAKE-SET(v)

4 foreach (u, v) ordered by weight(u, v), increasing:

5 if FIND-SET(u) ≠ FIND-SET(v):

6 A = A ∪ {(u, v)}

7 UNION(u, v)

8 return A

Analysis of Algorithm:

Where E is the number of edges in the graph and V is the number of vertices, Kruskal's algorithm

can be shown to run in O(ElogE) time, or equivalently, O(E log V) time, all with simple data

structures. These running times are equivalent because:

E is at most V2 and is O(log V).

Each isolated vertex is a separate component of the minimum spanning forest. If we

ignore isolated vertices we obtain V ≤ E+1, so log V is O(log E).

Connected Components:

In graph theory, a connected component (or just component) of an undirected graph is a

subgraph in which any two vertices are connected to each other by paths, and which is connected

to no additional vertices in the super graph. Following fig shows 3 connected components.

http://en.wikipedia.org/wiki/Path_(graph_theory)

http://en.wikipedia.org/wiki/Connected_graph

http://en.wikipedia.org/wiki/Glossary_of_graph_theory#Subgraphs



http://en.wikipedia.org/wiki/Binary_logarithm

http://en.wikipedia.org/wiki/Big-O_notation

Algorithm

1. It is straightforward to compute the connected components of a graph in linear time (in

terms of the numbers of the vertices and edges of the graph) using either breadth-first

search or depth-first search.

2. In either case, a search that begins at some particular vertex v will find the entire

connected component containing v (and no more) before returning.

3. To find all the connected components of a graph, loop through its vertices, starting a new

breadth first or depth first search whenever the loop reaches a vertex that has not already

been included in a previously found connected component.

There are also efficient algorithms to dynamically track the connected components of a graph

as vertices and edges are added, as a straightforward application of disjoint-set data

structures. These algorithms require amortized O (α(n)) time per operation, where adding

vertices and edges and determining the connected component in which a vertex falls are both

operations


1. Explain Kruskal’s algorithm in detail

2. What is connected component? Explain the algorithm for connected

components.

3. Differentiate between Prim’s and Kruskal’s algorithm.

http://en.wikipedia.org/wiki/Amortized_analysis



http://en.wikipedia.org/wiki/Depth-first_search

http://en.wikipedia.org/wiki/Breadth-first_search

http://en.wikipedia.org/wiki/Breadth-first_search

Session 6Session 1: Dikjtra's Single source shortest path algorithm

Session Objective:

o To understand the shortest path problem


o To apply Dijkstra’s Algorithm to find shortest path between source and destination


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

25 shortest path algorithm with example


FacilitatesExplain

ParticipateDiscuss


25Algorithm


FacilitatesExplain

ParticipateDiscuss

KnowledgeIntrapersonalComprehension




Define shortest path problem

Apply Dijkstra’s algorithm to find shortest path


PPT, Notes

References

http://lcm.csa.iisc.ernet.in/dsa/node162.html

Notes

Shortest path problem:

In graph theory, the shortest path problem is the problem of finding a path between two

vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is

minimized.

The problem is also sometimes called the single-pair shortest path problem.

Shortest path (A, C, E, D, F) between vertices A and F in the weighted directed graph

The most important algorithms for solving this problem are:

Dijkstra's algorithm solves the single-source shortest path problem.

Bellman–Ford algorithm solves the single-source problem if edge weights may be

negative.

Floyd–Warshall algorithm solves all pairs shortest paths.

Dijkstra’s Algorithm for Shortest Path:

http://en.wikipedia.org/wiki/Glossary_of_graph_theory#Weighted_graphs_and_networks

http://en.wikipedia.org/wiki/Graph_(mathematics)


http://en.wikipedia.org/wiki/Path_(graph_theory)


Greedy algorithm

It works by maintaining a set S of ``special'' vertices whose shortest distance from the

source is already known. At each step, a ``non-special'' vertex is absorbed into S.

The absorption of an element of V - S into S is done by a greedy strategy.

The following provides the steps of the algorithm.

Let

V={1, 2,..., n} and source = 1

C[i,j]=

{

S= { 1 };

for (i = 2; i < n; i++)

D[i] = C[1,i];

for (i=1; i < = n-1; i++)

{

choose a vertex w V-S such that D[w] is a minimum;

S = S {w };

for each vertex v V-S

D[v] = min (D[v], D[w] + C[w, v])

}

}

The above algorithm gives the costs of the shortest paths from source vertex to every

other vertex.

The actual shortest paths can also be constructed by modifying the above algorithm.

Example: Consider the digraph in Figure.

A digraph example for Dijkstra's algorithm

Initially:

S = {1} D[2] = 10 D[3] = D[4] = 30 D[5] = 100

Iteration 1

Select w = 2, so that S = {1, 2}

D[3]=min( , D[2] + C[2, 3]) = 60D[4]=min(30, D[2] + C[2, 4]) = 30D[5]=min(100, D[2] + C[2, 5]) = 100

Iteration 2

Select w = 4, so that S = {1, 2, 4}

D[3]=min(60, D[4] + C[4, 3]) = 50D[5]=min(100, D[4] + C[4, 5]) = 90

Iteration 3

Select w = 3, so that S = {1, 2, 4, 3}

D[5]=min(90, D[3] + C[3, 5]) = 60

Iteration 4

Select w = 5, so that S = {1, 2, 4, 3, 5}

D[2]=10D[3]=50D[4]=30D[5]=60

Complexity of Dijkstra's Algorithm

With adjacency matrix representation, the running time is O(n2) By using an adjacency

list representation and a partially ordered tree data structure for organizing the set V - S,

the complexity can be shown to be O(elog n) where e is the number of edges and n is the

number of vertices in the digraph.

Questions




1. What is shortest path problem? Explain with example.

2 Explain Dijkstra’s algorithm in detail.

3 What are the various algorithms to find shortest path.

Session 7

Session 1: Topological ordering

Session Objective:

To study the concept of Topological ordering


To apply the concept Topological ordering


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

20 Topological ordering


FacilitatesExplain

ParticipateDiscuss


30Solving examples


FacilitatesExplain

ParticipateDiscuss





Define the Topological ordering


PPT, Notes

References

1. https://courses.cs.washington.edu/courses/cse326/03wi/lectures/RaoLect20.pdf

Notes

Graph Algorithm #1: Topological Sort

Problem: Find an order in which all these courses can be taken.

Example: 142 143 378

370 321 341 322

326 421 401

Topological Sort Definition :Topological sorting problem: given digraph G = (V, E) , find a linear ordering of vertices such that: for all edges (v, w) in E, v precedes w in the ordering

Topological sorting problem: given digraph G = (V, E) , find a linear ordering of vertices such that: for any edge (v, w) in E, v precedes w in the ordering

https://courses.cs.washington.edu/courses/cse326/03wi/lectures/RaoLect20.pdf

Step 1: Identify vertices that have no incoming edge • The “in-degree” of these vertices is zero

Step 1: Identify vertices that have no incoming edge • If no such edges, graph has cycles (cyclic graph)

Step 1: Identify vertices that have no incoming edges • Select one such vertex

Step 2: Delete this vertex of in-degree 0 and all its outgoing edges from the graph. Place it in the output.

Repeat Steps 1 and Step 2 until graph is empty




Questions

Exam Theory Questions:



1. What is topological sorting ? Explain.

Session 8

Session 1: Case study- Data structure used in Webgraph and google map

Session Objective:

To study the concept of webgraph


To apply the concept graphs in webgraphs.


o Black board

o PPT

Session Plan

Time (in min)


Faculty Approach



CO PO

10 revision Brain StormingDiscussion

FacilitatesExplain

ParticipateDiscuss


40 Case study- Data structure used in Webgraph,googlemap


FacilitatesExplain

ParticipateDiscuss





Define the Topological ordering


PPT, Notes

References

1) http://jgaa.info/accepted/2006/Donato+2006.10.2.pdf

2) S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine.

Computer Networks and ISDN Systems, 30(1–7):107–117, 1998.

Notes

The Webgraph is the graph whose nodes are (static) web pages and edges are (directed)

hyperlinks among them. The Web graph has been the subject of a large interest in the scientific

community. The reason of such large interest is primarily given to search engine technologies.

Remarkable examples are the algorithms for ranking pages such as PageRank [2].

The Web graph relative to a certain set of URLs is a directed graph having those URLs as nodes,

and with an arc from x to y whenever page x contains a hyperlink toward page y. When trying

to devise a compression mechanism to store a Web graph efficiently we can exploit some

empirical observations about the structure of hyperlinks in a typical subset of the Web. The

features of the links of a Web graph that are usually quoted are locality and similarity, which

were originally exploited by the Connectivity Server [2] and by the LINK database [1]. 1.

Locality. Most links contained in a page have a navigational nature: they lead the user to some

other pages within the same host (“home”, “next”, “previous”, “up” etc.); if we compare the

source and target URLs of these links, we observe that they share a long common prefix; said

otherwise, if URLs are sorted lexicographically, the index of source and target are close to each

other. 2. Similarity. Pages that occur close to each other (in lexicographic order) tend to have

many common successors; this is because many navigational links are the same within the same

local cluster of pages, and even non-navigational links are often copied from one page to another

within the same host. These features suggest to use techniques borrowed from full-text indexing

for storing increasing sequences of integers with small gaps, and moreover inspired the reference

compression techniques discussed in [1]. Since several successor lists are similar, one can

http://jgaa.info/accepted/2006/Donato+2006.10.2.pdf

specify the successor list of a node by copying part of a previous list, and adding whatever

remains. This is achieved using a list of bits, one for each successor in the referenced list, which

tell whether the successor should be copied or not, or using other techniques (such as explicit

deletion lists [1]). The empirical analysis at the base of WebGraph’s compression techniques

evidenced two additional facts: 1. Similarity is much more concentrated than it was previously

thought. Either two lists have almost nothing in common, or they share large segments of their

successor lists. This implies that the one-bit-per-link scheme used in reference compression may

be refined to a copy-block list scheme, in which the links to be copied are specified by means of

interval lengths (this corresponds essentially to a run-length encoding of the reference bits). 2.

Consecutivity is common. It can be observed that many links within a page are consecutive (with

respect to the lexicographic order); this is due to two distinct phenomena. First of all, most pages

contain sets of navigational links which point to a fixed level of the hierarchy. Since the

hierarchical nature of a site is usually reflected in the hierarchical nature of URLs, links in pages

at the bottom of the hierarchy tend to be adjacent in lexicographic order. Second, in the

transposed Web graph pages that are high in the site hierarchy (e.g., the home page) are pointed

to by most pages of the site. This, of course, gives also rise to large intervals. 3. Consecutivity is

the dual of distance-one similarity. If a graph is easily compressible using similarity at distance

one (i.e., exploiting similarity with the successor list of the previous node in lexicographical

ordering), its transpose must sport large intervals of consecutive links, and viceversa, as a node

that is common among two or more consecutive successor lists at distance one is reflected by a

corresponding interval of length two or more in the transposed graph.

1. What kind of data structure is used to store the map information.

2. What kind of algorithm is used to "navigate" from source to destination.

3. How is Google/Bing able to "stream in" the data. So for example, you are able to zoom in

from miles up to the ground level seamlessly, all the while maintaining the coordinate

system.

I will attempt to address each question in order. Do note, that I do not work for the Google

Maps or the Bing team, so quite obviously, this information might not be completely accurate. I

am basing this off of the knowledge gained from a good CS course about data structures and

algorithms.

Answer for above all questions is the map is stored in an Edge Weighted Directed Graph.

Locations on the map are Vertices and the path from one location to another (from one vertex to

another) are the Edges.

Quite obviously, since there can be millions of vertices and an order of magnitude more edges,

the really interesting thing would be the representation of this Edge Weighted Digraph.

I would say that this would be represented by some kind of Adjacency List and the reason I say

so is because, if you imagine a map, it is essentially a sparse graph. There are only a few ways to

get from one location to another. Think about your house! How many roads (edges in our case)

lead to it? Adjacency Lists are good for representing sparse graphs, and adjacency matrix is good

for representing dense graphs.

Of course, even though we are able to efficiently represent sparse graphs in memory, given the

sheer number of Vertices and Edges, it would be impossible to store everything in memory at

once. Hence, I would imagine some kind of a streaming library underneath.

To create an analogy for this, if you have ever played an open-world game like World of

Warcraft / Syrim / GTA, you will observe that to a large part, there is no loading screen. But

quite obviously, it is impossible to fit everything into memory at once. Thus using a combination

of quad-trees and frustum culling algorithms, these games are able to dynamically load resources

(terrain, sprites, meshes etc).

We can imagine something similar, but for Graphs. I have not put a lot of thought into this

particular aspect, but to cook up a very basic system, one can imagine an in memory database,

which they query and add/remove vertices and edges from the graph at run-time as needed. This

brings us to another interesting point. Since vertices and edges need to be removed and added at

run-time, the classic implementation of Adjacency List will not cut it.

In a classic implementation, we simply store a List (a Vector in Java) in each element of an

array: Adj[]. I would imagine, a linked list in place of the Adj[] array and a binary search tree in

place of List[Edge]. The binary search tree would facilitate O(log N) insertion and removal of

nodes. This is extremely desirable since in the List implementation, while addition is O(1),

removal is O(N) and when you are dealing with millions of edges, this is prohibitive.

A final point to note here is that until you actually start the navigation, there is "no" graph. Since

there can be million of users, it doesn't make sense to maintain one giant graph for everybody

(this would be impossible due to memory space requirement alone). I would imagine that as you

stat the navigation process, a graph is created for you. Quite obviously, since you start from

location A and go to location B (and possibly other locations after that), the graph created just for

you should not take up a very large amount of memory (provided the streaming architecture is in

place).

The most basic algorithm for solving this problem would be Dijkstra Path Finding algorithm.

Faster variations such as A* exist. I would imagine Dijkstra to be fast enough, if it could work

properly with the streaming architecture discussed above. Dijkstra uses space proportional to V

and time proportional to E lg V, which are very good figures, especially for sparse graphs. Do

keep in mind, if the streaming architecture has not been nailed down, V and E will explode and

the space and run-time requirements of Dijkstra will make it prohibitive.

Exam Theory Questions:



1. Write short notes on Data structure used in Webgraph and Google map.

Documents

prasadlahre.files.wordpress.com · Web view2021. 3. 12. · Apply the Prim’s and Kruskal’s algorithm in finding MST. Implement Dijkstra’s algorithm for finding shortest path