Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
UNIT IIIGraphs
UNIT IIIUnit Rationale
Unit II: (9 Hrs)
Basic Concepts, Storage representation, Adjacency matrix, adjacency list, adjacency multi list,inverse adjacency list. Traversals -depth first and breadth first, Introduction to Greedy Strategy, Minimum spanning Tree, Greedy algorithms for computing minimum spanning tree -Prims and Kruskal Algorithms, Dikjtra's Single source shortest path, Topological ordering.Case study-Data structure used in Webgraph and Google map.
Upon completion Students will be able to: -
When the students have successfully completed this course, they will be able to:
Implement Graph data structures.
Demonstrate applications of graph.
Apply the Prim’s and Kruskal’s algorithm in finding MST.
Implement Dijkstra’s algorithm for finding shortest path
Understand the transitive closure and connected components concepts.
Session Objectives and Outcome
Unit Objective: To learn concept of graph data structure and its applications.
Unit Outcome: To apply the concepts of graph data structure in real life.
Session Contents Objective Outcome
1.
Basic Concepts, Storage representation of graphs To study the concept of
graph To apply the concept of graph
2.
Adjacency matrix, adjacency list, adjacency multi list, Inverse adjacency list
To study representations of graph using different data structures
To demonstrate how to represent graph using different data structures
3.
Traversals-depth first and breadth first To study traversal
techniques of graphTo demonstrate how to perform DFS and BFS on Graph
4.
Introduction to Greedy Strategy, Minimum spanning Tree To learn
implementation of MST To Explain MST
5.
Greedy algorithms for computing minimum spanning tree- Prims ,kruskals Algorithms
To learn implementation MST
To explain Prim’s and kruskal’s algorithm
6.
Dikjtra's Single source shortest path algorithm To gain knowledge of
how to find shortest path
To explain about dijkstra’s algorithm
7.Topological ordering
To study topological ordering To learn topological ordering
8.Case study- Data structure used in Webgraph and Google map
To study the use of webgraph To study the use of webgraph
9. Case study- Data structure used in Google map
To study the use of google map To study the use of google map
Session 1Session 1: Basic Concepts, Storage representation of graphs
Session Objective:
o To study the concept of Graphs and Storage representation of graphs
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
o To apply the concept of graph data structure
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
20 Definition of graph and related terminologies
Brain StormingDiscussion
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
15 Properties of graph and Graph Operations
Brain StormingPresentation
FacilitatesExplain
ParticipateDiscuss
Knowledge
15 Storage Representation of graph
Discussion Presentation
Explain Listen Knowledge
10 Summary Discussion Summarizes ParticipateDiscuss
KnowledgeComprehension
Upon completion Students will be able to: -
Describe the concepts of graph
Explain operations on graph data structure
Explain storage structure
Teaching Learning Material:-
PPT, Notes
References
o http://www.studytonight.com/data-structures/
o http://www.wikipedia.com
Notes
GraphDefinition:
A graph data structure consists of a finite (and possibly mutable) set of nodes or vertices, together with a set of ordered pairs of these nodes (or, in some cases, a set of unordered pairs). These pairs are known as edges or arcs.
A graph data structure may also associate to each edge some edge value, such as a symbolic label or a numeric attribute (cost, capacity, length, etc.). Following fig 1 represents graph having 6 vertices and 7 edges.
Fig1: Graph
Types of Graph
1) Undirected Graph:
An undirected graph is one in which edges have no orientation. The edge (a, b) is
identical to the edge (b, a), i.e., they are not ordered pairs, but sets {u, v} (or 2-multisets)
of vertices. The maximum number of edges in an undirected graph without a self-loop is
n(n - 1)/2.
2) Directed Graph:
A directed graph or digraph is an ordered pair D = (V, A) with
V a set whose elements are called vertices or nodes, and
A a set of ordered pairs of vertices, called arcs, directed edges, or arrows.
An arc a = (x, y) is considered to be directed from x to y; y is called the head and x is called the
tail of the arc;
3) Weighted Graph:
A graph is a weighted graph if a number (weight) is assigned to each edge.Such weights
might represent, for example, costs, lengths or capacities, etc. depending on the problem
at hand.
Terminologies:
1) End vertices: End-vertices of an edge are the endpoints of the edge.
2) Adjacent vertices: Two vertices are adjacent if they are endpoints of the same edge.
3) Edge: An edge is incident on a vertex if the vertex is an endpoint of the edge.
4) Outgoing edges: the edges of a vertex are directed edges that the vertex is the origin.
5) Incoming edges: the edges of a vertex are directed edges that the vertex is the destination.
6) Degree of a vertex, v, denoted deg(v) is the number of incident edges.
Degree of vertex A=3
Degree of vertex B=3
Degree of vertex C=3
Degree of vertex D=3
Degree of vertex E=2
7) Out-degree: outdeg(v), is the number of outgoing edges.
8) In-degree: indeg(v), is the number of incoming edges.
Vertices Indegree Outdegree
1 1 2
2 0 2
3 1 1
4 2 1
5 2 0
9) Parallel edges or multiple edges: these are edges of the same type and end-vertices.
10) Self-loop: It is an edge with the end vertices the same vertex.
11) Simple graphs: these graphs have no parallel edges or self-loops.
12) Pathis a sequence of alternating vertices and edges such that each successive vertex is
connected by the edge. Frequently only the vertices are listed especially if there are no
parallel edges.
13) Cycle: is a path that starts and end at the same vertex. Simple path is a path with distinct
vertices.
14) Directed path is a path of only directed edges. Directed cycle is a cycle of only directed
edges.
15) Sub-graph is a subset of vertices and edges. Spanning sub-graph contains all the vertices.
16) Complete Graph: An n vertex graph with exactly n(n-1)/2 edges is said to be complete
graph.
17) Connected graph has all pairs of vertices connected by at least one path.
18) Connected component is the maximal connected sub-graph of an unconnected graph.
19) Forest is a graph without cycles.
20) Tree is a connected forest (previous type of trees are called rooted trees, these are free
trees)
21) Spanning tree is a spanning subgraph that is also a tree.
Properties:
1. If graph, G, has m edges then Σv∈Gdeg(v) = 2m
2. If a di-graph, G, has m edges then Σv∈Gindeg(v) = m = Σv∈Goutdeg(v)
3. If a simple graph, G, has m edges and n vertices:
If G is also directed then m ≤ n(n-1)
If G is also undirected then m ≤ n(n-1)/2
So a simple graph with n vertices has O(n2) edges at most
If G is an undirected graph with n vertices and m edges:
If G is connected thenm ≥ n - 1
If G is a tree thenm = n - 1
If G is a forest then m ≤ n– 1
Basic Operations:
The basic operations provided by a graph data structure G usually include:
adjacent(G, x, y): tests whether there is an edge from node x to node y.
neighbors(G, x): lists all nodes y such that there is an edge from x to y.
add(G, x, y): adds to G the edge from x to y, if it is not there.
delete(G, x, y): removes the edge from x to y, if it is there.
get_node_value(G, x): returns the value associated with the node x.
set_node_value(G, x, a): sets the value associated with the node x to a.
Structures that associate values to the edges usually also provide:
get_edge_value(G, x, y): returns the value associated to the edge (x,y).
set_edge_value(G, x, y, v): sets the value associated to the edge (x,y) to v.
Representation of graphs
Different data structures for the representation of graphs are used in practice:
1)Adjacency list
2)Adjacency matrix
3)Incidence matrix
Questions
Exam Theory Questions: (Minimum 10)
(Question should be per session)
Question No. Question
1 Define the graph with examples. Explain various types of graph.
2 What is indegree, outdegree and degree of node.
3 Explain complete graph, simple graph, connected graph, strongly
connected graph with examples.
4 Explain various operations on graph.
5 What are the different properties of graph.
6 What are the graph representation techniques? Explain in detail.
Session 2Session 1: Adjacency matrix, adjacency list, adjacency multi list, inverse adjacency list
Session Objective:
o To implement graph representations using different data structures
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
o To apply different data structures to store the graph
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
5 Revision Brain StormingDiscussion
Summarizes ParticipateDiscuss
KnowledgeIntrapersonal
15 Adjacency matrix representation of
Brain Storming
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
graph Presentation Comprehension
15 Adjacency list representation of graph
Discussion Presentation
Explain Listen KnowledgeComprehension
10 Adjacency multi list representation of graph
Discussion Presentation
Explain Listen KnowledgeComprehension
5 Inverse Adjacency list representation of graph
Discussion Presentation
Explain Listen KnowledgeComprehension
10 Summary Discussion Explain Listen KnowledgeIntrapersonal
Upon completion Students will be able to: -
Understand representation of graph using different data structures
Teaching Learning Material:-
PPT, Notes
References
1. Data structures a pseudocode approach with c by Richard f. Gilberg , Behrouz a. Forouzan
Notes
Representation of graphs
Different data structures for the representation of graphs are used in practice:
Adjacency matrix:
A two-dimensional matrix, in which the rows represent source vertices and columns
represent destination vertices. Data on edges and vertices must be stored externally. Only
the cost for one edge can be stored between each pair of vertices.
Pros: Representation is easier to implement and follow. Removing an edge takes O(1)
time. Queries like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and
can be done O(1).
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of
edges), it consumes the same space. Adding a vertex is O(V^2) time.
Adjacency list :
Vertices are stored as records or objects, and every vertex stores a list of adjacent
vertices. This data structure allows the storage of additional data on the vertices.
Additional data can be stored if edges are also stored as objects, in which case each
vertex stores its incident edges and each edge stores its incident vertices.
Pros: Saves space O(|V|+|E|) . In the worst case, there can be C(V, 2) number of edges in
a graph thus consuming O(V^2) space. Adding a vertex is easier.
Cons: Queries like whether there is an edge from vertex u to vertex v are not efficient and
can be done O(V).
Adjacency multi-list :
An edge in an undirected graph is represented by two nodes in adjacency list representation.
Adjacency Multilists lists in which nodes may be shared among several lists.
(an edge is shared by two different paths)
Example for Adjacency Multlists
Lists: vertex 0: M1->M2->M3, vertex 1: M1->M4->M5
vertex 2: M2->M4->M6, vertex 3: M3->M5->M6
typedef struct edge *edge_pointer;
typedef struct edge {
short int marked;
int vertex1, vertex2;
edge_pointer path1, path2;
};
edge_pointer graph[MAX_VERTICES];
Inverse Adjacency List :
Inverse adjacency list for the given graph is shown below.
Questions
In sem Exam Theory Questions: (Minimum 10)
(Question should be per session)
Question No. Question
1. Explain matrix representation of graph. What is the pros and cons of
this representation.
2. Explain adjacency list representation of graph. What is the pros and
cons of this representation.
3. Explain multi list representation of graph. What is the pros and cons of
this representation.
4. Explain inverse adjacency list representation of graph. What is the pros
and cons of this representation.
Session 3Session 1: Traversals-depth first and breadth first
Session Objective:
o To implement the graph traversal techniques
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
o To apply the concept of traversals like DFS and BFS on graphs
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
10 Revision Brain Summarizes Participate Knowledge
StormingDiscussion
Discuss Intrapersonal
20Graph Traversal: DFS
Brain StormingPresentation
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonalComprehension
20 Breadth First Search with its applications
Discussion Presentation
Explain Listen KnowledgeComprehension
10 Summary Discussion Explain Listen KnowledgeIntrapersonal
Upon completion Students will be able to: -
Explain BFS traversal of Graph
Explain DFS traversal of Graph
Teaching Learning Material:-
PPT, Notes
Referenceshttps://www.tutorialspoint.com/data_structures_algorithms/breadth_first_traversal.htm
Notes
Depth-First Search
A depth-first search (DFS) is an algorithm for traversing a finite graph. DFS visits the child
nodes before visiting the sibling nodes; that is, it traverses the depth of any particular path before
exploring its breadth. A stack (often the program's call stack via recursion) is generally used
when implementing the algorithm.
The algorithm begins with a chosen "root" node; it then iteratively transitions from the current
node to an adjacent, unvisited node, until it can no longer find an unexplored node to transition
to from its current location. The algorithm then backtracks along previously visited nodes, until it
finds a node connected to yet more uncharted territory. It will then proceed down the new path as
it had before, backtracking as it encounters dead-ends, and ending only when the algorithm has
backtracked past the original "root" node from the very first step.
Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
As in the example given above, DFS algorithm traverses from A to B to C to D first then to E,
then to F and lastly to G. It employs the following rules.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a
stack.
Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all
the vertices from the stack, which do not have adjacent vertices.)
Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.
DFS is the basis for many graph-related algorithms, including topological sorts and planarity
testing.
Pseudocode
Input: A graph G and a vertex v of G
Output: A labeling of the edges in the connected component of v as discovery edges and back
edges
1 procedure DFS(G,v):2 label v as explored
3 for all edges e in G.incidentEdges(v) do4 if edge e is unexplored then5 w ← G.adjacentVertex(v,e)
6 if vertex w is unexplored then7 label e as a discovery edge
8 recursively call DFS(G,w)
9 else10 label e as a back edge
Breadth-first search
A breadth-first search (BFS) is another technique for traversing a finite graph. BFS visits the
parent nodes before visiting the child nodes, and a queue is used in the search process. This
algorithm is often used to find the shortest path from one node to another.
Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a
queue to remember to get the next vertex to start a search, when a dead end occurs in any
iteration.
As in the example given above, BFS algorithm traverses from A to B to E to F first then to C
and G lastly to D. It employs the following rules.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a
queue.
Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.
Pseudocode
Input: A graph G and a root v of G
Output: The node closest to v in G satisfying some conditions, or null if no such a node exists in
G
1 procedure BFS(G,v):2 create a queue Q
3 enqueuev onto Q
4 mark v
5 whileQ is not empty:6 t ← Q.dequeue()
7 ift is what we are looking for:8 return t
9 for all edges e in G.adjacentEdges(t) do12 o ← G.adjacentVertex(t,e)
13 ifo is not marked:14 mark o
15 enqueueo onto Q
16 return null
Applications
Breadth-first search can be used to solve many problems in graph theory, for example:
Finding all nodes within one connected component
Copying Collection, Cheney's algorithm
Finding the shortest path between two nodes u and v (with path length measured by
number of edges)
Testing a graph for bipartiteness
Ford–Fulkerson method for computing the maximum flow in a flow network
Serialization/Deserialization of a binary tree vs serialization in sorted order, allows the
tree to be re-constructed in an efficient manner.
The flood fill algorithm for marking contiguous regions of a two dimensional image or n-
dimensional array
The analysis of networks and relationships
Questions
In sem Exam Theory Questions: (Minimum 10)
(Question should be per session)
Question No. Question
1. Explain graph traversal techniques in detail.
2. Explain DFS with example.
3. Explain BFS with example.
Session 4Session 1: Introduction to Greedy Strategy, Minimum spanning Tree
Session Objective:
o To learn the concept of MST
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
o To find the MST using Prim’s algorithm
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
5 Revision Brain StormingDiscussion
Explain ParticipateDiscuss
KnowledgeIntrapersonal
10Greedy strategy
Brain StormingPresentation
FacilitatesExplain
Listen KnowledgeIntrapersonalComprehension
20 Spanning tree & minimum spanning tree using prims algorithm
Discussion Blackboard
FacilitatesExplain
Listen KnowledgeIntrapersonalComprehension
10 Summary Discussion Summarizes ListenParticipate
KnowledgeIntrapersonal
Upon completion Students will be able to: -
Describe Spanning tree and minimum spanning tree
Apply Prim’s algorithm to find nminimum spanning tree
Teaching Learning Material:-
PPT, Notes
References
o https://www.tutorialspoint.com/data_structures_algorithms/greedy_algorithms.htm
o http://www.wikipedia.com
o http://proceedings.esri.com/library/userconf/proc01/professional/papers/pap565/p565.htm
Notes
An algorithm is designed to achieve optimum solution for a given problem. In greedy algorithm
approach, decisions are made from the given solution domain. As being greedy, the closest
solution that seems to provide an optimum solution is chosen.
Greedy algorithms try to find a localized optimum solution, which may eventually lead to
globally optimized solutions. However, generally greedy algorithms do not provide globally
optimized solutions.
Examples
Most networking algorithms use the greedy approach. Here is a list of few of them −
Travelling Salesman Problem
Prim's Minimal Spanning Tree Algorithm
Kruskal's Minimal Spanning Tree Algorithm
Dijkstra's Minimal Spanning Tree Algorithm
Graph - Map Coloring
Spanning Tree
Given a connected, undirected graph, a spanning tree of that graph is a subgraph that is a tree and
connects all the vertices together. A single graph can have many different spanning trees. We can
also assign a weight to each edge, which is a number representing how unfavorable it is, and use
this to assign a weight to a spanning tree by computing the sum of the weights of the edges in
that spanning tree. Consider the following weighted graph
There are multiple spanning trees are formed as follows:
Minimum Spanning Tree
A minimum spanning tree (MST) or minimum weight spanning tree is then a spanning tree with
weight less than or equal to the weight of every other spanning tree. In above case it is following
tree with weight 15.
Application of Minimum Spanning Tree:
One example would be a telecommunications company laying cable to a new neighborhood. If it
is constrained to bury the cable only along certain paths (e.g. along roads), then there would be a
graph representing which points are connected by those paths. Some of those paths might be
more expensive, because they are longer, or require the cable to be buried deeper; these paths
would be represented by edges with larger weights. Currency is an acceptable unit for edge
weight — there is no requirement for edge lengths to obey normal rules of geometry such as the
triangle inequality. A spanning tree for that graph would be a subset of those paths that has no
cycles but still connects to every house; there might be several spanning trees possible. A
minimum spanning tree would be one with the lowest total cost, thus would represent the least
expensive path for laying the cable.
There are two algorithms for finding minimum spanning tree:
1) Prim’s Algorithm 2) Kruskal’s Algorithm
Prim's algorithm
It is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph.
This means it finds a subset of the edges that forms a tree that includes every vertex, where the
total weight of all the edges in the tree is minimized. The algorithm operates by building this tree
one vertex at a time, from an arbitrary starting vertex, at each step adding the cheapest possible
connection from the tree to another vertex.
In more detail, it may be implemented following the pseudo code below.
1. Associate with each vertex v of the graph a number C[v] (the cheapest cost of a
connection to v) and an edge E[v] (the edge providing that cheapest connection). To
initialize these values, set all values of C[v] to +∞ (or to any number larger than the
maximum edge weight) and set each E[v] to a special flag value indicating that there is no
edge connecting v to earlier vertices.
2. Initialize an empty forest F and a set Q of vertices that have not yet been included in F
(initially, all vertices).
3. Repeat the following steps until Q is empty:
a. Find and remove a vertex v from Q having the minimum possible value of C[v]
b. Add v to F and, if E[v] is not the special flag value, also add E[v] to F
c. Loop over the edges vw connecting v to other vertices w. For each such edge, if w
still belongs to Q and vw has smaller weight than C[w], perform the following
steps:
i. Set C[w] to the cost of edge vw
ii. Set E[w] to point to edge vw.
4. Return F
As described above, the starting vertex for the algorithm will be chosen arbitrarily, because the
first iteration of the main loop of the algorithm will have a set of vertices in Q that all have equal
weights, and the algorithm will automatically start a new tree in F when it completes a spanning
tree of each connected component of the input graph.
Analysis of Algorithm:
The time complexity of Prim's algorithm depends on the data structures used for the graph and
for ordering the edges by weight, which can be done using a priority queue. The following table
shows the typical choices.
Minimum edge weight data structure Time complexity (total)
adjacency matrix, searching O(|V|2)
binary heap and adjacency list O((|V| + |E|) log |V|) = O(|E| log |V|)
Fibonacci heap and adjacency list O(|E| + |V| log |V|)
Question No. Question
1. List various applications of graph.
2. Define spanning tree and minimum spanning tree with example.
3. Explain Prim’s Algorithm to find MST in detail.
Session 5Session 1: Greedy algorithms for computing minimum spanning tree- ,kruskals Algorithms
Session Objective:
o To study the concept of and Kruskal’s algorithm
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
o To apply the concept Kruskal’s algorithm on graph
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
20 Finding MST using Kruskal’s algorithm
Brain StormingDiscussion
FacilitatesExplain
ListenParticipate
KnowledgeIntrapersonal
30 Solving examples Discussion Presentation
Explain ParticipateDiscuss
KnowledgeComprehension
10 Summary Discussion Summarizes ListenParticipate
KnowledgeIntrapersonal
Upon completion Students will be able to: -
Apply Kruskal’s algorithm to find the MST
Explain connected component with examples
Teaching Learning Material:-
PPT, Notes
References
1. http://www.wikipedia.com
Notes
Kruskal's algorithm
It is a minimum-spanning-tree algorithm where the algorithm finds an edge of the least possible
weight that connects any two trees in the forest. It is a greedy algorithm in graph theory as it
finds a minimum spanning tree for a connected weighted graph at each step. This means it finds
a subset of the edges that forms a tree that includes every vertex, where the total weight of all the
edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning
forest (a minimum spanning tree for each connected component).
Consider the above graph the steps to find minimum spanning tree using Kruskal’s Algorithm is
as follows:
Description
create a forest F (a set of trees), where each vertex in the graph is a separate tree
create a set S containing all the edges in the graph
while S is nonempty and F is not yet spanning
o remove an edge with minimum weight from S
o if the removed edge connects two different trees then add it to the forest F,
combining two trees into a single tree
At the termination of the algorithm, the forest forms a minimum spanning forest of the graph. If
the graph is connected, the forest has a single component and forms a minimum spanning tree.
Pseudocode
The following code is implemented with disjoint-set data structure:
KRUSKAL(G):
1 A = ∅2 foreach v ∈ G.V:
3 MAKE-SET(v)
4 foreach (u, v) ordered by weight(u, v), increasing:
5 if FIND-SET(u) ≠ FIND-SET(v):
6 A = A ∪ {(u, v)}
7 UNION(u, v)
8 return A
Analysis of Algorithm:
Where E is the number of edges in the graph and V is the number of vertices, Kruskal's algorithm
can be shown to run in O(ElogE) time, or equivalently, O(E log V) time, all with simple data
structures. These running times are equivalent because:
E is at most V2 and is O(log V).
Each isolated vertex is a separate component of the minimum spanning forest. If we
ignore isolated vertices we obtain V ≤ E+1, so log V is O(log E).
Connected Components:
In graph theory, a connected component (or just component) of an undirected graph is a
subgraph in which any two vertices are connected to each other by paths, and which is connected
to no additional vertices in the super graph. Following fig shows 3 connected components.
Algorithm
1. It is straightforward to compute the connected components of a graph in linear time (in
terms of the numbers of the vertices and edges of the graph) using either breadth-first
search or depth-first search.
2. In either case, a search that begins at some particular vertex v will find the entire
connected component containing v (and no more) before returning.
3. To find all the connected components of a graph, loop through its vertices, starting a new
breadth first or depth first search whenever the loop reaches a vertex that has not already
been included in a previously found connected component.
There are also efficient algorithms to dynamically track the connected components of a graph
as vertices and edges are added, as a straightforward application of disjoint-set data
structures. These algorithms require amortized O (α(n)) time per operation, where adding
vertices and edges and determining the connected component in which a vertex falls are both
operations
Question No. Question
1. Explain Kruskal’s algorithm in detail
2. What is connected component? Explain the algorithm for connected
components.
3. Differentiate between Prim’s and Kruskal’s algorithm.
Session 6Session 1: Dikjtra's Single source shortest path algorithm
Session Objective:
o To understand the shortest path problem
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
o To apply Dijkstra’s Algorithm to find shortest path between source and destination
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
25 shortest path algorithm with example
Brain StormingDiscussion
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
25Algorithm
Brain StormingPresentation
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonalComprehension
10 Summary Discussion Summarizes ListenParticipate
KnowledgeIntrapersonal
Upon completion Students will be able to: -
Define shortest path problem
Apply Dijkstra’s algorithm to find shortest path
Teaching Learning Material:-
PPT, Notes
References
http://lcm.csa.iisc.ernet.in/dsa/node162.html
Notes
Shortest path problem:
In graph theory, the shortest path problem is the problem of finding a path between two
vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is
minimized.
The problem is also sometimes called the single-pair shortest path problem.
Shortest path (A, C, E, D, F) between vertices A and F in the weighted directed graph
The most important algorithms for solving this problem are:
Dijkstra's algorithm solves the single-source shortest path problem.
Bellman–Ford algorithm solves the single-source problem if edge weights may be
negative.
Floyd–Warshall algorithm solves all pairs shortest paths.
Dijkstra’s Algorithm for Shortest Path:
Greedy algorithm
It works by maintaining a set S of ``special'' vertices whose shortest distance from the
source is already known. At each step, a ``non-special'' vertex is absorbed into S.
The absorption of an element of V - S into S is done by a greedy strategy.
The following provides the steps of the algorithm.
Let
V={1, 2,..., n} and source = 1
C[i,j]=
{
S= { 1 };
for (i = 2; i < n; i++)
D[i] = C[1,i];
for (i=1; i < = n-1; i++)
{
choose a vertex w V-S such that D[w] is a minimum;
S = S {w };
for each vertex v V-S
D[v] = min (D[v], D[w] + C[w, v])
}
}
The above algorithm gives the costs of the shortest paths from source vertex to every
other vertex.
The actual shortest paths can also be constructed by modifying the above algorithm.
Example: Consider the digraph in Figure.
A digraph example for Dijkstra's algorithm
Initially:
S = {1} D[2] = 10 D[3] = D[4] = 30 D[5] = 100
Iteration 1
Select w = 2, so that S = {1, 2}
D[3]=min( , D[2] + C[2, 3]) = 60D[4]=min(30, D[2] + C[2, 4]) = 30D[5]=min(100, D[2] + C[2, 5]) = 100
Iteration 2
Select w = 4, so that S = {1, 2, 4}
D[3]=min(60, D[4] + C[4, 3]) = 50D[5]=min(100, D[4] + C[4, 5]) = 90
Iteration 3
Select w = 3, so that S = {1, 2, 4, 3}
D[5]=min(90, D[3] + C[3, 5]) = 60
Iteration 4
Select w = 5, so that S = {1, 2, 4, 3, 5}
D[2]=10D[3]=50D[4]=30D[5]=60
Complexity of Dijkstra's Algorithm
With adjacency matrix representation, the running time is O(n2) By using an adjacency
list representation and a partially ordered tree data structure for organizing the set V - S,
the complexity can be shown to be O(elog n) where e is the number of edges and n is the
number of vertices in the digraph.
Questions
In sem Exam Theory Questions: (Minimum 10)
(Question should be per session)
Question No. Question
1. What is shortest path problem? Explain with example.
2 Explain Dijkstra’s algorithm in detail.
3 What are the various algorithms to find shortest path.
Session 7
Session 1: Topological ordering
Session Objective:
To study the concept of Topological ordering
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
To apply the concept Topological ordering
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
20 Topological ordering
Brain StormingDiscussion
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
30Solving examples
Brain StormingDiscussion
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
10 Summary Discussion Summarizes ListenParticipate
KnowledgeIntrapersonal
Upon completion Students will be able to: -
Define the Topological ordering
Teaching Learning Material:-
PPT, Notes
References
1. https://courses.cs.washington.edu/courses/cse326/03wi/lectures/RaoLect20.pdf
Notes
Graph Algorithm #1: Topological Sort
Problem: Find an order in which all these courses can be taken.
Example: 142 143 378
370 321 341 322
326 421 401
Topological Sort Definition :Topological sorting problem: given digraph G = (V, E) , find a linear ordering of vertices such that: for all edges (v, w) in E, v precedes w in the ordering
Topological sorting problem: given digraph G = (V, E) , find a linear ordering of vertices such that: for any edge (v, w) in E, v precedes w in the ordering
Step 1: Identify vertices that have no incoming edge • The “in-degree” of these vertices is zero
Step 1: Identify vertices that have no incoming edge • If no such edges, graph has cycles (cyclic graph)
Step 1: Identify vertices that have no incoming edges • Select one such vertex
Step 2: Delete this vertex of in-degree 0 and all its outgoing edges from the graph. Place it in the output.
Repeat Steps 1 and Step 2 until graph is empty
Repeat Steps 1 and Step 2 until graph is empty
Repeat Steps 1 and Step 2 until graph is empty
Repeat Steps 1 and Step 2 until graph is empty
Questions
Exam Theory Questions:
(Question should be per session)
Question No. Question
1. What is topological sorting ? Explain.
Session 8
Session 1: Case study- Data structure used in Webgraph and google map
Session Objective:
To study the concept of webgraph
At the end of this session, the learner will be able to:-------------------------------------------------------------------------------------------------------------------------------------
To apply the concept graphs in webgraphs.
Teaching Learning Material:
o Black board
o PPT
Session Plan
Time (in min)
Content Learning Aid/Methodology
Faculty Approach
Typical Student Activity
Skill /Competency Developed
CO PO
10 revision Brain StormingDiscussion
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
40 Case study- Data structure used in Webgraph,googlemap
Brain StormingDiscussion
FacilitatesExplain
ParticipateDiscuss
KnowledgeIntrapersonal
10 Summary Discussion Summarizes ListenParticipate
KnowledgeIntrapersonal
Upon completion Students will be able to: -
Define the Topological ordering
Teaching Learning Material:-
PPT, Notes
References
1) http://jgaa.info/accepted/2006/Donato+2006.10.2.pdf
2) S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine.
Computer Networks and ISDN Systems, 30(1–7):107–117, 1998.
Notes
The Webgraph is the graph whose nodes are (static) web pages and edges are (directed)
hyperlinks among them. The Web graph has been the subject of a large interest in the scientific
community. The reason of such large interest is primarily given to search engine technologies.
Remarkable examples are the algorithms for ranking pages such as PageRank [2].
The Web graph relative to a certain set of URLs is a directed graph having those URLs as nodes,
and with an arc from x to y whenever page x contains a hyperlink toward page y. When trying
to devise a compression mechanism to store a Web graph efficiently we can exploit some
empirical observations about the structure of hyperlinks in a typical subset of the Web. The
features of the links of a Web graph that are usually quoted are locality and similarity, which
were originally exploited by the Connectivity Server [2] and by the LINK database [1]. 1.
Locality. Most links contained in a page have a navigational nature: they lead the user to some
other pages within the same host (“home”, “next”, “previous”, “up” etc.); if we compare the
source and target URLs of these links, we observe that they share a long common prefix; said
otherwise, if URLs are sorted lexicographically, the index of source and target are close to each
other. 2. Similarity. Pages that occur close to each other (in lexicographic order) tend to have
many common successors; this is because many navigational links are the same within the same
local cluster of pages, and even non-navigational links are often copied from one page to another
within the same host. These features suggest to use techniques borrowed from full-text indexing
for storing increasing sequences of integers with small gaps, and moreover inspired the reference
compression techniques discussed in [1]. Since several successor lists are similar, one can
specify the successor list of a node by copying part of a previous list, and adding whatever
remains. This is achieved using a list of bits, one for each successor in the referenced list, which
tell whether the successor should be copied or not, or using other techniques (such as explicit
deletion lists [1]). The empirical analysis at the base of WebGraph’s compression techniques
evidenced two additional facts: 1. Similarity is much more concentrated than it was previously
thought. Either two lists have almost nothing in common, or they share large segments of their
successor lists. This implies that the one-bit-per-link scheme used in reference compression may
be refined to a copy-block list scheme, in which the links to be copied are specified by means of
interval lengths (this corresponds essentially to a run-length encoding of the reference bits). 2.
Consecutivity is common. It can be observed that many links within a page are consecutive (with
respect to the lexicographic order); this is due to two distinct phenomena. First of all, most pages
contain sets of navigational links which point to a fixed level of the hierarchy. Since the
hierarchical nature of a site is usually reflected in the hierarchical nature of URLs, links in pages
at the bottom of the hierarchy tend to be adjacent in lexicographic order. Second, in the
transposed Web graph pages that are high in the site hierarchy (e.g., the home page) are pointed
to by most pages of the site. This, of course, gives also rise to large intervals. 3. Consecutivity is
the dual of distance-one similarity. If a graph is easily compressible using similarity at distance
one (i.e., exploiting similarity with the successor list of the previous node in lexicographical
ordering), its transpose must sport large intervals of consecutive links, and viceversa, as a node
that is common among two or more consecutive successor lists at distance one is reflected by a
corresponding interval of length two or more in the transposed graph.
1. What kind of data structure is used to store the map information.
2. What kind of algorithm is used to "navigate" from source to destination.
3. How is Google/Bing able to "stream in" the data. So for example, you are able to zoom in
from miles up to the ground level seamlessly, all the while maintaining the coordinate
system.
I will attempt to address each question in order. Do note, that I do not work for the Google
Maps or the Bing team, so quite obviously, this information might not be completely accurate. I
am basing this off of the knowledge gained from a good CS course about data structures and
algorithms.
Answer for above all questions is the map is stored in an Edge Weighted Directed Graph.
Locations on the map are Vertices and the path from one location to another (from one vertex to
another) are the Edges.
Quite obviously, since there can be millions of vertices and an order of magnitude more edges,
the really interesting thing would be the representation of this Edge Weighted Digraph.
I would say that this would be represented by some kind of Adjacency List and the reason I say
so is because, if you imagine a map, it is essentially a sparse graph. There are only a few ways to
get from one location to another. Think about your house! How many roads (edges in our case)
lead to it? Adjacency Lists are good for representing sparse graphs, and adjacency matrix is good
for representing dense graphs.
Of course, even though we are able to efficiently represent sparse graphs in memory, given the
sheer number of Vertices and Edges, it would be impossible to store everything in memory at
once. Hence, I would imagine some kind of a streaming library underneath.
To create an analogy for this, if you have ever played an open-world game like World of
Warcraft / Syrim / GTA, you will observe that to a large part, there is no loading screen. But
quite obviously, it is impossible to fit everything into memory at once. Thus using a combination
of quad-trees and frustum culling algorithms, these games are able to dynamically load resources
(terrain, sprites, meshes etc).
We can imagine something similar, but for Graphs. I have not put a lot of thought into this
particular aspect, but to cook up a very basic system, one can imagine an in memory database,
which they query and add/remove vertices and edges from the graph at run-time as needed. This
brings us to another interesting point. Since vertices and edges need to be removed and added at
run-time, the classic implementation of Adjacency List will not cut it.
In a classic implementation, we simply store a List (a Vector in Java) in each element of an
array: Adj[]. I would imagine, a linked list in place of the Adj[] array and a binary search tree in
place of List[Edge]. The binary search tree would facilitate O(log N) insertion and removal of
nodes. This is extremely desirable since in the List implementation, while addition is O(1),
removal is O(N) and when you are dealing with millions of edges, this is prohibitive.
A final point to note here is that until you actually start the navigation, there is "no" graph. Since
there can be million of users, it doesn't make sense to maintain one giant graph for everybody
(this would be impossible due to memory space requirement alone). I would imagine that as you
stat the navigation process, a graph is created for you. Quite obviously, since you start from
location A and go to location B (and possibly other locations after that), the graph created just for
you should not take up a very large amount of memory (provided the streaming architecture is in
place).
The most basic algorithm for solving this problem would be Dijkstra Path Finding algorithm.
Faster variations such as A* exist. I would imagine Dijkstra to be fast enough, if it could work
properly with the streaming architecture discussed above. Dijkstra uses space proportional to V
and time proportional to E lg V, which are very good figures, especially for sparse graphs. Do
keep in mind, if the streaming architecture has not been nailed down, V and E will explode and
the space and run-time requirements of Dijkstra will make it prohibitive.
Exam Theory Questions:
(Question should be per session)
Question No. Question
1. Write short notes on Data structure used in Webgraph and Google map.