72
Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links link cost: delay, $ cost, or congestion level Goal: determine “good” path (sequence of routers) thru network from source to dest. Routing protocol A E D C B F 2 2 1 3 1 1 2 5 3 5 “good” path: typically means minimum cost path other def’s possible

Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Embed Size (px)

Citation preview

Page 1: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Routing

Graph abstraction for routing algorithms:

• graph nodes are routers• graph edges are

physical links– link cost: delay, $ cost, or

congestion level

Goal: determine “good” path

(sequence of routers) thru network from source to

dest.

Routing protocol

A

ED

CB

F

2

2

13

1

1

2

53

5

• “good” path:– typically means

minimum cost path– other def’s possible

Page 2: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Routing Algorithm classificationGlobal or decentralized

information?Global:

• all routers have complete topology, link cost info

• “link state” algorithms

Decentralized:

• router knows physically-connected neighbors, link costs to neighbors

• iterative process of computation, exchange of info with neighbors

• “distance vector” algorithms

Static or dynamic?Static: • routes change slowly

over time

Dynamic: • routes change more

quickly– periodic update– in response to link

cost changes

Page 3: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

A Link-State Routing Algorithm

Dijkstra’s algorithm• net topology, link costs known

to all nodes– accomplished via “link state

broadcast” – all nodes have same info

• computes least cost paths from one node (‘source”) to all other nodes– gives routing table for that

node• iterative: after k iterations, know

least cost path to k dest.’s

Notation:• c(i,j): link cost from node i to j.

cost infinite if not direct neighbors

• D(v): current value of cost of path from source to dest. V

• p(v): predecessor node along path from source to v

• N: set of nodes whose least cost path definitively known

Page 4: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Dijsktra’s Algorithm

1 Initialization: 2 N = {A} 3 for all nodes v 4 if v adjacent to A 5 then D(v) = c(A,v) 6 else D(v) = infinity 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N: 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N

Page 5: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Dijkstra’s algorithm: example

Step012345

start NA

ADADE

ADEBADEBC

ADEBCF

D(B),p(B)2,A2,A2,A

D(C),p(C)5,A4,D3,E3,E

D(D),p(D)1,A

D(E),p(E)infinity

2,D

D(F),p(F)infinityinfinity

4,E4,E4,E

A

ED

CB

F

2

2

13

1

1

2

53

5

Page 6: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Dijkstra’s algorithm, discussionAlgorithm complexity: n nodes• each iteration: need to check all nodes, w, not in N• n*(n+1)/2 comparisons: O(n**2)• more efficient implementations possible: O(nlogn)

Oscillations possible:• e.g., link cost = amount of carried traffic

A

D

C

B1 1+e

e0

e

1 1

0 0

A

D

C

B2+e 0

001+e1

A

D

C

B0 2+e

1+e10 0

A

D

C

B2+e 0

e01+e1

initially… recompute

routing… recompute … recompute

Page 7: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector Routing Algorithm

iterative:• continues until no nodes

exchange info.• self-terminating: no

“signal” to stop

asynchronous:• nodes need not

exchange info/iterate in lock step!

distributed:• each node

communicates only with directly-attached neighbors

Distance Table data structure • each node has its own• row for each possible destination• column for each directly-attached

neighbor to node• example: in node X, for dest. Y via

neighbor Z:

D (Y,Z)X

distance from X toY, via Z as next hop

c(X,Z) + min {D (Y,w)}Z

w

=

=

Page 8: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Table: example

A

E D

CB7

8

1

2

1

2

D ()

A

B

C

D

A

1

7

6

4

B

14

8

9

11

D

5

5

4

2

Ecost to destination via

dest

inat

ion

D (C,D)E

c(E,D) + min {D (C,w)}D

w== 2+2 = 4

D (A,D)E

c(E,D) + min {D (A,w)}D

w== 2+3 = 5

D (A,B)E

c(E,B) + min {D (A,w)}B

w== 8+6 = 14

loop!

loop!

Page 9: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance table gives routing table

D ()

A

B

C

D

A

1

7

6

4

B

14

8

9

11

D

5

5

4

2

Ecost to destination via

dest

inat

ion

A

B

C

D

A,1

D,5

D,4

D,4

Outgoing link to use, cost

dest

inat

ion

Distance table Routing table

Page 10: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector Routing: overviewIterative, asynchronous:

each local iteration caused by:

• local link cost change • message from neighbor: its

least cost path change from neighbor

Distributed:• each node notifies neighbors

only when its least cost path to any destination changes– neighbors then notify their

neighbors if necessary

wait for (change in local link cost of msg from neighbor)

recompute distance table

if least cost path to any dest

has changed, notify neighbors

Each node:

Page 11: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector Algorithm:

1 Initialization: 2 for all adjacent nodes v: 3 D (*,v) = infinity /* the * operator means "for all rows" */ 4 D (v,v) = c(X,v) 5 for all destinations, y 6 send min D (y,w) to each neighbor /* w over all X's neighbors */

XX

Xw

At all nodes, X:

Page 12: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector Algorithm (cont.):8 loop 9 wait (until I see a link cost change to neighbor V 10 or until I receive update from neighbor V) 11 12 if (c(X,V) changes by d) 13 /* change cost to all dest's via neighbor v by d */ 14 /* note: d could be positive or negative */ 15 for all destinations y: D (y,V) = D (y,V) + d 16 17 else if (update received from V wrt destination Y) 18 /* shortest path from V to some Y has changed */ 19 /* V has sent a new value for its min DV(Y,w) */ 20 /* call this received new value is "newval" */ 21 for the single destination y: D (Y,V) = c(X,V) + newval 22 23 if we have a new min D (Y,w)for any destination Y 24 send new value of min D (Y,w) to all neighbors 25 26 forever

w

XX

XX

X

w

w

Page 13: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector Algorithm: example

X Z12

7

Y

Page 14: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector Algorithm: example

X Z12

7

Y

D (Y,Z)X

c(X,Z) + min {D (Y,w)}w=

= 7+1 = 8

Z

D (Z,Y)X

c(X,Y) + min {D (Z,w)}w=

= 2+1 = 3

Y

Page 15: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector: link cost changes

Link cost changes:• node detects local link cost

change • updates distance table (line 15)• if cost change in least cost path,

notify neighbors (lines 23,24)

X Z14

50

Y1

algorithmterminates“good

news travelsfast”

Page 16: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector: link cost changes

Link cost changes:• good news travels fast • bad news travels slow -

“count to infinity” problem!X Z

14

50

Y60

algorithmcontinues

on!

Page 17: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distance Vector: poisoned reverseIf Z routes through Y to get to X :• Z tells Y its (Z’s) distance to X is infinite (so Y

won’t route to X via Z)• will this completely solve count to infinity problem? X Z

14

50

Y60

algorithmterminates

Page 18: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Comparison of LS and DV algorithms

Message complexity

• LS: with n nodes, E links, O(nE) msgs sent each

• DV: exchange between neighbors only

– convergence time varies

Speed of Convergence• LS: O(n2) algorithm requires

O(nE) msgs

– may have oscillations

• DV: convergence time varies

– may be routing loops

– count-to-infinity problem

Robustness: what happens if router malfunctions?

LS: – node can advertise incorrect

link cost– each node computes only

its own table

DV:– DV node can advertise

incorrect path cost– each node’s table used by

others • error propagate thru

network

Page 19: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Roadmap• Routing in the Internet

– Routing algorithms– Routing Protocols

• Intra-AS routing: RIP and OSPF• Inter-AS routing: BGP

Page 20: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Routing in the Internet• The Global Internet consists of Autonomous

Systems (AS) interconnected with each other:– Stub AS: small corporation: one connection to other

AS’s– Multihomed AS: large corporation (no transit): multiple

connections to other AS’s– Transit AS: provider, hooking many AS’s together

• Two-level routing: – Intra-AS: administrator responsible for choice of

routing algorithm within network– Inter-AS: unique standard for inter-AS routing: BGP

Page 21: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Internet AS HierarchyInter-AS border (exterior gateway) routers

Intra-AS interior routers

Page 22: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Intra-AS Routing

• Also known as Interior Gateway Protocols (IGP)• Most common Intra-AS routing protocols:

– RIP: Routing Information Protocol

– OSPF: Open Shortest Path First

– IGRP: Interior Gateway Routing Protocol (Cisco proprietary)

Page 23: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

RIP ( Routing Information Protocol)

• Distance vector algorithm• Included in BSD-UNIX Distribution in 1982• Distance metric: # of hops (max = 15 hops)• Distance vectors: exchanged among neighbors

every 30 sec via Response Message (also called advertisement)

• Each advertisement: list of up to 25 destination nets within AS

Page 24: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

RIP: Example

Destination Network Next Router Num. of hops to dest. w A 2

y B 2 z B 7

x -- 1…. …. ....

w x y

z

A

C

D B

Routing table in D

Dest Next hops w - - x - - z C 4 …. … ...

Advertisementfrom A to D

Page 25: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

RIP: Example

Destination Network Next Router Num. of hops to dest. w A 2

y B 2 z B A 7 5

x -- 1…. …. ....Routing table in D

w x y

z

A

C

D B

Dest Next hops w - - x - - z C 4 …. … ...

Advertisementfrom A to D

Page 26: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

RIP: Link Failure and Recovery If no advertisement heard after 180 sec -->

neighbor/link declared dead– routes via neighbor invalidated– new advertisements sent to neighbors– neighbors in turn send out new

advertisements (if tables changed)– link failure info quickly propagates to entire

net– poison reverse used to prevent ping-pong

loops (infinite distance = 16 hops)

Page 27: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

RIP Table processing

• RIP routing tables managed by application-level process called route-d (daemon)

• advertisements sent in UDP packets, periodically repeated

physical

link

network forwarding (IP) table

Transprt (UDP)

routed

physical

link

network (IP)

Transprt (UDP)

routed

forwardingtable

Page 28: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

RIP Table example (continued)Router: giroflee.eurocom.fr

• Three attached networks (LANs)

• Router only knows routes to attached LANs• Default router used to “go up”• Route multicast address: 224.0.0.0• Loopback interface (for debugging)

Destination Gateway Flags Ref Use Interface -------------------- -------------------- ----- ----- ------ --------- 127.0.0.1 127.0.0.1 UH 0 26492 lo0 192.168.2. 192.168.2.5 U 2 13 fa0 193.55.114. 193.55.114.6 U 3 58503 le0 192.168.3. 192.168.3.5 U 2 25 qaa0 224.0.0.0 193.55.114.6 U 3 0 le0 default 193.55.114.129 UG 0 143454

Page 29: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

OSPF (Open Shortest Path First)

• “open”: publicly available• Uses Link State algorithm

– LS packet dissemination– Topology map at each node– Route computation using Dijkstra’s algorithm

• OSPF advertisement carries one entry per neighbor router

• Advertisements disseminated to entire AS (via flooding)– Carried in OSPF messages directly over IP (rather than TCP or

UDP

Page 30: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

OSPF “advanced” features (not in RIP)

• Security: all OSPF messages authenticated (to prevent malicious intrusion)

• Multiple same-cost paths allowed (only one path in RIP)• For each link, multiple cost metrics for different TOS

(e.g., satellite link cost set “low” for best effort; high for real time)

• Integrated uni- and multicast support: – Multicast OSPF (MOSPF) uses same topology data

base as OSPF• Hierarchical OSPF in large domains.

Page 31: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Hierarchical OSPF

Page 32: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Hierarchical OSPF

• Two-level hierarchy: local area, backbone.– Link-state advertisements only in area – each nodes has detailed area topology; only know

direction (shortest path) to nets in other areas.• Area border routers: “summarize” distances to nets in

own area, advertise to other Area Border routers.• Backbone routers: run OSPF routing limited to

backbone.• Boundary routers: connect to other AS’s.

Page 33: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Chapter 4: Network Layer

• 4. 1 Introduction• 4.2 Virtual circuit and

datagram networks• 4.3 What’s inside a router• 4.4 IP: Internet Protocol

– Datagram format– IPv4 addressing– ICMP– IPv6

• 4.5 Routing algorithms– Link state– Distance Vector– Hierarchical routing

• 4.6 Routing in the Internet– RIP– OSPF– BGP

• 4.7 Broadcast and multicast routing

Page 34: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Internet inter-AS routing: BGP

• BGP (Border Gateway Protocol): the de facto standard

• BGP provides each AS a means to:1. Obtain subnet reachability information from

neighboring ASs.2. Propagate the reachability information to all routers

internal to the AS.3. Determine “good” routes to subnets based on

reachability information and policy.

• Allows a subnet to advertise its existence to rest of the Internet: “I am here”

Page 35: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

BGP basics• Pairs of routers (BGP peers) exchange routing info over semi-

permanent TCP connections: BGP sessions• Note that BGP sessions do not correspond to physical links.• When AS2 advertises a prefix to AS1, AS2 is promising it will

forward any datagrams destined to that prefix towards the prefix.– AS2 can aggregate prefixes in its advertisement

3b

1d

3a

1c2aAS3

AS1

AS21a

2c

2b

1b

3c

eBGP session

iBGP session

Page 36: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Distributing reachability info• With eBGP session between 3a and 1c, AS3 sends prefix

reachability info to AS1.• 1c can then use iBGP do distribute this new prefix reach info to

all routers in AS1• 1b can then re-advertise the new reach info to AS2 over the 1b-

to-2a eBGP session• When router learns about a new prefix, it creates an entry for

the prefix in its forwarding table.

3b

1d

3a

1c2aAS3

AS1

AS21a

2c

2b

1b

3c

eBGP session

iBGP session

externalinternal

Page 37: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Path attributes & BGP routes

• When advertising a prefix, advert includes BGP attributes. – prefix + attributes = “route”

• Two important attributes:– AS-PATH: contains the ASs through which the advert

for the prefix passed: AS 67 AS 17

– NEXT-HOP: Indicates the specific internal-AS router to next-hop AS. (There may be multiple links from current AS to next-hop-AS.)

• When gateway router receives route advert, uses import policy to accept/decline.

Page 38: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

BGP route selection

• Router may learn about more than 1 route to some prefix. Router must select route.

• Elimination rules:1. Local preference value attribute: policy

decision

2. Shortest AS-PATH

3. Closest NEXT-HOP router: hot potato routing

4. Additional criteria

Page 39: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

BGP messages

• BGP messages exchanged using TCP.• BGP messages:

– OPEN: opens TCP connection to peer and authenticates sender

– UPDATE: advertises new path (or withdraws old)– KEEPALIVE keeps connection alive in absence of

UPDATES; also ACKs OPEN request– NOTIFICATION: reports errors in previous msg; also

used to close connection

Page 40: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

BGP routing policy

Figure 4.5-BGPnew: a simple BGP scenario

A

B

C

W X

Y

legend:

customer network:

provider network

• A,B,C are provider networks• X,W,Y are customer (of provider networks)• X is dual-homed: attached to two networks

– X does not want to route from B via X to C– .. so X will not advertise to B a route to C

Page 41: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

BGP routing policy (2)

Figure 4.5-BGPnew: a simple BGP scenario

A

B

C

W X

Y

legend:

customer network:

provider network

• A advertises to B the path AW • B advertises to X the path BAW • Should B advertise to C the path BAW?

– No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers

– B wants to force C to route to w via A– B wants to route only to/from its customers!

Page 42: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Why different Intra- and Inter-AS routing ? Policy: • Inter-AS: admin wants control over how its traffic routed,

who routes through its net. • Intra-AS: single admin, so no policy decisions needed

Scale:• hierarchical routing saves table size, reduced update

traffic

Performance: • Intra-AS: can focus on performance• Inter-AS: policy may dominate over performance

Page 43: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Chapter 4: Network Layer

• 4. 1 Introduction• 4.2 Virtual circuit and

datagram networks• 4.3 What’s inside a router• 4.4 IP: Internet Protocol

– Datagram format– IPv4 addressing– ICMP– IPv6

• 4.5 Routing algorithms– Link state– Distance Vector– Hierarchical routing

• 4.6 Routing in the Internet– RIP– OSPF– BGP

• 4.7 Broadcast and multicast routing

Page 44: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

R1

R2

R3 R4

sourceduplication

R1

R2

R3 R4

in-networkduplication

duplicatecreation/transmissionduplicate

duplicate

Broadcast Routing• Deliver packets from source to all other nodes• Source duplication is inefficient:

• Source duplication: how does source determine recipient addresses?

Page 45: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

In-network duplication

• Flooding: when node receives brdcst pckt, sends copy to all neighbors– Problems: cycles & broadcast storm

• Controlled flooding: node only brdcsts pkt if it hasn’t brdcst same packet before– Node keeps track of pckt ids already brdcsted– Or reverse path forwarding (RPF): only forward pckt if

it arrived on shortest path between node and source

• Spanning tree– No redundant packets received by any node

Page 46: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

A

B

G

DE

c

F

A

B

G

DE

c

F

(a) Broadcast initiated at A (b) Broadcast initiated at D

Spanning Tree

• First construct a spanning tree

• Nodes forward copies only along spanning tree

Page 47: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

A

B

G

DE

c

F1

2

3

4

5

(a) Stepwise construction of spanning tree

A

B

G

DE

c

F

(b) Constructed spanning tree

Spanning Tree: Creation• Center node• Each node sends unicast join message to center

node– Message forwarded until it arrives at a node already

belonging to spanning tree

Page 48: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Multicast Routing: Problem Statement

• Goal: find a tree (or trees) connecting routers having local mcast group members – tree: not all paths between routers used– source-based: different tree from each sender to rcvrs– shared-tree: same tree used by all group members

Shared tree Source-based trees

Page 49: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Multicast: one sender to many

receivers • Multicast: act of sending datagram to multiple receivers

with single “transmit” operation– analogy: one teacher to many students

• Question: how to achieve multicast

Multicast via unicast• source sends N unicast

datagrams, one addressed to each of N receivers

multicast receiver (red)

not a multicast receiver (red)

routersforward unicastdatagrams

Page 50: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Multicast: one sender to many

receivers • Multicast: act of sending datagram to multiple receivers

with single “transmit” operation– analogy: one teacher to many students

• Question: how to achieve multicast

Network multicast• Router actively participate in

multicast, making copies of packets as needed and forwarding towards multicast receivers

Multicastrouters (red) duplicate and forward multicast datagrams

Page 51: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Multicast: one sender to many

receivers • Multicast: act of sending datagram to multiple

receivers with single “transmit” operation– analogy: one teacher to many students

• Question: how to achieve multicast

Application-layer multicast• end systems involved in

multicast copy and forward unicast datagrams among themselves

Page 52: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Internet Multicast Service Model

multicast group concept: use of indirection– hosts addresses IP datagram to multicast group– routers forward multicast datagrams to hosts that have

“joined” that multicast group

128.119.40.186

128.59.16.12

128.34.108.63

128.34.108.60

multicast group

226.17.30.197

Page 53: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Multicast groups class D Internet addresses reserved for multicast:

host group semantics:o anyone can “join” (receive) multicast groupo anyone can send to multicast groupo no network-layer identification to hosts of members

needed: infrastructure to deliver mcast-addressed datagrams to all hosts that have joined that multicast group

Page 54: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Joining a mcast group: two-step process

• local: host informs local mcast router of desire to join group: IGMP (Internet Group Management Protocol)

• wide area: local router interacts with other routers to receive mcast datagram flow– many protocols (e.g., DVMRP, MOSPF, PIM)

IGMPIGMP

IGMP

wide-areamulticast

routing

Page 55: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

IGMP: Internet Group Management Protocol

• host: sends IGMP report when application joins mcast group– IP_ADD_MEMBERSHIP socket option– host need not explicitly “unjoin” group when

leaving • router: sends IGMP query at regular intervals

– host belonging to a mcast group must reply to query

query report

Page 56: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

IGMPIGMP version 1• router: Host Membership

Query msg broadcast on LAN to all hosts

• host: Host Membership Report msg to indicate group membership– randomized delay before

responding– implicit leave via no reply

to Query

• RFC 1112

IGMP v2: additions include• group-specific Query• Leave Group msg

– last host replying to Query can send explicit Leave Group msg

– router performs group-specific query to see if any hosts left in group

– RFC 2236

IGMP v3: under development as Internet draft

Page 57: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Approaches for building mcast trees

Approaches:• source-based tree: one tree per source

– shortest path trees– reverse path forwarding

• group-shared tree: group uses one tree– minimal spanning (Steiner) – center-based trees

…we first look at basic approaches, then specific protocols adopting these approaches

Page 58: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Shortest Path Tree

• mcast forwarding tree: tree of shortest path routes from source to all receivers– Dijkstra’s algorithm

R1

R2

R3

R4

R5

R6 R7

21

6

3 4

5

i

router with attachedgroup member

router with no attachedgroup member

link used for forwarding,i indicates order linkadded by algorithm

LEGENDS: source

Page 59: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Reverse Path Forwarding

if (mcast datagram received on incoming link on shortest path back to center)

then flood datagram onto all outgoing links

else ignore datagram

rely on router’s knowledge of unicast shortest path from it to sender

each router has simple forwarding behavior:

Page 60: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Reverse Path Forwarding: example

• result is a source-specific reverse SPT– may be a bad choice with asymmetric links

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

datagram will be forwarded

LEGENDS: source

datagram will not be forwarded

Page 61: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Reverse Path Forwarding: pruning

• forwarding tree contains subtrees with no mcast group members– no need to forward datagrams down subtree– “prune” msgs sent upstream by router with

no downstream group members

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

prune message

LEGENDS: source

links with multicastforwarding

P

P

P

Page 62: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Shared-Tree: Steiner Tree

• Steiner Tree: minimum cost tree connecting all routers with attached group members

• problem is NP-complete• excellent heuristics exists• not used in practice:

– computational complexity– information about entire network needed– monolithic: rerun whenever a router needs to

join/leave

Page 63: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Center-based trees

• single delivery tree shared by all• one router identified as “center” of tree• to join:

– edge router sends unicast join-msg addressed to center router

– join-msg “processed” by intermediate routers and forwarded towards center

– join-msg either hits existing tree branch for this center, or arrives at center

– path taken by join-msg becomes new branch of tree for this router

Page 64: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Center-based trees: an example

Suppose R6 chosen as center:

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

path order in which join messages generated

LEGEND

21

3

1

Page 65: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Internet Multicasting Routing: DVMRP

• DVMRP: distance vector multicast routing protocol, RFC1075

• flood and prune: reverse path forwarding, source-based tree– RPF tree based on DVMRP’s own routing tables

constructed by communicating DVMRP routers – no assumptions about underlying unicast– initial datagram to mcast group flooded everywhere via

RPF– routers not wanting group: send upstream prune msgs

Page 66: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

DVMRP: continued…• soft state: DVMRP router periodically (2 hours)

“forgets” branches are pruned: – mcast data again flows down unpruned branch– downstream router: reprune or else continue to

receive data

• routers can quickly regraft to tree – following IGMP join at leaf

• odds and ends– commonly implemented in commercial routers– Mbone routing done using DVMRP

Page 67: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Tunneling

Q: How to connect “islands” of multicast routers in a “sea” of unicast routers?

mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram

normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router

receiving mcast router unencapsulates to get mcast datagram

physical topology logical topology

Page 68: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

PIM: Protocol Independent Multicast

• not dependent on any specific underlying unicast routing algorithm (works with all)

• two different multicast distribution scenarios :Dense: group members

densely packed, in “close” proximity.

bandwidth more plentiful

Sparse: # networks with group

members small wrt # interconnected networks

group members “widely dispersed”

bandwidth not plentiful

Page 69: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

Consequences of Sparse-Dense

Dichotomy: Dense• group membership by

routers assumed until routers explicitly prune

• data-driven construction on mcast tree (e.g., RPF)

• bandwidth and non-group-router processing profligate

Sparse:• no membership until

routers explicitly join• receiver- driven

construction of mcast tree (e.g., center-based)

• bandwidth and non-group-router processing conservative

Page 70: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

PIM- Dense Mode

flood-and-prune RPF, similar to DVMRP but

underlying unicast protocol provides RPF info for incoming datagram

less complicated (less efficient) downstream flood than DVMRP reduces reliance on underlying routing algorithm

has protocol mechanism for router to detect it is a leaf-node router

Page 71: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

PIM - Sparse Mode

• center-based approach• router sends join msg to

rendezvous point (RP)– intermediate routers

update state and forward join

• after joining via RP, router can switch to source-specific tree– increased performance:

less concentration, shorter paths

R1

R2

R3

R4

R5

R6R7

join

join

join

all data multicastfrom rendezvouspoint

rendezvouspoint

Page 72: Routing Graph abstraction for routing algorithms: graph nodes are routers graph edges are physical links –link cost: delay, $ cost, or congestion level

PIM - Sparse Mode

sender(s):• unicast data to RP,

which distributes down RP-rooted tree

• RP can extend mcast tree upstream to source

• RP can send stop msg if no attached receivers– “no one is listening!”

R1

R2

R3

R4

R5

R6R7

join

join

join

all data multicastfrom rendezvouspoint

rendezvouspoint