111
Chapter 3 Internet Protocol Layer Part II: 3.3 Ren-Hung Hwang

Chapter 3 Internet Protocol Layer Part II: 3

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 3 Internet Protocol Layer Part II: 3

Chapter 3Internet Protocol Layer

Part II: 3.3

Ren-Hung Hwang

Page 2: Chapter 3 Internet Protocol Layer Part II: 3

Control Plane Mechanisms

Address resolutionAddress configurationError reportingIntra-domain routingInter-domain routingMulticast routing

Page 3: Chapter 3 Internet Protocol Layer Part II: 3

Address Resolution

What is address resolutionTranslate address at different layersFor example, host name to IP address, IP address to Ethernet address

Why address resolutionMAC address vs. IP address

Page 4: Chapter 3 Internet Protocol Layer Part II: 3

Address Resolution Protocol

Protocol operationSource node broadcasts an ARP request packet on the IP subnetAll nodes on the subnet will receive the ARP request, but only the target node (or some designate server) will reply an ARP reply packet via unicastSource node receives the reply and gets the MAC address of the target nodeCache is used to speed up (w/ timer)

Page 5: Chapter 3 Internet Protocol Layer Part II: 3

ARP Packet Format

16 0 8 24 31

Hardware Address Type Protocol Address Type

H. Addr Len P. Addr Len Operation Code

Sender Hardware Address (0-3)

Sender Hardware Addr (4-5) Sender Protocol Addr (0-1)

Sender Protocol Addr (2-3)

Target Hardware Address (0-3)

Target Hardware Addr (0-1)

Target Protocol Address

Page 6: Chapter 3 Internet Protocol Layer Part II: 3

ARP Packet Format

HARDWARE ADDRESS TYPELink types: Ethernet=0x0001

PROTOCOL ADDRESS TYPEUpper layer protocol identifier: IP=0x0800

HADDR LENLength of the address of the link layer: Ethernet=6

PADDR LENLength of the address of the network layer: IP=4

Page 7: Chapter 3 Internet Protocol Layer Part II: 3

ARP Packet Format

OPERATIONOperation code: ARP request=1, ARP reply=2 RARP request=3, RARP reply=4

SENDER HADDRSender link layer address

SENDER PADDRSender network layer address

TARGET HADDRTarget link layer address, fill zero if unknown

TARGET PADDRTarget network layer address

Page 8: Chapter 3 Internet Protocol Layer Part II: 3

Encapsulate ARP Packet into MAC Frame

Protocol id: 0x0806Destination address of an ARP request packet: 0xFFFFFFFFFFFF

Page 9: Chapter 3 Internet Protocol Layer Part II: 3

Reverse ARP (RARP)

Allow a diskless workstation to discover its IP addressNeed a RARP server on each networkBootp:

Use UDP messages which are forwarded over routers to find the file server that holds the mapping

Page 10: Chapter 3 Internet Protocol Layer Part II: 3

Open Source Implementation:ARP

Data structureHash table: arp_tableHash parameters: a primary key and device interface index

FunctionsArp_send(): set up ARP header and then xmitArp_rcv(): Only deal with reply or request operation.

Request: calls ip_input_route(), if routes to local, calls arp_send() to send out ARP reply. Otherwise, if the host is an arp proxy, also sends ARP reply. Reply: update ARP table.

__neigh_lookup(): calls neigh_lookup() to search the arp hash table, if not found, create oneEth_rebuild_header (old) or arp_solicit() calls arp_send()

Page 11: Chapter 3 Internet Protocol Layer Part II: 3

Error Control Protocol

What is error control protocolA protocol for reporting error or status of TCP/IP at remote site (router or host)

Why error control protocolFor monitoring the status of TCP/IP at each host/routerFor reporting error between hosts or routers

Page 12: Chapter 3 Internet Protocol Layer Part II: 3

ICMP

ICMP runs over IP

ICMP Header ICMP Data

IP Header IP Data

Page 13: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv4 Packet Format

Type and Code are used to identify an error eventData reports the header and first 8 bytes of the error packet

16 0 8 24 31

Type Code Checksum

Data

Page 14: Chapter 3 Internet Protocol Layer Part II: 3

Type and Code

Type Code description0 0 echo reply (ping)3 0 destination network unreachable3 1 destination host unreachable3 2 destination protocol unreachable3 3 destination port unreachable3 6 destination network unknown3 7 destination host unknown4 0 source quench (congestion control)8 0 echo request (ping)9 0 route advertisement10 0 router discovery11 0 TTL expired12 0 bad IP header

Page 15: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv4 Examples

Echo Request/ReplySource sends an echo request to a destinationDestination responses with an echo replyType and code of Echo Request and Reply are (8, 0) and (0, 0), respectively.ping uses echo request and reply

Destination Unreachable (type=3)Possible errors: network unreachable(code=0),host unreachable(code=1) , protocol unreachable(code=2),port unreachable(code=3),source route fail(code=5) ,destination network unknown (code=6), destination host unknown(code = 7 )

Page 16: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv4 Examples

If the do not fragment bit in IP header is set to 1, and the packet length is larger than the MTU of the output interface, router will discard this packet and send a fragmentation required (type=3, code=4) ICMP message to the source.

Source Quenchwhen buffer of a router overflows, router sends a source quench (type=4) to source。

Routing redirectIf a host forwards a packet to a router and the router finds that the packet should be forwarded by another router within the same physical network, it will forward the packet to that routerand sends a redirect (type=5, code=0 or 1, (network/ host)) ICMP message to the host.

Page 17: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv4 Example

Time Exceeded:

After decreases the TTL by one, if a router finds the TTL is less or equal to zero, it will send a Time Exceeded (type=11) ICMP message to the source.traceroute uses this type of ICMP message

traceroute sends an ICMP echo request with TTL=1 to the target machineWhen the first router receives the message, it responds with a time exceeded message traceroute then sends another echo request with TTL=2 The message passes the first router, but discarded by the secondrouter with a returned time exceeded messageTraceroute repeats sending echo requests until it receives an echo reply from the target machine

Page 18: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv4 Example

IP header error:Wrong IP header, such as wrong option field. (type=12)

Page 19: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv6

New type and codeType 0..127: error report

1: Destination unreachable2: Packet too big3: Time Exceeded4: Parameter problem

Type 128..255: informational128, 129: Echo request & reply130, 131, 132: Multicast group membership management133,134: Router solicitation and advertisement135, 136: Neighbor solicitation and advertisement137: Redirect

Page 20: Chapter 3 Internet Protocol Layer Part II: 3

ICMPv6 Type Code Description

1 0 No route to destination

1 1 Communication with destination

administratively prohibited

1 3 Address unreachable

1 4 Port unreachable

2 0 Packet too big

3 0 Hop limit exceeded in transit 3 1 Fragment reassembly time exceeded

4 0 Erroneous header field encountered

4 1 Unrecognized Next Header type

4 2 Unrecognized IPv6 option encountered

128 0 Echo request

129 0 Echo reply

130 0 Multicast Listener Query

131 0 Multicast Listener Report 132 0 Multicast Listener Done

133 0 Router Solicitation

134 0 Router Advertisement 135 0 Neighbor Solicitation

136 0 Neighbor Advertisement 137 0 Redirect

Page 21: Chapter 3 Internet Protocol Layer Part II: 3

Routing

Task of routingSelect a path from the source to the destination

Goal of routingStableRobustEfficient (low delay, high throughput, …)

Page 22: Chapter 3 Internet Protocol Layer Part II: 3

Optimality of IP Routing

IP uses hop-by-hop routing(forwarding)Each router determines its own routing tableWhy packets will be delivered to their destinations along the optimal path?

If k is an intermediate node on the optimal path from source node s to destination dThe path from s to k is also the optimal path from s to kA shortest path tree can be constructed from a source to the rest of the graph.

Page 23: Chapter 3 Internet Protocol Layer Part II: 3

Routing Algorithm Classification

Global or decentralized information?Link State routing: use Dijkstra algorithmDistance Vector routing: use distributed Bellman-Ford algorithm

Static or dynamic(adaptive)?Fixed routing table, set up manuallyRouting table adapts to network status

Page 24: Chapter 3 Internet Protocol Layer Part II: 3

The Shortest Path Algorithm

View a network as a graphNodes are routersEdges are physical links

Associated with a link cost: delay, congestion level, …

Find the least cost path from a sending node to the destination node

Depends on information available

Page 25: Chapter 3 Internet Protocol Layer Part II: 3

Link-State Routing

Routing informationGlobal information is available by reliable broadcastingDynamic: information exchanged when topology changes or periodically

Path calculationDijkstra algorithm

Page 26: Chapter 3 Internet Protocol Layer Part II: 3

Dijkstra AlgorithmFor each v in V-{s} { If v is adjacent to s C(v)=lc(s,v) else C(v)=? } T = {s} While (T≠V) { find w not in T s.t. C(w) is the minimum for all w in (V-T) T = T ∪{w} For each v in V-T C(v) = MIN(C(v), C(w)+lc(w,v)) P(v)=w) }

Page 27: Chapter 3 Internet Protocol Layer Part II: 3

Dijkstra Algorithm Example

A

D

E C

B

1

1

1 3

2

4

1

Iteration T C(B),p(B) C(C),p(C) C(D),p(D) C(E),p(E) 0 A 4,A 1,A ∞ ∞ 1 AC 3,C 4,C 2,C 2 ACE 3,C 3,E 3 ACEB 3,E 4 ACEBD

Page 28: Chapter 3 Internet Protocol Layer Part II: 3

Routing Table at Node A

Destination Cost NextHop B 3 C C 1 C D 3 C E 2 C

Page 29: Chapter 3 Internet Protocol Layer Part II: 3

Distance Vector Algorithm

Routing informationOnly local information is known

Knows status of adjacent links and routing information of adjacent nodes

Dynamic: information exchanged when link cost or shortest path changed

Path calculationBellmen-Ford

Page 30: Chapter 3 Internet Protocol Layer Part II: 3

Bellman-Ford AlgorithmWhile (1) { If x received route update message from y { For each (Dest, Distance) pair in y’s report { If (Dest is new) { /* Dest not in routing table */ Add a new entry for destination Dest rt(Dest).distance = Distance+lc(x,y) rt(Dest).NextHop = y } else if ((Distance+lc(x,y))<rt(Dest).distance){ /* y reports a shorter distance to Dest */ rt(Dest).distance = Distance+lc(x,y) rt(Dest).NextHop = y } } Send update messages to all neighbors if route changes Also send update messages to all neighbors periodically }

Page 31: Chapter 3 Internet Protocol Layer Part II: 3

Bellman-Ford Algorithm Example

Initial Routing Table at AD e s t in a t io n D is ta n c e N e x tH o p

B 4 B C 1 C

Final Distance Table at AD estina tio n D is tance N extH o p

B 3 C C 1 C D 3 C E 2 C

Page 32: Chapter 3 Internet Protocol Layer Part II: 3

Hierarchical Routing

Not a flat network: too many routing entriesDefine an AS

Routers within an AS are under the same administrative control

Routing within an AS and between AS’sIntradomain routingInterdomain routing

Page 33: Chapter 3 Internet Protocol Layer Part II: 3

AS

The Internet consists of Autonomous Systems (AS)interconnected with each other:

Stub AS: small corporationMultihomed AS: large corporation (no transit)Transit AS: provider

Two-level routing: Intra-AS: routing within an ASInter-AS: routing between AS’s

Page 34: Chapter 3 Internet Protocol Layer Part II: 3

An example of Hierarchical Routing

Intra-domain routers (exterior gateway)

Inter-domain routers (interior gateway)

Domain B

Domain A Domain C

C.1

C.2

C.3 A.3

A.1

A.2

B.3

B.1 B.4

B.2

Page 35: Chapter 3 Internet Protocol Layer Part II: 3

Example of Internet Routing Protocols

Intradomain routingRIPOSPF

Interdomain routingBGP-4

Page 36: Chapter 3 Internet Protocol Layer Part II: 3

Intra-domain Routing

What is intra-domain routingRouting within a domain (AS)Administrator decides the routing protocolAdministrator has total control on all routers

Why intra-domain routingMaintain connectivity within a domain

Page 37: Chapter 3 Internet Protocol Layer Part II: 3

Intra-domain Routing

Runs Interior Gateway Protocols (IGP)Most common IGPs:

RIP: Routing Information Protocol

OSPF: Open Shortest Path First

Page 38: Chapter 3 Internet Protocol Layer Part II: 3

RIP

Originally designed for Xerox PARC Universal Protocol (used in XNS) Adopted by UNIX and TCP/IP in 1982 (e.g., routed of BSD) RIP: RFC 1058 [1988] RIPv2: RFC 1388 [1993]

Page 39: Chapter 3 Internet Protocol Layer Part II: 3

RIP

Distance Vector routinguse hop count as cost metric (up to 15) restrict size of the network to 15 Exchange routing message (advertisement) every 30 seconds

Each advertisement consists of up to 25 routes (destination nets)

Page 40: Chapter 3 Internet Protocol Layer Part II: 3

RIPv2 Packet Format 16 0 8 24 31

Command

Family of net 1

Subnet Mask for net 1

Version Must be zero

Address of net 1

Route Tag for net 1

Next Hop for net 1

Distance to net 1

Subnet Mask for net 2

Address of net 2

Next Hop for net 2

Distance to net 2

Family of net 2 Route Tag for net 2

Page 41: Chapter 3 Internet Protocol Layer Part II: 3

RIP Packet Format and StabilityRIP packet format

commands: request or reply, version number up to 25 destination addresses

Stability hop count limit: 15 means infinity Stabilization Timer:

allows RIP to learn all routes from its neighbors before sendingfull updates.

Split horizons no update on backward route (omits routes learned from that neighbor)

Poison Reverse Update sends updates to a neighbor includes routes learned from that neighbor but sets the route metric to infinity.

Page 42: Chapter 3 Internet Protocol Layer Part II: 3

Routing Table of RIP

Taken from a cisco router at cs.ccu.edu.tw

Destination Gateway Flags Ref Use Interface -------------------- -------------------- ----- ----- ------ ---------127.0.0.1 127.0.0.1 UH 0 26492 lo0 192.168.2. 192.168.2.5 U 2 13 fa0 193.55.114. 193.55.114.6 U 3 58503 le0 192.168.3. 192.168.3.5 U 2 25 qaa0 224.0.0.0 193.55.114.6 U 3 0 le0 default 193.55.114.129 UG 0 143454

Page 43: Chapter 3 Internet Protocol Layer Part II: 3

Open Source Implementation

GNU Zebra ProjectSupports many routing protocols

RIP, OSPF, BGP

Runs routing daemon as user processCommunicates with kernel via netlink

Page 44: Chapter 3 Internet Protocol Layer Part II: 3

Routing Daemon and Kernel

Packets from NICs

Data packets

KernelRouting TableControlpackets

Routing manager(Zebra, routed, gated, …)

Handling protocol specific packetsUser spaceKernel space

Page 45: Chapter 3 Internet Protocol Layer Part II: 3

Overview of Zebra Routing Protocols

KernelRouting Table

ioctl sysctl proc fs rtnetlinknetlink

Zebra Daemon

RIPd OSPFd BGPd RIPngd Routing Information

(via socket interface)

Page 46: Chapter 3 Internet Protocol Layer Part II: 3

Zebra and Netlink/Rtnetlink

Routing Protocols

Zebra Daemon

Zebra Protocol

netlink / rtnetlink

Kernel

Page 47: Chapter 3 Internet Protocol Layer Part II: 3

Client Server Interaction in Zebra Protocol

Make zebra server socket

zclient_init()Install callback functions

callback functions

Zebra client APIs

zclient_connect

Zebra server APIsZebra

connection

Page 48: Chapter 3 Internet Protocol Layer Part II: 3

Zebra Client/Server ProtocolZebra IPv4 route message APIZebra IPv4 route message API

zsend_ipv4_{add,delete}_multipath

zsend_interface_{up,down}

Zebra Server

zsend_interface_{add,delete}

zsend_interface_address_{add,delete}

zsend_ipv4_{add,delete}

Zebra client

zebra_interface_address_{add,delete}_read

zebra_interface_state_read

zebra_interface_add_read

zapi_ipv4_{add, delete}

/* Structure for the zebra client. */

struct zclient

{

/* Other data structures here */

/* Pointer to the callback functions. */

int (*interface_add) (…);

int (*interface_delete) (…);

int (*interface_up) (…);

int (*interface_down) (…);

int (*interface_address_add) (…);

int (*interface_address_delete) (…);

int (*ipv4_route_add) (…);

int (*ipv4_route_delete) (…);

};

Page 49: Chapter 3 Internet Protocol Layer Part II: 3

RIP Daemon (ripd)

Interfacerip_networkrip_neighbor

rip_passive_interfaceip_rip_version

ip_rip_authenticationrip_split_horizon

InitializationScheduling

routemap offset

RIP Peerrip_peer_timeoutrip_peer_updaterip_peer_display

Zebraclient

RIP corerip_version

rip_default_metricrip_timersrip_route

rip_distance

Zebra Daemon

Page 50: Chapter 3 Internet Protocol Layer Part II: 3

OSPFFeatures

link-state routing protocolrun internal to a single Autonomous Systemshortest-path tree be constructed for routing table

Dijkstra algorithmsupport for equal-cost multipath routingsupport for TOS-based routingsupport variable subnet length

each route distributed has a destination and maskIntegrated uni- and multicast support:

Multicast OSPF (MOSPF) uses same topology database as OSPF

Page 51: Chapter 3 Internet Protocol Layer Part II: 3

OSPF

FeaturesTwo levels of hierarchy : areas within an AS

OSPF allows collection of contiguous networks and hosts to be grouped together, called an area.The topology of an area is invisible form outside of the other area.Routing in the AS takes place on two level

intra-area routing, inter-area routing

Page 52: Chapter 3 Internet Protocol Layer Part II: 3

AS boundary router

Area B Area A Area C

Area boundary router

backbone router Area boundary

router

internal router internal

router internal router

Backbone

Page 53: Chapter 3 Internet Protocol Layer Part II: 3

OSPF

FeaturesExternally derived routing data (via EGP) is advertised through the AS.

Flood without modificationTwo types of cost

type 1: compatible with costs within area, cost to an external network is the sum of internal cost and external cost

type 2: order of magnitude larger, cost to an external network is solely determined by external cost

Page 54: Chapter 3 Internet Protocol Layer Part II: 3

OSPF

FeaturesSupports stub to reduce broadcasting

An area can be figured as stub when there is a single exit point from the area. Virtual Link can not be configured through stub areas.AS boundary routers cannot be placed internal to stub areas.No AS external advertisements are flood into /through stub areas.

Page 55: Chapter 3 Internet Protocol Layer Part II: 3

N1

N2

N3

N4

N8N6

N7

N11

N9

N10

N12

N15

N12N13

N14

Internal router

Area border router

H1

RT1

RT2

RT4

RT3

RT5

RT6

RT10

RT11

RT9

RT12

RT7

RT8

3

3

1

1

1

1

1

1

1 1

1

4

2

2

2

2

8

8

8

66

7 6

6

88

8

9

1

10

3

7

5

Ia

Ib

Area 1

Area 2

Area 3

Stub

AS boundary router

Page 56: Chapter 3 Internet Protocol Layer Part II: 3

OSPF Hierarchy

Area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers.Backbone routers: run OSPF routing limited to backbone.Boundary routers: connect to other ASs.

Page 57: Chapter 3 Internet Protocol Layer Part II: 3

OSPFDatabase of area 1

|RT1|RT2|RT3|RT4|RT5|RT7|____________________________RT1| | | | | | |RT2| | | | | | |RT3| | | | | | |RT4| | | | | | |RT5| | | 14 | 8 | | |RT7| | | 20 | 14 | | |N1 | 3 | | | | | |N2 | | 3 | | | | |N3 | 1 | 1 | 1 | 1 | | |N4 | | | 2 | | | |Ia,Ib| | | 20 | 27 | | |

|RT1|RT2|RT3|RT4|RT5|RT7|________________________________N6 | | | 16 | 15 | | |N7 | | | 20 | 19 | | |N8 | | | 18 | 18 | | |N9-N11,H1| | | 29 | 36 | | |N12 | | | | | 8 | 2 |N13 | | | | | 8 | |N14 | | | | | 8 | |N15 | | | | | | 9 |

Page 58: Chapter 3 Internet Protocol Layer Part II: 3

OSPF

Summarized area information advertised by RT3 and RT4 to backbone.

Network RT3 adv RT4 adv

N1 4 4

N2 4 4

N3 1 1

N4 2 3

Page 59: Chapter 3 Internet Protocol Layer Part II: 3

OSPF

Backbone information advertised into area 1 by RT3 and RT4.

Destination RT3 adv. RT4 adv.Ia, Ib 20 27N6 16 15N7 20 19N8 18 18N9-11,H1 29 36RT5 14 8RT7 20 14

Page 60: Chapter 3 Internet Protocol Layer Part II: 3

OSPF Daemon of Zebra

Interface

InitializationScheduling

RouteOSPF coreip_ospf_interfaceip_ospf_neighborospf_router_idnetwork_area

show_ip_ospf_cmd

OSPF SPFcalcuation

zclient

Zebradaemon

Network

LSDBOSPF FloodingRoute Map

route_map_updateroute_map_event

LSALink State

Advertisement

ASEAS external

route calculation

Page 61: Chapter 3 Internet Protocol Layer Part II: 3

Inter-domain Routing

Called Exterior Gateway Protocols (EGP)

Most common EGP:

BGP: Border Gateway Protocol

Page 62: Chapter 3 Internet Protocol Layer Part II: 3

BGPFeatures

RFC 1771 (BGP-4) “Path vector” routing

loop free interdomain routing between ASsCan be used within and between ASs

multiple border routers (BGP speaker) within an ASIBGP: Interior BGP

runs between routers in the same ASAll BGP speakers within the AS must be fully meshed (through IGP protocol)

EBGP: Exterior BGPruns between routers belonging to two different ASs

Page 63: Chapter 3 Internet Protocol Layer Part II: 3

BGPRuns over TCP with port 179Routing table keeps all feasible paths to a destination network but advertises only the optimal path to neighborsSupport information aggregation

CIDR

Confederationcould also be used to allow multiple ASs within an AS

Policy routing at ASaccess-list permit or deny (route or path filtering)

Metric: combination of different metric with the degree of preference (weight, loc pref, med, …)

Page 64: Chapter 3 Internet Protocol Layer Part II: 3

BGPMessages

Open: first message sent after the TCP connection is established, followed by keepalive message Keepalive: send often enough to keep the hold-time timer from expiring Update: No periodic refresh of the entire table, exchanges only changes in tables

advertise a single feasible route to a peerwithdraw multiple routes previously advertisedMessage contains path attributes (origin, as_path, next_hop, multi_exit_disc, local_pref, aggregator, …) and NLRI (network layer reachability information)

Notification: send when an error is detected; also used to close connection

Page 65: Chapter 3 Internet Protocol Layer Part II: 3

BGPPath vector routing

Different ASs may have different link cost metricsLoop free is very importantPolicy routing is preferred (different priorities, prohibit lists, …)AS_PATH of the path attribute

a list of ASs to the destinationLoop is found if current AS already in the AS_PATH

Next_Hop of the path attribute indicates the next router (need not be a BGP speaker) to the destinationNLRI

a list of subnets that can be reached by the AS_PATH

Page 66: Chapter 3 Internet Protocol Layer Part II: 3

BGPPath selection(1) If Next_Hop is inaccessible, drop the update(2) Prefer largest LOCAL_PREF(3) Prefer shorter AS_PATH(4) Prefer lower origin code (igp<egp<incomplete)(5) Prefer lower MED (MULTI_EXIT_DISC)(6) Prefer external path over internal path(7) Prefer closer IGP neighbor(8) Prefer BGP router with lower ip address

Advertise the highest degree of preference for each destination to neighbor BGP speakers

Page 67: Chapter 3 Internet Protocol Layer Part II: 3

BGP PATH AttributesOrigin

defines the origin of the path information

IGP (i), BGP (e), Incomplete (?) (unknown, e.g., static route)

AS_PATHordered list or a set

Next_HopIP of the next hop to the destinationFor multiaccess network, nexthop could be a router other than the BGP speaker

LOCAL_PREFindicate preferred exit router within an AS

Page 68: Chapter 3 Internet Protocol Layer Part II: 3

BGP PATH Attributes

Multi_Exit_Disc(MED)When a router has multiple external links to the same AS, the link to the router with lower MED is preferred.

Page 69: Chapter 3 Internet Protocol Layer Part II: 3

Open Source Implementation

Page 70: Chapter 3 Internet Protocol Layer Part II: 3

Multicast

What is multicast?Protocols

Internet Group Management Protocol V2Distance Vector Multicast Routing ProtocolProtocol-Independent Multicast (PIM) – Sparse Mode (SM)

Open Source ImplementationTrace of IGMPTrace of DVMRP

Page 71: Chapter 3 Internet Protocol Layer Part II: 3

Multicast

Communication among more than two partiesMulti-party video conferencingDistance learning

IssuesMaintain group member informationConstruct a multicast tree for transmission packetsMany to many communication

Page 72: Chapter 3 Internet Protocol Layer Part II: 3

MBONE

A virtual network on top of InternetProvide multicast and real-time transmission techniqueCharacteristic of Mbone

Bandwidth usage will not increase proportionally when group membership increases

Goal of MBONEConstruct a testbed for multicast applications when no ubiquitous mrouters in the Internet

Page 73: Chapter 3 Internet Protocol Layer Part II: 3

MBONE Structure

Three components of Mbone :IslandMrouterTunnel

IslandsNetworks with IP multicast capabilityHosts in the same island can do multicast

directly without through routers

Page 74: Chapter 3 Internet Protocol Layer Part II: 3

MBONE Structure

Island B

Island A

Island C

Island D

member mrouter router w/o multicast cap.

Tunnel

Page 75: Chapter 3 Internet Protocol Layer Part II: 3

MBONE StructureMrouter

To solve problems caused by some routers that do not support multicast routing run mrouted (multicast routing daemon)

determine routing pathmulticast packet transition

Page 76: Chapter 3 Internet Protocol Layer Part II: 3

MBONE Structure

TunnelConstruct a virtual point-to-point link between local mrouter and remote mrouterAllow multicast traffic to pass through non-multicast capable routerCapsulation

Multicast Header IMulticast Data

New IP Header

Original Multicast Packet Tunnel source and destination

Page 77: Chapter 3 Internet Protocol Layer Part II: 3

MBONE Address

Multicast addressassigned to a multicast groupssenders use it as destination IP address

Class D Address (224.0.0.0~239.255.255.255 )high-order four bits is 111028-bit multicast group ID

Page 78: Chapter 3 Internet Protocol Layer Part II: 3

MBONE Communication Protocol

Multicast Routing ProtocolsDVMRP, PIM-DM, PIM-SM, CBT, MOSPF, ...

IGMPA communication protocol between mrouterand hosts in a subnet

Page 79: Chapter 3 Internet Protocol Layer Part II: 3

MBONE Application

Debug toolmtracemap-mbone

Basic softwareSDR (Session Directory)Wb (Whiteboard)VAT (Visual Audio Tool)VIC (Video Conference)

Page 80: Chapter 3 Internet Protocol Layer Part II: 3

Internet Group Management Protocol ( IGMPv2)

RFC 2236Used by IP hosts to report multicast group memberships to routers

Enhances IGMPv1- Querier election mechanism- IGMPv2 Leave Group message- Group-Specific Query message

Page 81: Chapter 3 Internet Protocol Layer Part II: 3

Protocol Overview

Multicast router plays one of the two roles: Querier or Non-Querier

Querier is responsible for maintain membership information of the attached physical networkRouter with the smallest IP address becomes the Querier

Routers hear the Query messages and make the judge

Querier periodically sends General Query to solicit membership informationA General Query is sent to 224.0.0.1 (ALL-SYSTEMS multicast group)

Page 82: Chapter 3 Internet Protocol Layer Part II: 3

Protocol Overview

When a host receives a General Querydelays a random time from the range of [0..Max Response Time](starts a timer)

Max Resp. Time is given in the Query message

Sends the report with TTL=1 when timer expiresIf the host receives another host’s report before timer expires, stop the timer and does not send the report (Report suppression)

Similar for a host receives a Group-Specific Query

Page 83: Chapter 3 Internet Protocol Layer Part II: 3

Protocol Overview

When a router receives a reportadds the group being reported to the list of multicast groupsSets timer for the membership to [Group Membership Interval]. Deletes it if no reports received before this timer has expired. (Query is sent periodically.)

When a host joins a multicast groupSends an unsolicited report immediately

Page 84: Chapter 3 Internet Protocol Layer Part II: 3

Protocol Overview

When a host leaves a multicast groupIf it was the last host to reply to a Query, it should send a Leave Group message to all-routers multicast address (224.0.0.2)

When a router receives a Leave Group message

Sends Group-specific Queries every [Last Member Query Interval] to the group being left for [Last Member Query Count] times.If no reports received before [Last Member Query Interval], assumes no local members.

Page 85: Chapter 3 Internet Protocol Layer Part II: 3

IGMPv2 message format

message format

type0x11=Membership Query

- General query - Group-Specific Query

0x16=Version 2 Membership Report0x17=Leave Group0x12=Version 1 Membership Report

16 0 8 24 31

Type Max. Resp. Time

Checksum

Multicast group Address

Page 86: Chapter 3 Internet Protocol Layer Part II: 3

IGMPv2 message format

Max Response Time- only in membership query message- set to be zero in other messages

Checksum- 16-bit one’s complement

Group address- zero when sending a General Query- group address when sending a Group-Specific query

Page 87: Chapter 3 Internet Protocol Layer Part II: 3

IGMPv3

IETF draft-ietf-idmr-igmp-v3-05.txtAdds support for “source filtering”

A receiver may request to receive packets only from specific source addressesSelect source addresses by INCLUDE or EXCLUDE

IPMulticastListen(socket, interface, multicast-address, filter-mode, source-list)filter-mode: INCLUDE or EXCLUDE

Page 88: Chapter 3 Internet Protocol Layer Part II: 3

Multicast Routing Protocols

Two types of multicast treesource-based treecore-based tree (shared tree)

Multicast protocolsDVMRP PIM

Sparse modeDense mode

CBTMOSPFBGMP

What’s the difference:

per (S,G) tree or

per (*,G) tree

Page 89: Chapter 3 Internet Protocol Layer Part II: 3

Distance Vector Multicast Routing Protocol (DVMRP)

RFC-1054Derived from RIP

Relies on RIP for unicast routing

Widely used on the MboneEnable incremental deployment of IP multicast since it supports tunnel

Construct a source-based tree per sourceProvide a shortest path between source and receivers

Page 90: Chapter 3 Internet Protocol Layer Part II: 3

DVMRP

Major difference between DVMRP and RIPRIP : concern with calculating next hop to a destinationDVMRP : concern with calculating previous hop back to a source

Reverse Path Forwarding (RPF) algorithmReverse Path Broadcast (RPB)Prune to a Reverse Path Multicast (RPM) treeForwarding data uni-directionally

Page 91: Chapter 3 Internet Protocol Layer Part II: 3

RPF Algorithm

Broadcast on the Reserve PathWhen a multicast packet is received

Forward the packet on all of its outgoing links only if

Packet arrives on the interface that is also the interface of the shortest path back to the senderPacket is not duplicated

Otherwise, discard the packet

Page 92: Chapter 3 Internet Protocol Layer Part II: 3

Reverse Path Broadcasting (RPB)

member mrouter router w/o member

source Forward Discard

Page 93: Chapter 3 Internet Protocol Layer Part II: 3

RPF Algorithm

PruneRouters that do not lead to any members send prune messages to upstream routersRouters know membership information via IGMP

Page 94: Chapter 3 Internet Protocol Layer Part II: 3

Prune RPB Tree

member mrouter router w/o member

source Forward Prune

Page 95: Chapter 3 Internet Protocol Layer Part II: 3

Example of a RPM tree

member router w/ member router w/o member

source Forward

Page 96: Chapter 3 Internet Protocol Layer Part II: 3

RPF Drawbacks And Benefits

Drawbacks :- First packet still has to be flooded

- periodic prune state refresh in order to adopt to network topology changes

- routers must keep routing state per (source , group) pair

Benefits :- guarantee efficient delivery

- easy to implement

Page 97: Chapter 3 Internet Protocol Layer Part II: 3

DVMRP’s Problem

Work well only for densely represented groups within a subnet

periodic broadcast will cause performance problems

Amount of state information stored inmrouters

information for forwarding multicast messagesprune-state information

not scale to support sparsely distributed multicast groups

Page 98: Chapter 3 Internet Protocol Layer Part II: 3

PIM-SM

Protocol OverviewSpecial FeaturesPacket Formats

Page 99: Chapter 3 Internet Protocol Layer Part II: 3

Protocol Overviews

DocumentsRFC 2362 IETF draft: draft-ietf-pim-sm-v2-new-01.txt

TerminologiesDR: Designated RouterRP: Rendezvous PointRPT: RP-based Tree

PIM-SM route packets in three phasesPhase one: RP treePhase two: Register StopPhase three: Shortest-Path Tree (Optional)

Page 100: Chapter 3 Internet Protocol Layer Part II: 3

Phase One: RP TreeReceiver

Sends join message to DR using IGMPDR sends (*,G) PIM Join message to RP

Reaches RP or converge on a router on the RPTJoin message is sent periodically (o.w., it will time out)

SenderSender sends a packet with multicast address as its destination to DRDR unicasts encapsulated packet to RP

PIM Register packets

RP decapsulates it and forwards it onto RPT

Page 101: Chapter 3 Internet Protocol Layer Part II: 3

Phase Two: Register StopMotivation

Encapsulation and decapsulation are too expensive

StepsRP initiates an (S,G) source-specific Join to SAll the routers on the path records the (S,G) multicast statePackets start to flow following the (S,G) tree to RPIf the packet reaches a router with (*,G), do a short-cut to receivers.RP may now receive duplicate packets: native and encapsulated. RP discards the encapsulated packet.RP sends a Register-Stop message to DR of Source.RP forwards native packets to the RPT.

Page 102: Chapter 3 Internet Protocol Layer Part II: 3

Phase Three: Shortest-Path Tree

MotivationFrom source to RP, then to receivers is too long.

StepsA receiver’s DR may optionally initiate to transfer from the RPT to a source-specific tree (SPT)It issues an (S,G) join to S. The join message may reach the source or converged at some router.It starts to receive two copies of packets. Drop the one from RPT.It then sends an (S,G) prune message to RP

(S, G, rpt) prunePrune message reaches RP or converged at some router.

Page 103: Chapter 3 Internet Protocol Layer Part II: 3

Special Issues

Source-specific JoinsMulti-access Transit LANsRP Discovery

Page 104: Chapter 3 Internet Protocol Layer Part II: 3

Source-specific Joins

If a receiver sends a source-specific join using IGMPv3

DR may omit performing a (*,G) join.Instead, DR issues a source-specific (S,G) join.

Multicast addresses for source-specific multicast

232.0.0.0 to 232.255.255.255Only source-specific join will be accepted for group in this range.

Page 105: Chapter 3 Internet Protocol Layer Part II: 3

Multi-access Transit LANs

Problems on a LAN with more than one routers

Two or more routers issue (*,G) JoinsTwo or more routers issue (S,G) JoinsA router issues a (*,G) Join while another router issues a (S,G) Join

Routers will observe duplicate join messagesUse PIM Assert messages to elect a single forwarder for the LAN

Choose the router sends (S,G)Choose the router with best metric to RP or to source

Page 106: Chapter 3 Internet Protocol Layer Part II: 3

RP Discovery

PIM-SM routers need to know how to map a group to an RP

Use bootstrap mechanismIn each PIM domain, a router is elected as the Bootstrap Router (BSR).Candidate RPs of the domain unicast their candidacy to the BSR.BSR decides an RP-set and periodically announces it in a bootstrap message to all routers.A router (DR) uses an order-preserving hash function to map the group address into the RP-set

Page 107: Chapter 3 Internet Protocol Layer Part II: 3

DR Election

PIM-Hello messages are sent periodically on each PIM-enabled interface

Hello messages are used to learn neighboring routers and elect a DR.Hello messages are sent to address 224.0.0.13Hello messages contain DR election priority and Generation Identifier fields

A router with largest DR election priority will be the DR. Tie break by IP address (larger is preferred)Generation Identifier is randomly generated. A new GenID causes update of old Hello information and may cause a new election of DR.

Page 108: Chapter 3 Internet Protocol Layer Part II: 3

BSR Election

A set of routers are configured as candidate bootstrap routers (C-BSRs)

Bootstrap messages are used for BSR election and RP-set distributionA C-BSR with largest BSR priority is elected as the BSR. Tie break by IP address.

Page 109: Chapter 3 Internet Protocol Layer Part II: 3

RP-set

A set of routers are configured as candidate RPs (C-RPs)

Typically same as C-BSRs

Candidate RPs periodically unicastCandidate-RP-Advertisement messages (C-RP-Advs) to the BSR (which includes)

C-RP addressGroup address and a mask to indicate a set of groups it preferred to be the RP

BSR forms the RP-set (for each group prefix)

Page 110: Chapter 3 Internet Protocol Layer Part II: 3

Hash Function

A router maintains up to date RP-setChoose an RP for a group G based on

Choose RPs from the RP-set whose Group-prefix is the longest that covers GCompute a value byValue(G,M,C(i))=

(1103515245 * ((1103515245 * (G&M)+12345) XOR C(i)) + 12345) mod 2^31Choose the RP with highest priority and valueTie break by IP address

Page 111: Chapter 3 Internet Protocol Layer Part II: 3

Summary

Source-rooted tree :- advantage :creating optimal path between sources and receivers- disadvantage :routers must maintain path information for each (S,G) pair

Shared tree :- advantage :requiring minimum amount of state in each router- disadvantage :path between sources and receivers may not be optimal