Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Internet and Intranet Protocols and Applications
IP MulticastMarch, 2004
Prof. Arthur GoldbergComputer Science Department
New York University
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron2
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron3
P2P file sharing
r Exampler Alice runs P2P client
application on her notebook computer
r Intermittently connects to Internet; gets new IP address for each connection
r Asks for “Hey Jude”r Application displays
other peers that have copy of Hey Jude.
r Alice chooses one of the peers, Bob.
r File is copied from Bob’s PC to Alice’s notebook: HTTP
r While Alice downloads, other users uploading from Alice.
r Alice’s peer is both a Web client and a transient Web server.
All peers are servers = highly scalable!
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron4
P2P: centralized directory
Original Napster design1) when peer connects, it
informs central server:– IP address– content
2) Alice queries for “Hey Jude”
3) Alice requests file from Bob
centralizeddirectory server
peers
Alice
Bob
1
1
1
12
3
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron5
P2P: problems with centralized directory
r Single point of failurer Performance
bottleneckr Copyright
infringement
file transfer is decentralized, but locating content is highly centralized
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron6
P2P: decentralized directory
r Each peer is either a group leader or assigned to a group leader.
r Group leader tracks the content in all its children.
r Peer queries group leader; group leader may query other group leaders.
ordinary peer
group-leader peer
neighoring relationshipsin overlay network
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron7
More about decentralized directory
r overlay networkr peers are nodesr edges between peers
and their group leadersr edges between some
pairs of group leadersr virtual neighborsr bootstrap noder connecting peer is
either assigned to a group leader or designated as leader
r advantages of approachr no centralized directory
serverr location service
distributed over peersr more difficult to shut
downr disadvantages of
approachr bootstrap node neededr group leaders can get
overloaded
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron8
P2P: Query flooding
r Gnutella r no hierarchyr use bootstrap node to
learn about othersr join message
r Send query to neighborsr Neighbors forward queryr If queried peer has
object, it sends message back to querying peer
join
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron9
P2P: more on query flooding
Prosr peers have similar
responsibilities: no group leaders
r highly decentralizedr no peer maintains
directory info
Consr excessive query
trafficr query radius: may not
have content when present
r bootstrap noder maintenance of overlay
network
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron10
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron11
Multicast: one sender to many receivers
• Multicast: act of sending datagram to multiple receivers with single “transmit” operation– analogy: one teacher to many students
• Question: how to achieve multicast
Unicast implementation• source sends N
unicast datagrams, one addressed to each of N receivers
multicast receiver (red)not a multicast receiver (red)
routersforward unicastdatagrams
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron12
Multicast: with network support
Routers actively participate in multicast, making copies of packets as needed and forwarding towards multicast receivers
Multicastrouters (red) duplicate and forward multicast datagrams
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron13
Multicast: Application-layer
• end systems involved in multicast copy and forward unicast datagrams among themselves
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron14
Internet Multicast Service Model
multicast group concept: use of indirection– hosts addresses IP datagram to multicast group– routers forward multicast datagrams to hosts that
have “joined” that multicast group
128.119.40.186
128.59.16.12
128.34.108.63
128.34.108.60
multicast group
226.17.30.197
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron15
IP Multicast - Concepts
• Message sent to multicast “group” of receivers– Senders need not be group members– Each group has a “group address”– Groups can have any size– End-stations (receivers) can join/leave at will– Data Packets are UDP (uh oh!)
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron16
IP Multicast Benefits
• Distribution tree for delivery/distribution of packets (i.e., scope extends beyond LAN)– Tree is built by multicast routing protocols. Current
multicast tree over the internet is called MBONE
• No more than one copy of packet appears on any sub-net.
• Packets delivered only to “interested” receivers => multicast delivery tree changes dynamically
• Non-member nodes even on a single sub-net do not receive packets (unlike sub-net-specific broadcast)
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron17
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Using and example Java program– Message distribution– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron18
*cast Definitions
• Unicast - send to one destination• General Broadcast - send to EVERY local node
(255.255.255.255)• Directed Broadcast - send to subset of nodes on
LAN (198.122.15.255)• Multicast - send to every member of a Group of
“interested” nodes (perhaps a Class D address).• RFCs 1700, 1112
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron19
Multicast addresses
• Class D addresses: 224.0.0.0 - 239.255.255.255• Each multicast address represents a set of hosts
group of arbitrary size, called a “host group”• Addresses 224.0.0.x and 224.0.1.x are reserved.
See assigned numbers RFC 1700– Eg: 224.0.0.2 = all routers on this sub-net
• Addresses 239.0.0.0 thru 239.255.255.255 are reserved for private network (or intranet) use
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron20
Link-Layer Multicast Addresses
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0
1 1 1 0 28 bits
23 bits
IP multicast address
Group bit
Ethernet and other LANs using 802 addresses:
LAN multicast address
Lower 23 bits of Class D address are inserted into the lower 23 bits of MAC address (see RFC 1112)
0x01005e
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron21
Multicast groups• class D Internet addresses reserved for multicast:
• host group semantics:o anyone can “join” (receive from) a multicast groupo anyone can send to a multicast groupo no network-layer identification to hosts of members
• needed: infrastructure to deliver multicast-addressed datagrams to all hosts that have joined that multicast group
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron22
NIC, IP Stack, and Apps Cooperate!
• Idea - NIC does not accept packets unless some app on this node wants it (avoid non-productive work).
• How does it know which packets to accept?– App “joins” group with Class D address– IP stack gives class D info to NIC– NIC builds “filter” to match MAC addresses– Latch on to:
• Own address, MAC broadcast, or “Group” address
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron23
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Using and example Java program– Group management– Message distribution– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron24
IP Multicast - Sending
• Use normal IP-Send operation, with multicast address specified as destination
• Must provide sending application a way to:– Specify outgoing network interface, if more than 1 is
available– Specify IP time-to-live (TTL) on outgoing packet– Enable/disable loop-back if the sending host is/isn’t a
member of the destination group on the outgoing interface
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron25
IP Multicast - Receiving
• Two new operations– Join-IP-Multicast-Group( group-address,
interface )– Leave-IP-Multicast-Group( group-address,
interface )
• Receive multicast packets for joined groups via normal Receive operation
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron26
IP Multicast in Java
• Java has a Multicast Socket Class• Use it to “join” a multicast group.
MulticastSocket s;InetAddress group;
try {group = InetAddress.getByName(“227.1.2.3”);s = new MulticastSocket(5555);s.joinGroup(group);
} catch (UnknownHostException e) {
} catch (IOException e) {
}
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron27
IP Multicast in Java
• Receive DatagramPackets on a MulticastSocket
DatagramPacket recv = new DatagramPacket(buf,buf.length);try {
s.receive(recv);} catch (IOException e) {
System.out.println("mcastReceive: " + e.toString());
return;}// get messageString msg = new String(recv.getData(),recv.getOffset(), recv.getLength());
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron28
IP Multicast in Java
• To send, just send a DatagramPacket to the multicast address, port (no need to use a MulticastSocket, although you could)
group = InetAddress.getByName(“227.1.2.3”);s = new DatagramSocket();
DatagramPacket snd = new DatagramPacket(buf, buf.length, group, 5555);
try {s.send(snd);
} catch (IOException e) {System.out.println("mcastSend: " + e.toString());return;
}
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron29
Multicast Scope
• Scope: How far do transmissions propagate?• Implicit scope:
– Reserved Mcast addresses => don’t leave subnet.
• TTL-based scope:– Each multicast router has a configured TTL threshold– It does not forward multicast datagram if TTL
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron30
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Using and example Java program– Group management– Message distribution– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron31
Joining a multicast group: two-step process
• local: host informs local mcast router of desire to join group: IGMP (Internet Group Management Protocol)
• wide area: local router interacts with other routers to receive mcast datagram flow– many protocols (e.g., DVMRP, MOSPF, PIM)
IGMPIGMP
IGMP
wide-areamulticast
routing
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron32
IGMP: Internet Group Management Protocol
• host: sends IGMP report when application joins mcast group– IP_ADD_MEMBERSHIP socket option– host need not explicitly “unjoin” group when
leaving • router: sends IGMP query at regular intervals
– host belonging to a mcast group must reply to query
query report
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron33
IGMPIGMP version 1• router: Host Membership
Query msg broadcast on LAN to all hosts
• host: Host Membership Report msg to indicate group membership– randomized delay before
responding– implicit leave via no reply
to Query
• RFC 1112
IGMP v2: additions include• group-specific Query• Leave Group msg
– last host replying to Query can send explicit Leave Group msg
– router performs group-specific query to see if any hosts left in group
– RFC 2236
IGMP v3: under development as Internet draft
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron34
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Using and example Java program– Group management– Message distribution– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron35
Multicast Routing: Problem Statement
• Goal: find a tree (or trees) connecting routers having local mcast group members – tree: not all paths between routers used– source-based: different tree from each sender to receivers– shared-tree: same tree used by all group members
Shared tree Source-based trees
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron36
Approaches for building mcast trees
Approaches:• source-based tree: one tree per source
– shortest path trees– reverse path forwarding
• group-shared tree: group uses one tree– minimal spanning (Steiner) – center-based trees
…we first look at basic approaches, then specific protocols adopting these approaches
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron37
Shortest Path Tree• mcast forwarding tree: tree of shortest path routes
from source to all receivers• Can be computed with Dijkstra’s algorithm
R1
R2
R3
R4
R5
R6 R7
21
6
3 45
i
router with attachedgroup member
router with no attachedgroup memberlink used for forwarding,i indicates order linkadded by algorithm
LEGENDS: source
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron38
An Aside: A Link-State Routing Algorithm
Dijkstra’s algorithm• net topology, link costs
known to all nodes– accomplished via “link
state broadcast” – all nodes have same info
• computes least cost paths from one node (‘source”) to all other nodes– gives routing table for
that node• iterative: after k
iterations, know least cost path to k dest.’s
Notation:• c(i,j): link cost from node i
to j. cost infinite if not direct neighbors
• D(v): current value of cost of path from source to dest. V
• p(v): predecessor node along path from source to v, that is next v
• N: set of nodes whose least cost path definitively known
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron39
Dijsktra’s Algorithm1 Initialization:2 N = {A} 3 for all nodes v 4 if v adjacent to A 5 then D(v) = c(A,v) 6 else D(v) = infinity 7 8 Loop9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N: 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron40
Dijkstra’s algorithm: exampleStep
012345
start NA
ADADE
ADEBADEBC
ADEBCF
D(B),p(B)2,A2,A2,A
D(C),p(C)5,A4,D3,E3,E
D(D),p(D)1,A
D(E),p(E)infinity
2,D
D(F),p(F)infinityinfinity
4,E4,E4,E
A
ED
CB
F2
21
3
1
1
2
53
5
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron41
Dijkstra’s algorithm, discussion—end aside
Algorithm complexity: n nodes• each iteration: need to check all nodes, w, not in N• n*(n+1)/2 comparisons: O(n**2)• more efficient implementations possible: O(nlogn)
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron42
Reverse Path Forwarding
if (mcast datagram received on incoming link on shortest path back to sender)then flood datagram onto all outgoing linkselse ignore datagram
q rely on router’s knowledge of unicast shortest path from it to sender
q each router has simple forwarding behavior:
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron43
Reverse Path Forwarding: example
• result is a source-specific reverse SPT• Problems
– A bad choice with asymmetric links– Substantial extra traffic
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup memberdatagram will be forwarded
LEGENDS: source
datagram will not be forwarded
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron44
Reverse Path Forwarding: pruning
• forwarding tree contains subtrees with no mcast group members– “prune” msgs sent upstream by router with
no downstream group members
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup memberrouter with no attachedgroup memberprune message
LEGENDS: source
links with multicastforwarding
P
P
P
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron45
Shared-Tree: Steiner Tree
• minimum cost tree connecting all routers with attached group members
• problem is NP-complete• excellent heuristics exists• not used in practice:
– computational complexity– information about entire network needed– monolithic: rerun whenever a router needs to join/leave
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron46
Center-based trees
• single delivery tree shared by all• one router identified as “center” of tree• to join:
– edge router sends unicast join-msg addressed to center router
– join-msg “processed” by intermediate routers and forwarded towards center
– join-msg either hits existing tree branch for this center, or arrives at center
– path taken by join-msg becomes new branch of tree for this router
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron47
Center-based trees: an exampleSuppose R6 chosen as center:
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup memberrouter with no attachedgroup memberpath order in which join messages generated
LEGEND
21
3
1
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron48
Internet Multicasting Routing: DVMRP
• DVMRP: distance vector multicast routing protocol, RFC1075
• flood and prune: reverse path forwarding, source-based tree– RPF tree based on DVMRP’s own routing tables
constructed by communicating DVMRP routers – no assumptions about underlying unicast– initial datagram to mcast group flooded everywhere via
RPF– routers not wanting group: send upstream prune msgs
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron49
DVMRP: continued…• soft state: DVMRP router periodically (1 min.)
“forgets” branches are pruned: – mcast data again flows down unpruned branch– downstream router: reprune or else continue to receive data
• routers can quickly regraft to tree – following IGMP join at leaf
• odds and ends– commonly implemented in commercial routers– Mbone routing done using DVMRP
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron50
TunnelingQ: How to connect “islands” of multicast
routers in a “sea” of unicast routers?
q mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram
q normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router
q receiving mcast router unencapsulates to get mcast datagram
physical topology logical topology
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron51
PIM: Protocol Independent Multicast
• not dependent on any specific underlying unicast routing algorithm (works with all)
• two different multicast distribution scenarios :
Dense:q group members
densely packed, in “close” proximity.
q bandwidth more plentiful
Sparse:q # networks with group
members small wrt # interconnected networks
q group members “widely dispersed”
q bandwidth not plentiful
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron52
Consequences of Sparse-Dense Dichotomy:
Dense• group membership by
routers assumed until routers explicitly prune
• data-driven construction on mcast tree (e.g., RPF)
• bandwidth and non-group-router processing profligate
Sparse:• no membership until
routers explicitly join• receiver- driven
construction of mcast tree (e.g., center-based)
• bandwidth and non-group-router processing conservative
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron53
PIM- Dense Mode
flood-and-prune RPF, similar to DVMRP butq underlying unicast protocol provides RPF info
for incoming datagramq less complicated (less efficient) downstream
flood than DVMRP reduces reliance on underlying routing algorithm
q has protocol mechanism for router to detect it is a leaf-node router
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron54
PIM - Sparse Mode
• center-based approach• router sends join msg to
rendezvous point (RP)– intermediate routers update
state and forward join
• after joining via RP, router can switch to source-specific tree– increased performance: less
concentration, shorter paths
R1
R2
R3
R4
R5
R6R7
join
join
join
all data multicastfrom rendezvouspoint
rendezvouspoint
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron55
PIM - Sparse Mode
sender(s):• unicast data to RP, which
distributes down RP-rooted tree
• RP can extend mcast tree upstream to source
• RP can send stop msg if no attached receivers– “no one is listening!”
R1
R2
R3
R4
R5
R6R7
join
join
join
all data multicastfrom rendezvouspoint
rendezvouspoint
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron56
Multicast
• Peer to peer applications• Multicast protocols
– Types– Addressing– Using and example Java program– Group management– Message distribution– Reliable multicast
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron57
Reliable Multicast
• Remember: IP Multicast uses IP - it’s unreliable!• We need a “reliable” multicast• Let’s review what we mean by “reliable”
– message gets to ALL receivers– sending order is preserved– no duplicates– no corruption
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron58
Reliable Multicast
• Problems: – Retransmission can make reliable multicast as
inefficient as replicated unicast• Ack-implosion if all destinations ack at once• Nak-implosion if a all destinations ack at once
– Source does not know # of destinations. Or even if all destinations are still “up”
– one bad link affects entire group– Heterogeneity: receivers, links, group sizes
Slides 3-9,11-14,31-55 Copyright Kurose and Ross, others Joe Conron59
Reliable Multicast
• RM protocols have to deal with: – Scalability.
– Heterogeneity.
– Adaptive flow/congestion control.
– Reliability.
• Not all multicast applications need reliability of the type provided by TCP. Some can tolerate reordering, delay, etc
• Let’s look at two very different approaches – PGM (Pragmatic General Multicast)– RMP (Reliable Multicast Protocol)