106
CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Embed Size (px)

Citation preview

Page 1: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

CPEG 419

Review of Lecture 1 and continuation of chapter 1

Introduction to Data Networking

Page 2: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Announcements

• Homework 1 due next week

• Project 1 due next week

Page 3: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Today

• Review and complete Chapter 1

• Start Chapter 2

Page 4: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Packet Switching CaseWhat is the probability of more than 100 users being active?

We conclude that if there are 200 users, then in “pretty much always” things will work fine

14200

101

200 102.012.0200

k

kkkThis is the binomial complimentary cumulative distribution

004.02.012.0400

400

101

400

k

kkk

The probability of 101 users being active plus, 102 users being active, plus, …., 200 users being active, which is

Suppose that there are 300 users: 8300

101

300 102.012.0300

k

kkkStill pretty good

Suppose that there are 400 users: Might be acceptable performance

Therefore: circuit switching could support 100 users, while packet switching can support 400 users. A factor of 4 more!!!

Page 5: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Losses and delay in packet switched networks

• Losses– Transmission losses

• In fiber links, bit-error is 10^-12 or better (i.e., less).– What is the probability of packet error when there are 1400 bytes in a packet?

• In wireless links, the bit-error rate can be very high

– Congestion losses. • If too many packets arrive at the same time, then the buffers will fill up and packets are

lost.

• Increasing the link speeds or reducing the number of users can reduce the probability of loss.

• Increasing the size of the buffer reduces losses, but also increases delay.

• Delay– Queuing delay– Transmission delay– Propagation delay– Processing delay

A

B

packet being transmitted (delay)

packets queueing (delay)

free (available) buffers: arriving packets dropped (loss) if no free buffers

Page 6: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

In the news

News sources

www.lightreading.com (general networks)

www.unstrung.com (wireless and mobile)

www.darkreading.com (network security)

www.alleyinsider.com (general tech business news)

arstechnica.com (general tech news)

Page 7: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

The Protocol Stack

• The application layer includes network applications and network application protocols– e.g. of applications: web, IM, email– e.g., application protocols: OSCAR,

http, smtp, ftp, DNS.

• Provide a service to a user or another application.

• Require service from the lower layers, but typically only interact with the transport layer.

application

transport

network

link

physical

Page 8: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

The Protocol Stack

• The transport layer (typically) transports messages from and to applications

• Different transport layer protocols provide different types of services.

• Types of services MAY include– Reliability: the sender application can be assured that

the data is correctly received, or receives an error message.

– Congestion and flow control: attempt to send data quickly but not so quickly to cause congestion in the network or at the receiving host

– Error detection / correction– In order delivery– Break long messages into small chunks suitable for

transmission over the network– Multiplexing so that multiple transport layer connections

can occur simultaneously• Note that when a transport protocol provides these

services, the application does not have to. – This makes implementation of applications easier.– This allows careful design of transport protocols,

following the divide and conquer approach• The transport layer uses the network layer to deliver

packets, but does not require any type of service guarantees from the network layer

– In practice, the transport layer hopes for in order delivery.

application

transport

network

link

physical

Page 9: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Transport layer protocols: TCP and UDP

• TCP and UDP are the most widely used transport protocols.

• Other protocols include SCTP (UD and Cisco are active in developing SCTP), RTP (for multimedia such as VoIP)

• TCP and UDP will be covered in great detail later. But for now:

• TCP provides many services– Congestion control– Flow control– Reliability– Multiplexing– Error detection

• UDP provides few services– Error detection– Multiplexing– The application must implement any other

services that it requires.• TCP requires a connection to be established,

UDP does not

application

transport

network

link

physical

Page 10: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Transport Multiplexing

• Transport layers use ports to provide multiplexing– A two hosts can have

multiple simultaneous connections by using ports.

– Well known ports can be used to specify a particular application

• E.g., web servers will accept TCP connections on port 80

• A host can have two connections with a web server by using different ports

host

TCP

0

45674568

216-1

UDP

0

216-1

host(web server)

TCP

0

80

216-1

UDP

0

216-1

Page 11: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Sockets – gateway between the app layer and the transport layer

• process sends/receives messages to/from its socket

• socket analogous to door– sending process shoves

message out door

– sending process relies on transport infrastructure on other side of door which brings message to socket at receiving process

process

TCP withbuffers,variables

socket

host orserver

process

TCP withbuffers,variables

socket

host orserver

Internet

controlledby OS

controlled byapp developer

Page 12: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

TCP Sockets

• An application accesses TCP and UDP through sockets.• TCP is connection based so one host must be listening and the other must

be connecting (calling)• The basic steps for a TCP listener

– Define socket variable as a TCP socket– Bind socket to a port (the bind function)

• If some other application is or was recently (120 sec) listening on this port, this function will fail.

• The application must check that this command succeeds.– Listen on this port (the listen function)– When a the other host connects, the listen function completes and data can be

send or received.– Close socket

• Basic steps for TCP caller– Define socket variable as a TCP socket

• No port is given, the OS will assign which ever port is available. The application has no control over the port

– Connect– Send data– Close socket

Page 13: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

UDP Sockets

• UDP are connectionless. – A host sends a packet when it wants. – There is no concept of one host connecting to another.– There is only the concept of one host sending a packet and the other host receiving the

packet. And either host can send or receive• Steps to send and then receive a UDP message

– Define socket as a UDP socket– Bind socket to a port

• If this port is in use, bind will fail– Send message– Wait for message

• There are two ways to wait for messages, blocking or non-blocking• A blocking function will wait for a message to arrive. It might wait forever.• A non-blocking will return immediately, but if no message was waiting in the transport layer, then no

message is returned• select function allows a time out to be set. So the function will wait until a message arrives or the

timeout time to elapse.– Close socket

• Steps to receive a UDP message– Define socket as a UDP socket– Bind socket to a port

• If this port is in use, bind will fail– Send message– Wait for response– Close socket

Page 14: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Project 1

• In this project messages will be sent over TCP and UDP.• The project is description currently at

– http://www.eecis.udel.edu/~bohacek/Classes/CPEG419_2005/Proj1/project1_part1.htm

• All the required information should be online. • This project can be completed by cut and pasting from

the web site. But try to understand the steps.• Let me know if there are typos.

Due 9/16

Page 15: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

The Protocol Stack

• The network layer routes packets (datagrams) through the network

• The network layer gets packets from the transport layer or from the link layer.

• Depending on the destination address, the network layer will give the packet to the transport protocol or to a specific link layer to send on a specific link

• The network layer also provides fragmenting of a large packet into chunks suitable for the link layer

application

transport

network

link

physical

Page 16: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

The Protocol Stack

• The link layer moves packets (frames) between two hosts

• However, the link layer may provide a wide range of services including– Media access control– Error detection / correction– Routing over layer 2 networks– Reliability (where the network layer is

informed if the transmission fails)

application

transport

network

link

physical

Page 17: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

The Protocol Stack

• The physical layer moves packets (frames) between two connected hosts

• This requires putting the bits onto a physical medium and decoding them from the medium.

• In this course we mostly neglect the physical layer and assume that is works correctly (each layer always assumes that the other layers work correctly)

• But the performance of a protocol at a layer often dependent on the other layers.– One approach is for cross-layer design

application

transport

network

link

physical

Page 18: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

sourceapplicatio

ntransportnetwork

linkphysical

HtHn M

segment Ht

datagram

destination

application

transportnetwork

linkphysical

HtHnHl M

HtHn M

Ht M

M

networklink

physical

linkphysical

HtHnHl M

HtHn M

HtHn M

HtHnHl M

router

switch

Encapsulationmessage M

Ht M

Hn

frame

Page 19: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Chapter 2

The Application Layer

Page 20: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Goals of this Chapter

• To understand common application protocols work– Web (http)– Email (smtp)– FTP– DNS– P2P– IM

• To understand how the design alternatives for application design– A network application runs on many hosts, it is a distributed

application– This chapter discusses several designs of distributed

applications

Page 21: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Road Map

• Application basics• Web• Email• FTP• DNS• P2P

– Graph theory– State diagrams– P2P design

• IM

Page 22: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Road Map

• Application basics• Web• Email• FTP• DNS• P2P

– Graph theory– State diagrams– P2P design

• IM

Page 23: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Creating a network app

write programs that– run on (different) end

systems– communicate over network– e.g., web server software

communicates with browser software

No need to write software for network-core devices– Network-core devices do not

run user applications – applications on end systems

allows for rapid app development, propagation

application

transportnetworkdata linkphysical

application

transportnetworkdata linkphysical

application

transportnetworkdata linkphysical

Page 24: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

An App-layer protocol defines

• Types of messages exchanged, – e.g., request, response

• Message syntax:– what fields in messages &

how fields are delineated

• Message semantics – meaning of information in

fields

• Rules for when and how processes send & respond to messages

Public-domain protocols:• defined in RFCs• allows for

interoperability• e.g., HTTP, SMTP

Proprietary protocols:• e.g., Skype

Page 25: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Ports

• An application is identified by the hosts IP address, transport protocols, and port– E.g., A web server has a

particular IP address, listens with TCP on port 80.

– A web browser on a host will connect a request a file from the web server. The browser is identified by the host’s IP address and a TCP port.

host

TCP

0

45674568

216-1

UDP

0

216-1

host(web server)

TCP

0

80

216-1

UDP

0

216-1

Page 26: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

What transport service does an app need?

Data reliability• some apps (e.g., audio) can

tolerate some loss

• other apps (e.g., file transfer, telnet) require 100% reliable data transfer

Timing• some apps (e.g., Internet

telephony, interactive games) require low delay to be “effective”

Throughput• some apps (e.g., multimedia)

require minimum amount of throughput to be “useful” (i.e., in order for the user to gain utility)

• other apps (“elastic apps”) make use of whatever throughput they get

Security• Encryption, data integrity, …

Page 27: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Transport service requirements of common apps

Application

file transfere-mail

Web documentsreal-time audio/video

stored audio/videointeractive gamesinstant messaging

Data loss

no lossno lossno lossloss-tolerant

loss-tolerantloss-tolerantno loss

Throughput

elasticelasticsome what elasticaudio: 5kbps-1Mbpsvideo:10kbps-5Mbpssame as above few kbps upelastic

Time Sensitive

nononot reallyyes, 100’s msec

yes, few secsyes, 100’s msecyes and no

Page 28: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Internet transport protocols services

TCP service:• connection-oriented: setup

required between client and server processes

• reliable transport between sending and receiving process

• flow control: sender won’t overwhelm receiver

• congestion control: throttle sender when network overloaded

• does not provide: timing, minimum throughput guarantees, security

UDP service:• unreliable data transfer

between sending and receiving process

• does not provide: reliability, flow control, congestion control, timing, throughput guarantee, or security

• Does not require connection set-up

• Packets can be sent at any rate desired (but this might be cause considerable congestion)

Page 29: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Internet apps: application, transport protocols

Application

e-mailremote terminal access

Web file transfer

streaming multimedia

Internet telephony

Applicationlayer protocol

SMTP [RFC 2821]Telnet [RFC 854]HTTP [RFC 2616]FTP [RFC 959]HTTP (eg Youtube), RTP [RFC 1889]SIP, RTP, proprietary(e.g., Skype)

Underlyingtransport protocol

TCPTCPTCPTCPTCP or UDP

typically UDP

Page 30: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Road Map

• Application basics• Web• Email• FTP• DNS• P2P

– Graph theory– State diagrams– P2P design

• IM

Page 31: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Web and HTTP

• Web page consists of objects• Object can be HTML file, JPEG image, Java applet, audio file,…• Web page consists of base HTML-file which includes several

referenced objects• The browser first requests the base file• The base file species text and URLs of objects• The browser requests these objects, where ever they are (not

always on the same server)• HTTP is used to request the base file and all the other files• Note, that HTTP can be used for other applications besides web• Each object is addressable by a URL• Example URL:

www.someschool.edu/someDept/pic.gif

host name path name

Page 32: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP overview

HTTP: hypertext transfer protocol

• Web’s application layer protocol

• client/server model– client: browser that

requests, receives, “displays” Web objects

– server: Web server sends objects in response to requests

PC runningExplorer

Server running

Apache Webserver

Mac runningNavigator

HTTP request

HTTP request

HTTP response

HTTP response

Page 33: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP overview (continued)

Uses TCP:• client initiates TCP connection

(creates socket) to server, port 80

• server accepts TCP connection from client

• HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server)

• TCP connection closed

HTTP is “stateless”• server maintains no

information about past client requests

Protocols that maintain “state” are complex!

• past history (state) must be maintained

• if server/client crashes, their views of “state” may be inconsistent, must be reconciled

aside

Page 34: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP connections

Nonpersistent HTTP• At most one object is

sent over a TCP connection.

Persistent HTTP• Multiple objects can

be sent over single TCP connection between client and server.

Page 35: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Nonpersistent HTTPSuppose user enters URL www.someSchool.edu/someDepartment/home.index

1a. HTTP client initiates TCP connection to HTTP server (process) at www.someSchool.edu on port 80

2. HTTP client sends HTTP request message (containing URL) into TCP connection socket. Message indicates that client wants object someDepartment/home.index

1b. HTTP server at host www.someSchool.edu waiting for TCP connection at port 80. “accepts” connection, notifying client

3. HTTP server receives request message, forms response message containing requested object, and sends message into its socket

time

(contains text, references to 10

jpeg images)

5. HTTP client receives response message containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects

6. Steps 1-5 repeated for each of 10 jpeg objects

4. HTTP server closes TCP connection.

Page 36: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

filereceived

time to transmit file

initiate TCPconnection

RTT

requestfile

RTT

time time

Non-Persistent HTTP: Response time

Definition of RTT: time for a small packet to travel from client to server and back.

Response time:• one RTT to initiate TCP

connection• one RTT for HTTP request and

first few bytes of HTTP response to return

• file transmission time

total = 2RTT+transmit time

Page 37: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Persistent HTTP

• Nonpersistent HTTP issues:• requires 2 RTTs per object• OS overhead for each TCP

connection• browsers often open parallel

TCP connections to fetch referenced objects

• Persistent HTTP• server leaves connection open

after sending response• subsequent HTTP messages

between same client/server sent over open connection

• client sends requests as soon as it encounters a referenced object

• as little as one RTT for all the referenced objects

Page 38: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP request message

• two types of HTTP messages: request, response• HTTP request message:

– ASCII (human-readable format)

GET /somedir/page.html HTTP/1.1Host: www.someschool.edu User-agent: Mozilla/4.0Connection: close Accept-language:fr

(extra carriage return, line feed)

request line(GET, POST,

HEAD commands)

header lines

Carriage return, line feed

indicates end of message

Page 39: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP request message: general format

Page 40: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP response message

HTTP/1.1 200 OK Connection closeDate: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ...

status line(protocol

status codestatus phrase)

header lines

data, e.g., requestedHTML file

Page 41: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

HTTP response status codes

200 OK– request succeeded, requested object later in this message

301 Moved Permanently– requested object moved, new location specified later in this message

(Location:)

400 Bad Request– request message not understood by server

404 Not Found– requested document not found on this server

505 HTTP Version Not Supported

In first line in server->client response message.

A few sample codes:

Page 42: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Trying out HTTP (client side) for yourself

1. Telnet to your favorite Web server:

Opens TCP connection to port 80(default HTTP server port) at cis.poly.edu.Anything typed in sent to port 80 at cis.poly.edu

telnet cis.poly.edu 80

2. Type in a GET HTTP request:

GET /~ross/ HTTP/1.1Host: cis.poly.edu

By typing this in (hit carriagereturn twice), you sendthis minimal (but complete) GET request to HTTP server

3. Look at response message sent by HTTP server!

Page 43: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Wireshark (ethereal)

• Wireshark captures all packets that pass through the hosts interface• To run Wireshark , libpcap (linux) or winpcap (windows) must be installed. It

comes with wireshark package• Then, run wireshark• Select Capture• Find the active interface

– E.g., mot generic dialup, nor vnp, nor packet scheduler, but wireless …. With IP address

– Then select prepare– Let’s watch TCP packets on port 80

• Next to capture filter, enter TCP port 80– Select update in realtime and autoscroll– Might need to enable or disable “capture in promiscuous mode”– Press start– Press close

• Load www.eecis.udel.edu page in browser• Press stop in Wireshark • Find http request to 128.4.40.10.

– Right click and select follow TCP stream

Page 44: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Web caches (proxy server)

• user sets browser: Web accesses via cache

• browser sends all HTTP requests to cache– object in cache: cache

returns object – else cache requests

object from origin server, then returns object to client

Goal: reduce network utilization by satisfying client request without involving origin server

client

Proxyserver

client

HTTP request

HTTP response

HTTP request HTTP request

origin server

origin server

HTTP response HTTP response

Page 45: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

More about Web caching

• cache acts as both client and server

• typically cache is installed by ISP (university, company, residential ISP)

Why Web caching?• reduce response time for

client request• reduce traffic on an

institution’s access link.• Internet dense with

caches: enables “poor” content providers to effectively deliver content (but so does P2P file sharing)

Page 46: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Caching example

Assumptions• average object size = 100,000 bits• avg. request rate from institution’s

browsers to origin servers = 15/sec

• delay from institutional router to any origin server and back to router = 2 sec

Consequences• utilization on LAN = 15%• utilization on access link = 100%• total delay = Internet delay + access

delay + LAN delay

= 2 sec + minutes + milliseconds

originservers

public Internet

institutionalnetwork 10 Mbps LAN

1.5 Mbps access link

institutionalcache

Page 47: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Caching example (cont)

possible solution• increase bandwidth of access

link to, say, 10 Mbps

consequence• utilization on LAN = 15%

• utilization on access link = 15%

• Total delay = Internet delay + access delay + LAN delay

= 2 sec + msecs + msecs

• often a costly upgrade

originservers

public Internet

institutionalnetwork 10 Mbps LAN

10 Mbps access link

institutionalcache

Page 48: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Caching example (cont)

possible solution: install cache

• suppose hit rate is 0.4

consequence• 40% requests will be satisfied

almost immediately• 60% requests satisfied by origin

server• utilization of access link reduced

to 60%, resulting in negligible delays (say 10 msec)

• total avg delay = Internet delay + access delay + LAN delay = .6*(2.01) secs + .4*milliseconds < 1.4 secs

originservers

public Internet

institutionalnetwork 10 Mbps LAN

1.5 Mbps access link

institutionalcache

Page 49: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Conditional GET

• Goal: don’t send object if cache has up-to-date cached version

• cache: specify date of cached copy in HTTP requestIf-modified-since:

<date>

• server: response contains no object if cached copy is up-to-date: HTTP/1.0 304 Not

Modified

cache server

HTTP request msgIf-modified-since:

<date>

HTTP responseHTTP/1.0

304 Not Modified

object not

modified

HTTP request msgIf-modified-since:

<date>

HTTP responseHTTP/1.0 200 OK

<data>

object modified

Page 50: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Road Map

• Application basics• Web• FTP• Email• DNS• P2P

– Graph theory– State diagrams– P2P design

• IM

Page 51: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

FTP: the file transfer protocol

• transfer file to/from remote host• client/server model

– client: side that initiates transfer (either to/from remote)– server: remote host

• ftp: RFC 959• ftp server: listens on port 21

file transfer FTPserver

FTPuser

interface

FTPclient

local filesystem

remote filesystem

user at host

Page 52: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

FTP is weird: separate control and data connections

• FTP client contacts FTP server at port 21, TCP is transport protocol

• client authorized over control connection– This is done in “clear text” (i.e., unencrypted)– So if some one if sniffing packets, your

password might be learned.– Sniffing packets is difficult on ethernet,

encrypted wifi, and DSL, but is possible on cable modems

• client browses remote directory by sending commands over control connection.

• Data is transferred over different connections. Two approaches

– Active– Passive

FTPclient

FTPserver

TCP control connection

port 21

TCP data connectionport 20

• Active– The client opens a TCP socket with

on some port (port number >1024)– The client sends the server the port– The server connects to the client’s

port where the servers source port is 20

• Active mode is a problem for firewalls

– If my desktop is not a server, if should not receive any requests for connections.

– But FTP servers will make such a requests

Page 53: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

FTP Passive mode

• When a file is to be transferred, the server opens a port (number>1024 and not 20)

• The server sends this port number information over the command connection

• The client connects to the servers over this port.

FTPclient

FTPserver

TCP control connection

port 21

TCP data connectionhigh port

• Drawback of passive– Some enterprises (companies) like to

control which applications are used• E.g., web browsing is ok, but skype is

not

– One way to do this is to block out going connections based on the port.

– However, this will cause FTP to fail, unless the device that blocks connections is smart

Page 54: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Road Map

• Application basics• Web• FTP• Email• DNS• P2P

– Graph theory– State diagrams– P2P design

• IM

Page 55: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Email Protocol Design• Basic assumption: weak user agents and strong mail servers

– The user wants to send the mail and leave– The user wants to get the mail– The user may come and go whenever (e.g., roaming laptop)– It should be possible to send mail to a user even if neither user is online at the same time.– We conclude that there must be a middle man/mail server.

• Servers are not that strong: The protocol must be as robust as possible to servers being offline – No single server – why

• Single point of failure• The server would have to be too big (congestion)

– We conclude that there should be many mail servers

• Two types of hosts– Users– Mail servers

• Each user has a mail box in its mail server– Users retrieve mail from their mail server at there convenience

• Users give mail to their mail servers to deliver the mail• Mail servers communicate with

– The users that have mail boxes in the server– Other mail servers

useragent

mailserver

mailserver user

agent

Page 56: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Email Protocol Design• Two types of hosts

– Users– Mail servers

• Each user has a mail box in its mail server– Users retrieve mail from their mail server at there convenience

• Users give mail to their mail servers to deliver the mail• Mail servers communicate with

– The users that have mail boxes in the server– Other mail servers

useragent

mailserver

mailserver user

agent

User composes mail and sends it to its mail server (or a mail server that will send mail for it)

Mail server finds the destination mail server and attempts to send the mail

Destination user requests emails from mailbox

Destination server gives mails to user

Page 57: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Email Protocol Design• Two types of hosts

– Users– Mail servers

• Each user has a mail box in its mail server– Users retrieve mail from their mail server at there convenience

• Users give mail to their mail servers to deliver the mail• Mail servers communicate with

– The users that have mail boxes in the server– Other mail servers

useragent

mailserver

mailserver user

agent

User composes mail and sends it to its mail server (or a mail server that will send mail for it)

Mail server finds the destination mail server and attempts to send the mail

Destination user requests emails from mailbox

Destination server gives mails to user

SMTP SMTP POP3IMAP…

Page 58: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Electronic Mail: Details

Three major components: • user agents

• mail servers

• simple mail transfer protocol: SMTP

User Agent

• a.k.a. “mail reader”

• composing, editing, reading mail messages

• e.g., Eudora, Outlook, elm, Mozilla Thunderbird

• Put outgoing on server (with SMTP)

• Get incoming messages from server

user mailbox

outgoing message queue

mailserver

useragent

useragent

useragent

mailserver

useragent

useragent

mailserver

useragent

SMTP

SMTP

SMTP

Page 59: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Electronic Mail: mail servers

Mail Servers • mailbox contains incoming

messages for user

• message queue of outgoing (to be sent) mail messages

• SMTP protocol between mail servers to send email messages

– client: sending mail server

– “server”: receiving mail server

• Reliable: several attempts and provide notification if delivery fails

mailserver

useragent

useragent

useragent

mailserver

useragent

useragent

mailserver

useragent

SMTP

SMTP

SMTP

Page 60: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Electronic Mail: SMTP [RFC 2821]

• uses TCP to reliably transfer email message from client to server, port 25

• direct transfer: sending server to receiving server• Emails are pushed to servers (but users pull messages from

servers)• three phases of transfer

– handshaking (greeting)– transfer of messages– closure

• command/response interaction– commands: ASCII text– response: status code and phrase

• messages must be in 7-bit ASCII– Makes it difficult to send attachments

Page 61: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Scenario: Alice sends message to Bob

1) Alice uses UA to compose message and “to” [email protected]

2) Alice’s UA sends message to her mail server; message placed in message queue

3) Client side of SMTP opens TCP connection with Bob’s mail server

4) SMTP client sends Alice’s message over the TCP connection

5) Bob’s mail server places the message in Bob’s mailbox

6) Bob invokes his user agent to read message

useragent

mailserver

mailserver user

agent

1

2 3 4 56

Page 62: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Sample SMTP interaction

S: 220 hamburger.edu C: HELO crepes.fr S: 250 Hello crepes.fr, pleased to meet you C: MAIL FROM: <[email protected]> S: 250 [email protected]... Sender ok C: RCPT TO: <[email protected]> S: 250 [email protected] ... Recipient ok C: DATA S: 354 Enter mail, end with "." on a line by itself C: Do you like ketchup? C: How about pickles? C: . S: 250 Message accepted for delivery C: QUIT S: 221 hamburger.edu closing connection

Client connects to server

Page 63: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Try SMTP interaction for yourself:

• telnet mail.eecis.udel.edu 25• see 220 reply from server

• enter HELO, MAIL FROM, RCPT TO, DATA, QUIT commands

above lets you send email without using email client (reader)

Page 64: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

SMTP: final words

• SMTP uses persistent connections

• SMTP requires message (header & body) to be in 7-bit ASCII

• SMTP server uses CRLF.CRLF to determine end of message

Comparison with HTTP:

• HTTP: pull• SMTP: push

• both have ASCII command/response interaction, status codes

• HTTP: each object encapsulated in its own response msg

• SMTP: multiple objects sent in multipart msg

Page 65: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Mail access

• POP3 and IMAP are two protocols for access mail on a mail server

• Web-based mail works differently, the web mail server and the mail server can be integrated, so that there is no user agent.

Page 66: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Mail access protocols

• SMTP: delivery/storage to receiver’s server

• Mail access protocol: retrieval from server

– POP: Post Office Protocol [RFC 1939]

• authorization (agent <-->server) and download

– IMAP: Internet Mail Access Protocol [RFC 1730]

• more features (more complex)

• manipulation of stored msgs on server

– HTTP: gmail, Hotmail, Yahoo! Mail, etc.

useragent

sender’s mail server

useragent

SMTP SMTP accessprotocol

receiver’s mail server

Page 67: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Road Map

• Application basics• Web• FTP• Email• DNS• P2P

– Graph theory– State diagrams– P2P design

• IM

Page 68: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS – domain name system

• Change names, like www.yahoo.com into IP address.• Services provided by DNS

– Name to address translation– Host aliasing

• A host relay1.west-coast.yahoo.com could have two aliases, yahoo.com and www.yahoo.com.

• In this case, the canonical hostname is relay1.west-coast.yahoo.com. • DNS can provide canonical host names

– Mail server aliasing• When a mail server wants to send a mail to [email protected], it does not send

it to www.udel.edu, but to mail.udel.edu. Or maybe udmail.udel.edu. DNS can translate udel.edu to mail.udel.edu

– (Cheap) Load distribution • Cnn.com has several servers.• DNS will respond with all address, • but it will reorder the addresses every time.• If the client uses the first address listed, then each client will use different

servers. • Content distribution networks (CDN) are better ways of load balancing

Page 69: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS - structure

• Centralized DNS?– Pros – somewhat easy to maintain (there is only one

system). But it must always be online– Cons

• Single point of failure (the system crashes -> no web)• Congestion• Server would be far from some hosts (delay)• Database would be too big• The register bohacek-pc1.pc.udel.edu would require

interacting with the big server

• Instead, a distributed hierarchical database is used.

Page 70: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Domain Hierarchy

edu com gov mil org net uk in

UD upenn yahoo cisco whitehouse nasa navy arpa acm

eecis art

bohacek_pc1 bohacek_pc10

Page 71: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Administrative Zones in the Domain Hierarchy

edu comgov mil org net uk in

UD upenn yahoo ciscowhitehouse nasa navy arpa acm

eecis art

bohacek_pc1 bohacek_pc10

root

It is possible that .edu and .gov are administered togetherNote that UD administered art but not eecisSome times a single service provider will administer the domains for a large number of .coms

Page 72: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Root servers

• Each layer in the hierarchy knows about the domain names below it• The highest level is the root.

– There are 13 root “servers”

– Each of these servers is actually several servers, and some of the machines that comprise a server are distributed geographically.

13 root name servers worldwide

b USC-ISI Marina del Rey, CAl ICANN Los Angeles, CA

e NASA Mt View, CAf Internet Software C. Palo Alto, CA (and 36 other locations)

i Autonomica, Stockholm (plus 28 other locations)

k RIPE London (also 16 other locations)

m WIDE Tokyo (also Seoul, Paris, SF)

a Verisign, Dulles, VAc Cogent, Herndon, VA (also LA)d U Maryland College Park, MDg US DoD Vienna, VAh ARL Aberdeen, MDj Verisign, ( 21 locations)

Page 73: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

overview

• Top-level domain (TLD) servers– There are around 200 top-level domains– These include com, edu, mil, info, in, uk, cn, – Currently,

• network solutions maintains the TLD servers for com

• Educause maintains the TLD servers for edu

– The root servers know the addresses and names of all top level servers

• Organizations have a hierarchy of DNS servers

Page 74: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS queries

• Suppose a host needs the IP address of bohacek-pc1.eecis.udel.edu• If this IP address is not in cache, the host asks its local DNS server.• If the DNS server does not have it in cache, it checks if is had the IP address of the

DNS server of eecis.udel.edu in cache• If not, it checks if IP address of the dns server of udel.edu in cache• If not, it check if it has the IP address of the top-level domain server of edu in cache• It not, it asks the root server for the IP address of the edu TLD server

– The DNS server always has the IP address of the root servers• The local DNS server asks the edu TLD server for address of bohack-

pc1.eecis.udel.edu. • The TLD server does not know that IP address, but instead gives the IP address of

the dns server for UD• The local DNS server asks the UD dns server for the address of bohack-

pc1.eecis.udel.edu.• The UD dns server does not know the address, but instead returns the address of the

eecis dns server.• The local DNS server asks the eecis dns server for the address of bohacek-

pc1.eecis.udel.edu• Eecis dns server replies with the address.• This address is returned to the host that orginally asked the question.

Page 75: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS Queries

Browser wants to show www. eecis.udel.edu

Browser needs the IP address of www. eecis.udel.edu

Host asks local DNS server for IP address of www. eecis.udel.edu

• Local DNS server checks if it has the IP address of www.eecis.udel.edu in cache.

• If not, it checks if is had the IP address of the DNS server of eecis.udel.edu in cache

• If not, it checks if IP address of the dns server of udel.edu in cache

• If not, it check if it has the IP address of the top-level domain server of edu in cache

• .if not, …..

What is the IP address of www.eecis.udel.edu?

Root server (IP address are always known)

Root server does not know. Instead, it responds with dns server that might, specifically, the TLD server for .edu

What is the ip address of www.eecis.udel.edu?

TLD server for .edu

TLD server does not know. Instead replies with the name and IP address of the UD DNS server

What is the ip address of www.eecis.udel.edu?

UD dns server does not know. Instead it replies with the name and IP address of the eecis dns server.

What is the ip address of www.eecis.udel.edu?

It is 128.4.1.2

It is 128.4.1.2

Page 76: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Browser wants to show

www.eecis.udel.edu

DNS Queries

Browser needs the IP address of

www.eecis.udel.edu Host asks local DNS server for IP

address of www.eecis.udel.ed

u

1. Local DNS server checks if it has the IP address of www.eecis.udel.edu in cache.

2. If not, it checks if is had the IP address of the DNS server of eecis.udel.edu in cache

3. If not, it checks if it has the IP address of the DNS server of udel.edu in cache

4. If not, it checks if it has the IP address of the top-level domain server of edu in cache

5. .if not, …..

What is the IP address of www.eecis.udel.edu?

Root server (IP addresses are always known)

Root server does not know. Instead, it responds with name and address of a

server that might, specifically, the TLD server

for .eduWhat is the IP address of

www.eecis.udel.edu?TLD server for .edu

TLD server does not know. Instead replies with the name and IP address of

the UD DNS server

What is the ip address of www.eecis.udel.edu?

UD DNS server does not know. Instead it replies with the name and IP address of the eecis dns server.

What is the IP address of www.eecis.udel.edu?

It is 128.4.1.2

It is 128.4.1.2

UD DNS server

eecis DNS server

Page 77: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Browser wants to show

www.eecis.udel.edu

DNS Queries

Browser needs the IP address of

www.eecis.udel.edu Host asks local DNS server for IP

address of www.eecis.udel.ed

u

1. Local DNS server checks if it has the IP address of www.eecis.udel.edu in cache.

2. If yes, then return it

It is 128.4.1.2

Page 78: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Browser wants to show

www.eecis.udel.edu

DNS Queries

Browser needs the IP address of

www.eecis.udel.edu Host asks local DNS server for IP

address of www.eecis.udel.ed

u

1. Local DNS server checks if it has the IP address of www.eecis.udel.edu in cache.

2. If not, it checks if is had the IP address of the DNS server of eecis.udel.edu in cache

3. If yes, query it…

What is the IP address of www.eecis.udel.edu?

It is 128.4.1.2

It is 128.4.1.2

eecis DNS server

Page 79: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Browser wants to show

www.eecis.udel.edu

DNS Queries

Browser needs the IP address of

www.eecis.udel.edu Host asks local DNS server for IP

address of www.eecis.udel.ed

u

1. Local DNS server checks if it has the IP address of www.eecis.udel.edu in cache.

2. If not, it checks if is had the IP address of the DNS server of eecis.udel.edu in cache

3. If not, it checks if it has the IP address of the DNS server of udel.edu in cache

4. If not, it checks if it has the IP address of the top-level domain server of edu in cache

5. .if so, then query it…

What is the IP address of www.eecis.udel.edu?

TLD server for .edu

TLD server does not know. Instead replies with the name and IP address of

the UD DNS server

What is the ip address of www.eecis.udel.edu?

UD DNS server does not know. Instead it replies with the name and IP address of the eecis dns server.

What is the IP address of www.eecis.udel.edu?

It is 128.4.1.2

It is 128.4.1.2

UD DNS server

eecis DNS server

Page 80: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Attack on DNS

• Hackers have tried to bring down DNS by performing a DoS on the root servers– DoS – denial of service. Sends more

packets or requests for service than the server can accommodate. Resulting in poor service for normal users.

• This failed because– There are many very strong root servers and have

firewalls/filters• The attacks used ICMP ping packets• DNS requests would have been more effective

– It is rare that a root server is needed• Usually only the TLD server is needed• Or only a domain server.

Page 81: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS Message Details

• DNS Record– (Name, Value, Type, Class, TTL)– If Type = A

• Name is the host name• Value is the IP address of the host

– If Type = NS• Name is a domain name• Value is the name of the DNS server for the domain• E.g., (udel.edu, dns.udel.edu, NS, …, …)

– Type = MX• Name is the domain name• Value is the name of the mail server for the domain• E.g., (udel.edu, mail.udel.edu, MX, …, …)

– Type = CName• Name is a host name• Value is the canonical name of the host• E.g., (www.yahoo.com, relay-east.yahoo.com, CName, …, …)

– TTL is the time to live, so DNS caches can be timed out– Class is no longer used, it is set as IN

Page 82: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS query

• (Name, Type, Class)

• (UDel.edu, MX, IN)– Please provide the name of the UD’s mail

server

• (mail.UDel.edu, A, IN)– Please provide the IP address for mail.udel.edu

Page 83: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS message format

DNS protocol : query and reply messages, both with same message format

msg header• identification: 16 bit #

for query, reply to query uses same #

• flags:– query or reply– recursion desired – recursion available– reply is authoritative

Page 84: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS message format

Name, type fields for a query

RRs in responseto query

records forauthoritative servers

additional “helpful”info that may be used

Page 85: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Browser wants to show

www.eecis.udel.edu

Browser needs the IP address of

www.eecis.udel.edu

DNS Queries

1. Local DNS server checks if it has the IP address of www.eecis.udel.edu in cache.

2. If not, it checks if is had the IP address of the DNS server of eecis.udel.edu in cache

3. If not, it checks if it has the IP address of the DNS server of udel.edu in cache

4. If not, it checks if it has the IP address of the top-level domain server of edu in cache

5. .if not, …..

Root server (IP addresses are always known)

TLD server for .edu

UD DNS server

eecis DNS server

1 00 0

(www.eecis.udel.edu, A,IN)

1 00 0

(www.eecis.udel.edu, A,IN)

0 00 4

(edu, edu-serverA.net, NS, IN)

(edu-serverA.net, 124.5.1.1, A, IN)

(edu, edu-serverB.net, NS, IN)

(edu-serverB.net, 124.5.1.2, A, IN)

1 00 0

(www.eecis.udel.edu, A,IN)

0 00 4

(udel.edu, dns2.udel.edu, NS, IN)

(udel.edu, dns2.udel.edu, 128.178.2.2, A, IN)

(udel.edu, dns1.udel.edu, NS, IN)

(dns1.udel.edu, 128.173.2.1, A, IN)

1 00 0

(www.eecis.udel.edu, A,IN)

0 00 4

(eecis.udel.edu, dns1.eecis.udel.edu, NS, IN)

(dns1.eecis.udel.edu, 128.4.1.10, A, IN)

(eecis.udel.edu, dns2.udel.edu, NS, IN)

(dns2.udel.edu, 128.4.1.11, A, IN)

1 00 0

(www.eecis.udel.edu, A,IN)

0 10 0

(www.eecis.udel.edu, 128.4.1.1, A, IN)

0 10 0

(www.eecis.udel.edu, 128.4.1.1, A, IN)

Page 86: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS Flags

• The DNS header has a query ID– The query has this ID and the server copies this ID

into the response

• Flag indicating query or answer• Flag indicating whether the server is the

authoritative server for the answer (as oppose to a cached answer)

• A recursive desired flag indicating that the host/server would like the server to perform the recursive DNS lookup

• A recursive available flag indicating whether the server is available to to the recursive lookup

Page 87: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

DNS

• Which transport protocol should DNS use?

• Why?

Page 88: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Peer-to-peer file sharing

• About P2P– 30% or more of the bytes transferred on the Internet are from

P2P users– Skype is a very successful P2P VoIP app

• Written in 3-4 months

• Topics covered– Scalability– P2P querying– Case study

• BitTorrent

• Skype

Page 89: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Pure P2P architecture• Review: What is the difference

between peer-to-peer and client/server?

– Each hosts acts as both a server and a client.

• no always-on server• arbitrary end systems directly

communicate• peers are intermittently

connected and may change IP addresses

• Pure P2P has significant drawbacks.

• P2P-like systems with some central servers are more common.

• But in all cases, the file transfer is between peers, not from servers.

peer-peer

Page 90: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

File Distribution: Server-Client vs P2P

Question : How much time to distribute file from one server to N peers?

us

u2d1 d2

u1

uN

dN

Server

Network (with abundant bandwidth)

File, size F

us: server upload bandwidth

ui: peer i upload bandwidth

di: peer i download bandwidth

Page 91: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

File distribution time: server-client

us

u2d1 d2u1

uN

dN

Server

Network (with abundant bandwidth)

F

• Time for the server to send a copy to a single client– F/us

• Time for the server send N copies:– NF/us time

• client i takes F/di time to download

increases linearly in N(for large N)

= dcs = max { NF/us, F/min(di) }i

Time to distribute F to N clients using

client/server approach

Page 92: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

File distribution time: P2P

us

u2d1 d2u1

uN

dN

Server

Network (with abundant bandwidth)

F• server must send one copy:

– F/us time

• client i download time – F/di

• Total data to be downloaded– NF

• fastest possible transfer rate: us + ui

dP2P = max { F/us, F/min(di) , NF/(us + ui) }i

Can you make a schedule for the download the take this amount?

Page 93: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30 35

N

Min

imu

m D

istr

ibut

ion

Tim

e P2P

Client-Server

Server-client vs. P2P: example

Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

Conclusion: P2P systems are scalable. But the load is distributed to all users, so P2P users have more load than clients in the client-server model.

Page 94: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Peer-to-peer Querying

• While the file is transferred from the peer, how to find the file• Options

– Centralize directory• Napster• Single point of failure• Performance bottleneck• Target for the RIAA• Always up• Easy to find• Easy protocol

– Query flooding• Gnutella• Hosts find other host and form a network of neighbors (overlay network) • Search for a file (covered next)• How to set up the network – bootstrap?

– Have a central list of peers– Have distributed lists of peers– Search out a peer by scanning – like in project

• Once the file is found, – the host could respond directly to the searcher,– or it could send the response along the reveres path. – In the later case, the peers along the way would learn about where the file is located (cache) and could more

quickly answer the next time the search is performed. But then we must worry about stale information.

Page 95: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Querying Flooding State DiagramUser Request for File

Send out a request for file to all neighbors

Set Timer=0;

wait

Set AttemptCounter = 0

AttemptCounter ++

AttemptCounter>MaxAttempts

Inform user that query failed

else Timer>TO

Reply from peer

Inform user of file location

Page 96: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Listening Peer

wait

Request arrives

Get request IDHave seen request before

Check for file in directory

Send response to peer that requested file

File is in local dirSend request to all neighbors

Page 97: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Expanding ring

Page 98: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

(hierarchical peer-to-peer network)• KaZaA

• Not all peers are equal – super peers (?)– Super peers (group leaders) have higher bit-rate connections, are more stable,

etc.

• Peers connect to group leaders

• The group leaders keep a list of file shared by all their children peer.

• group leaders connect to a small number of other group leaders

• A child host will ask its group leader for a file, if the group leader does not know where it is, it will flood the network of group leaders. The response from other group leaders follows a reverse path to the asking group leader (so other leader can cache the response)

• A file is identified with a ID (e.g., MD5) that can take a string (file) and come to a unique ID. A small change in the file causes a large change in the ID. It is not possible to construct two files that have the same ID. The ID is a finger print.

• Since files are ID-ed, multiple copies of the same file can be found and these copies can be downloaded from multiple hosts in parallel.

• Note the if you are downloading while other are uploading, the uploading slows down the downloading, but only a little bit.

Page 99: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

BitTorrent

• Centralized P2P– A centralized server, or tracker, tracks the clients

involved in the P2P transfer– This is similar to Napster– Companies that host these site get sued and are

attacked by DDoS

• Components of BitTorrent System– Torrent Files– Trackers– Seeders– Peers

Page 100: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Torrent File

• Required to download• Can be found on web sites or sent by email• Contains information about the file and the tracker

– Announce: the URL of the tracker– Creation date– Info

• Length of file• Name of file• Length of each piece (except for the last)• Pieces – the 20B SHA-1 value of each piece

– Note, the number of pieces can be determined counting the number of bytes in the pieces field and dividing by 20

• If the download contains multiple files, then a single torrent file will contain information about all files.

Page 101: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

Tracker

• Make a HTTP Get request to the tracker specifying the SHA-1 hash of the file to be downloaded– The request also includes the number of bytes

downloaded and the number uploaded– If the client does not upload enough, the tracker might

not provide a reply

• The reply contains– The time when the tracker information should be

refreshed (usually 30 minutes)– A list of the peers

• IP address and port (usually 6881)• Peer ID

Page 102: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

File distribution with BitTorrent

tracker: tracks peers participating in torrent

obtain listof peers

trading chunks

peer

Page 103: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

BitTorrent (1)

• file divided into 256KB chunks.• peer joining torrent:

– has no chunks, but will accumulate them over time– registers with tracker to get list of peers, connects to subset of

peers (“neighbors”)• while downloading, peer uploads chunks to other peers. • peers may come and go• once peer has entire file, it may (selfishly) leave or (altruistically)

remain

Page 104: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

BitTorrent (2)

Pulling Chunks• at any given time, different

peers have different subsets of file chunks

• periodically, a peer (Alice) asks each neighbor for list of chunks that they have.

• Alice sends requests for her missing chunks– rarest first– So rarest chunks are

spread, and chunks are uniformly common

Sending Chunks: tit-for-tat• Alice sends chunks to four

neighbors currently sending her chunks at the highest rate – re-evaluate top 4 every 10

secs• every 30 secs: randomly select

another peer, starts sending chunks– newly chosen peer may join

top 4– “optimistically unchoke”

Page 105: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

BitTorrent: Tit-for-tat

(1) Alice “optimistically unchokes” Bob(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates(3) Bob becomes one of Alice’s top-four providers

With higher upload rate, can find better trading partners & get file faster!

Page 106: CPEG 419 Review of Lecture 1 and continuation of chapter 1 Introduction to Data Networking

BitTorrent Pros/Cons

• Centralized server• Slow to get the transfer started

– Web transfers start much faster and will achieve a sustained rate

• Peers must upload– Some peers might not be in position to upload (e.g.,

mobile phone)• Chunks can be corrupted

– HBO distributed fake chunks– Since the SHA-1 hash does not match what is given

in the Torrent File, the chunk is dropped after it is downloaded

• This wastes bandwidth and can greatly increase download time