Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
SPL/2010
SPL/2010
Protocols
1
SPL/2010
SPL/2010
Application Level Protocol Design
● atomic units used by protocol: "messages"
● encoding
● reusable, protocol independent, TCP server,
● LinePrinting protocol implementation
2
SPL/2010
SPL/2010
Protocol Definition
● set of rules, governing the communication details between two parties (processes)
● different forms and levels;
● protocols for exchange bits across a wire
● protocols governing administration of super computers.
● application level protocols - define interaction between computer applications
3
SPL/2010
SPL/2010
Protocol Communication Rules
● syntax : how do we phrase the information we exchange.
● semantics : what actions/response for information received.
● synchronization : whose turn it is to speak (given the above defined semantics).
4
SPL/2010
SPL/2010
Protocols Skeleton
● all protocols follow a simple skeleton.
● exchange information using messages, which define the syntax.
● difference between protocols: syntax used for messages, and semantics of protocol.
5
SPL/2010
SPL/2010
Protocol Initialization (hand-shake)
● communication begins when party sends initiation message to other party.
● synchronization - each party sends one message in a round robin fashion.
6
SPL/2010
SPL/2010
TCP 3-Way Handshake
● Establish/ tear down TCP socket connections
● computers attempting to communicate can negotiate network TCP socket connection
● both ends can initiate and negotiate separate TCP socket connections at the same time
7
SPL/2010
SPL/2010
TCP 3-Way Handshake (SYN,SYN-ACK,ACK)
8
SPL/2010
SPL/2010
● A sends a SYNchronize packet to B
● B receives A's SYN
● B sends a SYNchronize-ACKnowledgement
● A receives B's SYN-ACK
● A sends ACKnowledge
● B receives ACK.
● TCP socket connection is ESTABLISHED.
9
SPL/2010
SPL/2010
HTTP (Hyper Text Transfer Protocol)
● exchanging special text files over the network.
● brief (not complete) protocol description:
● synchronization: client initiates connection, sends single request, receive reply from server.
● syntax: text based, see rfc2616.
● semantics: server either sends to the client the page asked for, or returns an error.
● 10
SPL/2010
SPL/2010
What next?
● syntax and semantics aspects of protocols.
● assume: synchronization works in round robin, i.e., each party sends one message at a time.
11
SPL/2010
SPL/2010
Message Format
● Protocol syntax: message is the atomic unit of data exchanged throughout the protocol.
● message = letter
● concentrate on the delivery mechanism.
12
SPL/2010
SPL/2010
Framing
● streaming protocols - TCP
● separate between different messages
● all messages are sent on the same stream, one after the other,
● receiver should distinguish between different messages.
● Solution: message framing - taking the content of the message, and encapsulating it in a frame (letter - envelop).
13
SPL/2010
SPL/2010
Framing – what is it good for?
● sender and receiver agree on the framing method beforehand
● framing is part of message format/protocol
● enable receiver to discover in a stream of bytes where message starts/ends
14
SPL/2010
SPL/2010
Framing – how?
● Simple framing protocol for strings:
● special FRAMING character (e.g., a line break).
● each message is framed by two FRAMING characters at beginning and end.
● message will not contain a FRAMING character
● framing protocol by adding a special tag at start and end.
● message can be framed using <begin> / <end> strings.
● avoid having <begin> / <end> in message body.
15
SPL/2010
SPL/2010
Framing – how?
● framing protocol by employing a variable length message format
● special tag to mark start of a frame
● message contains information on message's length
16
SPL/2010
SPL/2010
17
SPL/2010
SPL/2010
Textual data
● Many protocols exchange data in textual form
● strings of characters, in character encoding, (UTF-8)
● very easy to document/debug - print messages
● Limitation: difficult to send non-textual data.
– how do we send a picture? video? audio file?
18
SPL/2010
SPL/2010
Binary Data
● non-textual data is called binary data.
● all data is eventually encoded in "binary" format, as a sequence of bits
● "binary data" = data that cannot be encoded as a readable string of characters?
19
SPL/2010
SPL/2010
Binary Data
● Sending binary data in raw binary format in a stream protocol is dangerous.
● may contain any byte sequence, may corrupt framing protocol.
● Devising a variable length message format.
20
SPL/2010
SPL/2010
Base64 Encoding Binary Data
encode binary data using encoding algorithm
● Base64 encoding - encodes binary data into a string
● Convert every 2 bytes sequence from the binary data into 3 ASCII characters.
● used by many "standard" protocols (email to encode file attachments of any type of data).
21
SPL/2010
SPL/2010
Encoding using Poco
● In C++, Poco library includes module for encoding/decoding byte arrays into/from Base64 encoded ASCII data.
● functionality is modeled as a stream "filter"
● performs encode/decode on all data flowing through the stream
● classes Base64Encoder / Base64Decoder.
22
SPL/2010
SPL/2010
Encoding in Java
● iharder library.
● modeled as stream filters (wrappers around Input/Output Java streams).
23
SPL/2010
SPL/2010
Encoding binary data
● advantage: any stream of bytes can be "framed" as ASCII data regardless of character encoding used by protocol.
● disadvantage - size of the message, increased by 50%.
● (we will use UTF-8 encoding scheme)
24
SPL/2010
SPL/2010
Protocol and Server Separation
25
SPL/2010
SPL/2010
Protocol and Server Separation
code reuse is one of our design goals!
● generic implementation of server, which handles all communication details
● generic protocol interface:
● handles incoming messages
● implements protocol's semantics
● generates the reply messages.
26
SPL/2010
SPL/2010
Protocol-Server Separation: protocol object
● protocol object is in charge of implementing expected behavior of our server:
● What actions should be performed upon the arrival of a request.
● requests may be correlated one to another, meaning protocol should save an appropriate state per client.
27
SPL/2010
SPL/2010
Example: authenticated session
● protocols require user authentication (login),
● only authorized users can perform certain actions.
● protocol is statefull - serving requests of client can be in at least 2 distinct states:
1. authenticated (user has already logged in)
2. non-authenticated (user has not provided login).
● by state of the protocol object, behavior of protocol object is different
28
SPL/2010
SPL/2010
Protocol and Server Separation
separate different tasks server must perform.
● Accept new connections from new clients.
● Receive new bytes from connected clients.
● Parse incoming bytes from clients into messages ("de-serialization" / "unframing").
● Dispatch message to right method on server side to execute requested operation.
● Send back an answer to a connected client after an action has been executed.
29
SPL/2010
SPL/2010
a software architecture that separates tasks into separate interfaces
30
SPL/2010
SPL/2010
● The key participants in this architecture are:
● Tokenizer - syntax, tokenizing a stream of data into messages.
● MessagingProtocol – semantics, handling received messages and generating responses.
31
SPL/2010
SPL/2010
● implementations of interfaces:
● generic server
● MessageTokenizer
● LinePrinitingProtocol,
32
SPL/2010
SPL/2010
Interfaces
● implement separation between protocol and server. Define:
1. message (can be encoded in various ways: Base64, XML, text).
● Our messages encoded as plain UTF-8 text.
2. framing of messages - delimiters between messages sent in stream.
3. protocol interface which handles each individual message.
33
SPL/2010
SPL/2010
ConnectionHandler
● server accepted new connection from client.
● server creates ConnectionHandler - will handle all incoming messages from this client.
● ConnectionHandler - maintains state of connection for specific client
● Ex: user perform "login" - ConnectionHandler object remembers this in its state
34
SPL/2010
SPL/2010
ConnectionHandler - Socket
● ConnectionHandler has access to Socket connecting server to client process.
● TCP server - Socket connection is viewed as a pair of InputStream and OutputStream.
● streams of bytes – client and the server exchange a bunch of bytes.
35
SPL/2010
SPL/2010
Tokenizer - in charge of parsing a stream of bytes into a stream of messages
● Tokenizer interface: filter between Socket input stream and protocol
● Protocol accesses the input stream only through the tokenizer.
● instead of "seeing" a stream of bytes, it sees a stream of messages.
● Many libraries model such "filters" on streams as wrappers around a lower-level stream.
● OutputStreamWriter - wraps stream and performs encoding from one character encoding to another
● BufferedReader - adds a layer of buffering around a non-buffered input stream.
36
SPL/2010
SPL/2010
Tokenizer
● splits incoming bytes from the socket into messages.
● For simplicity, we model the Tokenizer as an iterator…
● protocol will see the input stream from the socket as an iterator over messages (instead of an iterator over bytes).
37
SPL/2010
SPL/2010
38
S
SPL/2010
SPL/2010
Messaging Protocol
● protocol interface
● wraps together: socket and Tokenizer
● Pass incoming messages to MessagingProtocol - execute action requested by client.
● look at the message and decide on action
● decision may depend on the state
● Once the action is performed - answer back from the MessagingProtocol.
39
SPL/2010
SPL/2010
40
SPL/2010
SPL/2010
(De)serialization vs. (De)framing
● We use String to pass data from Tokenizer to Protocol, and back from Protocol.
● Serialization/Deserialization = encode/decode parameters to/from Strings
● performed by Protocol
● Tokenizer in charge of deframing (split bytes into messages).
41
SPL/2010
SPL/2010
Implementations
42
SPL/2010
SPL/2010
Connection Handler
● holds references to:
● TCP socket connected to the client,
● Tokenizer
● an instance of the MessagingProtocol.
43
SPL/2010
SPL/2010
Connection Handler
● active object:
● handles one connection to one client for the whole period during which the client is connected
● (from the moment the connection is accepted, until one of the sides decides to close it).
● modeled as a Runnable class.
44
SPL/2010
SPL/2010
Connection Handler
● generic, works with any implementation of a messaging protocol.
● assumes data exchanged between client and server is in form of encoded strings
● encoder passed to constructor as an Encoder interface.
45
SPL/2010
SPL/2010
46
SPL/2010
SPL/2010
What’s left?
● only need to implement:
● specific framing handler (tokenizer)
● specific protocol we wish to use.
● continue our line printing example…
47
SPL/2010
SPL/2010
Message Tokenizer
● we use a framing method based on a single character delimiter.
● assume stream of messages, delimited by FRAMING = we will use the character '\0‘
48
SPL/2010
SPL/2010
49
SPL/2010
SPL/2010
Termination & Exceptions
● important part is connection termination and exception handling at any moment
● most of the code in low-level input/output and socket manipulation relates to error handling and connection termination.
50
SPL/2010
SPL/2010
Line Printing Protocol
● implement a specific protocol on the server side.
● when receives a message, prints it on the server side screen and adds a line number.
● line number is the state of the protocol.
● each client has its own line number. Two clients connected at the same time will see each one its own version of the line number.
● when protocol processes a message, - sends back message to client: ": printed" + date-time value when the message was processed (on the server side).
● timestamp acknowledgments.
51
SPL/2010
SPL/2010
52
SPL/2010
SPL/2010
A Client
● before ConnectionHandler into a running server process
● code of compatible TCP client for protocol we have just described.
● no new idea - it is similar to the TCP client we have reviewed in the previous section.
53
SPL/2010
SPL/2010
54
SPL/2010
SPL/2010
Concurrency Models of TCP Servers
Server quality criteria:
● Scalability: capability to server a large number of concurrent clients.
● Low accept latency: acceptance wait time
● Low reply latency: reply wait time after message received.
● High efficiency: use little resources on the server (RAM, number of threads CPU usage).
55
SPL/2010
SPL/2010
56
SPL/2010
SPL/2010
● model the concurrency model of the server,
● define interface which controls concurrency application of each connection handler
57
SPL/2010
SPL/2010
● Given:
● Encoder
● Tokenizer
● Protocol
● ServerConcurrencyModel
defined the MessagingServer
58
SPL/2010
SPL/2010
59
SPL/2010
SPL/2010
● To obtain good quality, a TCP server will most often use multiple threads.
● 3 simple models of concurrency servers
● 3 implementations of preparing the ServerConcurrencyModel interface
60
SPL/2010
SPL/2010
Server Model 1: Single Thread
● 1 thread for;
● accepting a new client
● dealing requests, by applying run method of the passive ConnectionHandler object.
61
SPL/2010
SPL/2010
62
SPL/2010
SPL/2010
Single Thread Model: Quality
● no scalability: at any given moment, it can serve at most one client.
● high accept latency: a second client must wait until first client disconnects
● low reply latency: all resources are concentrated on serving one client.
● Good efficiency: server uses exactly the resources needed to serve one client
●
63
SPL/2010
SPL/2010
When is model appropriate?
● time to process a full connection from one client is guaranteed to remain small.
● Example: server provides date and time value on the server machine.
● sends one string to the client then disconnects.
64
SPL/2010
SPL/2010
Server Model 2: Thread per Client
● assigns a new thread, for each connected client, by invoking the 'start' method over the runnable ConnectionHandler object.
65
SPL/2010
SPL/2010
66
SPL/2010
SPL/2010
Model Quality: Scalability
● server can serve several concurrent clients, up to max threads running in the process.
● RAM of the host is used - each thread allocates a stack and thus consumes RAM
● Approx. 500 - 1000 threads become active in a single process.
● process does not defend itself – keeps creating new threads - dangerous for the host.
67
SPL/2010
SPL/2010
Model Quality: Latency
● Low accept latency: time from one accept to the next ~ time to create a new thread –
● short compared to delay in incoming client connections.
● Reply latency: resources of the server are spread among concurrent connections.
● reasonable number of active connections (~hundreds), load requested relatively low in CPU and RAM,
68
SPL/2010
SPL/2010
Model Quality: Efficiency
● Low efficiency: server creates full thread per connection,
– connection may be bound to Input/Output operations.
– ConnectionHandler thread will be blocked waiting for IO, ,still use the resources of the thread (RAM and Thread).
● Reactor architecture …
69
SPL/2010
SPL/2010
Server Model 3: Constant Number of Threads
● constant number of 10 threads (given by the Executor interface of Java)
● adding runnable ConnectionHandler object to task queue of a thread pool executor
70
SPL/2010
SPL/2010
71
SPL/2010
SPL/2010
Model Quality
● avoids server causing host crash when too many clients connect at the same time
● up to N concurrent client connections -server behaves as "thread-per-connection"
● above N, accept latency will grow
● scalability is limited to amount of concurrent connections we believe we can support.
72