Upload
cleopatra-terry
View
221
Download
2
Tags:
Embed Size (px)
Citation preview
Assignment #3
• 3A: Reliability– No flow-control or congestion control– Just sliding window. – You can use the same window value for receiver and sender windows
• 3B: TCP-like– Must implement @ least slow-start and congestion avoidance– You can add more to make your implementation more efficient or
fair
• Implementation details– You can ignore the demux function
• Only gets tested in 3B and we wouldn’t be testing it.
– When testing 3B, the propagation between any two host will be 100ms
Today
• RPC– Session management+Language– Semantic Challenges– Example RPC (Not on exam)
• Wireless– Collision detection
Why Should You Care?• RPC is used to build distributed
systems
• Used by … everyone– Google– Facebook– Twitter– Google– Yahoo
RPC – Remote Procedure Call
• Procedure calls are a well understood mechanism– Transfer control and data on a single computer
• Idea: make distributed programming look the same– Have servers export interfaces that are accessible through
local APIs– Perform the illusion behind the scenes
• 2 Major Components– Protocol to manage messages sent between client and
server– Language and compiler support
• Packing, unpacking, calling function, returning value
Problem
• If you want a call between two servers to look like a call between two methods
• What are some Key Problems?– Semantics of the communication
• APIs, how to cope with failure
– Data Representation • (think of parsing information in Assignment #2 & #3)
– Scope: should the scheme work across• Architectures• Languages• Compilers…?
Stub Functions
• Local stub functions at client and server give appearance of a local function call
• client stub– marshalls parameters -> sends to server -> waits– unmarshalls results -> returns to client
• server stub– creates socket/ports and accepts connections– receives message from client stub -> unmarshalls
parameters -> calls server function– marshalls results -> sends results to client stub
Stub Generation
• Many systems generate stub code from independent specification: IDL– IDL – Interface Description Language
• describes an interface in a language neutral way’
• Separates logical description of data from– Dispatching code– Marshalling/unmarshalling code– Data wire format
RPC Components
• Stub Compiler– Creates stub methods– Creates functions for marshalling and unmarshalling
• Dispatcher– Demultiplexes programs running on a machine– Calls the stub server function
• Protocol– At-most-once semantics (or not)– Reliability, replay caching, version matching– Fragmentation, Framing (depending on underlying
protocols)
Presentation Formatting
• How to represent data?
• Several questions:– Which data types do you want to support?
• Base types, Flat types, Complex types
– How to encode data into the wire– How to decode the data?
• Self-describing (tags)• Implicit description (the ends know)
• Several answers:– Many frameworks do these things automatically
Which data types?
• Basic types– Integers, floating point, characters– Some issues: endianness (ntohs, htons), character
encoding, IEEE 754
• Flat types– Strings, structures, arrays– Some issues: packing of structures, order,
variable length
• Complex types– Pointers! Must flatten, or serialize data structures
Data Schema
• How to parse the encoded data?
• Two Extremes: – Self-describing data: tags
• Additional information added to message to help in decoding
• Examples: field name, type, length
– Implicit: the code at both ends “knows” how to decode the message• E.g., your code for assignment 2 &3 know what
format packets should be in• Interoperability depends on well defined protocol
specification!• very difficult to change
Examples of RPC Systems
• SunRPC (now ONC RPC)– The first popular system– Used by NSF– Not popular for the wide area (security,
convenience)
• Java RMI– Popular with Java– Only works among JVMs
• DCE– Used in ActiveX and DCOM, CORBA– Stronger semantics than SunRPC, much more
complex
Can we maintain the same semantics?
• Mostly…
• Why not?– New failure modes: nodes, network
• Possible outcomes of failure– Procedure did not execute– Procedure executed once– Procedure executed multiple times– Procedure partially executed
• Desired: at-most-once semantics
Implementing at-most-once semantics
• Problem: request message lost– Client must retransmit requests when it gets no reply
• Problem: reply message lost– Client may retransmit previously executed request– OK if operation is idempotent– Server must keep “replay cache” to reply to already
executed requests
• Problem: server takes too long executing– Client will retransmit request already in progress– Server must recognize duplicate – could reply “in
progress”
Server Crashes
• Problem: server crashes and reply lost– Can make replay cache persistent – slow– Can hope reboot takes long enough for all clients to fail
• Problem: server crashes during execution– Can log enough to restart partial execution – slow and
hard– Can hope reboot takes long enough for all clients to fail
• Can use “cookies” to inform clients of crashes– Server gives client cookie, which is f(time of boot)– Client includes cookie with RPC– After server crash, server will reject invalid cookie
Example: Sun XDR (RFC 4506)
• External Data Representation for SunRPC
• Types: most of C types• No tags (except for array lengths)
– Code needs to know structure of message
• Usage:– Create a program description file (.x)– Run rpcgen program– Include generated .h files, use stub functions
• Very C/C++ oriented– Although encoders/decoders exist for other
languages
RPC Program Definition
• Rpcgen generates marshalling/unmarshalling code, stub functions, you fill out the actual code
XML
• Other extreme
• Markup language– Text based, semi-human readable– Heavily tagged (field names)– Depends on external schema for parsing– Hard to parse efficiently
<person> <name>John Doe</name> <email>[email protected]</email> </person>
Google Protocol Buffers
• Defined by Google, released to the public– Widely used internally and externally– Supports common types, service definitions– Natively generates C++/Java/Python code
• Over 20 other supported by third parties
– Efficient binary encoding, readable text encoding
• Performance– 3 to 10 times smaller than XML– 20 to 100 times faster to process
Protocol Buffers Example
message Student {required String name = 1; required int32 credits = 2;
}
(…compile with proto)Student s; s.set_name(“Jane”); s.set_credits(20);fstream output(“students.txt” , ios:out | ios:binary );s.SerializeToOstream(&output);
(…somebody else reading the file)Student s;fstream input(“students.txt” , ios:in | ios:binary ); s.ParseFromIstream();
Binary Encoding
• Integers: varints– 7 bits out of 8 to encode integers– Msb: more bits to come– Multi-byte integers: least significant group first
• Signed integers: zig-zag encoding, then varint– 0:0, -1:1, 1:2, -2:3, 2:4, …– Advantage: smaller when encoded with varint
• General:– Field number, field type (tag), value
• Strings: – Varint length, unicode representation
Apache Thrift
• Originally developed by Facebook
• Used heavily internally
• Full RPC system– Support for C++, Java, Python, PHP, Ruby, Erlang, Perl,
Haskell, C#, Cocoa, Smalltalk, and Ocaml
• Many types– Base types, list, set, map, exceptions
• Versioning support
• Many encodings (protocols) supported– Efficient binary, json encodings
Apache Avro
• Yet another newcomer
• Likely to be used for Hadoop data representation
• Encoding:– Compact binary with schema included in file– Amortized self-descriptive
• Why not just create a new encoding for Thrift?– I don’t know…
Conclusions
• RPC is good way to structure many distributed programs– Have to pay attention to different semantics,
though!
• Data: tradeoff between self-description, portability, and efficiency
• Unless you really want to bit pack your protocol, and it won’t change much, use one of the IDLs
• Parsing code is easy to get (slightly) wrong, hard to get fast– Should only do this once, for all protocols
• Which one should you use?
Wireless
• Today: wireless networking truly ubiquitous– 802.11, 3G, (4G), WiMAX, Bluetooth, RFID,
…– Sensor networks, Internet of Things– Some new computers have no wired
networking– 4B cellphone subscribers vs. 1B computers
• What’s behind the scenes?
Wireless is different
• Signals sent by the sender don’t always reach the receiver intact– Varies with space: attenuation, multipath– Varies with time: conditions change,
interference, mobility
• Distributed: sender doesn’t know what happens at receiver
• Wireless medium is inherently shared– No easy way out with switches
Implications
• Different mechanisms needed
• Physical layer– Different knobs: antennas, transmission power, encodings
• Link Layer– Distributed medium access protocols– Topology awareness
• Network, Transport Layers– Routing, forwarding
• Most advances do not abstract away the physical and link layers
Physical Layer• Specifies physical medium
– Ethernet: Category 5 cable, 8 wires, twisted pair, R45 jack
– WiFi wireless: 2.4GHz
• Specifies the signal– 100BASE-TX: NRZI + MLT-3 encoding– 802.11b: binary and quadrature phase shift keying
(BPSK/QPSK)
• Specifies the bits– 100BASE-TX: 4B5B encoding– 802.11b @ 1-2Mbps: Barker code (1bit -> 11chips)
What can happen to signals?
• Attenuation– Signal power attenuates by ~r2 factor for omni-
directional antennas in free-space– Exponent depends on type and placement of
antennas• < 2 for directional antennas• > 2 if antennas are close to the ground
Interference
• External sources– E.g., 2.4GHz unlicensed ISM band– 802.11– 802.15.4 (ZigBee), 802.15.1 (Bluetooth)– 2.4GHz phones– Microwave ovens
• Internal sources– Nodes in the same network/protocol can (and do)
interfere
• Multipath– Self-interference (destructive)
Implications of Attenuation and Interference
• Reduces the ratio of Signal to Noise– Makes it hard to decode bites– Increases bit error rates
• Could make signal stronger (transmit with higher power)– Uses more energy– Increases interference to other nodes
Link Layer
• Medium Access Control– Should give 100% if one user– Should be efficient and fair if more users
• Ethernet uses CSMA/CD– Can we use CD here ?
• No! Collision happens at the receiver
• Protocols try to avoid collision in the first place
Carrier Sensing
• The initial collision detection of wireless– Listen on the channel (medium)
• If someone is transmitting then you sleep and try again later
• Else you start sending
Hidden Terminals
• A can hear B and C• B and C can’t hear each other• They both interfere at A. COLLISION!!!!!• B is a hidden terminal to C, and vice-
versa• Carrier sense at sender is useless
A CB
Exposed Terminals
• A transmits to B
• C hears the transmission, backs off, even though D would hear C– C can still be sending to D. But It wouldn’t!!!!!!!
• C is an exposed terminal to A’s transmission
• Why is it still useful for C to do CS?
A CB D
Key points
• No global view of collision– Different receivers hear different senders– Different senders reach different receivers
• Collisions happen at the receiver
• Goals of a MAC protocol– Detect if receiver can hear sender– Tell senders who might interfere with receiver to
shut up
Simple MAC: CSMA/CA
• Maintain a waiting counter c
• For each time channel is free, c--
• Transmit when c = 0
• When a collision is inferred, retransmit with exponential backoff– Use lack of ACK from receiver to infer collision– Collisions are expensive: only full packet transmissions
• How would we get ACKs if we didn’t do carrier sense?
RTS/CTS
• Idea: transmitter can check availability of channel at receiver
• Before every transmission– Sender sends an RTS (Request-to-Send)– Contains length of data (in time units)– Receiver sends a CTS (Clear-to-Send)– Sender sends data– Receiver sends ACK after transmission
• If you don’t hear a CTS, assume collision
• If you hear a CTS for someone else, shut up
Benefits of RTS/CTS
• Solves hidden terminal problem• Also solves exposed terminal• Does it?– Control frames can still collide
• If A & C send RTS as the same time
– In practice: reduces hidden terminal problem on data packets
RTS/CTS
A CBData
• B sends to A• A responds with CTS• C knows not to send.!!!• Hidden Terminal solved!
Benefits of RTS/CTS
• Solves hidden terminal problem• Also solves exposed terminal• Does it?– Control frames can still collide
• If A & C send RTS as the same time
– In practice: reduces hidden terminal problem on data packets
Issues with RTS/CTS
• A sends to B• C hears RTS but not CTS
– So, C can send!
• C sends to D• Exposed terminal solved!!!
A CBRTS
DRTS
Benefits of RTS/CTS
• Solves hidden terminal problem• Also solves exposed terminal• Does it?– Control frames can still collide
• If A & C send RTS as the same time • Also, CTS can get lost!!
– In practice: reduces hidden terminal problem on data packets
Benefits of RTS/CTS
• Solves hidden terminal problem• Also solves exposed terminal• Does it?– Control frames can still collide
• If A & C send RTS as the same time • Also, CTS can get lost!!
– In practice: reduces hidden terminal problem on data packets
CTS Loss
• T0: B sends RTS to A• T1: A responds with CTS WHILE D
also sends RTS
A CB DRTS RTS RTS RTS
CTS Loss
• T0: B sends RTS to A• T1: A responds with CTS WHILE D also
sends RTS to E• T1: This CTS is loss
– C doesn’t know about B->A
A CB DCTS CTS RTS RTS
Drawbacks of RTS/CTS
• Overhead is too large for small packets– 3 packets per packet: RTS/CTS/Data (4-
22% for 802.11b)
• RTS still goes through CSMA: can be lost
• CTS loss causes lengthy retries• 33% of IP packets are TCP ACKs• In practice, WiFi doesn’t use
RTS/CTS
Other MAC Strategies
• Time Division Multiplexing (TDMA)– Central controller allocates a time slot for each
sender– May be inefficient when not everyone sending
• Frequency Division– Multiplexing two networks on same space– Nodes with two radios (think graph coloring)– Different frequency for upload and download
Sometimes you can’t (or shouldn’t) hide that you are on wireless!
• Three examples of relaxing the layering abstraction
Examples of Breaking Abstractions
• TCP over wireless– Packet losses have a strong impact on TCP
performance– Snoop TCP: hide retransmissions from TCP end-
points– Distinguish congestion from wireless losses