View
217
Download
1
Category
Preview:
Citation preview
OSes: 14. Dist. System Structures 1
Operating SystemsOperating Systems
ObjectivesObjectives– introduce the basic notions behind a introduce the basic notions behind a
networked/distributed system networked/distributed system
Certificate Program in Software DevelopmentCSE-TC and CSIM, AITSeptember -- November, 2003
14. Distributed System Structures
(S&G 6th ed., Ch. 15)
OSes: 14. Dist. System Structures 2
OverviewOverview
1. Background1. Background
2. Network Topologies2. Network Topologies
3. Network Types3. Network Types
4. Communication Network Issues4. Communication Network Issues
5. Partitioning the Network5. Partitioning the Network
6. Robustness6. Robustness
7. Design Strategies7. Design Strategies
8. Networking Example8. Networking Example
OSes: 14. Dist. System Structures 3
1. Background: A Distributed System1. Background: A Distributed System
Figure 15.1, p.540
OSes: 14. Dist. System Structures 4
1.1. Motivation1.1. Motivation
Resource sharingResource sharing– sharing and printing files at remote sitessharing and printing files at remote sites– processing information in a distributed databaseprocessing information in a distributed database– using remote specialized hardware devicesusing remote specialized hardware devices
Computation speedupComputation speedup– load sharingload sharing
continued
OSes: 14. Dist. System Structures 5
ReliabilityReliability– detect and recover from site failure, function detect and recover from site failure, function
transfer, reintegrate failed sitetransfer, reintegrate failed site
CommunicationCommunication– message passing (simpest form)message passing (simpest form)– higher-level capabilities: FTP, rlogin, RPChigher-level capabilities: FTP, rlogin, RPC
OSes: 14. Dist. System Structures 6
1.2. Network Operating Systems1.2. Network Operating Systems
Users are aware of multiplicity of machines. Users are aware of multiplicity of machines. Access to resources of various machines is Access to resources of various machines is done explicitly by:done explicitly by:– remote logging into the appropriate remote remote logging into the appropriate remote
machinemachine
– transferring data from remote machines to local transferring data from remote machines to local machines, via the File Transfer Protocol (FTP) machines, via the File Transfer Protocol (FTP) mechanismmechanism
OSes: 14. Dist. System Structures 7
1.3. Distributed Operating Systems1.3. Distributed Operating Systems
Users unaware of multiplicity of machinesUsers unaware of multiplicity of machines– access to remote resources similar to access to access to remote resources similar to access to
local resourceslocal resources
Data migrationData migration– transfer data by transferring entire file, or transfer data by transferring entire file, or
transferring only those portions of the file transferring only those portions of the file necessary for the immediate tasknecessary for the immediate task
continued
OSes: 14. Dist. System Structures 8
Computation migrationComputation migration– transfer the computation, rather than the data, transfer the computation, rather than the data,
across the systemacross the system e.g. Remote Procedure Calls (RPCs)e.g. Remote Procedure Calls (RPCs)
continued
OSes: 14. Dist. System Structures 9
Process migration – execute an entire Process migration – execute an entire process, or parts of it, at different sitesprocess, or parts of it, at different sites– load balancingload balancing
distribute processes across network to even the distribute processes across network to even the workloadworkload
– computation speedupcomputation speedup subprocesses can run concurrently on different sitessubprocesses can run concurrently on different sites
– explicit vs. implicitexplicit vs. implicit
continued
OSes: 14. Dist. System Structures 10
– hardware preferencehardware preference process execution may require specialized processorprocess execution may require specialized processor
– software preferencesoftware preference required software may be available at only a particular required software may be available at only a particular
sitesite
– data accessdata access run process remotely, rather than transfer all data locallyrun process remotely, rather than transfer all data locally
OSes: 14. Dist. System Structures 11
Some CriteriaSome Criteria– Basic costBasic cost. How expensive is it to link the various . How expensive is it to link the various
sites in the system?sites in the system?
– Communication costCommunication cost. How long does it take to send a . How long does it take to send a message from site message from site AA to site to site BB??
– ReliabilityReliability. If a link or a site in the system fails, can . If a link or a site in the system fails, can the remaining sites still communicate with each other?the remaining sites still communicate with each other?
1.4. Connecting Sites1.4. Connecting Sitesused in section 2
OSes: 14. Dist. System Structures 12
The various topologies are depicted as The various topologies are depicted as graphs whose nodes correspond to sites. graphs whose nodes correspond to sites.
An edge from node An edge from node AA to node to node BB corresponds to a direct connection between corresponds to a direct connection between the two sites.the two sites.
2. Network Topologies2. Network Topologies
OSes: 14. Dist. System Structures 13
2.1.2.1.Network Network TopologyTopologyDiagramsDiagrams
Figure 15.2, p.547
OSes: 14. Dist. System Structures 14
3. Network Types3. Network Types
Local-Area NetworkLocal-Area Network (LAN) – designed to (LAN) – designed to cover small geographical area.cover small geographical area.– multiaccess bus, ring, or star networkmultiaccess bus, ring, or star network– ~10 megabits/second, or higher~10 megabits/second, or higher– broadcast is fast and cheapbroadcast is fast and cheap– nodes: nodes:
usually workstations and/or personal computers usually workstations and/or personal computers a few (usually one or two) mainframesa few (usually one or two) mainframes
continued
OSes: 14. Dist. System Structures 15
A typical LAN:A typical LAN:Figure 15.3, p.550
OSes: 14. Dist. System Structures 16
Wide-Area NetworkWide-Area Network (WAN) – links (WAN) – links geographically separated sites. geographically separated sites. – point-to-point connections over long-haul lines point-to-point connections over long-haul lines
(often leased from a phone company)(often leased from a phone company)– ~100 kilobits/second.~100 kilobits/second.– broadcast usually requires multiple messagesbroadcast usually requires multiple messages– nodes: nodes:
usually a high percentage of mainframesusually a high percentage of mainframes
continued
OSes: 14. Dist. System Structures 17
A typical A typical WAN:WAN:
Figure 15.5p.551
OSes: 14. Dist. System Structures 18
4. Communication Network Issues4. Communication Network Issues
Naming and name resolutionNaming and name resolution– how do two processes locate each other to how do two processes locate each other to
communicate?communicate?
Routing strategiesRouting strategies– how are messages sent through the how are messages sent through the
network?network?
continued
More detailsin the next few slides
OSes: 14. Dist. System Structures 19
Connection strategiesConnection strategies– how do two processes send a sequence of how do two processes send a sequence of
messages?messages?
ContentionContention– the network is a shared resource, so how do we the network is a shared resource, so how do we
resolve conflicting demands for its use?resolve conflicting demands for its use?
OSes: 14. Dist. System Structures 20
4.1. Naming and Name Resolution4.1. Naming and Name Resolution
Name systems in the networkName systems in the network– fine for LANsfine for LANs
Address messages with the process IDsAddress messages with the process IDs– fine for process to process comms.fine for process to process comms.
Identify processes on remote systems Identify processes on remote systems by pairs:by pairs:
<host-name, PIDs><host-name, PIDs>
continued
OSes: 14. Dist. System Structures 21
Domain name serviceDomain name service (DNS) (DNS)– specifies the naming structure of the hosts, as specifies the naming structure of the hosts, as
well as name to address resolution (Internet)well as name to address resolution (Internet) e.g. from a hierarchical name "ratree.psu.ac.th" e.g. from a hierarchical name "ratree.psu.ac.th"
to a dotted decimal 127.50.2.7to a dotted decimal 127.50.2.7
OSes: 14. Dist. System Structures 22
4.2. Routing Strategies4.2. Routing Strategies
Fixed routingFixed routing. A path from . A path from AA to to BB is is specified in advancespecified in advance; path changes only if a ; path changes only if a hardware failure disables it. hardware failure disables it. – since the shortest path is usually chosen, since the shortest path is usually chosen,
communication costs are minimizedcommunication costs are minimized– fixed routing cannot adapt to load changesfixed routing cannot adapt to load changes– ensures that messages will be delivered in the ensures that messages will be delivered in the
order in which they were sentorder in which they were sent
continued
OSes: 14. Dist. System Structures 23
Virtual circuitVirtual circuit. A path from . A path from AA to to BB is fixed is fixed for the for the duration of one sessionduration of one session. Different . Different sessions involving messages from sessions involving messages from AA to to BB may have different paths. may have different paths. – partial remedy to adapting to load changespartial remedy to adapting to load changes– ensures that messages will be delivered in the ensures that messages will be delivered in the
order in which they were sent order in which they were sent
continued
OSes: 14. Dist. System Structures 24
Dynamic routingDynamic routing. The path used to send a . The path used to send a message form site message form site AA to site to site BB is is chosen only when chosen only when a message is senta message is sent. . – usually a site sends a message to another site on the usually a site sends a message to another site on the
link least used at that particular timelink least used at that particular time– adapts to load changes by avoiding routing messages adapts to load changes by avoiding routing messages
on heavily used pathon heavily used path– messages may arrive out of order. This problem can messages may arrive out of order. This problem can
be remedied by appending a sequence number to each be remedied by appending a sequence number to each message. message.
OSes: 14. Dist. System Structures 25
4.3. Connection Strategies4.3. Connection Strategies
Circuit switchingCircuit switching– a permanent physical link is established for the a permanent physical link is established for the
duration of the communicationduration of the communication e.g. the telephone system; TCPe.g. the telephone system; TCP
Message switchingMessage switching..– a temporary link is established for the duration of a temporary link is established for the duration of
one message transferone message transfer e.g. the post-office mailing system; UDPe.g. the post-office mailing system; UDP
continued
OSes: 14. Dist. System Structures 26
Packet switchingPacket switching– messages of variable length are divided into messages of variable length are divided into
fixed-length packets which are sent to the fixed-length packets which are sent to the destinationdestination
– each packet may take a different path through each packet may take a different path through the networkthe network
– the packets must be reassembled into messages the packets must be reassembled into messages as they arriveas they arrive
continued
messagespackets
OSes: 14. Dist. System Structures 27
Circuit switching requires setup time, but Circuit switching requires setup time, but incurs less overhead for shipping each incurs less overhead for shipping each message, and may waste network message, and may waste network bandwidthbandwidth
Message and packet switching require less Message and packet switching require less setup time, but incur more overhead per setup time, but incur more overhead per message.message.
OSes: 14. Dist. System Structures 28
4.4. Contention4.4. Contention
CSMA/CDCSMA/CD. Carrier sense with multiple access . Carrier sense with multiple access (CSMA); collision detection (CD)(CSMA); collision detection (CD)– a site determines whether another message is a site determines whether another message is
currently being transmitted over that link. If two or currently being transmitted over that link. If two or more sites begin transmitting at exactly the same more sites begin transmitting at exactly the same time, then they will register a CD and will stop time, then they will register a CD and will stop transmittingtransmitting
– When the system is very busy, many collisions may When the system is very busy, many collisions may occur, and thus performance may be degraded.occur, and thus performance may be degraded.
continued
OSes: 14. Dist. System Structures 29
CSMA/CD is used successfully in the CSMA/CD is used successfully in the Ethernet system, the most common network Ethernet system, the most common network system.system.
continued
OSes: 14. Dist. System Structures 30
Token passingToken passing– a unique message type, known as a token, a unique message type, known as a token,
continuously circulates in the systemcontinuously circulates in the system usually a ring structureusually a ring structure
– a site that wants to transmit information must a site that wants to transmit information must wait until the token arrives. When the site wait until the token arrives. When the site completes its round of message passing, it completes its round of message passing, it retransmits the tokenretransmits the token
– used by the IBM and Apollo systems used by the IBM and Apollo systems
continued
OSes: 14. Dist. System Structures 31
Message slotsMessage slots– a number of fixed-length message slots continuously a number of fixed-length message slots continuously
circulate in the systemcirculate in the system usually a ring structureusually a ring structure
– since a slot can contain only fixed-sized messages, a since a slot can contain only fixed-sized messages, a single logical message may have to be broken down single logical message may have to be broken down into a number of smaller packets, each of which is into a number of smaller packets, each of which is sent in a separate slotsent in a separate slot
– adopted in the experimental Cambridge Digital adopted in the experimental Cambridge Digital Communication RingCommunication Ring
X X
slots
OSes: 14. Dist. System Structures 32
5. Partitioning the Network5. Partitioning the Network
1. Physical layer1. Physical layer– handles the mechanical and electrical details handles the mechanical and electrical details
of the physical transmission of a bit streamof the physical transmission of a bit stream
2. Data-link layer2. Data-link layer– handles the handles the framesframes, or fixed-length parts of , or fixed-length parts of
packets, including any error detection and packets, including any error detection and recovery that occurred in the physical layerrecovery that occurred in the physical layer
continued
7 layers
OSes: 14. Dist. System Structures 33
3. Network layer3. Network layer– provides connections and routes packets in the provides connections and routes packets in the
communication networkcommunication network handling the address of outgoing packetshandling the address of outgoing packets decoding the address of incoming packetsdecoding the address of incoming packets maintaining routing info. for proper response to maintaining routing info. for proper response to
changing load levelschanging load levels
continued
OSes: 14. Dist. System Structures 34
4. Transport layer4. Transport layer– responsible for low-level network access and for responsible for low-level network access and for messagemessage
transfer between clients (hosts)transfer between clients (hosts) partitioning messages into packetspartitioning messages into packets maintaining packet order, controlling flowmaintaining packet order, controlling flow generating physical addresses.generating physical addresses.
5. Session layer5. Session layer– implements sessions, or implements sessions, or process-to-processprocess-to-process
communications protocolscommunications protocols
continued
OSes: 14. Dist. System Structures 35
6. Presentation layer6. Presentation layer– resolves the differences in resolves the differences in formatsformats among the various among the various
sites in the networksites in the network character conversionscharacter conversions half duplex/full duplex (echoing).half duplex/full duplex (echoing).
7. Application layer7. Application layer– interacts directly with the users’ deals with file transfer, interacts directly with the users’ deals with file transfer,
remote-login protocols and e-mailremote-login protocols and e-mail– schemas for distributed databases.schemas for distributed databases.
OSes: 14. Dist. System Structures 36
5.1. The ISO Network Model5.1. The ISO Network ModelFigure 15.5, p.559
OSes: 14. Dist. System Structures 37
5.2.5.2.The ISO The ISO Protocol Protocol LayerLayer
Figure 15.6p.560
Summarises the slides on the seven layers
OSes: 14. Dist. System Structures 38
5.3.5.3.The ISO The ISO Network Network MessageMessage
Figure 15.7p.561
header
OSes: 14. Dist. System Structures 39
5.4. The TCP/IP Protocol Layers5.4. The TCP/IP Protocol Layers
Figure 15.8, p.562
OSes: 14. Dist. System Structures 40
6. Robustness6. Robustness
Failure detectionFailure detection– many types of failure: host, link, routing, loss many types of failure: host, link, routing, loss
of message, excessive delays, etc.of message, excessive delays, etc.
ReconfigurationReconfiguration– main aim: to "keep going" in the face of partial main aim: to "keep going" in the face of partial
failurefailure
OSes: 14. Dist. System Structures 41
6.1. Failure Detection6.1. Failure Detection
Detecting hardware failure is difficult.Detecting hardware failure is difficult. To detect a link failure, a handshaking To detect a link failure, a handshaking
protocol can be used.protocol can be used.
continued
AA BB"I-am-up""I-am-up"
"ok""ok"
AA BB"Are you up?""Are you up?"
"yes""yes"
OSes: 14. Dist. System Structures 42
Assume Site A and Site B have established a Assume Site A and Site B have established a link. At fixed intervals, each site will exchange link. At fixed intervals, each site will exchange an an I-am-upI-am-up message indicating that they are up message indicating that they are up and running.and running.
If Site A does not receive a message within the If Site A does not receive a message within the fixed interval, it assumes eitherfixed interval, it assumes either– a) the other site is not up or a) the other site is not up or
– b) the message was lostb) the message was lost
continued
OSes: 14. Dist. System Structures 43
Site A can now send an Site A can now send an Are-you-up?Are-you-up? message to Site message to Site B.B.
If Site A does not receive a reply after a fixed If Site A does not receive a reply after a fixed interval, it can repeat the message or try an alternate interval, it can repeat the message or try an alternate route to Site B.route to Site B.
If Site A does not ultimately receive a reply from Site If Site A does not ultimately receive a reply from Site B, it concludes some type of failure has occurred.B, it concludes some type of failure has occurred.
continued
OSes: 14. Dist. System Structures 44
Types of failures:Types of failures:– site B is downsite B is down– the direct link between A and B is downthe direct link between A and B is down– the alternate link from A to B is downthe alternate link from A to B is down– the message has been lostthe message has been lost
However, Site A cannot determine exactly However, Site A cannot determine exactly whywhy the failure has occurred. the failure has occurred.
OSes: 14. Dist. System Structures 45
6.2. Reconfiguration6.2. Reconfiguration
When Site A determines a failure has occurred, When Site A determines a failure has occurred, it must reconfigure the system:it must reconfigure the system:
– 1. If the link from A to B has failed, this must be 1. If the link from A to B has failed, this must be broadcast to every site in the systembroadcast to every site in the system
– 2. If a site has failed, every other site must also be 2. If a site has failed, every other site must also be notified indicating that the services offered by the notified indicating that the services offered by the failed site are no longer availablefailed site are no longer available
continued
OSes: 14. Dist. System Structures 46
When the link or the site becomes available When the link or the site becomes available again, this information must again be again, this information must again be broadcast to all other sites.broadcast to all other sites.
OSes: 14. Dist. System Structures 47
7. Design Issues7. Design Issues
TransparencyTransparency– the distributed system should appear as a the distributed system should appear as a
conventional, centralized system to the userconventional, centralized system to the user
Fault toleranceFault tolerance– the distributed system should continue to the distributed system should continue to
function in the face of failurefunction in the face of failure
continued
OSes: 14. Dist. System Structures 48
ScalabilityScalability– as demands increase, the system should easily as demands increase, the system should easily
accept the addition of new resources to accept the addition of new resources to accommodate the increased demandaccommodate the increased demand
ClustersClusters– a collection of semi-autonomous machines that a collection of semi-autonomous machines that
acts as a single systemacts as a single system
OSes: 14. Dist. System Structures 49
8. Networking Example8. Networking Example
The transmission of a network packet The transmission of a network packet between hosts on an Ethernet network.between hosts on an Ethernet network.
Every host has a unique IP address and a Every host has a unique IP address and a corresponding Ethernet (MAC) address.corresponding Ethernet (MAC) address.
Communication requires both addresses.Communication requires both addresses.
continued
OSes: 14. Dist. System Structures 50
Domain Name Service (DNS) can be used to Domain Name Service (DNS) can be used to acquire IP addresses.acquire IP addresses.
Address Resolution Protocol (ARP) is used to map Address Resolution Protocol (ARP) is used to map MAC addresses to IP addresses.MAC addresses to IP addresses.
If the hosts are on the same network, ARP can be If the hosts are on the same network, ARP can be used. If the hosts are on different networks, the used. If the hosts are on different networks, the sending host will send the packet to a sending host will send the packet to a routerrouter which which routes the packet to the destination network.routes the packet to the destination network.
OSes: 14. Dist. System Structures 51
8.1. An Ethernet Packet8.1. An Ethernet PacketFigure 15.9
p.568
Recommended