Introduction to Distributed Systemspeople.cs.aau.dk/~bnielsen/DS-E05/material/intro.pdf · 2005-09-06 · concepts in distributed systems, knowledge about their construction, and

Introduction to Distributed Systems

Brian [email protected]

About me• Lecturer in Department of Computer Science• Research Unit: Distributed Systems and

semantics• Center of Embedded Systems• Research

– Distributed programming (Linda w. multiple tuplespaces)

– Distributed Multi-media– Operating Systems– Automated testing of embedded, real-time, distributed

systems

About You

• Dat-3• F7S• SSE-1• Guests (SSW-1)• ⇒ Different background and expectations

Teaching Assistants

• Bjørn Haagensen– PhD student in DSS– [email protected]

• Marc Kammersgaard– Dat-5 student – [email protected]

Study RegulationsPurpose: That the student obtains knowledge about concepts in distributed systems, knowledge about their construction, and an understanding of advantages and disadvantages of their use.

Contents:

•Structure of distributed systems.

•Distributed algorithms.

•Distributed and parallel programming.

•Fault tolerance.

•Examples of one or more distributed systems.

Course Form• 1 lecture = 1 mini-module = 6 hrs

– 2*45 min of lectures– 1.5 hrs of exercises with TAs in groups– 1.5 hrs of reading homework– 0.5 hrs of exam-preparation

• 1 big study assignment subject to examination

• PE: exam part of project-exam• SE: Pass/Fail verdict

Text Book• Coulouris, Dollimore and

Kindberg• Distributed Systems:

Concepts and Design• Edition 4

• www.cdk4.net

Preliminary Course Plan

Study Exercise15

Study Exercise14

Grid Computing IIB3-10413

Grid Computing IB3-10412

Web ServicesB3-10411

B3-10410

ReplicationB3-1049

Multicast communicationB3-1048

Distributed Mutual ExclusionB3-1047

Clock SynchronizationB3-1046

Peer2peer SystemsKS3 4.110 Mon 3/10 12.305

Distributed File SystemsB3-104Wed 28/9 8.154

Programming Models IIB3-104Wed14/9 8.153

Programming Models IKS3 4.112Mon12/9 12.302

Introduction to Distributed SystemsB3-104Wed 7/9 8.151

TopicRoomDate

Pre-requisites• Programming

– Practical programming in e.g. Java, C, C++ – Basic data-structures and algorithms

• Networks– OSI-model, IP-addressing, IP-routing, message enveloping,

TCP/UDP, Sliding-window, congestion and flow control, socket-programming, basic encryption technology

– Else read chapter 3• Operating Systems

– Processes, threads, concurrency, non-determinism, kernel and user level, synchronization (semaphores, monitors), address spaces, virtual memory, file-systems

– Else read chapter 6

Today’s Agenda

• Course Introduction• What is a distributed system?• Examples of distributed systems.• Challenges in the development of

distributed systems.• Models of distributed computation and

systems.

Book-Links

• http://www.cs.cornell.edu/courses/cs514/2005sp/

• http://www.cdk4.net/instructors/teaching.html

• http://www.prenhall.com/divisions/esm/app/author_tanenbaum/custom/dist_sys_1e/

Definition

• A distributed system is the one in which hardware and software components at networked computers communicate and coordinate their activity only by passing messages.

• Examples: Internet, intranet and mobile computing systems.

Consequences• Concurrent execution of processes

– Users work independently & share resources– non-determinism, race-conditions, synchronization, deadlock,

liveness, …• No global clock

– Coordination is done by message exchange– There are limits to the accuracy with which computers in a network

can synchronize their clocks• No global state

– Generally, there is no single process in the distributed system that would have a knowledge of the current global state of the system

• Units may fail independently.– Network faults can result in the isolation of computers that continue

executing– A system failure or crash might not be immediately known to other

systems

Examples of Distributed Systems• Intra-net, inter-net, WWW• DNS

– Hierarchical distributed database• Network of workstations (NOW), Cluster Computers• Cray T3E

– 2048 tightly coupled homogeneous processors– Distributed/parallel computing

• Email• Electronic banking• Airline reservation system• Peer-to-peer networks• Sensor networks• Mobile and Pervasive Computing• Etc., etc., etc.

Examples of mission-critical applications

• Banking, stock markets, stock brokerages• Heath care, hospital automation• Control of power plants, electric grid• Telecommunications infrastructure• Electronic commerce and electronic cash on the Web

(very important emerging area)• Corporate “information” base: a company’s memory of

decisions, technologies, strategy• Military command, control, intelligence systems

intranet

ISP

desktop computer:

backbone

satellite link

server:

☎

network link:

☎

☎

☎

A typical portion of the Internet

Internet– Vast interconnected collection of networks– Heterogeneous network of computers and

applications– Interaction via message passing for a

common means of communication– Users in different physical locations– The set of services is open-ended

• Email, ftp, instant messaging, IP-telephony, CSCW, WWW, …

– Communication and information exchange

Resouces

• A thing that can be shared• Physical: printers, disks, fast machines

etc.• Software defined: files, databases, data-

objects, video-stream, etc.

Web servers and web browsers

Internet

BrowsersWeb servers

www.google.com

www.cdk3.net

www.w3c.org

Protocols

Activity.html

http://www.w3c.org/Protocols/Activity.html

http://www.google.comlsearch?q=kindberg

http://www.cdk3.net/

File system ofwww.w3c.org

URL=Uniform Resource Locator Protocol:protocol-specific-identifier

http url: http://servername[:port][/pathname][?query][#fragment]

A typical intranet

the rest of

email server

Web server

Desktopcomputers

File server

router/firewall

print and other servers

other servers

print

Local areanetwork

email server

the Internet

Intranets• A portion of the Internet that

– is separately administered– usually proprietary– provides internal and external services– can be configured to enforce local security policies

• may use a firewall to prevent unauthorized messages leaving or entering

– may be connected to the internet via a router• Services:

– File, print services, backup, program-sharing, user-, system-administration, internet access

Portable and handheld devices in a Dist.Sys

Laptop

Mobile

PrinterCamera

Internet

Host intranet Home intranetWAP Wireless LAN

phone

gateway

Host site

Mobile and ubiquitous computing

• Mobile computing• continuous service is available to the internet, company

intranets• ex. Cell phone can access simple information

• Ubiquitous computing– computing devices will become so pervasive they

will not be noticeable• wearables, PDAs, digital cameras• Interaction with users physical environment

• Issues• discovery of resources, eliminating reconfiguration of

devices from movement, coping with limited connectivity, privacy and security, location awareness

Service Registry

Web Server of Company A

Web Server of Company B

Web server of Company C

Inventory

A’s App Server for generating a package for flight ticket

A: Flight ticketB: Car RentalC: Hotel Booking

B’s App Server for generating a package for car rental

• Customer makes specifications on schedule, budget ranges, and other specific requirements (e.g. non-smoking room etc.)

• System responses him with a list packages for him to choose

C’s App Server for generating a package for hotel room

Web-services

Automotive Control

•80+ ECU’s interconnected in controller area networks•Vehicle dynamics (engine, brake, gear control,…)•Instrumentation control (lights, indicators, windows,…)•System Integration•Information and entertainment systems•Drive-by-wire

Automotive Control

Sensor-networks

Distributed Computing• Speed up huge computations by using multiple

computers• NOWs (network of workstations) / cluster

computing• Dedicated computers• seti@home: project to scan data retrieved by a

radio telescope to search for radio signals from another world; and the Grid Computing: www.grid.org (using over 2.5m CPUs world-wide – source 2004)

Cray T3E

CRAY T3E• Liquid cooled• 64-2048 stk. 600 MHZ CPUs each with 2GB

local ram• 38.4 GFLOPS - 2.4 TFLOPS• 4 TB global memory• High I/O bandwidth: up to 250 GB/s• Inter-processor communication 650 MB/s• UNICOS/mk unix like micro-kernel OS

T3E ArkitekturComputing-node

computing-nodeCommunication

links

Weather forecasting

Wind-tunnel Simulation

Proces-sor22

• 2.5 mil points, 400 MB• 300 hrs on a Cray Y-MP

Shared Memory Multi-Processor

processor

cache

processor

cache

processor

cache

Main memory

System bus

i/o subsystem

NOT a

distributed system!

Domain Name Service

• Database that maps host names to IP-addresses and vice versa

• homer.cs.aau.dk =130.225.194.13

root

dk com gov mil org net uk fr

aau mit

cs ece

homer

etc.

owlnet

DNS: History• Initially all host-addess mappings were in a file

called hosts.txt (in /etc/hosts)– Changes were submitted to SRI by email– New versions of hosts.txt ftp’d periodically from SRI– An administrator could pick names at their discretion– Any name is allowed: eugenesdesktopatrice

• As the internet grew this system broke down because:– SRI couldn’t handled the load– Hard to enforce uniqueness of names– Many hosts had inaccurate copies of hosts.txt

• Domain Name System (DNS) was born

Basic DNS Features

• Hierarchical namespace– as opposed to original flat namespace

• Distributed storage architecture– as opposed to centralized storage (plus replication)

• Client--server interaction on UDP Port 53– but can use TCP if desired

Server Hierarchy• Each server has authority over a

portion of the hierarchy–A server maintains only a subset of all

names

• Each server contains all the records for the hosts or domains in its zone–might be replicated for robustness

• Every server knows the root

DNS Name Servers• Local name servers:

– Each ISP (company) has local default name server– Host DNS query first goes to local name server– Local DNS server IP address usually learned from DHCP – Frequently cache query results

• Authoritative name servers:– For a host: stores that host’s (name, IP address)– Can perform name/address translation for that host’s name

DNS: Root Name Servers

• Contacted by local name server that can not resolve name

• Root name server:– Contacts authoritative

name server if name mapping not known

– Gets mapping– Returns mapping to

local name server• ~ Dozen root name

servers worldwide

Basic Domain Name Resolution

• Every host knows a local DNS server– Through DHCP, for example– Sends all queries to a local DNS server

• Every local DNS server knows the ROOT servers– When no locally cached information exist about the

query, talk to a root server, and go down the name hierarchy from the root

– If we lookup homer.cs.aau.dk, and we have a cached entry for the .dk name server, then we can go directly to the .dk name server and bypass the root server

Example of Recursive DNS Query

Root name server:• May not know

authoritative name server

• May know intermediate name server: who to contact to find authoritative name server?

Recursive query:• Puts burden of name

resolution on contacted name server

• Heavy load? requesting hosthomer.cs.aau.dk

www.google.com

root name server

local name serverdns.cs.aau.dk

1

23

4 5

6

authoritative name serverns1.google.com

intermediate name server(com server)

7

8

Example of Iterated DNS QueryIterated query:• Contacted server

replies with name of server to contact

• “I don’t know this name, but ask this server”

This is how today’s DNS system behaves

requesting hosthomer.cs.aau.dk

www.google.com

root name server


1

2

34

67

authoritative name serverns1.google.com

intermediate name server(com server)

5

8

iterated query

DNS Security• Man-in-the-middle

• Spoofing responses– eavesdrop on requests and race the real DNS server to respond

• Name Chaining– AKA cache poisoning– Attacker waits for a query, injects possibly unrelated response

resource record (frequently NS or CNAME record)– Bogus record is cached for later use

Cache Poisoning Attack

• Until the TTL expires, queries to dns.cs.aau.dk for ebay.com’s nameserverwill return the poison entry from the cache


remote name serverns1.google.com

Query: (A, google.com)

Response: (A, google.com, 216.239.37.99)

Additional Response: (NS, ebay.com, 172.133.44.44)

Attacker172.133.44.44

Solution: DNSSEC• Cryptographically sign critical resource records

– Resolver can verify the cryptographic signature

• Two New resource record type:

• Type = KEY: – name = Zone domain name– value = Public key for the zone

• Type = SIG:– name = (type, name) tuple (i.e. a query)– value = Cryptographic signature of query result

Security 2: Denial of Service• Caching can mitigate the effect of a large-scale denial of

service attack• October 2002: root name servers subjected to massive

DoS attack– By and large, users didn’t notice– Locally cached records used

• More directed denial of service can be effective– DoS local name server cannot access DNS– DoS authoritative names server cannot access domain

• More subtle: claim a domain doesn’t exist– DNSSEC addresses this problem (resource type NXT)

Mail

Net News

Why a Distributed System?• Functional distribution

– computers have different functional capabilities yet may need to share resources

• Client / server• Data gathering / data processing

• Inherent distribution in application domain• cash register and inventory systems for supermarket chains• computer supported collaborative work

• Economics – collections of microprocessors offer a better price/

performance ratio than large mainframes

Why a Distributed System?

• Load balancing– assign tasks to processors such that the

overall system performance is optimized• Replication of processing power

– independent processors working on the same task

• Increased Reliability – Exploit independent failures property and– Redundancy

Challenges

• Heterogeneity• Openness• Security• Scalability• Fault handling• Concurrency• Transparency

Heterogeneity• Network technology• Computer hardware

– Instruction set, word-length, byte-order• Operating system

– System calls, APIs • Programming language.• Implementation by different persons• ⇒ Middleware

– can mask heterogeneity– provides a uniform computational model for use by the

programmers of servers and distributed applications• ⇒ Virtual machines (jvm, .net)

– transparency from hard-, software and programming language heterogeneity

– Mobile code

Openness

• The degree to which a DS can be extended with new services and be made available to clients– Can the system be extended or

reimplemented?– Can we add new resource sharing services?– Can the system be made available to different

client programs?

Openness• The degree to which new resource-sharing

services can be added and be made available for use by a variety of client programs– Specification and documentation of the key

software interfaces of the components must be published

– Uniform service localization and access– Extension may be at the hardware level by

introducing additional computers

Security• Confidentiality: only disclose information to

authorized users• Integrity: No alteration to documents/messages• Availability: No interference to resource access• ⇒ Cryptography• Continued challenges

– Denial of service attacks– Security of mobile code

Scalability

• Can we add more physical resources?• What is the performance penalty?• Can we run out of software resources?

Scalability• system remains effective when there is a

significant increase in the number of resources and the number of users

– controlling the cost of physical resources– adding servers to allow for increased file sharing

– controlling the cost of performance loss– increases in data storage and access

– preventing software resources from running out– trade-offs between allocating too much vs. too little

– avoiding performance bottlenecks– algorithms should be decentralized, use caching and hierarchies

Fault handling• In distributed systems, some components

fail while others continue executing• How do we:

– Detect faults?– Mask fault?– Tolerate faults?– Recover from faults?

• messages can be retransmitted• data can be written to multiple disks to minimize

the chance of corruption• Data can be recovered when computation is “rolled

back”• Redundant components or computations tolerate

failure

Concurrency– Several clients may attempt to access a

shared resource at the same time• ebay bids

– Generally multiple requests are handled concurrently rather than sequentially

– All shared resources must be responsible for ensuring that they operate correctly in concurrent environment

Transparency• the system is perceived as a whole rather

than as a collection of independent components

• Access transparency: enables local and remote resources to be accessed using identical operations.

• Location transparency: enables resources to be accessed without knowledge of their location.

• Concurrency transparency: enables several processes to operate concurrently using shared resources without interference between them.

• Replication transparency: enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers.

Transparencies (cont.)• Failure transparency: enables the concealment of

faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components.

• Scaling transparency: allows the system and applications to expand in scale without change to the system structure or the application algorithms.

• Performance transparency: allows the system to be reconfigured to improve performance as loads vary.

• Mobility transparency: allows the movement of resources and clients within a system without affecting the operation of users or programs.

Models

ArchitecturalFundamental

Architectural models

• Software layers• System architecture

Service layers

Applications, services

Computer and network hardware

Platform

Operating system

Middleware

Middleware• Software layer (library of functions) that

simplifies programming– Masks heterogeneity– Provides a convenient programming model

• Objects/ processes• Communication primitives• Synchronization• Group and multicasting• Naming and Localization services• Event notification

– Corba, JavaRMI, DCOM, MPI, ISIS,…

•Session Layer: Dialog controller•Establish•Maintain•Synchronize•Terminate

Presentation layer: handles syntax and semantics•Data translation•Encryption/decryption•Compression/expansion

OSI-model•Open Systems Interconnection model (ISO standard)

Clients invoke individual servers

Server

Client

Client

invocation

result

Serverinvocation

result

Process:Key:

Computer:

A distributed application based on peer processes

Application

Application

Application

Peer 1

Peer 2

Peer 3

Peers 5 .... N

Sharableobjects

Application

Peer 4

Client Server vs P2PClient-Server• Most widely used

model• Functional

specialization• Asymmetrical• Tends to be

Centralized• Tends to scale poorly

P2P• Symmetrical,

computers have same capabilities

• Truly Distributed• Share / exploit

resources at a large number of participants

Web proxy server

Client

Proxy

Web

server

Web

server

serverClient

Web appletsa) client request results in the downloading of applet code

Web server

ClientWeb serverApplet

Applet codeClient

b) client interacts with the applet

Thin clients and compute servers

ThinClient

ApplicationProcess

Network computer or PCCompute server

network

A service provided by multiple servers

Server

Server

Server

Service

Client

Client

Design requirements• Performance

– Responsiveness– Throughput– Load balancing

• Quality of service• Caching and replication• Dependability

– Correctness– Fault tolerance– Security

Fundamental Models

Design and solutions depend on fundamental assumptions on

– Process Interaction– Failures– Security threats

Interaction Model• Distributed Algorithms:

– A definition of the steps to be taken by each of the processes of which the system is composed, including the messages transmitted between them

• Process: executing program with private state• Communication Performance is a limiting

characteristic– Latency, bandwidth, Jitter

• It is impossible to maintain a single notion of time– Computer clocks have drift– GPS: 1 micro Sec– Message Passing (eg.NTP) 100ms

Distributed Algorithms• Address problems of

– resource allocation -- deadlock detection– communication -- global snapshots– consensus -- synchronization– concurrency control -- object implementation

• Have a high degree of• uncertainty and independence of activities

– unknown # of processes & network topology– independent inputs at different locations– several programs executing at once, starting at

different times, operating at different speeds– processor non-determinism– uncertain message ordering & delivery times– processor & communication failures

Asynchronous Model

• Separate components take steps arbitrarily

• Reasonably simple to describe - with the exception of liveness guarantees

• Harder to accurately program• Allows ignorance to timing considerations• May not provide enough power to solve

problems efficiently

Asynchronous systems

No known bounds for:• The execution speed of a process• Message delay on the network• Clock drift

Synchronous systems

• Known upper and lower bound for each process step

• Known upper bound for the time it task for a message to be received

• Known upper bound for clock drift

Real-time ordering of eventssend

receive

send

receive

m1 m2

2

1

3

4X

Y

Z

Physical time

Am3

receive receive

send

receive receive receivet1 t2 t3

receive

receivem2

m1

Failure Model

• The algorithm might need to tolerate failures– processors

• might stop• degrade gracefully• exhibit Byzantine failures

– may also be failures of • communication mechanisms

Failure Model

• Processes and channelsprocess p process q

Communication channel

send

Outgoing message buffer Incoming message buffer

receivem

Omission and arbitrary failures

Class of failure Affects DescriptionFail-stop Process Process halts and remains halted. Other processes

may detect this state.Crash Process Process halts and remains halted. Other processes

may not be able to detect this state.Omission Channel A message inserted in an outgoing message buffer

never arrives at the other end’s incoming message buffer.

Send-omission Process A process completes a send but the message is not put in its outgoing message buffer.

Receive-omission

Process A message is put in a process’s incoming messagebuffer, but that process does not receive it.

Arbitrary(Byzantine)

Process orchannel

Process/channel exhibits arbitrary behaviour: it maysend/transmit arbitrary messages at arbitrary times,commit omissions; a process may stop or take anincorrect step.

Timing failures

Class of Failure Affects DescriptionClock Process Process’s local clock exceeds the bounds

on its rate of drift from real time.Performance Process Process exceeds the bounds on the interval

between two steps.Performance Channel A message’s transmission takes longer than

the stated bound.

Security model

• Protection of objects• Securing processes and their interaction

Objects and principals

Network

invocation

resultClient

Server

Principal (user) Principal (server)

ObjectAccess rights

The enemy

Communication channel

Copy of m

Process q Process qm

The enemym´

Secure channels

Principal A

Secure channelProcess p Process q

Principal B

The Web

END

DefinitionA distributed system is a set of independent computersinterconnected via a network where processes communicate and synchronize using message passing.

Partially Synchronous Model

• Some restrictions on the timing of events, but not exactly lock-step

• Most realistic model• Trade-offs must be considered when

deciding the balance of the efficiency with portability

DefinitionA distributed system is a computer system where processes are executed on computers interconnect via a network and communication and coordination are performed using message passing.

Sources of Information

• Sun's JMS Tutorialhttp://java.sun.com/products/jms/tutorial/html/jmsTOC.fm.html

• Microsoft's Message Queue(MSMQ)http://msdn.microsoft.com/library/psdk/msmq/msmq

• IBM's MQ-Serieshttp://www-4.ibm.com/software/ts/mqseries/

Computers in the Internet

Date Computers Web servers

1979, Dec. 188 01989, July 130,000 01999, July 56,218,000 5,560,866

Computers vs. Web servers

Date Computers Web servers Percentage

1993, July 1,776,000 130 0.0081995, July 6,642,000 23,500 0.41997, July 19,540,000 1,203,096 61999, July 56,218,000 6,598,697 12

Proxy servers

Distributed Scalability Service Architecture

Home server

Client

Client

Client

Client

images

images

ResultCache

ResultCache

• Improved scalability (distributed)• Proxy can run same app code as server

How to maintain cache consistency?

Customers++??1. Invest in heavy-duty server infrastructure

… OR …2. Risk inability to handle customer load

Need on-demand scalability

Home server

App serverBack-end Database

Web server

HTTP

Client

Client

App code DBMS

Today’s e-business infrastructure

P2P

”File”-”Sharing”

• Scalability, Fault-Tolerance, Dynamics

Service-oriented architecture

Exchange Trading Data Settlement RegistryIssuer Data

Pre-Trading Analytics Surveillance Broker

Middleware infrastructure

…

…Basic Services

Composite Services

Documents

Introduction to Distributed Systemspeople.cs.aau.dk/~bnielsen/DS-E05/material/intro.pdf · 2005-09-06 · concepts in distributed systems, knowledge about their construction, and