Upload
adam-howard
View
216
Download
1
Embed Size (px)
Citation preview
1
1. Definitions and Terminology2. Developing distributed applications3. Architecture4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms7. The Java family8. The .NET family9. This course
Chapter 1
Introduction
2
1. Definitions and Terminology1. Introduction2. Examples3. Definitions4. Need for distributed systems5. Why are distributed systems different ?
2. Developing distributed applications3. Architecture4. Design: middleware and services5. Classes of distributed systems...
Chapter 1
Introduction
3
A loose definition
distributed system = • collection of interacting processes• appearing as single application to the end user
“The network is the computer”- world wide web- search portals for huge information stores- grid computing- P2P applications- communication tools- online gaming- cummunity applications
Technology = standard programming toos + sockets ?- specific, recurring problems due to distributed nature- solve once, use many times- middleware solves these problems in a generic way
1. Definitions and terminology1. Introduction
Why this course ?
4
1. Definitions and terminology2. Examples
The www-boom
WORLD INTERNET USAGE AND POPULATION STATISTICS
World RegionsPopulation( 2008 Est.)
Population% of World
Internet Usage,
Latest Data
% Population
( Penetration )
Usage% of
World
Usage Growth2000-2008
Africa 955,206,348 14.3 % 51,022,400 5.3 % 3.6 %1030.2
%
Asia 3,776,181,949 56.6 % 529,701,704 14.0 % 37.6 % 363.4 %
Europe 800,401,065 12.0 % 382,005,271 47.7 % 27.1 % 263.5 %
Middle East 197,090,443 3.0 % 41,939,200 21.3 % 3.0 %1176.8
%
North America 337,167,248 5.1 % 246,402,574 73.1 % 17.5 % 127.9 %
Latin America/Caribbean
576,091,673 8.6 % 137,300,309 23.8 % 9.8 % 659.9 %
Oceania / Australia 33,981,562 0.5 % 19,353,462 57.0 % 1.4 % 154.0 %
WORLD TOTAL 6,676,120,288 100.0 % 1,407,724,920 21.1 %100.0
%290.0 %
5
1. Definitions and terminology2. Examples
Grid computing
“A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.”
[Kesselman & Foster, 1998]
• Geographically spread resources• Resource heterogeneity• No single entity of control (no big boss)
6
1. Definitions and terminology2. Examples
Grid computing
Requirements [2008]• > 6000 users• 20 Pbyte/year [20 106 CDs]• > 70 000 GFLOP
Large Hadron Collider (LHC) Large Hadron Collider (LHC) Computing Grid Project LCGComputing Grid Project LCG
7
1. Definitions and terminology2. Examples
File sharing
Some factsOrigin : sharing (often illegal) content (KazaA, BitTorrent, eMule, ...)Huge user baseP2P internet traffic estimated at 70% of total traffic
New applications• Voice over IP (Skype)• personalized broadcast (Joost)• sharing computational power “desktop grids” (Folding@home)
Benefits• resources grow with number of users• better scaling behaviour• better robustness (no central servers)
8
1. Definitions and terminology2. Examples
Mobile and ubiquitous computing
Nomadic scenario’saccess guaranteed on other locations(e.g. traveller accessing remote content, using local peripherals)
Mobile scenario’smobility (=moving user) support for applications (e.g. taking location into account)
Ubiquitous computing scenario’sapplication fed with sensor info, steers actuators (e.g. cleaning robot)
Ad-hoc scenario’srapid deployment of infrastructure (often emergency/military scenario’s)
9
1. Definitions and terminology3. Definitions
Distributed system =
A distributed system = a system where • hardware and software components are located at networked computers• components communicate and coordinate their actions ONLY by passing messages
Consequences• no limit on spatial extent• no global time notion• almost always concurrent execution• (partial) failures likely to happen
10
1. Definitions and terminology3. Definitions
Service =
A service = part of a computer system managing a collection of related resources, presenting their functionality to users and applications
Examples• storage service• print service• streaming service• connectivity service• transcoding service
Streaming content to mobile clients
Network service
Streaming service
Storage service
Transcodingservice
User management
11
1. Definitions and terminology3. Definitions
Client - Server
A server = running process on networked computer• accepting requests to perform a service• responding appropriately
A client = running process on networked computer sending service requests to servers
client
request
response
serverrequest
response
12
1. Definitions and terminology3. Definitions
Remote invocation =
A remote invocation = complete interaction between client and server to process single request
client serverrequest
response
remote interaction
clientserverclient
request 1
response 1
remote interaction #1
serverrequest 2
response 2
remote interaction #2
13
1. Definitions and terminology4. Need for distr. syst.
Why distributed systems ?
Many problems• no limit on spatial extent, difficult to manage• no global time notion• almost always concurrent execution• (partial) failures likely to happen
Many advantages• resource sharing
• information resources• hardware resources
• scalability “easy” to adapt to larger user base
• fault tolerance“easy” to cope with failures
14
1. Definitions and terminology5. Why different ?
Distributed system vs. standalone system
Standalone system Distributed system
Communication between processes possiblethrough shared memory.
Communication between processes onlythrough the network.
Global state possible through sharedmemory.
No global state.
Local operating system can be used to geta shared time value.
No global notion of time.
Local operating system offers primitivesto safely share resources.
Synchonisation between processes throughnetwork communication only.
Failing processes easy to detect(communication is reliable).
(Partial) failures difficult to detect(because of unreliable communication).
Infrastructure known beforehand. Infrastructure can vary dynamically (newservers/nodes can become available).
Components are located on the standalonemachine.
Location of components can varydynamically.
Local operating system enforces security.
Security threat due to distributed nature(possibly vulnerable communication link).
15
1. Definitions and terminology5. Why different ?
Faulty communication or faulty process ?
Normal scenario
Failurescenario’s
communcationfailure [1]
process failure [2]
How can Object 1 distinguish between [1] and [2] ?
state updateof Object 2
state updateof Object 2
NO state updateof Object 2
16
1. Definitions and Terminology2. Developing distributed applications
1. The development process2. System models3. Challenges for developing distributed app’s
3. Architecture4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms...
Chapter 1
Introduction
17
2. Developing distributed app’s1. Development process
Developing standalone applications
Requirements analysis phase
Architectural phase
Design phase
Implementation phase
Integration phase
• required functions• non-functional constraints
Formalised requirementsUML-diagrams capturing requirements
WHAT ?[OOA]
HOW ?High level architecture
subsystems and interfacesDetailed design
(in parallel for each subsystem)
Code + documentation (in parallel)
Combine realised subsystem code
18
2. Developing distributed app’s1. Development process
Developing distributed applications
Requirements analysis phase
Architectural phase
Design phase
• required functions• non-functional constraints
Formalised requirementsUML-diagrams capturing requirements
High level architecturesubsystems and interfaces
Functional architecture
Logical architecturehow will the subsystems interact (e.g. eventing, through remote method call, data base, ...)
System architectureclient-server or P2P ?
System modelsnon-functional constraints relating to distributed nature
+
Middleware Services
Middleware Platform
19
2. Developing distributed app’s2. System models
System models
System models-> capture non-functionals of the requirements related to distributed nature
-> Interaction modelcan we give time guarantees on network communication/process execution ?
-> Failure modelwhich failures should the application cope with ?how will we detect failures ?what will we do in case of failures ?
-> Security modelwho can do what in the system ?
20
2. Developing distributed app’s2. System models
Interaction model
A synchronous system is a system where1. The time to execute each process step has a known lower and upper bound2. Each message is received within a known bounded time3. Each process controlled by local clock with known bound for drift rate w.r.t. real time
An asynchronous system is a system where one of the above mentioned conditions does not hold.
a:A b:A c:A
m1()
m2()
a:B b:B c:B
n1()
n2()
Asynchronous systemneeds special algorithmsto detect event order
21
2. Developing distributed app’s2. System models
Failure model
Omission failuresa component fails to perform an action • Process omission : a process does not fulfill a request • Channel omission: a communication link fails to deliver a message
Byzanthine failuresarbitrary behavior (e.g. a process giving the wrong result)
Timing failuresa synchronous system is violating one (or more) of its timing constraints
Detection mechanismsheartbeattimout
22
2. Developing distributed app’s2. System models
Security model
Security threatsProcess
server : enemy can use intercepted credentialsclient : enemy can pretend to be server
Channelscopy,change,replay, create messages
Privacy issues
SolutionAuthentication/Authorization infrastructure+ Crytographic solutions
(shared secret, asymmetric keys, ...)
23
2. Developing distributed app’s3. Challenges
Challenges for developing distributed app’s
Challenge Solution
Heterogeneityphysical/L2 network Common (standard) protocolshardware Virtualizationoperating system Virtual machinsprogramming languages Middleware
Securityprivacy (e.g. financial/medical data) cryptographic infrastructuresecure accounting authorization/authenticationprove identity virus scanningviruses firewall, VPN(distributed) denial of service attacks sandboxingmobile code
Scalabilityscaling of infrastructure cost clever design
physical cost = O(#users) automatic replication of componentsperformance scalingscaling of logical resourcesperformance bottlenecks
24
2. Developing distributed app’s3. Challenges
Challenges for developing distributed app’s
Challenge Solution
Failures failure detection polling, heart beat, checksum,…failure handling masking : retransmit messages,
fail overtolerate : inform user of
(partial) failure recover : roll back to ensure
consistencyhigh availability redundancy and replication
Concurrencyresource access conflicts lock mechanisms synchronization primitives
25
2. Developing distributed app’s3. Challenges
Challenges for developing distributed app’s
Challenge Solution
Transparency Middleware
Access transparency identical operations to access local and remote resourcesLocation transparency access of resources without location knowledgeConcurrency transparency provide safe resource sharing amongst concurrent processesReplication transparency replicated resources are used implicitlyFailure transparency hide hardware/software failureMobility transparency resources/clients can move without affecting system operationPerformance transparency system can be reconfigured according to load conditionsScaling transparency system can expand without changing structure and algorithms
26
1. Definitions and Terminology2. Developing distributed applications3. Architecture
1. Logical architecture2. System architecture
4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms...
Chapter 1
Introduction
27
3. Architecture1. Logical architecture
Logical architecture
= architecture capturing how subsystems will interactalso referred to asa “architectural style”
Layered architecturelayer only interacts with neighbour+ reduced number of interfaces, dependencies+ easy replacement of a layer- possible duplication of functionality
Component 1
Component 2
Component 3
1.
2. 3.
4.
Layer 1
Layer 2
Layer 3
Layerviolation
28
3. Architecture1. Logical architecture
Logical architecture
Interacting objectsno predefined interaction pattern+ highly flexible- complex to manage and maintain
Component 1 Component 2
Component 3
1.
2.3.
4.5.
6.
7.
8.
9.
10.
29
3. Architecture1. Logical architecture
Logical architecture
Event based interaction“publish-subscribe” style+ loose coupling of componentsrelated: message based interaction (also decoupling in time)often used to integrate legacy systems
Event system
Component 1
Component 3
1:subscribe
3:publish
4:event delivery
5:event delivery
Component 2
2:subscribe
30
3. Architecture1. Logical architecture
Logical architecture
Data centric architectureonly interaction through shared data base+ loose coupling of components- possibly slow (central bottleneck, locking, ...)
Component 1
Component 2
1:publish data
2: query data
31
3. Architecture2. System architecture
System architecture
How to map logical components to actually deployed components(replicated ?, P2P, pure client server, ...)
Client-server architectures
client server
2. reply
1. request“simple” client-server
client- multiserver(explicit server lookup)
e.g. DNS based load balancing
client
server
2. serverID
server
1. get server
3. request
4. reply
server
32
3. Architecture2. System architecture
System architecture
client- multiserver(implicit server lookup)
client
server
2. forward request
server
1. request
3. reply
server
proxy server
client
server
server
1. requestserver
proxy
2. forward request
3. reply
4. forward reply
33
3. Architecture2. System architecture
P2P architectures
servent = process issueing requests + fulfilling requestspeering components use logical network (overlay) to interact
Unstructured P2P : - links have no other meaning than communication- unstructured search (e.g. flooding) : resource consuming
Structured : - logical link has relation to service offered- structured search much more efficientimportant data structure : Distributed Hash Table (DHT)
Why P2P ?• scalability (10000 – 100000 users not exceptional)• gradual scaling (“organic growth”)• robustness
34
3. Architecture2. System architecture
P2P : file sharing
Unstructured
Mediated P2P Pure P2P Hybrid P2P
NapsterAudiogalaxy
Early GnutellaFreeNet
KazaA
control traffic (index, lookup)data traffic (download, upload)
35
3. Architecture2. System architecture
P2P : file sharing
Structured
content item
keyhash(itemName)key nodeID
f(key)
Chordeach node has integer nodeIDf(key)=min {nodeID | nodeID≥key}e.g. consider 5 node network {nodeID}={3,7,10,12,15}
{key}={0 .. 15}
3
710
12
15
{0,1,2,3}
{4,5,6,7}{8,9,10}
{11,12}
{14,15}
see chapter 7 formore details
36
1. Definitions and Terminology2. Developing distributed applications3. Architecture4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms...
Chapter 1
Introduction
37
4. Design : MW and servicesRole of middleware
1) Abstract hardware/software platform• operating system• implementation language• ...
2) Realize transparency – focus on location transparency
Hardware A
Host A
Operatingsystem A
Hardware B
Host B
Operatingsystem B
Middleware
DistributedApplication
DistributedApplication
Remote invocation
Remote invocation
Socket Socket
NIC NIC
Logical interaction
38
4. Design : MW and servicesRole of middleware
3) Provide generic servicesNaming service
associate logical names to remote entitiesDiscovery and registration service
easy service lookup and registrationTransaction service
support distributed (database) transactionsSecurity service
support to authenticate and authorize usersEvent service
distributed notification of eventsData replication
offer the same data at different locations to optimize data accessPersistence service
to transparently offer data persistence (i.e. data is automatically stored in a persistent medium)
Life cycle service managing service components (start up of services, shut down, etc.)
39
4. Design : MW and servicesMiddleware Layer
Computer/network hardware
Operating system
Middleware
Applications and services
Platform
Remote procedure callRemote method invocation
Inter-process communicationEvent notification
Data replication
Streaming
Naming service
Transaction service
Security
Persistent storageDiscovery/registration service
40
1. Definitions and Terminology2. Developing distributed applications3. Architecture4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms...
Chapter 1
Introduction
41
5. Classes of distr. syst.Taxonomy
Responsiveness
Data collection
TransactionalComputationStorage
Do not care
Soft real time
Hard real time
E-science grids/clusters
Service grids
Multimediagrids
BusinessApplications(e.g. web shop, banking, …)
InfrastructureManagement(e.g. VoIP system)
Off-line monitoring systems (e.g. national pollution measurement network)
E-health monitoring systems (e.g. monitoring of life-critical functions in a hospital)
Functionality
42
1. Definitions and Terminology2. Developing distributed applications3. Architecture4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms
1. Thin versus thick ?2. Grid and cluster systems3. Transactional systems4. Integration
7. The Java technology family...
Chapter 1
Introduction
43
6. Architectures and Platforms1. Client think/thin
How thick is the client ?
Thin client+ cheaper and many devices involved -> overall cheaper+ cheaper to update (controlled by server), and less updates needed- some functions more natural to support on client- permanent network connection needed
Technologies- web browser
+ ubiquitous and uniform client+ no network blockages (firewalls) for HTTP (port 80)+ huge user base+ browser functionality can be extended
- remote desktop clients (cf. Athena) “screen scrapers”
44
6. Architectures and Platforms2. Grids and clusters
Clusters and grids
Cluster middleware• job scheduling• cluster monitoring• user administration• matchmaking• job submissionImportant technologies: Sun GridEngine, Mosix
Grid middleware• meta-schedulling• abstraction of heterogeneity• information serviceImportant technologies :
Globus (and descendents), gLite (used in EGEE(+))interacting jobs : Message Passing Interface (MPI)
45
6. Architectures and Platforms2. Grids and clusters
Grid@UGent
46
•Thin client (web, screenscraper)•Thick client
6. Architectures and Platforms3. Transactional systems
Clusters and grids
Client
Server
Presentation Layer Business Logic
LayerPersistent Data StorageUser interface
3-tier business architecturelayered architecturerationale : separiation of concerns
•Web components (UI, webservices)•non web components•plain old objects
•Relational DB•Object relational mapping•Custom solution
47
6. Architectures and Platforms4. Integration
Clusters and grids
Ingration solution
Application 1
Application 3
Application 2
Integration solutions :- Enterprise service bus (SOAP based)- Message Oriented Middleware
48
...4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms7. The Java technology family
1. General architecture2. Client tier3. Web tier4. Business tier5. Web services
...
Chapter 1
Introduction
49
7. The Java family 1. General architecture
Java Enterprise Edition (JEE)
Client Server
Web Tier
Business Tier
Enterprise Information System
Browser
Application Client
JSP
Servlet
EJB
Entities
Presentation
Business logic
Persistent data storage
thin
clie
nt
thic
k cl
ient
HTTP
RMICORBA
50
7. The Java family 1. General architecture
The container concept
a component
a client
high level “programmers” viewon the interaction
mediated interactionby container toimplement services(e.g. security checks, load balancing, ...)
51
7. The Java family 2. Client tier
The container concept
Thin clientContainer = browser• runs HTTP protocol• services
• install plug-ins• run security checks• sandboxing
Thick clientContainer = appclient• runs RMI/CORBA/... protocol• middleware services:
• naming and lookup• security checking• ...
52
7. The Java family 3. Web tier
Servlet
= server side plug-in= base technology for all Java web tier components
Basic operation
1. browser sends request to URL referring to a Servlet instanceHTTP request contains parameters, HTTP headers, ...
2. Servlet gets HTTPRequest object from web server3. Servlet analyzes request4. Servlet prints HTML output to HTTPResponse object5. HTTPResponse is sent to webserver6. Webserver sends HTML to client
53
7. The Java family 3. Web tier
Servlet
Container services
• Servlet life cycle management• Servlet pooling• Session tracking • Providing access to other resources (e.g. a database)• Providing access to other components (other Servlets, EJBs, …)
webclientWebserver
HTTP POST
HTTP REPLY
Servlet Container
INVOKE
webclient Servlet
Database
init()service(HTTTRequest, HTTPResponse)
54
7. The Java family 3. Web tier
Java Server Pages (JSP)
Servlet disadvantage• pages contain static + dynamic data• static data created dynamically -> not optimal
Solution• mix static and dynamic code in single webpage• compile webpage to extract servlets and static portions• configure webserver to
• automatically invoke servlets to create dynamic content• merge dynamic and static content to response to client
JSP : Java Server Page
active
passive
passive
JSP compiler
Servlet class
HTMLcode
55
7. The Java family 3. Web tier
Java Server Faces (JSF)
Further optimization of UI development• Abstracts UI-component functions from appearance• Appearance defined in separate Renderer-object• Easy to reuse UI for other clients (e.g. mobile)
JSF captures• UI component state management• Page navigation (including linking to JSPs)• Input validation• Event handling
JSF compiles to JSP/Servlet
56
7. The Java family 3. Web tier
(Apache) Struts
= framework to “force” web developers to MVC-architectural pattern Model : data of the applicationView : presentation of the dataControl : linking the view and model
Model : plain Java codeView : JSPControl : declarative
Basic functioning1. ActionServlet gets HTTPRequest2. ActionServlet finds Action-class to forward request to
(from configuration file)3. Action-object handles request
Struts compile to JSP/Servlet
57
7. The Java family 4. Business tier
Beans and entities
Java EE container
Web container
EJBcontainer
JSP
ServletEJB Entitie
s
Persistence provider
Relational Database
External resources
58
7. The Java family 4. Business tier
Beans and entities
EJB container services• component life cycling management, • transaction processing, • security handling, • persistence • remotability• timer• state management• resource pooling• messaging
Persistence provider services• store/retrieve from DB• optimization through caching
59
7. The Java family 4. Business tier
Beans and entities
Session Beans EntitiesMessage Beans
- session related object- always associated to one single client at most- types stateful : “conversational state” stateless
- “real life” object- mostly associated to “row in database”- persistent- NEW since EJB3.0 [replace (very) complex EntityBeans]- NOT managed by EJB container
- asynchronous message handling- connected to JMS compliant message queue- new since J2EE 1.3
synchronous asynchronous
60
7. The Java family 5. Web services
Web services
What ?= “interface” technology specifies how to access remote objects through the webtypically : SOAP (Simple Object Access Protocol) over HTTP
Pro – con ?+ language and platform neutral+ often uses HTTP as underlying protocol (no network discontinuities)- performance (large messages, complex parsing)- only stateless invocations
JEE and web servicesPojo’s and session beans can be exposed as webservice
61
...4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms7. The Java technology family8. The .NET technology family...
Chapter 1
Introduction
62
8. The .NET family .NET
Very similar functions as in JEEWeb tier• web service (SOAP) heavily used• ASP is counterpart for JSP• .NET framework for web tier development : ASP.NET
Business tier• COM+ component model ported to .NET• “.NET enterprise services”
• resource pooling• eventing• security• transaction support• synchronization
63
...4. Design: middleware and services5. Classes of distributed systems6. Important architectures and platforms7. The Java technology family8. The .NET technology family9. This course...
Chapter 1
Introduction
64
9. This course Structure
Chapter 1: Introduction
Classes of distributed systems
Chapter 2 : Remote invocation
Challenges for distributed systems
Chapter 3 : Middleware Services
Chapter 4 : Global state
Chapter 5 : Coordination
Chapter 6 : Modeling distributed systems
Chapter 7 : Peer-to-peer systems
Chapter 8 :Grids and clusters
Chapter 9 : Workflow oriented systems
Transactional client-server systems
Chapter 10: Case studies