41
Ischia, Italy - 9-21 July 2006 1 Session 2 Monday 10 th July Malcolm Atkinson

Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 1

Session 2

Monday 10 th July

Malcolm Atkinson

Page 2: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 2

Page 3: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 3

Principles of Distributed ComputingPrinciples of Distributed Computing

Issues you can’t avoid– Lack of Complete Knowledge (LOCK)– Latency– Heterogeneity– Autonomy– Unreliability– Change

A Challenging goal– balance technical feasibility– against virtual homogeneity, stability and reliability

Appropriate balance between usability and productivity

– while remaining affordable, manageable and maintainable

This is NOT easy

Page 4: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 4

Lack of Complete KnowledgeLack of Complete Knowledge

• Technical origins of LoCK– Dynamics of systems involve very large state

spaces• Can’t track or explore or all the states

– Latency prevents up-to-date knowledge being available

• By the time a notification of a state change arrives the state may have changed again

– Failures inhibit information propagation

– Unanticipated failure modes

Page 5: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 5

Lack of Complete Knowledge 2Lack of Complete Knowledge 2

• Human origins of LoCK– lack of understanding

– Incomplete simplified models– Intractable models

– Poor & incomplete descriptions– Erroneous descriptions

• Socio-Economic effects generate LoCK– Autonomous owners do not choose to reveal all

• About their services, resources and performance

– Intermediaries aggregate & simplify

Page 6: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 6

LoCKLoCK Counter StrategiesCounter Strategies

• Improve the quality of the available knowledge– Better static information

– Better information collection & dissemination

• Improve quality of the Distributed System Models– Prove invariants that algorithms can exploit

– Test axioms with real systems

• Build algorithms that behave reasonably well– When they have incomplete knowledge

Page 7: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 7

LatencyLatency

• It is always going to be there– Consequence of signal transmission times– Consequence of messages / packets in queues– Consequence of message processing time– Errors cause retries

• It gets worse– Geographic scale increases latency– System complexity increases number of queues– System scale & complexity increase processing time

• Think about– How many operations a system can do while a message it

sent reaches its destination, a reply is formed and the reply travels back

Page 8: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 8

Latency Counter StrategiesLatency Counter Strategies

• Design algorithms that require fewer round trips– This is THE complexity measure!– Batching requests and responses

• Shorten distance to get information– Caching

• But may be stale data!

– Move data to computation• But be smart about which data when

– Move computation to data• Succinct computation & volumes of data• But safety and privacy issues arise

Page 9: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 9

HeterogeneityHeterogeneity

• Hardware variation– Different computer architectures

• Big endians v little endians• Number representation• Address length• Performance

– Different Storage systems• Architectures• Technologies• Available operations

– Different Instrument systems• Accepting different control inputs• Generating different output data streams

Some of the

variation is wanted

and exploited

Page 10: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 10

Heterogeneity 2Heterogeneity 2

• Operating System variation– Different O/S architectures

• Unix families & versions• Windows families and versions• Specialised O/S,

e.g. for Instruments & Mobile devices

• Implementation system variation– Programming languages– Scripting languages– Workflow systems– Data models– Description languages– Many implementations of same functionality

Some of the

variation is just makes work

Page 11: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 11

Heterogeneity Counter MeasuresHeterogeneity Counter Measures

• Invest in virtual Homogeneity– Agree standards (formally or de facto)– Introduce intermediate code

• That hides unwanted variation• Presenting it in standard form

– But this has high cost• Developing the standard• Developing the intermediate code• Executing the intermediate code

– It may hide variations some want• Provide direct access to facilities as well• But this may inhibit optimisation & automation

May be work in progress

Page 12: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 12

Heterogeneity Counter Measures 2Heterogeneity Counter Measures 2

• Automatically manage diversity– Manual agreement and construction of virtual

homogeneity will not scale & compose– Develop abstract and higher level models– Describe each component

– Generate the adaptations as needed from these descriptions

• Not yet achievable for the general & complete systems– Relevant for specific domains

Page 13: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 13

Autonomy and ChangeAutonomy and Change

• Necessary– To persuade organisations & individuals to

engage• They need to control their own facilities• They have best knowledge to develop their services• Their business opportunity

– Because coordinated change is unachievable• Systems & workloads are busy• Service commitments must be met• Large-scale scheduling of work is very hard

– To correct errors– To plug vulnerabilities

Page 14: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 14

Autonomy and Change 2Autonomy and Change 2

• What changes – local decisions– The underlying technology delivering a service

– The operations available from a service– The semantics of the operations

– Policy changes, e.g. authorisation rules, costs, …

• What changes – corporate decisions– Some agreed standard is changed

• E.g. a new version of a protocol is introduced

Page 15: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 15

Autonomy and change Counter MeasuresAutonomy and change Counter Measures

• Users & other providers expect stability

• Agree some standards that are rarely changed– As a platform & framework– As a means of communicating change

• Introduce change-absorbing technology– Mark the protocols and services with version information– Transform between protocols when changes occur– Anneal the change out of the system

• Develop algorithms tolerant to change– Revalidate dependencies where they may change– Handle failures due to change

Page 16: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 16

UnreliabilityUnreliability

• Failures are inevitable– Equipment, software & operations errors– Network outages, Power outages, …

• Their effects must be localised– Cannot afford total system outages

• This is not easy– Each error may occur when system is in any state– The system is an unknown composition of subsystems– Errors often occur while other errors are still active– Errors often occur during error recovery actions– Errors may be caused by deliberate attack– Attackers may continue their attack

Page 17: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 17

Unreliability Counter MeasuresUnreliability Counter Measures

• Requires much R&D

• Continuous arms race as scale of Grids grow• Ideal of a continuously available stable service

– Not achievable – recognise that drops in response and local failures must be dealt with

• Design resilient architectures• Design resilient algorithms

• Improve reliability of each component• Distribute the responsibility

– For failure detection– For recovery action

Page 18: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 18

Page 19: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 19

Three ComponentsThree Components

Registries

Service Consumers

Services

Register an available serviceSend name & description

Page 20: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 20

Three ComponentsThree Components

Registries

Service Consumers

Services

Request a serviceSend a description

Page 21: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 21

Three ComponentsThree Components

Registries

Service Consumers

ServicesRequest service operation

Page 22: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 22

Three ComponentsThree Components

Registries

Service Consumers

ServicesReturn result or Error

Page 23: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 23

Composed behaviourComposed behaviour

• Services are themselves consumers– They may compose and wrap other services

• The registry is itself a consumer• A federation of registries may deal with

registry services reliability & performance• Observer services may report on quality of

services and help with diagnostics• Agreements between services may be set up

– Service-Level Agreements– Permitting sustained interaction

Page 24: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 24

Composed behaviourComposed behaviour

• Services are themselves consumers– They may compose and wrap other services

• The registry is itself a consumer• A federation of registries may deal with

registry services reliability & performance• Observer services may report on quality of

services and help with diagnostics• Agreements between services may be set up

– Service-Level Agreements– Permitting sustained interaction

Page 25: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 25

Page 26: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 26

The Open Grid Services Architecture

• An open, service-oriented architecture (SOA)− Resources as first-class entities− Dynamic service/resource creation and destruction

• Built on a Web services infrastructure

• Resource virtualization at the core

• Build grids from small number of standards-based components− Replaceable, coarse-grained− e.g. brokers

• Customizable− Support for dynamic, domain-specific content…− …within the same standardized framework

Hiro Kishimoto: Keynote GGF17

Page 27: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 27

• Logical view of capabilities

• Relatively coarse-grained functions

• Reusable and composable behaviors

• Encapsulation of complex operations

• Naturally extendable framework

• Platform-neutral− machine and OS

Why Use an SOA?

Hiro Kishimoto: Keynote GGF17

Page 28: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 28

SOA & Web Services: Key Benefits

SOA• Flexible

− Locate services on any server− Relocate as necessary− Prospective clients find services using

registries

• Scalable− Add & remove services as demand

varies

• Replaceable− Update implementations without

disruption to users

• Fault-tolerant− On failure, clients query registry for

alternate services

SOA• Flexible

− Locate services on any server− Relocate as necessary− Prospective clients find services using

registries

• Scalable− Add & remove services as demand

varies

• Replaceable− Update implementations without

disruption to users

• Fault-tolerant− On failure, clients query registry for

alternate services

Web Services• Interoperable

− Growing number of industry standards

• Strong industry support• Reduce time-to-value

− Harness robust development tools for Web services

− Decrease learning & implementation time

• Embrace and extend− Leverage effort in developing and

driving consensus on standards− Focus limited resources on

augmenting & adding standards as needed

Web Services• Interoperable

− Growing number of industry standards

• Strong industry support• Reduce time-to-value

− Harness robust development tools for Web services

− Decrease learning & implementation time

• Embrace and extend− Leverage effort in developing and

driving consensus on standards− Focus limited resources on

augmenting & adding standards as needed

Hiro Kishimoto: Keynote GGF17

Page 29: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 29

Virtualizing Resources

Resources

Webservices

AccessAccess

StorageStorage SensorsSensors ApplicationsApplications InformationInformationComputersComputers

Resource-specific InterfacesResource-specific Interfaces

Common Interfaces

Type-specific interfaces

Hiro Kishimoto: Keynote GGF17

Page 30: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 30

A Service-Oriented Grid

Virtualized resources

Grid middleware

servicesBrokering

Service

Brokering Service

Registry Service

Registry Service

DataService

DataService

CPU Resource

CPU Resource Printer

Service

Printer Service

Job-Submit Service

Job-Submit Service

ComputeService

ComputeService

Notify

Advertise

ApplicationService

ApplicationService

Hiro Kishimoto: Keynote GGF17

Page 31: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 31

A Closer Look at OGSA

Hiro Kishimoto: Keynote GGF17

Page 32: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 32

OGSA Capabilities

Security• Cross-organizational users• Trust nobody• Authorized access only

Security• Cross-organizational users• Trust nobody• Authorized access only

Information Services• Registry• Notification• Logging/auditing

Information Services• Registry• Notification• Logging/auditing

Execution Management• Job description & submission• Scheduling• Resource provisioning

Execution Management• Job description & submission• Scheduling• Resource provisioning

Data Services• Common access facilities• Efficient & reliable transport• Replication services

Data Services• Common access facilities• Efficient & reliable transport• Replication services

Self-Management• Self-configuration• Self-optimization• Self-healing

Self-Management• Self-configuration• Self-optimization• Self-healing

Resource Management• Discovery• Monitoring• Control

Resource Management• Discovery• Monitoring• Control

OGSAOGSA

OGSA “profiles”OGSA “profiles”

Web services foundationWeb services foundation

Hiro Kishimoto: Keynote GGF17

Page 33: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 33

CDLCDL

3. Select from or deployrequired resources

3. Select from or deployrequired resources

Execution Management

• The basic problem− Execute and manage jobs/services in the grid− Select from or provision required resources

• The basic problem− Execute and manage jobs/services in the grid− Select from or provision required resources

2. Submit the job2. Submit the job

1. Describe the job1. Describe the job

JSDLJSDL

Job

4. Manage the job4. Manage the job

Hiro Kishimoto: Keynote GGF17

Page 34: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 34

IssuesIssues

Find Describe

Access

DataDataDataData

Formats

ProtocolsProtocols

Use casesUse cases

DataDataDataData

DataData

DataDataMove/Copy/ReplicateMove/Copy/Replicate

MetadataMetadata

DataData

ManageManage

Common

accessCommon

access

Data Services

The basic problem

• Manage, transfer and access distributed data services and resources

The basic problem

• Manage, transfer and access distributed data services and resources

Derived dataCatalog

Sensor Data stream

Text file

Relational database

Hiro Kishimoto: Keynote GGF17

Page 35: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 35

Resource Management

• Provides a framework to integrate resource management functions− interfaces, services, information models, etc.

• Enables integrated discovery, monitoring, control, etc.

• Provides a framework to integrate resource management functions− interfaces, services, information models, etc.

• Enables integrated discovery, monitoring, control, etc.

High-levelmanagementservices(GGF)

Domain-specific capabilitiesDomain-specific capabilities

OGSA

Access tomanageability(OASIS, DMTF)

Informationmodels (DMTF,SNIA, etc.)

Resources

WSDM, WS-ManagementWSDM, WS-Management

WSRF/WSN, WS-Transfer/EventingWSRF/WSN, WS-Transfer/Eventing

Dataservices

Securityservices

ExecutionManagement

services

Application-specific

Hiro Kishimoto: Keynote GGF17

Page 36: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 36

Information Services

Executionmanagement

Resourcereservation

Problemdetermination

Accounting

Applicationmonitoring

Loadbalancing

Servicediscovery

ConsumersConsumersInformationServices

InformationServices

• Reliable• Secure• Efficient

Provide management and access facilities for informationabout applications and resources in the grid environment

Provide management and access facilities for informationabout applications and resources in the grid environment

ProducersProducers

Asynchronousnotification

Retrieval

RegistryRegistry

LoggerLogger

Hiro Kishimoto: Keynote GGF17

Page 37: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 37

Specifications Landscape: April 2006

SYSTEMSMANAGEMENT

UTILITYCOMPUTING

GRIDCOMPUTING

Core Services

Web ServicesFoundation WS-Addressing

Privacy

WS-BaseNotification

CIM/JSIM

WSRF-RAP

WSDM

WS-Security

Naming

OGSA-EMSOGSA Self Mgmt

GFD-C.16

GGF-UR

Data Model

HTTP(S)/SOAP

Discovery

SAML/XACML

WSDL

WSRF-RL

Trust

WS-DAI

VO Management

Information

Distributed query processing

ASP

Data CentreUse Cases &Applications Collaboration Multi MediaPersistent Archive

Data Transport

WSRF-RP

X.509

StandardEvolvingGapHole

Warning: Volatile data!Warning: Volatile data!

Hiro Kishimoto: Keynote GGF17

Page 38: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 38

Page 39: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 39

GridsGrids

• Many reasons motivating investment in grids– Collaboration for Global Science & Business– Resource integration & sharing

• New approach to large scale distributed systems– Large coordinated effort– Industry & Academia

• Many technical and socio-economic challenges– Work for you all

• Many new opportunities– Work for you all

Page 40: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 40

Summary: Take home messageSummary: Take home message

• E-Infrastructure is arriving– Built on Grids & Web Services

– Data and Information grow in importance

• There is a dramatic rate of change• An opportunity for everyone

Page 41: Session 2 Monday 10 July - unina.itmurli/ISSGC06/session-02/Session...Data Services The basic problem • Manage, transfer and access distributed data services and resources The basic

Ischia, Italy - 9-21 July 2006 41