Upload
jin
View
36
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Managing Metadata in Service Architectures. Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox. Outline. Introduction Motivation Requirements Research Issues Architecture Performance Evaluation Conclusions Contribution. Context as Service Metadata. Context interaction-independent - PowerPoint PPT Presentation
Citation preview
Managing Metadata in Service
ArchitecturesMehmet S. Aktas
Advisor: Prof. Geoffrey C. Fox
Outline Introduction Motivation Requirements Research Issues Architecture Performance Evaluation Conclusions Contribution
2 of 34
3 of 34
Context as Service Metadata Context
interaction-independent slowly varying, quasi-static service metadata
interaction-dependent dynamically generated metadata as result of interaction
of services information associated to a single service, or a session
(service activity) or both
Dynamic Grid/Web Service Collections loosely assembled collections of services assembled to support a specific task generate metadata and have limited life-time
4 of 34
Motivating Cases Multimedia Collaboration Grids
Global Multimedia Collaboration System- Global MMCS
widely distributed services, session service metadata, session metadata,
stream-specific metadata mostly read-only
Workflow-style applications in Geographical Information System/Sensor Grids Pattern Informatics (PI) – UC Davis, Interdependent
Energy Infrastructure Simulation System (IEISS) – LANL widely distributed services conversation metadata, transient multiple writers
5 of 34
Problems with Grid Information Services
Standardization and Unification Issues Customized Grid Information Services Differences in application requirements Thick clients
Performance and Centralization Issues Low performance Low fault tolerance
Dynamic Metadata Management Issues Point-to-point service communication approaches
6 of 34
Requirements for Grid Information Services
Greater Interoperability Unified platform for communication Shared communication protocol Thin clients
Greater Capabilities High Performance Fault-tolerant
Dynamic Grid/Web Service Collections Distributed state management Collaboration session management
7 of 34
Research Issues I Unification of Grid Information Services
How to combine different information services? Federation of Grid Information Services
What is a common data model and communication protocol?
Flexibility and extensibility Accommodating broad range of application domains
read-dominated, read/write dominated Ability to add/support more information services
Interoperability Being compatible with wide range of applications
8 of 34
Research Issues II Performance
Efficient centralized metadata management strategies high performance and persistency
Efficient decentralized metadata management strategies Efficient request distribution strategies Adaptation to instantaneous client-demand changes
Fault-tolerance Efficient replica-content creation strategies
Consistency How to provide consistency across the copies of the same
data?
Client
TUPLE SPACE API
TUPLE POOL ( JAVA SPACES)
UNIFORM ACCESS INTERFACE
Request processorAccess Control Notification
A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE
Information Service - I
Information Service - II ….
INFORMATION RESOURCE MANAGER
Client
TUPLE SPACE API
TUPLE POOL ( JAVA SPACES)
Extended UDDI API
Request processorAccess Control Notification
A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE
Extended UDDI WS-Context ….
INFORMATION RESOURCE MANAGER
WS-Context API
Unified Schema API … Unification
Uniform Access
Extensibility
Interoperability Extended UDDI WS-Context
Federation Unified Schema Query/Publish XML API
Hybrid Grid Information Service
9 of 34
10 of 34
Client
TUPLE SPACE ACCESS API (JAVASPACES)
Mapping Files
(XML)
TUPLE POOL
Extended UDDI API
Information Resource Manager
Resource Handler
DB1
Resource Handler
DB2
……
…
WS-Context API ….
Request processorAccess Control Notification
…..
Mapping Rule
Files (XSLT)
Filter
Extended UDDI WS-Context
Unified Schema API
TUPLE processor
Lifetime Management
Persistency Management
Dynamic Caching Management
Fault Tolerance Management
JDBC JDBC JDBC
PUB – SUB Network Manager
HYBRID GIS NETWORK CONNECTED WITH PUB-
SUB SYSTEM
Publisher Subscriber
10 of 34
UDDI instanceWS-Context instanceUnified schema instance
Client
TUPLE SPACE ACCESS API
TUPLE POOL
UNIFORM ACCESS INTERFACE
……
Dynamic Caching Management
Fault Tolerance Management
INFORMATION RESOURCE MANAGER
A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE
WS-ContextExtended UDDI
11 of 341 of 34Distributed HYBRID Grid Information Services
Subscriber
Publisher
Replica Server-2 Replica Server-N
Topic Based Publish-Subscribe Messaging System
HTTP(S)WSDLClient
WSDLClient
WSDL WSDL
HYBRID Grid Information Service
Replica Server-1
WSDL
HYBRID ServiceWSDL
HYBRID Service
Database
Extended UDDI
Database
WS-Context
Database
Ext UDDI
Database
WS-Context
Database
Ext UDDI
Database
WS-Context
Decentralized Fault-tolerant Efficient distribution Look-ahead caching Consistency enforced
11 of 34
12 of 34
Support for interaction-independent metadata: Extended UDDI Service
Client
TUPLE SPACE API
TUPLE POOL ( JAVA SPACES)
Extended UDDI API
Request processorAccess Control Notification
A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE
Extended UDDI WS-Context ….
INFORMATION RESOURCE MANAGER
WS-Context API
Unified Schema API …
It supports different types of metadata Geographical Information System Metadata Catalog
(functional metadata) User-defined metadata ((name, value) pairs)
It enables advanced query capabilities Geo-spatial queries Metadata oriented queries Domain independent queries
It provides additional capabilities Up-to-date service registry information (leasing) Dynamic aggregation of capabilities of services
Ex: geospatial capabilities
Support for interaction-dependent metadata: WS-Context Service
Client
TUPLE SPACE API
TUPLE POOL ( JAVA SPACES)
Extended UDDI API
Request processorAccess Control Notification
A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE
Extended UDDI WS-Context ….
INFORMATION RESOURCE MANAGER
WS-Context API
Unified Schema API
…
Context Manager Service Data model and communication protocol Session-related metadata
It supports Dynamic Web Service Collections Support for distributed state based systems
collaboration grids workflow-style grids
It provides various capabilities Asynchronous communication capability Up-to-date service registry information (leasing)
13 of 34
Support for federated service metadata: Unified Information Service
Client
TUPLE SPACE API
TUPLE POOL ( JAVA SPACES)
Extended UDDI API
Request processorAccess Control Notification
A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE
Extended UDDI WS-Context ….
INFORMATION RESOURCE MANAGER
WS-Context API
Unified Schema API …
Federating Grid Information Services Unified data model and communication protocol Extended UDDI, WS-Context and Glue Schemas
Approach taken Union of schemas vs. separate schemas Reuse common concepts
Ex1: business, session, site => category Combine disjoined concepts
Ex1: UDDI’s tModel
It enables hybrid query capabilities “Give me list of services satisfying C:{a,b,c..} QoS
requirements and participating S:{x,y,z..} sessions”14 of 34
Collaboration GridSensor Grid
WSDL HYBRID Service
Database
WS-Context
Topic Based Publish-Subscribe Messaging System
Subscriber
Publisher
WSDL HYBRID Service
Database
Ext-UDDI
Federating Grid Information Services
15 of 34
16 of 34
Features of the Distributed System Cache Strategy
Memory-in storage Access Distribution
Redirecting client request to an appropriate replica server
Look-ahead caching Moving/replicating metadata to where they
wanted Replica Content Placement
Replicating data on an appropriate replica server
Consistency enforcement Ensuring all replicas of a data to be the same
16 of 34
17 of 34
Tuple Spaces & Publish-Subscribe Paradigms Publish-Subscribe paradigm
Message based asynchronous communication Participants are decoupled both in space and in time Open source NaradaBrokering software
topic based publish/subscribe messaging system
Tuple Spaces paradigm [Gelernter-99] a data-centric asynchronous communication paradigm communication units are tuples (data structure) JavaSpaces [Sun Microsystems]- object oriented
implementation specification
18 of 34
Caching Strategy Light-weight implementation of JavaSpaces
Data sharing, associative lookup, and persistency Integrated caching capability for all types of
service metadata Ex: UDDI-type, WS-Context-type, Unified Schema-type
metadata We assume that today’s servers are capable of holding
such small size metadata in cache. All metadata accesses happen in memory Persistency
All metadata is backed-up into appropriate Information Service back-end every so often for persistency
Persistency investigation
10 100 1000 10000 100000123456789
101112
Round Trip Chart for Publish/Inquire operations for varying backup-interval times
Average for Publication
Average for Inquiry
Backup-time interval (msec) (logaritmic scale)
Tim
e (m
sec)
Test-1. Echo Service
singlethreaded W
SDL
Client
1 user/200 transactions
Test-2. Publish with memory access for WS-Context, extended UDDI and Unified
Schema standard operations
WSD
L
Client
Ext-UDDI
HYBRIDSERVICE
WSDL
WS-Context
ECHOSERVICE
WSDL
Test-3. Publish with database access for WS-Context, extended UDDI and Unified
Schema standard operations
singlethreaded W
SDL
Client
Ext-UDDI
HYBRIDSERVICE
WSDL
WS-Context
1 user/200 transactions
1 user/200 transactions
singlethreaded
Performance investigation
Simulation parametersBackup frequency
every 10 seconds
Metadata size 1.7 KbytesRegistry size 5000 metadataObservation 200
19 of 34
20 of 34
Performance investigation
1 2 3 4 50
5
10
15
Round Trip Time Chart for WS-Context Service Metadata Publish Request
Average-WSContext - database
Average-WSContext - memory
Average - echo service
Repeated Test Cases
Tim
e (m
sec)
1 2 3 4 50
5
10
15
20
25
Round Trip Time Chart for Hybrid Service Metadata Publish Request (WS-Context metadata)
Average - Hybrid - databaseAverage - Hybrid - memoryAverage - echo service
Repeated Test Cases
Tim
e (m
sec)
1 2 3 4 50
5
10
15
20
Round Trip Time Chart for Ext UDDI Service Metadata Publish Request
Average-Ext UDDI - database
Average-Ext UDDI - memory
Average - echo service
Repeated Test Cases
Tim
e (m
sec)
1 2 3 4 50
5
10
15
20
25
Round Trip Time Chart for Hybrid Service Metadata Publish Request (UDDI metadata)
Average - Hybrid - database
Average - Hybrid - memory
Average - echo service
Repeated Test Cases
Tim
e (m
sec)
21 of 34
Message rate scalability investigation
100 200 300 400 500 600 700 800 900 10000
10
20
30
40
50
60
70inquiry message rate
publication message rate
message processing rate (message/per second)
avg
tim
e (m
s) p
er m
essa
ge
Hybrid Information Service – WS-Contextinquiry/publish operations with increasingmessage rates (# of messages per second)
HTTP(S)
WSD
LThread Pool
WSD
LThread Pool
5 Client distributed to cluster nodes 1 to 5, with each running 1 to 15 threads
Ext-UDDI
HYBRIDSERVICE
WSD
L
WS-Context
Message rate scalability investigation
22 of 34
Message size scalability investigation
10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40Average - database access
Average - memory access
Average - Echo Service
context payload size (KB)
time
(mill
isec
onds
)
0.1 1.0 10.0 100.00
5
10
15
20
25
30
Average - publish
Average - inquiry
context payload size (KB) (logarithmic scale)
time
(mill
isec
onds
)
Hybrid Information Service – WS-Context inquiry/publish operations with increasing
message sizes
singlethreaded W
SDL
Client
1 user/200 transactions
Ext-UDDI
HYBRIDSERVICE
WSD
L
WS-Context
Server
Message size scalability investigation
Simulation parametersBackup frequency every 10 secondsRegistry size 5000 metadataObservation 200
23 of 34
Access DistributionLook-ahead Caching Broadcast-based request dissemination
Pub-sub system for message broadcast Broadcast requests only to those servers that
can answer No need to keep track of metadata locations
Dynamic migration/replication [Rabinovich et al, 1999] Popular copies are moved/replicated where
they wanted Autonomous decisions, self-awareness
24 of 34
Access Distribution ExperimentTest Methodology
NB node
HybridService instance
HybridService instance
Bloomington, IN 1 - Indianapolis, IN
2- Tallahassee, FL
3- San Diego, CA
HybridService instance
HybridService instance
Bloomington, IN 1 - Indianapolis, IN
2- Tallahassee, FL
3- San Diego, CA
NB node
NB node
T1 T2 T3Time = T1 + T2 + T3
Simulation parametersBackup frequency every 10 secondsMessage size 2.7 Kbytes
Distribution experiment result
Overhead of access distribution is only few milliseconds. Continuous access distribution operation does not
degrade the performance.
bloomington-indianapolis bloomington-tallahassee bloomington-san diego0
10
20
30
40
50
60
70
overhead of distribution when using one intermediary brokeroverhead of distribution when using two intermediary brokerslatency
Tim
e (m
s)
The overhead of distribution remains the same regardless of the network distances between nodes.
0 5 10 15 20 250
1
2
3
4
5
6
7
8 Bloomington - Indianapolis Access Distribution Chart
Average - Two Brokers
Average - One Broker
Average - Latency
Every 1000 observations
Tim
e (m
s)
25 of 34
26 of 34
NB node
HybridService instance
HybridService instance
Bloomington, IN Indianapolis, INTest-1 Distribution with Dynamic Replication Enabled
Test-2 Distribution with Dynamic Replication Disabled
NB node
HybridServiceinstance
HybridServiceinstance
Bloomington, IN Indianapolis, IN
T1 T2 T3
Time = T1 + T2 + T3
Simulation parametersmessage size / message rate 2.7 Kbytes / 10 msg/sec
replication decision frequency every 100 seconds
deletion / replication threshold 0.03 request/second and 0.18 request/second
registry size 1000 metadata in Indianapolis
Dynamic Replication PerformanceTest Methodology
27 of 34
The decrease in average latency shows that the algorithm manages to move replica copies to where they wanted.
0 5 10 15 20 250
1
2
3
4
5
6
7
Dynamic Replication Performance Chart - Distribution between Blooming-ton, IN and Indianapolis, IN
Average - Dynamic Replication
STDev - Dynamic Replication
Every 100 sec
Late
ncy
(ms)
0 5 10 15 20 250
1
2
3
4
5
6
7
Dynamic Replication Performance Chart - Distribution between Blooming-ton, IN and Indianapolis, IN
Average - Distribution
STDev - Distribution
Every 100 sec
Late
ncy
(ms)
Replica content placementConsistency enforcement Replica-content placement
Each node keeps information about other servers Selection of Replica Server(s)
Selection policy based on a) geographical (proximity) and b) topical (number of topics) information
Consistency Enforcement - Primary-copy approach Update distribution: updates labeled with synchronized
timestamps reflected (unicast) to primary-copy Update propagation: primary-copy pushes (broadcast)
updates only to those replica servers holding the context
HybridService 1
HybridService 2
HybridService 3
HybridService 4
HybridService 1 28 of 34
29 of 34
Fault-tolerance experiment Testing Setup
Hybrid Serviceinstance
Hybrid Serviceinstance
Bloomington, IN
NB node
NB node
Hybrid Serviceinstance
NB node
Hybrid Serviceinstance
NB node
Indianapolis, IN
Tallahassee, FL
San Diego, CA
Hybrid Serviceinstance
Hybrid Service instance
Bloomington, IN
NB node
Hybrid Serviceinstance
Hybrid Serviceinstance
Indianapolis, IN
Tallahassee, FL
San Diego, CA
Test - 1
Test - 2
Simulation parametersBackup frequency every 10 secondsMessage size 2.7 Kbytes
T1 T2 T3Time = T1 + T2 + T3
30 of 34
Fault-tolerance experiment result
Overhead of replica creation is only few milliseconds. Continuous replica creation operation does not degrade
the performance.
1 replica creation (In-dianapolis)
2 replica creation (Indi-anapolis, IN - Tallahassee,
FL)
3 replica creation (Indi-anapolis-IN, Tallahassee-
FL, San Diego-CA)
0
10
20
30
40
50
60
70
overhead of replica creation when using one intermediary brokeroverhead of replica creation when using two intermediary brokersend-to-end latency
Tim
e (m
s)
Overhead of replica creation increases in the order of milliseconds as the fault-tolerance level increase.
0 5 10 15 20 250123456789
1 replica creation at remote location: Indianapolis, IN
Average - Two Brokers
Average - One Broker
Latency
Every 1000 observations
Tim
e (m
s)
31 of 34
Consistency Enforcement ExperimentTest Methodology
NB node
HybridService instance
HybridService instance
Bloomington, IN 1 - Indianapolis, IN
2- Tallahassee, FL
3- San Diego, CA
HybridServiceinstance
HybridService instance
Bloomington, IN 1 - Indianapolis, IN
2- Tallahassee, FL
3- San Diego, CA
NB node
NB node
T1 T2 T3Time = T1 + T2 + T3
Simulation parametersBackup frequency every 10 secondsMessage size 2.7 Kbytes
32 of 34
Consistency Enforcement Test Result
Overhead of consistency enforcement is few milliseconds. Continuous operation does not degrade the performance. The cost of consistency enforcement remains the same
regardless of distribution of the network nodes.
bloomington-indianapolis bloomington-tallahassee bloomington-san diego0
10
20
30
40
50
60
70
overhead of distribution when using one intermediary brokeroverhead of distribution when using two intermediary brokerslatency
Tim
e (m
s)
0 5 10 15 20 250
1
2
3
4
5
6
7
8
9Bloomington - Indianapolis Consistency Enforcement Chart
Average - Latency
Average - One Broker
Average - Two Brokers
Every 1000 observations
Tim
e (m
s)
Conclusions
33 of 34
Efficient decentralized metadata strategies TupleSpaces & Pub-Sub communication paradigms Distribution Replication for fault-tolerance Replication for performance Consistency Enforcement
Efficient centralized metadata management strategies TupleSpaces Paradigm based memory-in storage
Contributions
34 of 34
Federated Grid Information Service Architecture Unified data model and communication protocol Support for both interaction independent and conversation-
based service metadata Support for greater interoperability
Unified Grid Information Service Architecture Flexible and extendable architecture Support for High Performance and Fault-tolerance Uniform access to all kinds of service metadata
Efficient decentralized metadata systems can be built by integrating TupleSpaces and Publish-Subscribe paradigms Fault-tolerance, distribution and consistency can be succeeded
with few milliseconds system processing overhead. Self-awareness can be achieved in decentralized metadata
management. Communication among services can be achieved with
efficient mediator metadata strategies A metadata management approach for Dynamic Web/Grid
Service Collections Collective operations such as queries on subsets of all available
metadata in service conversation.
35 of 34
Information Service Usage CasesWS-Context Fast SOAP transfer in Mobile
Computing (Sangyoon Oh Thesis)
WS-ContextExtended UDDI
Geographical Information Service & Sensor Grids (Galip Aydin’s Thesis)
WS-Context Session Metadata Management (Hasan Bulut’s Thesis)
WS-Context Fault-Tolerant Registry (Harshawardhan Gadgil’ s Thesis)
WS-Context VLab Project – Univ. of Minesota, Florida State University
Extended UDDI Chemical Informatics and Cyberinfrastructure Collaboratory Project
WS-ContextExtended UDDI
Pattern Informatics – UC – DavisIEISS - LANL
Selected Publication List focusing on a) Metadata, b) Information Services, and c) Metadata Discovery
36 of 34
Mehmet S. Aktas, Geoffrey Fox, Marlon Pierce, Information Services for Dynamically Assembled Semantic Grids [SKG-05, 2005]
Mehmet S. Aktas, Geoffrey Fox, Marlon Pierce, Managing Dynamic Metadata as Context [ICCSE, 2005]
Mehmet S. Aktas et al., Web Service Information Systems and Applications [GGF-16, 2006]
Mehmet S. Aktas, Geoffrey C. Fox, Marlon Pierce, Fault Tolerant High Performance Information Services for Dynamic Collections of Grid and Web Services [FGCS Journal, 2006]
Mehmet S. Aktas, Sangyoon Oh, Geoffrey C. Fox, Marlon Pierce, XML Metadata Services [SKG-2006, Concurrency and Computation: Practice and Experience Journal-2007]
Mehmet S. Aktas, Marlon Pierce, and Geoffrey C.Fox, Designing Ontologies and Distributed Resource Discovery Services for an Earthquake Simulation Grid [ GGF11, 2004]
Mehmet S. Aktas, M. Pierce, G. Fox, and D. Leake , A Web based Conversational Case-Based Recommender System for Ontology aided Metadata Discovery [GRID Workshop -2004]
Sangyoon Oh, Mehmet S. Aktas, Geoffrey C. Fox, Marlon Pierce, Architecture for High-Performance Web Service Communications Using an Information Service [WSEAS Journal -2006]