Upload
daniel-marsh
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
1
Reliable Web Services by Fault Tolerant Techniques: Methodology, Experiment, Modeling and Evaluation
Term Presentation
Presented by Pat Chan3 May 2006
2
Outline Introduction Problem Statement Methodologies for Web Service
Reliability New Reliable Web Service Paradigm Road Map for Experiment Experimental Results and Discussion Conclusion
3
Introduction Service-oriented computing is becoming a reality. Web Service is a promoting technique in the internet. The benefit of interoperability, reusability, and
adaptability. Reliability is an important issue. Existing web service model needs to be extended to
assure survivability and reliability. We propose experimental settings and offer a
roadmap to dependable Web services.
4
Reliability
"a measure of the success with which the system conforms to some authoritative specification"Guaranteed deliveryDuplicate eliminationOrderingCrash toleranceState synchronization
5
What are Web Services ?
Self-contained, modular applications built on deployed network infrastructure including XML and HTTP
Use open standards for description (WSDL), discovery (UDDI) and invocation (SOAP)
6
Web Services
Internet
UDDI
WSDL
HTTP/SOAP
WSDL
7
Web Services Architecture
SOAPSOAP
HTTP/SMTPHTTP/SMTPXMLXMLTCP/IPTCP/IP
DirectoryDirectory
InspectionInspection
Building Block ModulesBuilding Block Modules
Inter Application ProtocolsInter Application Protocols
ReferralReferral
RoutingRouting
SecuritySecurity
LicenseLicense
EventingEventing TransactionsTransactions
Reliable MessagingReliable Messaging
The InternetThe Internet
DescriptionDescription
……
……
8
Web Services
Benefits of WSService-orientedHighly accessibleOpen specificationEasy integration
Simplicity
Dynamic Standard
Web Services
Build common infrastructure reducing the barriers of business integration with lower costs and faster speed.
9
Problems of Web Services
Transaction Atomicity is not provided
Security Insecure Internet transportation
Reliability The internet is inherently unreliableNo single underlying “transport
protocols” address all the reliability issues.
10
Problem Statement Fault-tolerant techniques
Replication Diversity
Replication is one of the efficient ways for providing reliable systems by time or space redundancy.
Increasing the availability of distributed systems Key components are re-executed or replicated Protect against hardware malfunctions or transient system faults.
Another efficient technique is design diversity. By independently designing software systems or services with
different programming teams, Resort in defending against permanent software design faults.
We focus on the analysis of the replication techniques when applied to Web services.
A generic Web service system with spatial as well as temporal replication is proposed and investigated.
11
Methodologies for Reliable Web services -- Redundancy Spatial redundancy
Static redundancy, all replicas are active at the same time and voting takes place to obtain a correct result.
Dynamic redundancy engages one active replica at one time while others are kept in an active or in standby state.
Temporal redundancy Redundancy in time
12
Methodologies for Reliable Web services -- Diversity
Protect redundant systems against common-mode failures
With different designs and implementations, common failure modes will probably cause different error effects.
N-version programming, recovery blocks…
13
Failure Response Stages of Web Services Fault confinement Fault detection Diagnosis Fail-over Reconfiguration Recovery Restart Repair Reintegration
14
Fault Confinement
Fault Detection Fault Detection
Failover Diagnosis
Online Offline
Reconfiguration
Recovery
Restart
Repair
Reintegration
15
Replication Manager
Web service selection algorithm
WatchDog
UDDI
Registry
WSDL
Web ServiceIIS
Application
Database
Web ServiceIIS
Application
Database
Web ServiceIIS
Application
Database
Client
Port
Application
Database
1. Create web services
2. Select primary web service (PWS)
3. Register
4. Look up
5. Get WSDL
6. Invoke web service
7. Keep check the availability of the PWS
8. If PWS failed, reselect the PWS.
9. Update the WSDL
Propose Paradigm
16
RM sends message to the Web Service
Reselect a primary Web Service
Do not get reply
Map the new address to the WSDL
System Fail
Get reply
All Service failed
Work Flow of the Replication Manager
17
Road Map for Experiment Research
Redundancy in time Redundancy in space
SequentiallyParallelMajority voting using N modular
redundancyDiversified version of different
services
18
Experiments
A series of experiments are designed and performed for evaluating the reliability of the Web service, single service without replication,single service with retry or reboot and, service with spatial replication.
We will also perform retry or failover when the Web service is down.
19
Summary of the Experiments
None Retry/Reboot
Failover Both (hybrid)
Single service, no retry
0 -- -- --
Single service with retry
-- 1 -- --
Single service with reboot
-- 2 -- --
Spatial replication
-- -- 3 4
20
Parameters of the Experiments
Parameters Current setting/metric
Request frequency 1 req/min
Polling frequency 5 ms
Number of replicas 5
Client timeout period for retry 10 s
Failure rate λ # failures/hour
Load (profile of the program) % or load function
Reboot time 10 min
Failover time 1 s
21
Experimental Results
Experiments over 360 hour period (43200 reqs)
Normal Resource Problem
Entry Point Failure
Network Level Fault Injection
Exp 0 4928 6130 6492 5324
Exp 1 2210 2327 2658 2289
Exp 2 2561 3160 3323 5211
Exp 3 1324 1711 1658 5258
Exp 4 1089 1148 1325 2210
Retry11.97% to 4.93%
Reboot11.97% to 6.44%
Failover11.97% to 3.56%Retry and Failover11.97% to 2.59%
22
Number of Failure When the Server is Normal Situation
23
Number of Failure When the Server is Busy
24
Number of Failure When the Server Reboots Periodically
25
Network Level Fault Injection
26
Reliability of the System Over Time
0
( ) ( )lim 0.025t
F t t F t
t
( )( ) t tR t e
27
(a)
(b)
P1
λ1
μ1C2
S-j
P2
μ2C2
λ2
S-j-1
S
S-n
F
λN
μ*c2
(1-c1)μ*
λ* S-1 S-2
λ*
μ*c2
λ*
(1-c1)μ*(1-c1)μ*
F
(1-c1)μ1
(1-c1)μ2
(1-c2)μ1
(1-c2)μ2
Reliability Model
28
Reliability Model
1 2 1 2 2 2* (1 ) (1 )C C 1 1 2 2*
ID Description Value
λn Network failure rate 0.02
λ* Web service failure rate 0.025
λ1 Resource problem rate 0.142
λ2 Entry point failure rate 0.150
μ* Web service repair rate 0.286
μ1 Resource problem repair rate 0.979
μ2 Entry point failure repair rate 0.979
C1 Probability that the RM response on time 0.9
C2 Probability that the server reboot successfully 0.9
29
SHARPE
Failure rate
0.0050.050.010.020.030.04
Reliability with different failure rate
30
Conclusion
Surveyed replication and design diversity techniques for reliable services.
Proposed a hybrid approach to improving the availability of Web services.
Carried out a series of experiments to evaluate the availability and reliability of the proposed Web service system.
Developed the Reliability Model for the proposed system.