Upload
calvin
View
35
Download
0
Embed Size (px)
DESCRIPTION
Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns. Matthias Tichy , Daniela Schilling, Holger Giese. Application Example - RailCab. Application Example - RailCab. Vision: Combine individual transport cost-effectiveness of public transport - PowerPoint PPT Presentation
Citation preview
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004
Design of Self-Managing Dependable Systems with UML and Fault Tolerance
Patterns Matthias Tichy, Daniela Schilling, Holger Giese
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 2
Application Example - RailCab
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 3
Application Example - RailCab
Vision: Combine individual transport cost-effectiveness of public transport
Autonomously operating shuttles using linear drive (maglev train)
Build contact-free convoys for energy savings Information about the exact position necessary
Computation based on the stator waves
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 4
Motivation
Example service structure:
Problem: position calculation service must be highly reliable
:PositionCalculation :ConvoyController:StatorWaveSensor
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 5
Contents
Problem: position calculation service must be highly reliable
Solution: self-healing, fault-tolerant software
1. Application of software fault tolerance patterns, which capture:
a) Abstract service structure of fault tolerance techniquesb) Deployment restrictions (explicitly taking fault
tolerance into account)2. Automatic deployment
a) Initial deploymentb) Self-healing by deployment reconfiguration during
runtime
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 6
Capture the abstract structure of a well known fault tolerance technique
Example: Triple Modular Redundancy (TMR)
Fault Tolerance Patterns
:Multiplier :Service2:Provider :User:Voter
:Service3
:Service1
Triple Modular Redundancy
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 7
Naïve, unrestricted deployment may yield unwanted results
Deployment must respect restrictions imposed by the fault tolerance technique (redundancy, heterogeneity)
Instead of manual deployment, enrich fault tolerance patterns by deployment restrictions
Fault Tolerance Patterns - Deployment
Arthur
<<Service1>>
:PositionCalculation
<<Service2>>
:PositionCalculation
<<deploy>>
<<deploy>>
<<Service3>>
:PositionCalculation
<<deploy>>
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 8
{ Node3.CPU Node4.CPU Node4.CPU Node5.CPU Node3.CPU Node 5.CPU }
Deployment restrictions
Deployment restrictions for the TMR pattern:
Avoid crash failures-> Deploy redundant services to distinct nodes
Avoid single-point-of-failure of voter / multiplier-> Deploy voter and user to same node(if the user fails, the failure of the voter is no problem)
Heterogeneous hardware platform-> require different CPU
:Multiplier
:Service2
:Provider :User:Voter
:Service3:Service1
Node1: Node2:
Node3: Node4: Node5:
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 9
Fault Tolerance Patterns
<<Multiplier>>
mult:Multiplier
<<Service2>>
pc2:PositionCalculation
<<Provider>>
sen:StatorWaveSensor
<<User>>
cc:ConvoyController
<<Voter>>
vot:Voter
<<Service3>>
pc3:PositionCalculation
<<Service1>>
pc1:PositionCalculation
:PositionCalculation :ConvoyController:StatorWaveSensor
Application of TMR pattern
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 10
Compute a correct / reliable deployment
Use a standard constraint solver
Avalon
Taliesin
Uther
Gareth
Gorlois
Arthur
<<Multiplier>>
mult:Multiplier
<<Service2>>
pc2:PositionCalculation
<<Provider>>
sen:Sensor<<User>>
cc:Convoy<<Voter>>
vot:Voter
<<Service3>>
pc3:PositionCalculation
<<Service1>>
pc1:PositionCalculation
Deployment
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 11
Compute a correct / reliable deployment
Mapping to a standard constraint solver
• Variables xi,j є{0,1}
• Constraint: (each service is deployed to exactly one node)
i Services: ∑xi,j = 1
xi,jsen mul pc1 pc2 pc3 vot cc
Avalon
Gareth
Taliesin
Gorlois 1
Uther
Arthur
Nod
es (
j)Services (i) 1: Service pc1
is deployedto node Gorlois
j
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 12
Compute a correct / reliable deployment
Restriction: Services must be executed on same node
Graphically:
Constraint:
<<Voter>>
vot:Voter
<<deploys>>
<<User>>cc:
ConvoyController
<<deploys>>
j Nodes: (xvot,j+xcc,j) =2 or (xvot,j+xcc,j) =0
xi,jsen mul pc1 pc2 pc3 vot cc ∑
Avalon 0 0 0
Gareth 0 0 0
Taliesin 1 1 2
Gorlois 0 0 0
Uther 0 0 0
Arthur 0 0 0
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 13
Compute a correct / reliable deployment
Initial deployment:
Graphically:
xi,jsen mul pc1 pc2 pc3 vot cc
Avalon 1 1 0 0 0 0 0
Gareth 0 0 0 0 0 0 0
Taliesin 0 0 0 0 0 1 1
Gorlois 0 0 1 0 0 0 0
Uther 0 0 0 1 0 0 0
Arthur 0 0 0 0 1 0 0
mul
pc2
sen ccvot
pc3pc1 Avalon Taliesin
Gorlois Uther Arthur
Gareth
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 14
Repair
Due to TMR, one crashed redundant service is tolerable
Second crash is not tolerable Therefore: Self-heal by restarting the crashed service on a
working node. But on which node? Compute it timely!
<<Service2>>
pc2:PositionCalculation
<<Service1>>
pc1:PositionCalculation
Uther
Gorlois
<<deploys>>
<<deploys>>
<<Service3>>
pc3:PositionCalculationArthur <<deploys>>
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 15
Repair
Solve restricted deployment problem
<<Service1>>
pc1:PositionCalculation
Gorlois
<<deploys>>
• Fix variables xi,j of unaffected service/node combinations
• Free variables xi,j for affected service/node combinations
xi,jsen mul pc1 pc2 pc3 vot cc
Avalon 1 1 ? 0 0 0 0
Gareth 0 0 ? 0 0 0 0
Taliesin 0 0 ? 0 0 1 1
Gorlois 0 0 1 0 0 0 0
Uther 0 0 ? 1 0 0 0
Arthur 0 0 ? 0 1 0 0
Only consider changed values
Crash Failure
Service migration should be avoided!
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 16
Repair
No solution found -> relax the problem
First round Service migration
necessary
Second round Third round
• Fixed variables • Free variables
xi,jsen mul pc1 pc2 pc3 vot cc
Avalon 1 1 ? 0 0 0 0
Gareth 0 0 ? 0 0 0 0
Taliesin 0 0 ? 0 0 1 1
Gorlois 0 0 1 0 0 0 0
Uther 0 0 ? 1 0 0 0
Arthur 0 0 ? 0 1 0 0
xi,jsen mul pc1 pc2 pc3 vot cc
Avalon 1 1 ? ? 0 0 0
Gareth 0 0 ? ? 0 0 0
Taliesin 0 0 ? ? 0 1 1
Gorlois 0 0 1 0 0 0 0
Uther 0 0 ? ? 0 0 0
Arthur 0 0 ? ? 1 0 0
xi,jsen mul pc1 pc2 pc3 vot cc
Avalon 1 1 ? ? ? 0 0
Gareth 0 0 ? ? ? 0 0
Taliesin 0 0 ? ? ? 1 1
Gorlois 0 0 1 0 0 0 0
Uther 0 0 ? ? ? 0 0
Arthur 0 0 ? ? ? 0 0
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 17
Tool Support
Current work
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 18
Conclusions / Future Work
Fault tolerance patterns capture fault tolerance techniques Contain: Structure + Deployment restrictions Graphical specification Easy application
Automatic deployment Initial deployment Self-healing by deployment reconfiguration during runtime
Future Work Heterogeneous service restrictions in pattern application Synthesize behavior of voter and multiplier services Use fault tolerance application knowledge for automatic fault tree
analysis
www. .de
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 19
www. .de
Questions?
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 20
Compute a correct / reliable deployment
Restriction: Services must be deployed on distinct nodes
Graphically:
Constraint:
<<Service2>>
pc2:PositionCalculation
<<Service1>>
pc1:PositionCalculation
<<deploys>>
<<deploys>>
j Nodes: (xpc1,j+xpc2,j) <=1
xi,jsen mul pc1 pc2 pc3 vot cc ∑
Avalon 0 0 0
Gareth 0 0 0
Taliesin 0 0 0
Gorlois 1 0 1
Uther 0 1 1
Arthur 0 0 0
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 21
Compute a correct / reliable deployment
Restriction: CPU of nodes for redundant service must be distinct
Constraint:
xs1,j2 xs2,j2 xs3,j3 j1.cpu j2.cpu j3.cpu
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 22
Multistage
Multiple applications of TMR Transform to a multiple stage arrangement
with tripled voter and multiplier
Deployment restrictions are transformed too
<<Multiplier>>
mult:Multiplier
<<Service2>><<Provider>>
pc2:PositionCalculation
<<Provider>>
sen:Sensor<<User>>
<<Service2>>
cc:Convoy
<<Voter>>
:Voter
<<Service3>><<Provider>>
pc3:PositionCalculation
<<Service1>><<Provider>>
pc1:PositionCalculation
<<User>><<Service1>>
cc:Convoy
<<Voter>>
:Voter
<<User>><<Service3>>
cc:Convoy
<<Voter>>
:Voter
<<Multiplier>>
:PCMultiplier
<<Multiplier>>
:PCMultiplier
<<Multiplier>>
:PCMultiplier
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 23
Tool Support
Already working: Graphical specification of fault tolerance patterns and
deployment restrictions Automatic mapping to ILOG solver software
Next: Runtime support
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 24
Motivation
System without fault-tolerance, node crash (1) fault tolerance pattern Introduce redundancy (TMR) but deploy two redundant
service on one node, crash of that node Better deploy redundant services to distinct nodes (2) deployment restrictions for fault tolerance pattern Example with distinct nodes, crash failure, everything
ok (3) which nodes Now self-heal the system by restarting a crashed failure
on a new node (4) which node
University of PaderbornSoftware Engineering Group
Prof. Dr. Wilhelm Schäfer
Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 25
Motivation
Goals: Easy application of fault tolerance techniques Deployment issues taken into account
Easy deployment specification Reusable deployment specification