25
University of Paderborn Software Engineering Group Prof. Dr. Wilhelm Schäfer Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 Design of Self- Managing Dependable Systems with UML and Fault Tolerance Patterns Matthias Tichy , Daniela Schilling, Holger Giese

Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

  • Upload
    calvin

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns. Matthias Tichy , Daniela Schilling, Holger Giese. Application Example - RailCab. Application Example - RailCab. Vision: Combine individual transport cost-effectiveness of public transport - PowerPoint PPT Presentation

Citation preview

Page 1: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004

Design of Self-Managing Dependable Systems with UML and Fault Tolerance

Patterns Matthias Tichy, Daniela Schilling, Holger Giese

Page 2: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 2

Application Example - RailCab

Page 3: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 3

Application Example - RailCab

Vision: Combine individual transport cost-effectiveness of public transport

Autonomously operating shuttles using linear drive (maglev train)

Build contact-free convoys for energy savings Information about the exact position necessary

Computation based on the stator waves

Page 4: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 4

Motivation

Example service structure:

Problem: position calculation service must be highly reliable

:PositionCalculation :ConvoyController:StatorWaveSensor

Page 5: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 5

Contents

Problem: position calculation service must be highly reliable

Solution: self-healing, fault-tolerant software

1. Application of software fault tolerance patterns, which capture:

a) Abstract service structure of fault tolerance techniquesb) Deployment restrictions (explicitly taking fault

tolerance into account)2. Automatic deployment

a) Initial deploymentb) Self-healing by deployment reconfiguration during

runtime

Page 6: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 6

Capture the abstract structure of a well known fault tolerance technique

Example: Triple Modular Redundancy (TMR)

Fault Tolerance Patterns

:Multiplier :Service2:Provider :User:Voter

:Service3

:Service1

Triple Modular Redundancy

Page 7: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 7

Naïve, unrestricted deployment may yield unwanted results

Deployment must respect restrictions imposed by the fault tolerance technique (redundancy, heterogeneity)

Instead of manual deployment, enrich fault tolerance patterns by deployment restrictions

Fault Tolerance Patterns - Deployment

Arthur

<<Service1>>

:PositionCalculation

<<Service2>>

:PositionCalculation

<<deploy>>

<<deploy>>

<<Service3>>

:PositionCalculation

<<deploy>>

Page 8: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 8

{ Node3.CPU Node4.CPU Node4.CPU Node5.CPU Node3.CPU Node 5.CPU }

Deployment restrictions

Deployment restrictions for the TMR pattern:

Avoid crash failures-> Deploy redundant services to distinct nodes

Avoid single-point-of-failure of voter / multiplier-> Deploy voter and user to same node(if the user fails, the failure of the voter is no problem)

Heterogeneous hardware platform-> require different CPU

:Multiplier

:Service2

:Provider :User:Voter

:Service3:Service1

Node1: Node2:

Node3: Node4: Node5:

Page 9: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 9

Fault Tolerance Patterns

<<Multiplier>>

mult:Multiplier

<<Service2>>

pc2:PositionCalculation

<<Provider>>

sen:StatorWaveSensor

<<User>>

cc:ConvoyController

<<Voter>>

vot:Voter

<<Service3>>

pc3:PositionCalculation

<<Service1>>

pc1:PositionCalculation

:PositionCalculation :ConvoyController:StatorWaveSensor

Application of TMR pattern

Page 10: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 10

Compute a correct / reliable deployment

Use a standard constraint solver

Avalon

Taliesin

Uther

Gareth

Gorlois

Arthur

<<Multiplier>>

mult:Multiplier

<<Service2>>

pc2:PositionCalculation

<<Provider>>

sen:Sensor<<User>>

cc:Convoy<<Voter>>

vot:Voter

<<Service3>>

pc3:PositionCalculation

<<Service1>>

pc1:PositionCalculation

Deployment

Page 11: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 11

Compute a correct / reliable deployment

Mapping to a standard constraint solver

• Variables xi,j є{0,1}

• Constraint: (each service is deployed to exactly one node)

i Services: ∑xi,j = 1

xi,jsen mul pc1 pc2 pc3 vot cc

Avalon

Gareth

Taliesin

Gorlois 1

Uther

Arthur

Nod

es (

j)Services (i) 1: Service pc1

is deployedto node Gorlois

j

Page 12: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 12

Compute a correct / reliable deployment

Restriction: Services must be executed on same node

Graphically:

Constraint:

<<Voter>>

vot:Voter

<<deploys>>

<<User>>cc:

ConvoyController

<<deploys>>

j Nodes: (xvot,j+xcc,j) =2 or (xvot,j+xcc,j) =0

xi,jsen mul pc1 pc2 pc3 vot cc ∑

Avalon 0 0 0

Gareth 0 0 0

Taliesin 1 1 2

Gorlois 0 0 0

Uther 0 0 0

Arthur 0 0 0

Page 13: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 13

Compute a correct / reliable deployment

Initial deployment:

Graphically:

xi,jsen mul pc1 pc2 pc3 vot cc

Avalon 1 1 0 0 0 0 0

Gareth 0 0 0 0 0 0 0

Taliesin 0 0 0 0 0 1 1

Gorlois 0 0 1 0 0 0 0

Uther 0 0 0 1 0 0 0

Arthur 0 0 0 0 1 0 0

mul

pc2

sen ccvot

pc3pc1 Avalon Taliesin

Gorlois Uther Arthur

Gareth

Page 14: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 14

Repair

Due to TMR, one crashed redundant service is tolerable

Second crash is not tolerable Therefore: Self-heal by restarting the crashed service on a

working node. But on which node? Compute it timely!

<<Service2>>

pc2:PositionCalculation

<<Service1>>

pc1:PositionCalculation

Uther

Gorlois

<<deploys>>

<<deploys>>

<<Service3>>

pc3:PositionCalculationArthur <<deploys>>

Page 15: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 15

Repair

Solve restricted deployment problem

<<Service1>>

pc1:PositionCalculation

Gorlois

<<deploys>>

• Fix variables xi,j of unaffected service/node combinations

• Free variables xi,j for affected service/node combinations

xi,jsen mul pc1 pc2 pc3 vot cc

Avalon 1 1 ? 0 0 0 0

Gareth 0 0 ? 0 0 0 0

Taliesin 0 0 ? 0 0 1 1

Gorlois 0 0 1 0 0 0 0

Uther 0 0 ? 1 0 0 0

Arthur 0 0 ? 0 1 0 0

Only consider changed values

Crash Failure

Service migration should be avoided!

Page 16: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 16

Repair

No solution found -> relax the problem

First round Service migration

necessary

Second round Third round

• Fixed variables • Free variables

xi,jsen mul pc1 pc2 pc3 vot cc

Avalon 1 1 ? 0 0 0 0

Gareth 0 0 ? 0 0 0 0

Taliesin 0 0 ? 0 0 1 1

Gorlois 0 0 1 0 0 0 0

Uther 0 0 ? 1 0 0 0

Arthur 0 0 ? 0 1 0 0

xi,jsen mul pc1 pc2 pc3 vot cc

Avalon 1 1 ? ? 0 0 0

Gareth 0 0 ? ? 0 0 0

Taliesin 0 0 ? ? 0 1 1

Gorlois 0 0 1 0 0 0 0

Uther 0 0 ? ? 0 0 0

Arthur 0 0 ? ? 1 0 0

xi,jsen mul pc1 pc2 pc3 vot cc

Avalon 1 1 ? ? ? 0 0

Gareth 0 0 ? ? ? 0 0

Taliesin 0 0 ? ? ? 1 1

Gorlois 0 0 1 0 0 0 0

Uther 0 0 ? ? ? 0 0

Arthur 0 0 ? ? ? 0 0

Page 17: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 17

Tool Support

Current work

Page 18: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 18

Conclusions / Future Work

Fault tolerance patterns capture fault tolerance techniques Contain: Structure + Deployment restrictions Graphical specification Easy application

Automatic deployment Initial deployment Self-healing by deployment reconfiguration during runtime

Future Work Heterogeneous service restrictions in pattern application Synthesize behavior of voter and multiplier services Use fault tolerance application knowledge for automatic fault tree

analysis

www. .de

Page 19: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 19

www. .de

Questions?

Page 20: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 20

Compute a correct / reliable deployment

Restriction: Services must be deployed on distinct nodes

Graphically:

Constraint:

<<Service2>>

pc2:PositionCalculation

<<Service1>>

pc1:PositionCalculation

<<deploys>>

<<deploys>>

j Nodes: (xpc1,j+xpc2,j) <=1

xi,jsen mul pc1 pc2 pc3 vot cc ∑

Avalon 0 0 0

Gareth 0 0 0

Taliesin 0 0 0

Gorlois 1 0 1

Uther 0 1 1

Arthur 0 0 0

Page 21: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 21

Compute a correct / reliable deployment

Restriction: CPU of nodes for redundant service must be distinct

Constraint:

xs1,j2 xs2,j2 xs3,j3 j1.cpu j2.cpu j3.cpu

Page 22: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 22

Multistage

Multiple applications of TMR Transform to a multiple stage arrangement

with tripled voter and multiplier

Deployment restrictions are transformed too

<<Multiplier>>

mult:Multiplier

<<Service2>><<Provider>>

pc2:PositionCalculation

<<Provider>>

sen:Sensor<<User>>

<<Service2>>

cc:Convoy

<<Voter>>

:Voter

<<Service3>><<Provider>>

pc3:PositionCalculation

<<Service1>><<Provider>>

pc1:PositionCalculation

<<User>><<Service1>>

cc:Convoy

<<Voter>>

:Voter

<<User>><<Service3>>

cc:Convoy

<<Voter>>

:Voter

<<Multiplier>>

:PCMultiplier

<<Multiplier>>

:PCMultiplier

<<Multiplier>>

:PCMultiplier

Page 23: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 23

Tool Support

Already working: Graphical specification of fault tolerance patterns and

deployment restrictions Automatic mapping to ILOG solver software

Next: Runtime support

Page 24: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 24

Motivation

System without fault-tolerance, node crash (1) fault tolerance pattern Introduce redundancy (TMR) but deploy two redundant

service on one node, crash of that node Better deploy redundant services to distinct nodes (2) deployment restrictions for fault tolerance pattern Example with distinct nodes, crash failure, everything

ok (3) which nodes Now self-heal the system by restarting a crashed failure

on a new node (4) which node

Page 25: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns

University of PaderbornSoftware Engineering Group

Prof. Dr. Wilhelm Schäfer

Matthias Tichy - Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns – Oct 2004 - 25

Motivation

Goals: Easy application of fault tolerance techniques Deployment issues taken into account

Easy deployment specification Reusable deployment specification