Breakpoints and Halting in Distributed Systems
Presented by
Abhishek Saxena
CS 739 Distributed Systems
Spring 2002
2
References
• Detecting Relational Global Predicates in Distributed Systems by Alexander I. Tomlinson and Vijay K. Garg, 1993
• Breakpoints and Halting in Distributed Programs by Barton P. Miller and Jong-Deok Choi, 1992
• Restoring Consistent Global States of Distributed Computations by Goldberg et al., 1991
3
Presentation Layout
• Introduction• Motivation• Halting in Distributed Systems• Detecting Breakpoints for:
• Conjunctive/Disjunctive/Linked Predicates• Relational Predicates
• Applications to Research• Relevance to papers read• Conclusions
4
Introduction
• General problems of:– Halting distributed programs– Detecting breakpoints – Validating resource conflicts– Recording, restoration and replay of program
sequences
5
Motivation
• Why halt?– Interactive debugging– Issues in distributed systems:
• No single global notion of time• Unpredictable communication delays• How to issue instant command to all processes?• Command to simultaneously reach all processes?
6
Halting
• 2 pertinent questions:– How to halt a distributed program?
• Halting Algorithm
– When to halt?• Breakpoint Detection
7
Halting Algorithm
• Extends Chandy & Lamport’s algorithm• Sending rule:
– Increments last_halt_id– Send halt marker containing this value to
outgoing channels
• Receiving rule:– Compare the halt_id with its last_halt_id &
update – Send halt marker like sender
8
Receiving process Q
Process T
Process U
Halt marker
Sending process P
Process R
Process S
Halt markerHalt marker
The Halting Algorithm
Halt marker Halt marker
9
The Halting Algorithm
• Intuitive extension to Chandy & Lamport’s Algorithm[1]
• Leads to a global consistent state since:– Process states same as recorded process
states in [1]– Undelivered messages same as recorded
channels states in [1]
10
Problems with this Algorithm
• Processes that infrequently interact with other computation processes• Long halting time
• Acyclic network connection
P Q
Producer Consumer
Communication Channel
11
A Solution…• Centralized debugger process:
d
qp
Debugger process
12
Problems with this solution
• Communication overheads
• Possible change in execution of program
• Complex to build
13
Detecting Breakpoints
• Breakpoints & Predicates
• Predicate satisfaction = breakpoint detection
• Distributed processes’ system needs: – Simple predicates– Disjunctive predicates– Linked predicates…interesting!– Conjunctive predicates…very interesting!
14
Simple Predicates
• Encapsulate single process behavior
• Detect simple events:– Entered procedure– Message sent / received– Channel created / destroyed– Process created / destroyed
15
Disjunctive predicates
• Form:
DP ::= SP [ U SP ]*
• Satisfied when any SP is satisfied
• Initiate halting when DP is true
16
Linked Predicates
• Specify sequences of events
• Form:
LP ::= DP [ ->DP ]*
• Debugger process sends the LP {DP1->...} to processes involved in DP1
• Upon DP1, strip off DP1 & send stripped LP to processes involved in DP2
17
Process S
Process P
Linked predicates’ implementation
Debugger process
Process Q
Process R
Processes involved in DP1
Processes
involved in DP2
DP1->DP2DP1->DP2DP1->DP2
Start Halting
Process T
DP2DP2
Start halting
Start halting
18
Conjunctive Predicates
• Form:
CP ::= SP [ ∩ SP ]*• Hardest to detect! • No single time reference across machines• Interpretation based on virtual time:
– Consider processes P1, P2 with virtual time axes T1, T2
– Define
SCP = { (t1, t2) | t1ε T1, t2ε T2, SP(t1) ∩ SP(T2) }
19
Conjunctive predicates
• Split SCP into:– Ordered-SCP:
{ (t1, t2) | (t1, t2)ε SCP, ((SP1) i -> (SP2) j) U ((SP2) i ->(SP1) j) }
– Unordered-SCP:{ (t1, t2) | (t1, t2)ε SCP, (t1, t2) € ordered-SCP }
20
Conjunctive Predicates
t11
t12
t13
t21
t22
t23
unordered- SCP pair
ordered-SCP pair
21
Conjunctive Predicates
• Detecting unordered-SCP events difficult
• Requires:– Global information gathering process– Time delay!– Cannot preserve meaningful process states
22
Detecting Relational Global Predicates
• Resource conflict validation problems undetectable by earlier predicate classes
• Form:
( x0 +…+ xn > C )– xi: resource usage at Pi– C: total resource available
• Undecomposable into earlier classes of predicates
23
How to detect such predicates?
• 2 algorithms:– Decentralized: runs concurrently– Centralized: decoupled from the target
program
24
Model & Notation
• Partial ordering on S = { S0, …, Sn } where, Si <= Sj, for 0 <= i,j <= n
• Happens-before relation: “->”
• pred.u.i: Intuitively, is the state just preceding u in process i
• succ.u.i: The state just succeeding u in process i
25
Concurrent States & Intervals
Deterministic event
Non-deterministic event
Local state
P Q
State Interval
Receive Interval
2
3
411
10
9
26
Concurrent Intervals
1, lo1
0, lo0 0, i 0, hi0 KEY
1, j 1, hi1
pred relation
P1
P0
27
Concurrent Intervals
• Intervals (0,i) & (1, j) concurrent iff
KEY exists in P0 or P1 s.t.,
lo0 < i <= hi0 & lo1 < j <= hi1,
where,
the lo0, lo1, hi0, hi1 as defined by the previous diagram
28
Overview of algorithms
• Gather information– What?– How?
• Consider 2 processes P0 & P1
• Gather concurrent interval sequences: – { lo0 to hi0 } at P0 & { lo1 to hi1 } at P1
• Check resource violations at all possible pairs of states in these sequences!!
29
Algorithms contd…
• Representation of
(0, lo0) (0, hi0)
(1, lo1) (1, hi1)
as a 2x2 Matrix clock• Row i of Pi’s matrix clock = Pi’s vector clock• Current interval at Pk = (k, Mk[ , ])• Row k of Mk…pred() of current interval at Pk• Row i<>k…pred.pred() of current interval at Pk
30
Maintaining Matrix Clocks
• Initialize– Initialize matrix to 0– If k=0 or k=1 Mk[k, k] ++
• Send message tagged with Mk[., .] ; Increment Mk[k,k] for k=0 V 1
• Upon message receive update matrix clock; Increment Mk[k,k] ; – Mk[k, ]= diagonal(Mk)
31
Matrix Clock Example
1 00 0
0 00 1
0 00 2
2 12 3
2 10 1
3 10 1
0 0
0 1
2 1
0 1
P0
P1
32
Decentralized Algorithm
• Consider process P0
• Upon mesg receive evaluate lo0, lo1, hi0, hi1
• Find min value of resource(x) at P0
• Send debug mesg (min_x0, lo1, hi1) to P1
• P1 detects the predicate :
(min_x0 + min_x1 > C)
33
Overheads & Complexity at P0
• Message overheads:– (# of receive intervals at P0)* sizeof ( 3
integers)………………..Debug mesgs– Sizeof(4 integers)…………Application mesgs
• Memory:– # intervals at P0; min_x for each interval
• Computation:– (# intervals at P0)*( # debug mesgs sent +
received)
34
Centralized Algorithm
• Checker process runs concurrently or, post-mortem
• Consider the latter: processes P0 & P1– Processes keep trace files containing:
• min_x for each interval• an array of {lo0, lo1, hi0, hi1} for each interval
– Runs a check algorithm• Builds heaps by inserting the min_x values for all
concurrent interval sequences at P0 & P1 • Use these heap-tops to detect the predicate
35
Overheads & Complexity for P0
• Memory:– 4 integers for matrix clock each application process
• Computation:– Monitor local variables– Rest offloaded to checker– O(R0 + M0logM0 + M1logM1)
Where, R0 & M0 = # rec intervals & total intervals at P0
36
Major Practical Problems
• Reduced complexity from exp to O(nlogn) but still…
• Large overheads even for 2 processes
• Lots of messages!
• Lots of memory space!
• Lots of computation!
37
Applications to Research
• Development of distributed debugging environment– Recording of execution sequences– Rollback– Replay– Exploration of new execution scenarios
• Command of mission-control distributed systems
38
Relevance to Papers Read
• The S/Net’s Linda kernel:– Debugging distributed tuple space– Detecting race conditions, deadlocks, probe
effects
• Chandy & Lamport’s paper explores the detection of stable predicates and Garg’s paper explores unstable predicate detection
39
Conclusions
• Distributed debugging still challenging
• No efficient algorithm
• Hard to do away with overheads
• Need for efficient event monitoring & manipulation tools
• Message sequence chart generators
• Program flow analysis for more independent program splitting