Upload
chessa
View
34
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Proving Absence of Deadlocks in Hardware Alexander Gotmanov, Satrajit Chatterjee and Mike Kishinevsky Intel Corp., Strategic CAD Labs (SCL). Outline. Basics of hardware verification Communication fabrics Modeling Verification. Basics of hardware modeling and verification. - PowerPoint PPT Presentation
Citation preview
A. Gotmanov, S. Chatterjee, M. Kishinevsky1
Proving Absence of Deadlocks in Hardware
Alexander Gotmanov, Satrajit Chatterjee and Mike KishinevskyIntel Corp., Strategic CAD Labs (SCL)
A. Gotmanov, S. Chatterjee, M. Kishinevsky2
Outline
Basics of hardware verification Communication fabrics
– Modeling– Verification
A. Gotmanov, S. Chatterjee, M. Kishinevsky3
Basics of hardware modeling and verification
A. Gotmanov, S. Chatterjee, M. Kishinevsky4
Hardware vs. Software Software Thread parallelism “Infinite” state Termination Asynchronous Easy to fix
Hardware Device parallelism Finite state Unbound execution Synchronous Impossible to fix
X F
in
out
Describing Hardware
• RTL = Register Transfer Level– Registers– Wires– Finite data types– Expressions– Modularity– Initial state(s)– No final state
Xu := (v+1) * avu
ab := (2 <= v) & (v <= 5)
b
Circuit
X – Boolean stateI(X) – initial predicate T(X,X’) – transition relation
Correctness
• Equivalence verification– M1, M2 are models– Is it true that fM1 = fM2?
• When hardware model is correct?• Property verification
– Make statements about behaviors of the model M that are expected to hold
– Is it true that they hold in every execution of M?
Linear Temporal Logic (LTL)
s1 s2 s3 s4 s5 s6 s7
LTL is a formal language to reason about executions of a state machine in time
…
Execution = infinite sequence of consecutive states
Predicate = statement about a state
s42
p ¬qp is true in s42, while q is false
Temporal operators / 1
s1 s2 s3 s4 s5 s6 s7
p p
G(p)
p p p p p
…
s1 s2 s3 s4 s5 s6 s7
¬p ¬p ¬p p - - -
…F(p)
“always p”
“eventually p”
LTL is a formal language to reason about executions of a state machine in time
Temporal operators / 2
s1 s2 s3 s4 s5 s6 s7
¬p ¬pF(G(p))
p ¬p p p p
…
s1 s2 s3 s4 s5 s6 s7
¬p p ¬p ¬p p ¬p p
…G(F(p))
“eventually G(p)”
“always F(p)”
G(p)
F(p) F(p) F(p) F(p) F(p) F(p) F(p)
p is satisfied infinitely often in
the execution
Plus all propositional connectives:“·”, “+”, “→”, “¬”, …
LTL is a formal language to reason about executions of a state machine in time
Safety and Liveness
• Safety property (invariant)– G(P(X))– P(X) is always true
• Liveness property– GF(P(X))– P(X) is satisfied infinitely often
Guarantee and Persistence
• Guarantee property– F(P(X))– P(X) becomes true
• Persistence property– FG(P(X))– P(X) becomes “stuck” at true
Property verification / 1
• Simulation (~debugging)– Applied everywhere– Check that the model is correct for a (small)
subset of possible behaviors– Random / semi-random / trace simulation– Need a testbench and checkers
Property verification / 2
• Compliance monitors– How to choose the “right” properties?– Assuming that all properties are true, prove
“correctness theorem”– Example: proving producer-consumer ordering
for PCI Express
Property verification / 3
• Small-scale formal verification– Applied at module level, late in design cycle– Use RTL description– Prove properties
– Bit-level verification– Bounded proofs– Complete proofs
• Perfecting model checking algorithms
Property verification / 4
• Large-scale formal verification– Applied at system level, early in design cycle– Build an abstract model (HLM)
– TLA+– Murphi– SMV– xMAS
– Prove properties – SMT & bit-level– Theorem proving– Strengthening– Automatic proof
Property verification / 2
• HLM-to-RTL correlation– Demonstrate that RTL is compliant with its
High Level Model– Simulation
– Generate compliance checkers – Formal verification
– (No practical examples)
Reachability analysis / 1
• Construct step-wise reachability sets– R0, R1, R2, …– R0(X) = I(X)– Ri+1(X’) = Ri(X) + X. Ri(X) & T(X,X’)
• Need to prove:– G(P(X))
• For some k, Rk = Rk+1 = R• Check that R(X) → P(X). Q.E.D.
X – Boolean stateI(X) – initial predicate T(X,X’) – transition relation
Reachability analysis / 2
• Implementation options– Explicit state enumeration– Reduced Ordered BDD– And-Inverter Graphs + SAT– Formulas + SMT
• Bounded model checking– Partial reachability analysis– Unable to reach k, where Rk = Rk+1– Still have “bounded” proof that P(X) is true for
first n cycles of execution (n < k)
Induction / 1
• P(X) is called inductive, iff– I(X) → P(X)– P(X) & T(X,X’) → P(X’)
• Can we prove G(P(X)) without checking all reachable states?• Can we prove 2n ≤ n! + 100, n ≥ 0, without
checking all non-negative n?
• Theorem: if P(X) is inductive then G(P(X)) is true– Two SAT (SMT) checks to establish inductivity
Induction / 2
• Theorem: if G(P(X)) is true, there exists inductive Q(X), such that Q(X) → P(X)
• What if P(X) is not inductive?
• “Local” properties are usually not inductive
G(y = 1) y = 1 → y’ = 1
G(y = 1 & x = 1)is inductive
Other bit-level approaches• Interpolation [K. McMillan, CAV 2003]
– Property-oriented– Based on Craig’s interpolation theorem
• IC3 [A. Bradley, VMCAI 2011] – Approximate reachability– Incremental inductive strengthening
• ABC [A. Mischenko, R. Brayton, UC-Berkeley] – Synthesis for verification– Parallel model checking
Equivalence verification
• Correctness of synthesis algorithms− Prove that optimized circuit is equivalent to
the original
• Equivalence can be expressed as a property
X1 F1
in
X2F2
z
=Assert(z == 1);
A. Gotmanov, S. Chatterjee, M. Kishinevsky23
Communication fabrics: modeling
Communication Fabric = logic connecting different agents on a chip
A. Gotmanov, S. Chatterjee, M. Kishinevsky24
Verification is not optional
Tricky, distributed interaction – deadlock – starvation
Modular verification doesn’t work
Problems seen late– during integration – late RTL, in Silicon
A. Gotmanov, S. Chatterjee, M. Kishinevsky25
Our Approach: High-Level Modeling
Build early, abstract models of the microarchitecture
kk
kk kk
kk
~ ~
~~
k k
k k
agent Q
agent P
x x = req
req
req
x rsp
x rsp
fabric2 2 2 2 2 2
x x = req
Goal: (Automatically) prove correctness by exploiting high-level information provided by the model
A. Gotmanov, S. Chatterjee, M. Kishinevsky26
xMAS Modeling FrameworkModel is composed from a small set of primitives
primitives are formally specified
A simple example
source
e
sink fork
f
g
join
h
switch
sf
function
k
queue merge
k k
0 (6-bit)
x y z
friendly to microarchitects
Networks where packets flow under backpressure• naturally concurrent• packets are conserved
A. Gotmanov, S. Chatterjee, M. Kishinevsky27
xMAS Semantics
k k
data
trdyirdy
data
trdyirdy
data
trdyirdy
FIFO of size k FIFO of size k
is short-hand for
Each primitive is a synchronous (RTL) module
Different modules are connected by channels
channel
A. Gotmanov, S. Chatterjee, M. Kishinevsky28
xMAS Semantics
is short-hand for
Primitives are parameterized
Example
f data
trdyirdy
data
trdyirdy
f
data
trdyirdy
data
trdyirdy
data
trdyirdy
FIFO of size k FIFO of size k
data
trdyirdy
+ 42
k k(x #x+42)
A. Gotmanov, S. Chatterjee, M. Kishinevsky29
A Simple Example
22
42(8-bit)
(queue A)(queue B)
id
id x
22
42(8-bit)
(queue A)(queue B)
id
id x
22
42(8-bit)
(queue A)(queue B)
id
id x
22
42(8-bit)
(queue A)(queue B)
id
id
x x
22
42(8-bit)
(queue A)(queue B)
id
id
22
42(8-bit)
(queue A)(queue B)
id
id
cycle 1 cycle 2
cycle 3 cycle 6…
cycle 7 cycle 8
A. Gotmanov, S. Chatterjee, M. Kishinevsky30
x x = req
Agent P
Agent Q
Fabric
A more complex example
kk
kk kk
kk
~ ~
~~
k k
k k
agent Q
agent P
x #x = req
req
req
x #rsp
x #rsp
fabric2 2 2 2 2 2
A. Gotmanov, S. Chatterjee, M. Kishinevsky31
Modeling and Verification Flowk k(λx.x+42)
C++
xMAS API(Library)
exec-utable
by hand
compile
model.v+
SVA
randomtest-
bench.v
run
VCD
Model Checker
simulate(RTL Simulator)
abc
Proof
(from architects)
Other backends to SystemC, TLA, Murphi, etc. possible
Additional Formal
Analysis
To make this easier
A. Gotmanov, S. Chatterjee, M. Kishinevsky32
Safety proofs
A. Gotmanov, S. Chatterjee, M. Kishinevsky33
Channel Properties
k k
0 (6-bit)
x y z
4 4
Although this property is obviously true:― Interpolation (in ABC) takes 10 mins to prove this!― (Usually explicit or BDD-based reachability not an option )
(queue 1) (queue 2)
Example of channel property: G (z.irdy (z.data = 0))
A channel property checks that all packets on a channel satisfy some condition
― e.g. all packets received by an agent have correct dest_id
A. Gotmanov, S. Chatterjee, M. Kishinevsky34
Inductive Strengthening / 1
k k
0 (6-bit)
x y z
Target channel property: G (z.irdy (z.data = 0))
4 4
(queue 1) (queue 2)
Use high-level structure to add invariants
Invariant 1:
Invariant 2: G (y.irdy (y.data = 0))Invariant 3: G (usedj (memj = 0)) (For queue 1)
G (usedj (memj = 0)) If location j in queue 2 is in use, it must contain 0
This set of invariants is inductive!
A. Gotmanov, S. Chatterjee, M. Kishinevsky35
Inductive Strengthening / 2Similar propagation for other primitives
Property: G (z.irdy (z.data = 7))Invariant: G (y.irdy ((y.data+7)=7))
The invariant is obtained syntactically (no need to invert function)
source
e
sink fork
f
g
join
h
switch
sf
function
k
queue merge
Example: Propagation across function primitive
k
x y z
v #v + 7
0 (6-bit)
A. Gotmanov, S. Chatterjee, M. Kishinevsky36
Non-blocking property
credit logic
master M target T
k
k
kt
r
(outstanding credits)
token
(credit queue)
(request sink)
(request source)
(ingress queue)
s
f
e
pn
u
vw
z
non-blocking property on channel r: G (r.irdy r.trdy)Intuition: If a packet is in r there is room in the ingress queue
(useful to reason about liveness)
Hard property to prove: Not inductive!
A. Gotmanov, S. Chatterjee, M. Kishinevsky37
Strengthening non-blocking properties
credit logic
master M target T
k
k
kt
r
(outstanding credits)
token
(credit queue)
(request sink)
(request source)
(ingress queue)
s
f
e
pn
u
vw
z
(Inductive) Invariant: G (numi + numc= numo)
numi
numcnumo
non-blocking property on channel r: G (r.irdy r.trdy)
A. Gotmanov, S. Chatterjee, M. Kishinevsky38
How to find numi + numc= numo automatically?
credit logic
master M target T
k
k
kt
r
(outstanding credits)
token
(credit queue)
(request sink)
(request source)
(ingress queue)
s
f
e
pn
u
vw
z
numi
numcnumo
Eliminate λs using modified Gaussian Elimination
Introduce λ variables to count transfers on each channel
A. Gotmanov, S. Chatterjee, M. Kishinevsky39
Practical Experience More robust than model checking
– model checking fails on trivial examples
More automation than theorem proving– hundreds of invariants generated automatically to
make it inductive– invariants for non-blocking can have 20+ num
terms each
A. Gotmanov, S. Chatterjee, M. Kishinevsky40
Liveness proofs
A. Gotmanov, S. Chatterjee, M. Kishinevsky41
Channel protocol
Xferirdy = 1trdy = 1
Inactiveirdy = 0trdy = 0
Initiator is ready, waiting for target
Target is ready, waiting for initiator
FwdRetryirdy = 1trdy = 0
BwdRetryirdy = 0trdy = 1
A retry state either persists or ends in transfer.
Idle (irdy=0)
Blocked (trdy=0)
Deadlock in xMAS
• Dead(u) = F(u.irdy · G¬u.trdy)
• Deadlock in xMAS is defined for a channel• Dead(u) = “eventually a packet arrives at input of u, but output
of u never accepts it”
v2 2u w
token
irdy
data
trdy
• Live(u) = ¬Dead(u) = G(u.irdy → Fu.trdy)
Example
v2 2u w
token
v2 2u w
token
GF(w.trdy)Live(u)
Dead(u) FG(¬w.trdy)
Sink is fair
Sink is not fair
Fairness Constraints
• Fairness constraint:
GF(x.trdy) = ¬FG(¬w.trdy)
x
• Fair = conjunction of all fairness constraints
• Execution of xMAS model is fair if it satisfies all fairness constraints
State of the art• ABC (liveness-to-safety translation)
– Reduces liveness problem to equivalent safety problem by circuit transformation
• Doubles number of flops– BMC can reveal “shallow” deadlocks– Induction, interpolation, etc.– Does not scale well to xMAS models with 10s of queuesExample
Simple Messaging Fabric with Ordering75 primitives(24 queues)
ABCProof – infeasible.
BMC – 29 frames in 1hr.Our method
Proof – 4s.
Idle and Block
Dead(u) = ¬Idle(u) · Block(u)“u is in deadlock iff u stuck at blocked, but not stuck at idle”
Idle(u) = FG(¬u.irdy) “u (eventually) stuck at idle”
v2 2u w
token
Block(u) = FG(¬u.trdy) “u (eventually) stuck at blocked”
Fair = ¬Block(w)“sink is fair iff w not stuck at blocked”
Equations for xMAS queue
ku vIdle(u)Block(u)
Idle(v)Block(v)
Full(q)Empty(q)
Full(q) = FG(q.num=k) “q (eventually) stuck at full”Empty(q) = FG(q.num=0) “q (eventually) stuck at empty”
Block(u) = Full(q)“u stuck at blocked iff q stuck at full”
Full(q) → Block(v) “if q stuck at full then v stuck at blocked”
Other equations: Idle(v) = Empty(q), Empty(q) → Idle(u),¬Idle(u) · Block(v) → Full(q), etc.
Similar equations for all xMAS primitives.
u.trdy = (q.num < k)v.irdy = (q.num > 0)…
Equations propagate knowledge about Idle and Block conditions through a
queue.
Liveness proof
v2 2u w
token
Full(q1) Full(q2)
Block(v)
Block(w)
Block(u) = Full(q1)Full(q1) → Block(v)Block(v) = Full(q2)Full(q2) → Block(w)
Fair = ¬Block(w)(true for all fair executions)
Dead(u) · DeadEq · Fair = 0 in propositional logic
DeadEq(true for every execution)
There is no fair execution of the model, where channel u is dead.
Dead(u) = ¬Idle(u) · Block(u)
Q.E.D.
¬Block(w)
Method overview
As ¬Idle · Block
LTL theorems, describing how Idle and Block propagate through primitives
As ¬Block
Dead(u)
DeadEq(per primitive)
Fair
• UNSAT = Liveness proof• SAT = Counterexample
+
+ • May return unreachable executions as counterexamples• Rule out with over approximate reachability using
automatically generated invariants.
• Can’t comprehend message-dependent deadlocks with Idle and Block only• Capture properties of data with additional Prop
conditions.
(details in the backup)
Experimental results: proofExperiment NPrims
(NQueues)Time for ABC (proof)
Num. of BMC frames in 1hr
Time for our method
Time % (DE+SAT)
Time for safety
Two queues 4 (2) <1s >100 <1s - <1s
SMF = simple messaging fabric
57 (20) N/A >100 2s - <1s
SMF with ordering 75 (24) N/A 29 4s - 2s
IOF1 = IO fabric 91 (19) N/A 15 2s - 3s
IOF2 293 (58) N/A 14 14s 53%+47% 1min
IOF3 480 (92) N/A 11 12min 94%+6% 9min
IOF4 (minimal) 493 (97) N/A 18 2.5min 85%+15% 28s
IOF4 (non-minimal) 493 (97) N/A 20 3min 65%+35% 3.5hr
IOF5 (minimal) 680 (131) N/A 15 40min 97%+3% 2min
IOF5 (non-minimal) 680 (131) N/A 20 44min 94%+6% 10hr+
No proofs with ABC, when number of queues is in 10s.
Conclusion
• Verification technology to reduce liveness of xMAS model to satisfiability in propositional logic
• Can get proofs for models with 100s of queues
• Open problem: method requires extension for models with non-restricted joins
A. Gotmanov, S. Chatterjee, M. Kishinevsky52
References “Quick Formal Modeling of Communication Fabrics
to Enable Verification”, HLDVT 2010
“Automatic Generation of Inductive Invariants from High-Level Microarchitectural Models of Communication Fabrics”, CAV 2010
“Verifying Deadlock-Freedom of Communication Fabrics”, VMCAI 2011
A. Gotmanov, S. Chatterjee, M. Kishinevsky53
Thank you
A. Gotmanov, S. Chatterjee, M. Kishinevsky54
Backup
A. Gotmanov, S. Chatterjee, M. Kishinevsky55
source
e
sink fork
f
g
join
h
switch
sf
function
k
queue merge
Careful with Joins
G (z.irdy (z.data = 42))
Hard case Easy case
Most joins like this in practice
(used for synchronization)
(x, y) #x + yx
y z
(x, y) #yx
y z
Adding invariants / 1
Dead(a)Fair = ¬Block(g)
Dead(a) · DeadEq · Fair has a solution
• Solution is unreachable execution with q1 stuck at empty and q2, q3 stuck at full
a
token
b
d e
2
2 2
c
fg
q1
q2 q3
• The execution contradicts with model flow invariant (automatically generated):q1.num = q2.num + q3.num
Full(q2) Full(q3)
Empty(q1)
(q1.num = 0) → ((q2.num = 0)·(q3.num = 0))
Deadlock equations – about executions
Flow invariants – about instantaneous state
?
Adding invariants / 2
Dead(a)Fair = ¬Block(g)
• Add equations connecting Idle, Block, Empty, Full, etc. with instantaneous state of the model (q.num, irdy, trdy, etc.):
Empty(q) → (q.num = 0)Full(q) → (q.num = k)Idle(u) → (u.irdy = 0)…
a
token
b
d e
2
2 2
c
fg
q1
q2 q3Full(q2) Full(q3)
Empty(q1)
InfEq (eventually true for every execution)
• Add any instantaneous invariants (Inv), solveDead(u) · DeadEq · Fair · Inv · InfEq for propositional satisfiabilityUNSAT
Live
Method revisited
As ¬Idle · Block
LTL theorems, describing how Idle and Block propagate through primitives
As ¬Block
Dead(u)
DeadEq(per primitive)
Fair• UNSAT = Liveness proof• SAT = Counterexample
+
+
InfEq
+
Connect Idle, Block, etc. with instantaneous state
Inv
+
Automatically generated instantaneous invariants
Data dependencies / 1
0
1
uv
wIdle(u)
Block(u)
Idle(v)Block(v)
Idle(w)Block(w)
a(x)
b(x)
Idle(v) depends on what kind of data is coming from input channel u
Idle(v) = Idle(u) + “all packets are going to the bottom branch”
Propp(x)(u) = FG(u.irdy → p(u.data)) “u (eventually) stuck at p(x)”
Idle(v) = Idle(u) + Prop¬a(x)(u) “v stuck at idle, iff u stuck at idle or u stuck at ¬a(x)”
• Prop conditions in addition to Idle and Block, etc.• Add equations to propagate Prop conditions through the model
u.trdy = u.irdy · (a(u.data) · v.trdy + + b(u.data) · w.trdy)
v.irdy = u.irdy · a(u.data)w.irdy = u.irdy · b(u.data)
This method is a recent improvement on the one in the paper.
This method is a recent improvement on the one in the paper.
Data dependencies / 2
x → f(x)u vPropp(x)(v)
Propp(x)(v) = Propp(f(x))(u)“v stuck at p(x) iff u stuck at p(f(x))”
Propp(f(x))(v)
Number of all possible Prop conditions is (number of channels) x (average number of predicates per channel).
Linear from size of the model.May be exponential from width of data on channels.
Explosion is avoidable in practice
• Compute a limited subset of Prop conditions.• Start with Prop conditions used in equations for all switches.• Add new Prop conditions with propagation.• Stop propagation, if Prop predicate is repeated or becomes a
tautology.
Method revisited (again)
As ¬Idle · Block
LTL theorems, describing how Idle and Block propagate through primitives
As ¬Block
Dead(u)
DeadEq(per primitive)
Fair• UNSAT = Liveness proof• SAT = Counterexample
+
+
InfEq
+
Connect Idle, Block, etc. with instantaneous state
Inv
+
Automatically generated instantaneous invariants
and Prop Find a set of Prop conditions to use
Experimental results: finding a bug
Experiment NPrims (NQueues) Time for ABC (BMC)
Time for our method
2 Master/Slave agents(no credit logic)
16 (2) 6s <1s
SMF1(unfair sink)
57 (20) 4s 2s
SMF1(broken credits logic)
57 (20) 8s 2s
SMF2 (unfair sink)
75 (24) 1hr 3s
SMF2 (broken ordering logic)
75 (24) 4hr 3sDeadlocks found with our method can be unreachable (due to over-approximation).
In experiments, all found deadlocks were reachable.