Upload
kimberly-hampton
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
A Template-based Approach to Complete Predicate Refinement
Tachio Terauchi (Nagoya University)
Hiroshi Unno (University of Tsukuba)
Naoki Kobayashi (University of Tokyo)
Software Model Checking
Automated Verification of Infinite State SystemsData : Infinite (e.g. Integers)Control : Finite, PDS (aka CFL reachability)– SLAM, BLAST, IMPACT, ARMC, Terminator, etc.
SMC Internals
• FOL predicate abstraction of infinite data– E.g. “x < y” = set of states ½ where ½(x) < ½(y) – Exploits advances in SAT/SMT solving
• CEGAR to automatically refine abstraction– Inference of appropriate FOL predicates
Same design also used in “higher-order SMC”: Depcegar, MoCHi, HMC, etc.
Predicates: x = 0, y = 0, x = yPredicates: x = 0, y = 0
Example
x := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
>
x= 0 Æ y = 0
>
>
>
> ) x = y
Predicates: x = 0, y = 0, x = y
x= 0 Æ y = 0
Example
x := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
>
x = y
x = y
x = y
Problem
• A refinement can be any predicates that refute the c.ex.– Not unique in general
• We got lucky by choosing x = y– Could have chosen x = 1 instead• And then choose x = 2, x = 3, … ad infinitum
Predicates: x = 0, y = 0, x = 1Predicates: x = 0, y = 0
Example failing to converge
x := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
>
x= 0 Æ y = 0
>
>
>
> ) x = y
>
Predicates: x = 0, y = 0, x = 1Predicates: x = 0, y = 0, x = 1, x = 2
Example failing to converge
x := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
>
x= 0 Æ y = 0
>
x = 1
x = 1>
> ) x = y
Solution : Complete SMC
Def: Let X be a FOL theory (e.g., X = QF_UFLRA). SMC is said to be complete wrt. X when
9predsµX. P ²preds safe ,
SMC(P) returns “safe”
Complete SMC in CEGAR (1/2)[Jhala,McMillan TACAS’06]
• Let X be some FOL theory– “theory” : set of (normalized) formulas
• Let L0, L1, … µ X s.t.– Each Li is finite
– For each i, Li µ Li+1
– i2!Li = X
• E.g., – X = QF_UFLRA– Li = {µ 2 X| atomic terms in µ are of size · i }
Complete SMC in CEGAR (2/2)[Jhala,McMillan TACAS’06]
Init L := some Li 2 {L0, L1, … }
Repeat Run SMC but restricting refinements to L • If proved safe, exit with “safe”• If fail to prove, let ¼ = counterexample
– Find Lj s.t. L µ Lj and Lj contains a refinement for ¼
» Exit with “unsafe” if no such Lj exists
– Set L := Lj and repeat
Challenges
1. Given L and c.ex. ¼, quickly find preds µ L s.t. ¼ ²preds safe
2. Find Lj s.t. L µ Lj Æ 9predsµLj. ¼ ²preds safe– This can be done by existing methods
Challenges
1. Given L and c.ex. ¼, quickly find preds µ L s.t. ¼ ²preds safe
Problem is obviously decidable– Because L is finite– “quickly” is the issue• Existing method [Jhala,McMillan TACAS’06] only handles
limited theory (QF_UFDL)
Overview of c.ex. refinement
• Refinement reduces to inferring Ã(y) s.t. µ(x,y) ) Ã(y) , Ã(y) ) Á(y,z), and µ(x,y), Á(y,z), Ã(y) 2 X
µ(x,y) : “what is true about x,y at the program point”
Á(y,z) : “what must hold true about y,z after the point to refute the c.ex.”
Ã(y) : “sufficient fact about y at the point to refute the c.ex.”
So, to do complete refinement• Just restrict Ã(y) to the current L when doing this
A Template-based Approach (QF_LRA)
Template T:QF_LRA formula with bounded coefficient variables– E.g.
c0x + c2y + c3 · 0 Æ c4x + c5y + c6 · 0 Ç c5x + c6y + c7 < 0
• Each c is associated with bound Bc µfin Z
Idea: Let L = the instances of Tand use “increasingly larger” T’s for L0 µ L1 µ …
Searching for Refinements in T (1/3)
Problem:Decide if 9c02B0,…,cn2Bn.8x0,…,xm.(µ ) T) Æ (T ) Á)
9c02B0,…,cn2Bn.8x0,…,xm.ª(c0,…,cn,x0,…,xm)
ª is a non-linear arithmetic formula over rationals– linear on x’s with coefficients on c’s
Searching for Refinements in T (2/3)
9c02B0,…,cn2Bn.8x0,…,xm.ª(c0,…,cn,x0,…,xm)
ª is a non-linear arithmetic formula over rationals– linear on x’s with coefficients on c’s
1. Convert ª to cnf Æj Ãj
– Ãj of the form :(Ax · a Æ Bx < b) s.t. a,b,A,B are over c’s
2. Apply Motzkin’s transposition theorem to each Ãj
Ax · a Æ Bx < b is unsatisfiable iff 9r¸0,p¸0. rA + pB = 0 Æ (ra + pb < 0 Ç (p != 0 Æ ra + pb · 0))
Searching for Refinements in T (3/3)
• Now, the problem is of the form9c02B0,…,cn2Bn,r¸0,p¸0. ©(r,p,c0,…,cn)
• Existential formula (i.e., got rid of 8x0,…xm)• © is non-linear arithmetic formula– linear on r and p’s with coefficients on c’s
Prop: Let Á be a satisfiable QF_LIA formula with• n vars, m literals, and coefficients bounded by k
Then, there is a solution of Á bounded by 2log(n+2) + m(log(m) + log(k))
Bit-blast and reduce ① to SAT
①
QF_UFLRA
• UF– Function symbols f1, f2, …, fk
– For each fj of arity n
8x1…xn,y1…yn. Æi xi = yi ) fj(x1…xn) = fj(y1…yn)
• Useful for conservatively modeling operators like :: £
L-restricting UF
1. Incorporate UF terms in templates as follows c0f(c1x+c2y+c3+c4g(c5x + c6y+c7)) + …
2. Apply Ackermann expansionFor each UF subterm f(t) 2 µ, let xf(t) be a fresh var.
Let Á = Æf(t1),f(t2)2µ ½(t1) = ½(t2) ) xf(t1) = xf(t2)
• ½ replaces f(t) by xf(t)
Prop: QF_UFLRA ² µ iff QF_LRA ² Á ) ½(µ)
Idea from [Beyer et al. VMCAI’07]
QF_AUFLRA
8a,e,i. rd(wr(a,i,e),i) = e8a,e,i,j. i != j ) rd(wr(a,i,e),j) = rd(a,j)8a,b. a != b ) rd(a,diff(a,b)) != rd(b,diff(a,b))
• Useful for modeling pointers
– QF_AUFLRA can be reduced to QF_UFLRA• See, e.g., [Totla, Wies POPL’13]
No it doesn’t scale
I was oversimplifying the problem– Infer Ã(y) s.t. µ(x,y) ) Ã(y) and Ã(y) ) Á(y,z)
c.ex. refinement in reality:– Infer Ã1(y1), Ã2(y2), … Ãn(yn) s.t.
µ1 (x1,y1) ) Ã1(y1) Æ
µ2 (x2,y2) Æ Ã1(x2) ) Ã2(y2) Æ
… µn (xn,yn) Æ Ã1(xn) ) Á(yn,z)
So, to infer L-restricted refinement
Need to restrict Ã1(y1), Ã2(y2), … Ãn(yn) to L
Lots of templates! T1, T2, … Tn
– Proportional to the size of c.ex.
– Lots of non-linear terms in the constraints
Doesn’t scale even on state-of-the-art SAT solver (or SMT solver for non-linear real arithmetic)
Solution (Informal)
Key Observation: counterexample in SMC (and constraints solved to refute it) is always repetitions of a fixed set of patterns.
– Use the observation to L-restrict only a few Ã’s and still achieve complete refinement
Example (1/2)
c.ex. are of the form µinit(x,y) ) Ã1(x1,y1) Æ
Ã1(x1, y1) Æ µloop(x1,y1,x2,y2) ) Ã2(x2, y2) Æ
Ã2(x2, y2) Æ µloop(x2,y2,x3,y3) ) Ã3(x3, y3) Æ
… Ãn(xn,yn) ) Á(xn,yn)
x := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
µinit(x,y) , x = 0 Æ y = 0
µloop(x,y,x’,y’) , x < 100 Æ x’ = x + 1 Æ y’ = y + 1
Á(x,y) , x ¸ 100 Æ x = y
Example (2/2)
Theorem: The following strategy is sufficient for complete predicate refinement:1. Pick some constant k > 02. Infer L-restricted refinement for i£k-th Ã’s (i.e., Ãi£k)
3. Infer unrestricted refinement for other Ã’s (e.g., via interpolation)• This reduces to [Jhala,McMillan TACAS’06] when k = 1• Larger k -> less L-restriction
– Proof: • On board
Formalization
Key Observation: Let P be a program. There exists a set of Horn-clause-like rules R s.t. for any c.ex. ¼ of SMC(P), the set of constraints solved to refute ¼ is an acyclic instance of R
− P1(x) Æ…Æ Pn(x)Ƶ(x,y) ) Q(y)− P1(x) Æ…Æ Pn(x)Ƶ(x,y) ) Á(x,y) • P, Q, … : predicate variables
Copies of rules from R with fresh renaming of pred. vars s.t. there is no cycle P ) … ) P
Examplex := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
x = 0 Æ y = 0 ) P(x,y)P(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q(x’,y’)Q(x,y) ) P(x,y)P(x,y) Æ x ¸ 100 ) x = y
x = 0 Æ y = 0 ) P1(x,y)P1 (x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q1(x’,y’)Q1 (x,y) ) P2(x,y)P2(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q2(x’,y’)Q2 (x,y) ) P3(x,y)P3 (x,y) Æ x ¸ 100 ) x = y
= R
E.g. consts(¼) =
Examplex := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
x = 0 Æ y = 0 ) P(x,y)P(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q(x’,y’)Q(x,y) ) P(x,y)P(x,y) Æ x ¸ 100 ) x = y
= R
• The observation also holds for higher-order SMC (e.g., Depcegar, MoCHi, HMC), and SMC for concurrent programs (e.g., Threader)
• Somewhat more general than [Grebenshchikov et al. PLDI’12]– Only says that c.ex. are instances of the rules
Bounded Patterns
Def: Set of bounded patterns A of R is a finite set of acyclic instances of R– Can view each element of A as a “combined” rule
Def: Bounded patterns A of R is partitioning if for any acyclic instance G of R, there exists instance A’ of A s.t. G and A’ are isomorphic– E.g., • R is a partitioning bounded patterns of R
– So is any A [ R where A is a bounded pattern of R
Examplex := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);
x = 0 Æ y = 0 ) P(x,y)P(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q(x’,y’)Q(x,y) ) P(x,y)P(x,y) Æ x ¸ 100 ) x = y
= R
A = R [ {{ P0(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q0(x’,y’), Q0(x,y) ) P1(x,y),
P1(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q1(x’,y’), Q1(x,y) ) P2(x,y),
P2(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q2(x’,y’), Q2(x,y) ) P3(x,y) }}
A’ : On board
L-restriction at Boundaries
Def: Let A’ be a partition of c.ex. G by A. Boundaries of partition A’ are predicate variables that appear in more than one element of A’
Theorem: L-restriction at boundaries is sufficient for complete predicate refinement– Proof: Preds at boundaries determine the preds at internal
nodes. So, L-restr. at boundaries -> finite # of possible refinements for internals
How to pick bounded partitioning
A simple strategy: View G as dag of P’s, L-restrict each i£k-th P (i.e., Pi£k) from a root where k is some constant and i = 1,2,3,…
• Reduces to [Jhala,McMillan TACAS’06] when k = 1• Larger k -> less L-restriction
Theorem: above ensures bounded partitioning– Proof: Because there are only a finite # of dags
generated by R of path lengths bounded by k
Conclusion
• Complete predicate refinement for the theory of QF_AUFLRA– Template-based• Bounded coefficients allow reduction to SAT
– Extends L-restricted refinement [Jhala,McMillan TACAS’06]
• Exploits the observation that c.ex. are repetitions of some patterns
• Only L-restrict predicate variables at boundaries of bounded patterns
Horn-clause-like rules