A Template-based Approach to Complete Predicate Refinement Tachio Terauchi (Nagoya University) Hiroshi Unno (University of Tsukuba) Naoki Kobayashi (University

A Template-based Approach to Complete Predicate Refinement

Tachio Terauchi (Nagoya University)

Hiroshi Unno (University of Tsukuba)

Naoki Kobayashi (University of Tokyo)

Software Model Checking

Automated Verification of Infinite State SystemsData : Infinite (e.g. Integers)Control : Finite, PDS (aka CFL reachability)– SLAM, BLAST, IMPACT, ARMC, Terminator, etc.

SMC Internals

• FOL predicate abstraction of infinite data– E.g. “x < y” = set of states ½ where ½(x) < ½(y) – Exploits advances in SAT/SMT solving

• CEGAR to automatically refine abstraction– Inference of appropriate FOL predicates

Same design also used in “higher-order SMC”: Depcegar, MoCHi, HMC, etc.

Predicates: x = 0, y = 0, x = yPredicates: x = 0, y = 0

Example

x := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);

>

x= 0 Æ y = 0

>

>

>

> ) x = y

Predicates: x = 0, y = 0, x = y

x= 0 Æ y = 0

Example


>

x = y

x = y

x = y

Problem

• A refinement can be any predicates that refute the c.ex.– Not unique in general

• We got lucky by choosing x = y– Could have chosen x = 1 instead• And then choose x = 2, x = 3, … ad infinitum

Predicates: x = 0, y = 0, x = 1Predicates: x = 0, y = 0

Example failing to converge


>

x= 0 Æ y = 0

>

>

>

> ) x = y

>

Predicates: x = 0, y = 0, x = 1Predicates: x = 0, y = 0, x = 1, x = 2

Example failing to converge


>

x= 0 Æ y = 0

>

x = 1

x = 1>

> ) x = y

Solution : Complete SMC

Def: Let X be a FOL theory (e.g., X = QF_UFLRA). SMC is said to be complete wrt. X when

9predsµX. P ²preds safe ,

SMC(P) returns “safe”

Complete SMC in CEGAR (1/2)[Jhala,McMillan TACAS’06]

• Let X be some FOL theory– “theory” : set of (normalized) formulas

• Let L0, L1, … µ X s.t.– Each Li is finite

– For each i, Li µ Li+1

– i2!Li = X

• E.g., – X = QF_UFLRA– Li = {µ 2 X| atomic terms in µ are of size · i }

Complete SMC in CEGAR (2/2)[Jhala,McMillan TACAS’06]

Init L := some Li 2 {L0, L1, … }

Repeat Run SMC but restricting refinements to L • If proved safe, exit with “safe”• If fail to prove, let ¼ = counterexample

– Find Lj s.t. L µ Lj and Lj contains a refinement for ¼

» Exit with “unsafe” if no such Lj exists

– Set L := Lj and repeat

Challenges

1. Given L and c.ex. ¼, quickly find preds µ L s.t. ¼ ²preds safe

2. Find Lj s.t. L µ Lj Æ 9predsµLj. ¼ ²preds safe– This can be done by existing methods

Challenges

1. Given L and c.ex. ¼, quickly find preds µ L s.t. ¼ ²preds safe

Problem is obviously decidable– Because L is finite– “quickly” is the issue• Existing method [Jhala,McMillan TACAS’06] only handles

limited theory (QF_UFDL)

Overview of c.ex. refinement

• Refinement reduces to inferring Ã(y) s.t. µ(x,y) ) Ã(y) , Ã(y) ) Á(y,z), and µ(x,y), Á(y,z), Ã(y) 2 X

µ(x,y) : “what is true about x,y at the program point”

Á(y,z) : “what must hold true about y,z after the point to refute the c.ex.”

Ã(y) : “sufficient fact about y at the point to refute the c.ex.”

So, to do complete refinement• Just restrict Ã(y) to the current L when doing this

A Template-based Approach (QF_LRA)

Template T:QF_LRA formula with bounded coefficient variables– E.g.

c0x + c2y + c3 · 0 Æ c4x + c5y + c6 · 0 Ç c5x + c6y + c7 < 0

• Each c is associated with bound Bc µfin Z

Idea: Let L = the instances of Tand use “increasingly larger” T’s for L0 µ L1 µ …

Searching for Refinements in T (1/3)

Problem:Decide if 9c02B0,…,cn2Bn.8x0,…,xm.(µ ) T) Æ (T ) Á)

　 9c02B0,…,cn2Bn.8x0,…,xm.ª(c0,…,cn,x0,…,xm)

ª is a non-linear arithmetic formula over rationals– linear on x’s with coefficients on c’s


9c02B0,…,cn2Bn.8x0,…,xm.ª(c0,…,cn,x0,…,xm)

ª is a non-linear arithmetic formula over rationals– linear on x’s with coefficients on c’s

1. Convert ª to cnf Æj Ãj

– Ãj of the form :(Ax · a Æ Bx < b) s.t. a,b,A,B are over c’s

2. Apply Motzkin’s transposition theorem to each Ãj

Ax · a Æ Bx < b is unsatisfiable iff 9r¸0,p¸0. rA + pB = 0 Æ (ra + pb < 0 Ç (p != 0 Æ ra + pb · 0))


• Now, the problem is of the form9c02B0,…,cn2Bn,r¸0,p¸0. ©(r,p,c0,…,cn)

• Existential formula (i.e., got rid of 8x0,…xm)• © is non-linear arithmetic formula– linear on r and p’s with coefficients on c’s

Prop: Let Á be a satisfiable QF_LIA formula with• n vars, m literals, and coefficients bounded by k

Then, there is a solution of Á bounded by 2log(n+2) + m(log(m) + log(k))

Bit-blast and reduce ① to SAT

①

This is complete for QF_LRA

• Going beyond QF_LRA– QF_UFLRA– QF_AUFLRA

QF_UFLRA

• UF– Function symbols f1, f2, …, fk

– For each fj of arity n

8x1…xn,y1…yn. Æi xi = yi ) fj(x1…xn) = fj(y1…yn)

• Useful for conservatively modeling operators like :: £

L-restricting UF

1. Incorporate UF terms in templates as follows c0f(c1x+c2y+c3+c4g(c5x + c6y+c7)) + …

2. Apply Ackermann expansionFor each UF subterm f(t) 2 µ, let xf(t) be a fresh var.

Let Á = Æf(t1),f(t2)2µ ½(t1) = ½(t2) ) xf(t1) = xf(t2)

• ½ replaces f(t) by xf(t)

Prop: QF_UFLRA ² µ iff QF_LRA ² Á ) ½(µ)

Idea from [Beyer et al. VMCAI’07]

QF_AUFLRA

8a,e,i. rd(wr(a,i,e),i) = e8a,e,i,j. i != j ) rd(wr(a,i,e),j) = rd(a,j)8a,b. a != b ) rd(a,diff(a,b)) != rd(b,diff(a,b))

• Useful for modeling pointers

– QF_AUFLRA can be reduced to QF_UFLRA• See, e.g., [Totla, Wies POPL’13]

This sounds too easy…

Does it really scale?

No it doesn’t scale

I was oversimplifying the problem– Infer Ã(y) s.t. µ(x,y) ) Ã(y) and Ã(y) ) Á(y,z)

c.ex. refinement in reality:– Infer Ã1(y1), Ã2(y2), … Ãn(yn) s.t.

µ1 (x1,y1) ) Ã1(y1) Æ

µ2 (x2,y2) Æ Ã1(x2) ) Ã2(y2) Æ

… µn (xn,yn) Æ Ã1(xn) ) Á(yn,z)

So, to infer L-restricted refinement

Need to restrict Ã1(y1), Ã2(y2), … Ãn(yn) to L

Lots of templates! T1, T2, … Tn

– Proportional to the size of c.ex.

– Lots of non-linear terms in the constraints

Doesn’t scale even on state-of-the-art SAT solver (or SMT solver for non-linear real arithmetic)

Solution (Informal)

Key Observation: counterexample in SMC (and constraints solved to refute it) is always repetitions of a fixed set of patterns.

– Use the observation to L-restrict only a few Ã’s and still achieve complete refinement

Example (1/2)

c.ex. are of the form µinit(x,y) ) Ã1(x1,y1) Æ

Ã1(x1, y1) Æ µloop(x1,y1,x2,y2) ) Ã2(x2, y2) Æ

Ã2(x2, y2) Æ µloop(x2,y2,x3,y3) ) Ã3(x3, y3) Æ

… Ãn(xn,yn) ) Á(xn,yn)


µinit(x,y) , x = 0 Æ y = 0

µloop(x,y,x’,y’) , x < 100 Æ x’ = x + 1 Æ y’ = y + 1

Á(x,y) , x ¸ 100 Æ x = y

Example (2/2)

Theorem: The following strategy is sufficient for complete predicate refinement:1. Pick some constant k > 02. Infer L-restricted refinement for i£k-th Ã’s (i.e., Ãi£k)

3. Infer unrestricted refinement for other Ã’s (e.g., via interpolation)• This reduces to [Jhala,McMillan TACAS’06] when k = 1• Larger k -> less L-restriction

– Proof: • On board

Formalization

Key Observation: Let P be a program. There exists a set of Horn-clause-like rules R s.t. for any c.ex. ¼ of SMC(P), the set of constraints solved to refute ¼ is an acyclic instance of R

− P1(x) Æ…Æ Pn(x)Æµ(x,y) ) Q(y)− P1(x) Æ…Æ Pn(x)Æµ(x,y) ) Á(x,y) • P, Q, … : predicate variables

Copies of rules from R with fresh renaming of pred. vars s.t. there is no cycle P ) … ) P

Examplex := 0;y := 0;while (x < 100) { x++; y++; } assert (x = y);

x = 0 Æ y = 0 ) P(x,y)P(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q(x’,y’)Q(x,y) ) P(x,y)P(x,y) Æ x ¸ 100 ) x = y

x = 0 Æ y = 0 ) P1(x,y)P1 (x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q1(x’,y’)Q1 (x,y) ) P2(x,y)P2(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q2(x’,y’)Q2 (x,y) ) P3(x,y)P3 (x,y) Æ x ¸ 100 ) x = y

= R

E.g. consts(¼) =



= R

• The observation also holds for higher-order SMC (e.g., Depcegar, MoCHi, HMC), and SMC for concurrent programs (e.g., Threader)

• Somewhat more general than [Grebenshchikov et al. PLDI’12]– Only says that c.ex. are instances of the rules

Bounded Patterns

Def: Set of bounded patterns A of R is a finite set of acyclic instances of R– Can view each element of A as a “combined” rule

Def: Bounded patterns A of R is partitioning if for any acyclic instance G of R, there exists instance A’ of A s.t. G and A’ are isomorphic– E.g., • R is a partitioning bounded patterns of R

– So is any A [ R where A is a bounded pattern of R



= R

A = R [ {{ P0(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q0(x’,y’), Q0(x,y) ) P1(x,y),

P1(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q1(x’,y’), Q1(x,y) ) P2(x,y),

P2(x,y) Æ x < 100 Æ x’ = x+1 Æ y’ = y+1 ) Q2(x’,y’), Q2(x,y) ) P3(x,y) }}

A’ : On board

L-restriction at Boundaries

Def: Let A’ be a partition of c.ex. G by A. Boundaries of partition A’ are predicate variables that appear in more than one element of A’

Theorem: L-restriction at boundaries is sufficient for complete predicate refinement– Proof: Preds at boundaries determine the preds at internal

nodes. So, L-restr. at boundaries -> finite # of possible refinements for internals

How to pick bounded partitioning

A simple strategy: View G as dag of P’s, L-restrict each i£k-th P (i.e., Pi£k) from a root where k is some constant and i = 1,2,3,…

• Reduces to [Jhala,McMillan TACAS’06] when k = 1• Larger k -> less L-restriction

Theorem: above ensures bounded partitioning– Proof: Because there are only a finite # of dags

generated by R of path lengths bounded by k

Conclusion

• Complete predicate refinement for the theory of QF_AUFLRA– Template-based• Bounded coefficients allow reduction to SAT

– Extends L-restricted refinement [Jhala,McMillan TACAS’06]

• Exploits the observation that c.ex. are repetitions of some patterns

• Only L-restrict predicate variables at boundaries of bounded patterns

Horn-clause-like rules

Documents

A Template-based Approach to Complete Predicate Refinement Tachio Terauchi (Nagoya University) Hiroshi Unno (University of Tsukuba) Naoki Kobayashi (University