All-Solution Satisfiability Modulo Theories: applications, algorithms and benchmarks

All-Solution Satisfiability Modulo Theories:applications, algorithms and benchmarks

Quoc-Sang Phan and Pasquale Malacaria

Queen Mary University of London

ARES: August 25, 2015

1 / 23

Content

Satisfiability Modulo Theories (SMT): a decision problem forlogical formulas over first-order theories

All-SMT: the problem of finding all solutions of an SMTproblem with respect to a set of Boolean variables

All-SMT solver can benefit various domains of application:Bounded Model Checking, Automated Test Generation,Reliability analysis, and Quantitative Information Flow. Weconcentrate here on Quantitative Information Flow (QIF)

we propose algorithms to design an All-SMT solver on top ofan existing SMT solver, and implement it into a prototypetool, called aZ3.

2 / 23

Information Flow

Secret input (H) Public input (L)

Program P

Public Output (O)

Non-interference

Public input (L)

Program P

Secret input (H)

Information leaked

Public Output (O) √?

3 / 23

Non-interference is unachievable

int check(int H, int L){

int O;

if (L == H)

O = ACCEPT;

else O = REJECT;

return O;

}

password check

Secret input (H) Public input (L)

Program P

Public Output (O)

Non-interference

Public input (L)

Program P

Secret input (H)

Information leaked

Public Output (O) √?

Leakage = Secrecy before observing - Secrecy after observing

∆E (XH) = E (XH)− E (XH |XO)

where XH is the secret, XO is the output and E is an ”entropy”function

4 / 23

Motivations for Quantitative Information Flow

1 information leakage is unavoidable, e.g. authenticationsystems must leak by design some information aboutpasswords

2 however, provided the leakage is small, usually that is not aproblem.

3 so measuring leakage allows for a security assessment of aprogram

4 This work provides new and fast algorithms to measureinformation leaks, “how much” a program leaks

5 / 23

Quantifying Information Leaks

Theorem: Channel Capacity for deterministic systems

∆E (XH) ≤ log2(|O|)

holds for Shannon entropy and Renyi’s min-entropy

holds for all possible distributions of XH .

is the basis of state-of-the-art techniques for QuantitativeInformation Flow analysis.

based on the above we have:

Definition

Quantitative Information Flow (QIF) is the problem of counting N,the number of possible outputs of a given program P.

6 / 23

An example

base = 8;if (H < 16 and H>=0) then

O = base + Helse

O = base

Figure: Data sanitization program

Here then O is in [8..23], so leakage ≤ log(15), possible bitsconfigurations in O are 0 . . . 01000 to 0 . . . 010111

7 / 23

From programs to formulas

First step in our approach is to understand how programs aretranslated into formulas: using Single Static Assignment (SSA) aprogram P is translated into a conjunctive formula ϕP

8 / 23

Quantifying as Counting

Adversary

tries to infer

H from L and O

H

LO

f

O is stored as a bit vector b1b2 . . . bM .

Assume we have a first-order formula ϕP such that:

ϕP contains a set of Boolean variables VI := {p1, p2, .., pM}pi = > if and only if bi is 1, and pi = ⊥ if and only if bi = 0

Counting outputs of P ≡ Counting models of ϕP w.r.t. VI

9 / 23

QIF analysis using a All-SMT solver

Program transformation

L = 8;

if (H < 16)

O = H + L;

else

O = L;

(L1 = 8) ∧(G0 = H0 < 16) ∧(O1 = H0 + L1) ∧(O2 = L1) ∧(O3 = G0?O1 : O2)

Figure: A simple program P encoded into a first-order formula ϕP

Formula instrumentation to build the set VI :

(assert (= (= #b1 (( extract 0 0) O3)) p1))

10 / 23


We introduce two algorithms for All-SMT solving.Both use APIs provided by an SMT solver.

Blocking clause

After finding a model

µ = l0 ∧ l1 ∧ · · · ∧ lm ∧ . . .

Add the clause:

block = ¬l0 ∨ ¬l1 ∨ · · · ∨ ¬lm

11 / 23

Blocking Clause all-SMT

12 / 23

Blocking Clause all-SMT

The blocking clauses method is straightforward and it is simple toimplement.However, adding a large number of blocking clauses will require alarge amount of memory.Also the large number of clauses slows down the BooleanConstraint Propagation procedure of the underlying solver.To address these inefficiencies we introduce an alternative methodwhich avoids re-discovering solutions using depth-first search(DFS).

13 / 23


Use APIs provided by an SMT solver

Depth-first search

Two components:

A DPLL like procedure to enumerate truth assignments.

Use the SMT solver to check consistency of the truthassignments.

14 / 23

Depth First Search all-SMT

15 / 23

Depth First Search all-SMT

The method choose literal chooses the next state to explorefrom VI in a DFS manner, and even if there are 2N possible statesefficient pruning avoid exponential blow-out for programs that“don’t leak too much”, i.e.

Depth-first search all-SMT algorithm is linear in |O|

16 / 23

QIF analysis using Model Checking

UNSAT

p1

p1 ∧ p2

p1 ∧ p2 ∧ p3

p1 ∧ p2 ∧ p3 ∧ p4

p1 ∧ p2 ∧ p3 ∧ p4 ∧ p5p1 ∧ p2 ∧ p3 ∧ p4 ∧ ¬p5

p1

p2

p3

p4

p5

assert !(p1 && p2 && p3 && p4 && p5);

17 / 23

Implementation

Tools selected:

Model Checking: CBMC (Ansi C)Symbolic Execution: Symbolic PathFinder (Java bytecode)Program transformation: CBMCSMT solver: z3

Benchmarks include:

Vulnerabilities in Linux kernelAnonymity protocolsA Tax program from the European project HATS (Java)

Assumptions: all programs have bounded loops, no recursion.

18 / 23

Evaluation

19 / 23

Evaluation

20 / 23

Conclusions

P

program transformationϕP

QIF All-SMT

Formal methods DPLL(T )

Two approaches:

Use formal methods to mimic DPLL(T ).

QIF analysis using Model Checking.QIF analysis using Symbolic Execution.

Generate ϕP , then using DPLL(T ).

Generate ϕP using program transformation.Extend an SMT solver for All-SMT.

21 / 23

THANK YOU FOR YOUR ATTENTION!

22 / 23

Science

All-Solution Satisfiability Modulo Theories: applications, algorithms and benchmarks