Nikolaj Bjørner Senior Researcher Microsoft Research Redmond

Nikolaj BjørnerSenior Researcher

Microsoft Research Redmond

Modern Satisfiability Modulo Theories Solvers in Program Analysis

Lectures

Wednesday 10:45–12:15An Introduction to Z3 with Applications

Thursday August 30th 15:45–17:15Introduction to SAT and SMT

Friday 10:30–10:45 Theories and Solving Algorithms

Friday 15:45–17:15 Advanced & Summary: Quantifiers, Arrays, Fixed-points

I. Introduction to Z3 – An Efficient SMT SolverI. ArchitectureII. Interaction – Input formatsIII. Theories

II. Introduction to ApplicationsI. Basic examples of SMT solvingII. A selection of applications using Z3/SMT.

Takeaways:

• Many program analysis, verification and test tools solve problems that can be reduced to logical formulas and transformations between logical formulas at their core.

• Modern SMT solvers are a often good fit.– Handle domains found in programs directly.

• The selected examples are intended to show instances where sub-tasks are reduced to SMT/Z3.

Z3 – AN EFFICIENT SMT SOLVER

http://research.microsoft.com/projects/z3

What is Z3?

TheoriesBit-Vectors

Lin-arithmetic Nonlin-arithmetic

Free (uninterpreted) functions

Arrays

Quantifiers:E-matching

CNative

SMT-LIB

Model Generation:Finite/parametric

Simplify

Floating pointsRecursive Datatypes

Quantifiers:Super-position

Proof objects

TacticsAssumption

tracking

By Leonardo de Moura, Nikolaj Bjørner, Christoph Wintersteiger

F# quote

Python

Z3 is Cutting Edge

http://smtcomp.org

Z3 is Free Software

for research

Z3: Little Engines of Proof

Congruence

Closure

SAT Solver

Simplifier

Quant.Instance

Simplex

Bit-Arith

Super-position

Arrays

Datatypes

User-Theorie

Q- Elim

Freely available from http://research.microsoft.com/projects/z3

INPUT FORMATS

Input Formats

• Text:– SMT-LIB2 - main exchange format for SMT solvers– Log - low-level log for replay– Datalog - a Datalog format for the fixed-point engine– DIMACS - a format for propositional SAT

• Programmatic:– C - API functions exposed for C– Ocaml - Ocaml wrapper around C API– .NET - .NET wrapper around C API– Python - Try it on http://rise4fun.com/z3py – Scala - by Phillipe Suter

A Primer on SMT-LIB2

Online interactive tutorial

http://rise4fun.com/z3/tutorial

THEORIES

Theories

Uninterpreted functions

Uninterpreted functionsArithmetic (linear)

Theories

Uninterpreted functionsArithmetic (linear)Bit-vectors

Theories

Uninterpreted functionsArithmetic (linear)Bit-vectorsAlgebraic data-types

Theories

Uninterpreted functionsArithmetic (linear)Bit-vectorsAlgebraic data-typesArrays

Theories

Uninterpreted functionsArithmetic (linear)Bit-vectorsAlgebraic data-typesArraysPolynomial Arithmetic

Theories

Uninterpreted functionsArithmetic (linear)Bit-vectorsAlgebraic data-typesArraysPolynomial Arithmetic

Theories

USER-INTERACTION AND GUIDANCE

Logical Formula

Sat/Model

Interaction

Logical Formula

Unsat/Proof

Interaction

Simplify

Logical Formula

Interaction

ImpliedEqualities

- x and y are equal- z + y and x + z are equal

Logical Formula

Interaction

QuantifierEliminatio

Logical Formula

Interaction

Logical Formula

Unsat. Core

APPLICATIONS OF Z3

A Decision Engine for SoftwareSome Microsoft engines:- SDV: The Static Driver Verifier- PREfix: The Static Analysis Engine for C/C++.- Pex: Program EXploration for .NET.- SAGE: Scalable Automated Guided Execution - Spec#: C# + contracts- VCC: Verifying C Compiler for the Viridian Hyper-Visor- HAVOC: Heap-Aware Verification of C-code.- SpecExplorer: Model-based testing of protocol specs.- Yogi: Dynamic symbolic execution + abstraction.- FORMULA: Model-based Design- F7: Refinement types for security protocols- M3: Model Program Modeling- VS3: Abstract interpretation and Synthesis

They all use the SMT solver Z3.

Hyper-V

Applications

Test case generation

Verifying Compilers

Predicate Abstraction

Invariant Generation

Type Checking

Model Based Testing

Theorem Provers/Satisfiability Checkers

A formula F is validIff

F is unsatisfiable

Theorem Prover/Satisfiability Checker

SatisfiableModel

FUnsatisfiableProof

Theorem Provers/Satisfiability Checkers

A formula F is validIff

F is unsatisfiable

SatisfiableModel

FUnsatisfiableProof

Timeout Memout

Verification/Analysis Tool: “Template”

Verification/AnalysisTool

Problem

Logical Formula

UnsatisfiableSatisfiable(Counter-example)

SMT EXAMPLEBasic

Is formula satisfiable modulo theory T ?

SMT solvers have specialized algorithms for

Satisfiability Modulo Theories (SMT)

ArithmeticArray TheoryUninterpreted Functions

𝑥+2=𝑦⇒ 𝑓 (𝑠𝑒𝑙𝑒𝑐𝑡 (𝑠𝑡𝑜𝑟𝑒 (𝑎 ,𝑥 ,3 ) , 𝑦−2 ) )= 𝑓 (𝑦−𝑥+1)

Satisfiability Modulo Theories (SMT)

SMT EXAMPLEJob Shop Scheduling

Job Shop Scheduling

Machines

P = NP? Laundry 𝜁 (𝑠 )=0⇒ 𝑠=12+𝑖𝑟

Constraints:Precedence: between two tasks of the same job

Resource: Machines execute at most one job at a time

[ 𝑠𝑡𝑎𝑟 𝑡2,2 ..𝑒𝑛𝑑2,2 ]∩ [ 𝑠𝑡𝑎𝑟 𝑡4,2 ..𝑒𝑛𝑑4,2 ]=∅

Job Shop Scheduling

Constraints:Encoding:

Precedence: - start time of job 2 on mach 3 - duration of job 2 on mach 3Resource:

[ 𝑠𝑡𝑎𝑟 𝑡2,2 ..𝑒𝑛𝑑2,2 ]∩ [ 𝑠𝑡𝑎𝑟 𝑡4,2 ..𝑒𝑛𝑑4,2 ]=∅

Not convex

Job Shop Scheduling

case split

Efficient solvers:- Floyd-Warshal algorithm- Ford-Fulkerson algorithm

𝑧−𝑧=5 – 2 –3 – 2=−2<0

EXAMPLE EXERCISESFollow-on to lectures by Hoare and Petersen

True or false?

SMT-LIB2 formulation:(define-fun Max ((x Int) (y Int)) Int (if (> x y) x y))(define-fun ominus ((x Int) (y Int)) Int (Max 0 (- x y)))(declare-const x Int)(declare-const a Int)(assert (not (<= (+ (ominus x a) a) x)))(check-sat)(get-model)

Basic Arithmetic

Follow on Exercises

Prove the other properties for WP

Prove Axioms for Hoare and Milner calculi

SMT APPLICATIONTest case generation

Test generation tools

Internal. For Security Fuzzing

Runs on x86 instructions

External. For Developers

Runs on .NET code

Try it on: http://pex4fun.com

Finding security bugs before the hackers

black hat

SMT APPLICATIONStatic Analysis

Integrating Z3 with PREfix static analyzer. Yannick Moy, David Selaff, B

int binary_search(int[] arr, int low, int high, int key)

while (low <= high) { // Find middle value int mid = (low + high) / 2; int val = arr[mid]; if (val == key) return mid; if (val < key) low = mid+1; else high = mid-1; } return -1;}

void itoa(int n, char* s) {

if (n < 0) { *s++ = ‘-’; n = -n; } // Add digits to s ….

-INT_MIN= INT_MIN

(INT_MAX+1)/2 +(INT_MAX+1)/2

= INT_MIN

Package: java.util.ArraysFunction: binary_search

Book: Kernighan and RitchieFunction: itoa (integer to ascii)

What is wrong here?

int init_name(char **outname, uint n){ if (n == 0) return 0; else if (n > UINT16_MAX) exit(1); else if ((*outname = malloc(n)) == NULL) { return 0xC0000095; // NT_STATUS_NO_MEM; } return 0;}

int get_name(char* dst, uint size) { char* name; int status = 0; status = init_name(&name, size); if (status != 0) { goto error; } strcpy(dst, name);error: return status;}

The PREfix Static Analysis Engine

C/C++ functions

model for function init_nameoutcome init_name_0: guards: n == 0 results: result == 0outcome init_name_1: guards: n > 0; n <= 65535 results: result == 0xC0000095outcome init_name_2: guards: n > 0|; n <= 65535 constraints: valid(outname) results: result == 0; init(*outname)

models

C/C++ functions

model for function init_nameoutcome init_name_0: guards: n == 0 results: result == 0outcome init_name_1: guards: n > 0; n <= 65535 results: result == 0xC0000095outcome init_name_2: guards: n > 0|; n <= 65535 constraints: valid(outname) results: result == 0; init(*outname)

path for function get_name guards: size == 0 constraints: facts: init(dst); init(size); status == 0

models

warnings

pre-condition for function strcpy init(dst) and valid(name)

6/26/2009 54

iElement = m_nSize;if( iElement >= m_nMaxSize ){

bool bSuccess = GrowBuffer( iElement+1 );…

}::new( m_pData+iElement ) E( element );m_nSize++;

Overflow on unsigned addition

m_nSize == m_nMaxSize == UINT_MAX

Write in unallocated

memory

iElement + 1 == 0

Code was written for

address space < 4GB

Using an overflown value as allocation size

ULONG AllocationSize;while (CurrentBuffer != NULL) { if (NumberOfBuffers > MAX_ULONG / sizeof(MYBUFFER)) { return NULL;

} NumberOfBuffers++; CurrentBuffer = CurrentBuffer->NextBuffer;

}AllocationSize = sizeof(MYBUFFER)*NumberOfBuffers;UserBuffersHead = malloc(AllocationSize);

6/26/2009 55

Overflow check

Possible overflow

Increment and exit from loop

LONG l_sub(LONG l_var1, LONG l_var2){ LONG l_diff = l_var1 - l_var2; // perform subtraction // check for overflow if ( (l_var1>0) && (l_var2<0) && (l_diff<0) ) l_diff=0x7FFFFFFF …

6/26/2009 56

Possible overflow

Forget corner case INT_MIN

Overflow on unsigned subtraction

for (uint16 uID = 0; uID < uDevCount && SUCCEEDED(hr); uID++) {…

if (SUCCEEDED(hr)) { uID = uDevCount; // Terminates the loop

6/26/2009 57

Possible overflow

Loop does not terminate

uID == UINT_MAX

Overflow on unsigned addition

Using an overflown value as allocation size

DWORD dwAlloc;dwAlloc = MyList->nElements * sizeof(MY_INFO);if(dwAlloc < MyList->nElements) … // returnMyList->pInfo = malloc(dwAlloc);

6/26/2009 58

Can overflow

Allocate less than needed

Not a proper test

Modular arithmetic

Bit-wise operations

1 0 1 0 1 1 0 1 1 0 0 1

Concatenation

1 0 1 0 1 1 [4:2] = 0 1 0

1 0 1 0 1 1

0 1 1 0 0 1

0 0 1 0 0 1

1 0 1 0 1 1

0 1 1 0 0 1+

0 0 0 1 0 0

Extraction

Bit-wise and

AdditionVector

Segments

Vector Segments

Bit-precise analysis

a0b0a0b1a0b2a0b3

a1b0a1b1a1b2

HAHAHA

out0 out1 out2 out3

O(n2) clauses

SAT solving time increases exponentially. Similar for BDDs.

Brute-force enumeration + evaluation faster for 20 bits.

Method: Encode bit-vectors bit-by-bit into Boolean Satisfiability

Multiplication [out3, out2, out1, out0] = [a3, a2, a1, a0] * [b3, b2, b1, b0]

SMT APPLICATIONVerified Software

Microsoft Verifying C Compiler

Hypervisor Verification (2007 – 2010)

Partners:• European Microsoft Innovation Center• Microsoft Research• Microsoft’s Windows Division• Universität des Saarlandes

co-funded by the German Ministry of Education and Research

http://www.verisoftxt.de

Hypervisor: Scale & Speed

• VCs have several Mega-bytes• Thousands of non ground formulas• Developers are willing to wait at most 5 min

per VC

Are you willing to wait more than 5 min for your compiler?

Verification Attempt Time vs.Satisfaction and Productivity

By Michal Moskal (VCC Designer and Software Verification Expert), Language quiz: “loose” or “lose” ?

Why did my proof attempt fail?

1. My annotations are not strong enough!weak loop invariants and/or contracts

2. My theorem prover is not strong (or fast) enough.Send “angry” email to Nikolaj and Leo.

VCC Performance Trends Nov 08 – Mar 09

Attempt to improve Boogie/Z3 interaction

Modification in invariant checking

Switch to Boogie2

Switch to Z3 v2

Z3 v2 update

The Importance of Speed

Building VerveVerifie

Safe to the Last Instruction / Jean Yang & Chris Hawbliztl PLDI 2010

C# compiler

Kernel.cs

Boogie/Z3

Translator/Assembler

TAL checker

Linker/ISO generator

Verve.iso

Source file

Compilation toolVerification tool

Nucleus.bpl (x86) Kernel.obj (x86)

9 person-months

SMT APPLICATIONFormal Language Theory and Security

Why string analysis?(motivating scenario)

Tomcat v. < 6.0.18

req = http://www.x.com/%c0%ae%c0%ae/%c0%ae%c0%ae/private/

Windows 2000 vulnerability: http://www.sans.org/security-resources/malwarefaq/wnt-unicode.php

Apache Tomcat vulnerability: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2008-2938

1) security check: req must not contain "../"

2) dir = utf8decode("%c0%ae%c0%ae/%c0%ae%c0%ae/private/") = "../../private/"

access granted to "../../private/"

Analysis question:Does utf8decode reject overlong

utf8-encodings such as "%C0%AE" for '.'?

Symbolic Word Transducers

Relativized Formal Language Theory

Classical Word Transducers(e.g. decoding automata,

rational transductions)

Classical I/O Automata(e.g. Mealy machine)

ClassicalWord Acceptors

(NFA, DFA)

Symbolic Word Acceptors

regex matching

string transformation

Classical Word Acceptors modulo Th()

Classical Word Transducers modulo Th()

Rex & Bek – Symbolic RegEx & Transducers

Margus Veanes

Symbolic Finite Transducer (SFT)

• Classical transducer modulo a rich label theory• Core Idea: represent labels with guarded

transformation functions– Separation of concerns: finite graph / theory of labels

Concrete transitions:

Symbolic transition:

‘\x80’/“\xC2\x80”

… ‘\x7FF’/“\xDF\xBF”

x. 8016 ≤ x ≤ 7FF16/[C016|x10,6, 8016|x5,0]

bitvector operations

1920transitions

SMT APPLICATIONSoftware Model Checking

Static Driver Verifier

SLAM – part of SDV (Windows 7, 8)Z3 is used for:

Predicate abstraction Counter-example refinement

Yogi – next generation SDVZ3 is used for: Refining transition abstractions

Corral – uses reachability modulo theoriesSLAyer – checks memory safety using separation logic

Z3 & Static Driver Verifier: SLAM

• All-SAT– Better (more precise) Predicate Abstraction

• Unsatisfiable cores– Why the abstract path is not feasible?– Fast Predicate Abstraction

Summary

Introduction to Z3: An Efficient SMT Solver

Introduction to Applications:

– Exemplary applications

– Tools built around Z3

Tomorrow. More technical:– Introduction to SAT solving.– Introduction to SMT solving.

Nikolaj Bjørner Senior Researcher Microsoft Research Redmond

Documents

Leonardo de Moura and Nikolaj Bjørner Microsoft Researchleodemoura.github.io/files/fmcad09-slides.pdf · Leonardo de Moura and Nikolaj Bjørner Microsoft Research. Verification/Analysis

NETWORK VERIFICATION: WHEN HOARE MEETS CERF George Varghese and Nikolaj Bjørner Microsoft Research 1 FOR PUBLIC CLOUDS, PRIVATE CLOUDS, ENTERPRISE NETWORKS,

Austin, Texas 2011 Theorem Proving Tools for Program Analysis SMT Solvers: Yices & Z3 Austin, Texas 2011 Nikolaj Bjørner 2, Bruno Dutertre 1, Leonardo

Nikolaj Bjørner Microsoft Research Lecture 3. DayTopicsLab 1Overview of SMT and applications. SAT solving, Z3 Encoding combinatorial problems with Z3

Nikolaj pp

Nikolaj Bjørner Joe Hendrix Microsoft Research & Corporation

Nikolaj Bjørner Microsoft Research Lecture 5. DayTopicsLab 1Overview of SMT and applications. SAT solving part I. Program exploration with Pex 2SAT solving

Redmond Wrap - Redmond Holiday Wrap 2013

Kunsthallen Nikolaj

invisible Kari Skallerud Bjørner Patchworks, Quilts &c

Nikolaj Bjørner Microsoft Research FSE · 2000 Barrett et.al N.O + Rewriting 2002 Zarba & Manna. “Nice” Theories 2004 Ghilardi et.al. N.O. Generalized 2007 de Moura & B. Model-based

Nikolaj Bjørner Leonardo de Moura Nikolai Tillmann Microsoft Research August 11’th 2008

Nikolaj Bjørner Senior Researcher Microsoft Research Redmond

Nikolaj Bock

Ethan Jackson, Nikolaj Bjørner and Wolfram Schulte

Leonardo de Moura and Nikolaj Bjørner Microsoft Research · 2019. 8. 30. · Leonardo de Moura and Nikolaj Bjørner Microsoft Research. Z3 is a Satisfiability Modulo Theories (SMT)

Network Verification Solvers, Symmetries, Surgeriesconferences.sigcomm.org/sigcomm/2016/files/program/netpl/...Network Verification Solvers, Symmetries, Surgeries Nikolaj Bjørner

Nikolaj Bjørner, Leonardo de Moura Microsoft Research Bruno Dutertre SRI International

Satisfiability Modulo Theories and Network Verification Nikolaj Bjørner Microsoft Research Formal Methods and Networks Summer School Ithaca, June 10-14

Advances in Automated Theorem Proving Leonardo de Moura, Nikolaj Bjørner Ken McMillan, Margus Veanes presented by Thomas Ball