138
Probabilistic Analysis Using a Theorem Prover Osman Hasan Sofiene Tahar Hardware Verification Group Concordia University Montreal, Canada CADE-22 Tutorial August 2, 2009

Probabilistic Analysis Using a Theorem Proverohasan.seecs.nust.edu.pk/talks/cade_tutorial_2009.pdf · Probabilistic Analysis Using a Theorem Prover Osman Hasan Sofiene Tahar Hardware

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Probabilistic Analysis Using a Theorem Prover

Osman Hasan Sofiene Tahar

Hardware Verification Group Concordia University

Montreal, Canada

CADE-22 Tutorial August 2, 2009

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Objectives

q Probabilistic Theorem Proving “A formal verification technique for systems with random or unpredictable components”

q Why do we need it? q What is it?

q How can we apply it for the analysis of real-world applications?

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

3

Outline

q Introduction and Motivation

q Probabilistic Theorem Proving

q Case Studies q Coupon Collector’s Problem

q Stop-and-Wait Protocol

q Reconfigurable Memory Arrays

q Conclusions

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

4

Why System Verification?

Hardware Software

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

5

Why System Verification?

q Therac-25

q Software Bug in a Cancer Therapy Machine

q 3 Deaths and 3 severe injuries between 1985-87

q FDIV bug in Intel Pentium

q Hardware error in the floating point division unit

q Resulted in net loss of US $500M to the company in 1994

q Faulty systems can be disastrous

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

6

Why System Verification?

q  Mars Polar Lander

q  Engine shutdown due to spurious signals that gave false indication that spacecraft had landed Mars

q  Resulted in a loss of US $370M in 1999

q  Mars Climate Orbiter

q  Conversion error from English units to metric units

q  Resulted in a loss of US $125M in 1999

q Faulty systems can be disastrous

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

7

Why System Verification?

q Faulty systems can be disastrous

q Unfortunately, many other examples can be found … And the list is still growing!

q System Verification is the process that allows us to debug errors in the design phase where it is cheaper to do so

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

8

System Verification

Hardware Software

System Model

Property Satisfied? Yes/No

Properties

Computer Based Analysis Framework

Probabilistic Analysis using a Theorem Prover 9

Simulation

q State-of-the-art system verification approach

q Step 1 q Construct a computer based model of the system

q Step 2 q Analyze the behavior of the system model under a

number of test cases to deduce properties of interest

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

10

Simulation – Example q  8 Bit Adder

q  Model q  VHDL/Verilog

q  Test Cases

q  Deduction: The property is true as it is found to be true for all the test vectors used

Test vectors (x,y) System output (z) z=x+y

(1,1) 2 True

(4,0) 4 True

(100,100) 200 True

(127,127) 254 True

+x z

O. Hasan and S. Tahar

y

Probabilistic Analysis using a Theorem Prover 11

Simulation

q Easy to use

q May generate inaccurate results q Practically impossible to test for all possible cases when

dealing with large systems §  Over a million gate hardware design §  Windows kernel

q Example §  64-bit floating-point division routine.

§  There are 2128 combinations §  At 1 test/µsec – 1025 years

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

12

Formal Verification

q Precise and accurate system analysis approach

q Based on Mathematical techniques q Construct a computer based mathematical model of

the system (implementation)

q Use mathematical reasoning to check if the implementation satisfies the properties of interest (specifications) in a computerized environment

q Sometimes is difficult and time consuming

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

13

Formal Verification Techniques

q Model Checking

q Theorem proving

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

14

Model Checking

S ys tem

Temporal LogicFSM

ModelChecker

True,   if   the  model   satis fies   the   specificationC ounter   example,   otherwise

Properties

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

15

Model Checking – Example – Traffic Light Controller q Objective is to prioritize the

highway traffic

q Highway light remains green until there is a car at the farm road

q Farm light remains green for only x time units

q There is a yellow light in every transition

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

16

Model Checking – Example – Traffic Light Controller

O. Hasan and S. Tahar

Time x has elapsed

Car Present on the Farm Road

No car on the Farm Road

Hwy =GREEN

Hwy = YELLOW Hwy = RED

Hwy = RED

Time x has not elapsed

Farm =RED

Farm =RED

Farm =GREEN

Farm =YELLOW

Probabilistic Analysis using a Theorem Prover

17

Model Checking – Example – Traffic Light Controller

O. Hasan and S. Tahar

MODEL CHECKER

System Description

Temporal Logic

Properties

Both lights are not green Both lights are not red If a car arrives at the farm road it will eventually get access

True

False

(Counter Example)

Probabilistic Analysis using a Theorem Prover

18

Model Checking

q Advantages q Automatic (Push button type analysis tools) q No proofs involved q Diagnostic counter examples

q Disadvantages q Limited expressiveness q State-space explosion problem

q Model Checking Tools q SMV (Symbolic Model Verifier) - Carnegie Mellon U. q VIS (Verification Interacting with Synthesis) - U. of California, Berkeley q Formal Check – Cadence Labs

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

19

Theorem Proving

S ys tem

Logic (Function)

Logic (Theorem)

Formal  proofs  of   the   sys tem  properties  

Properties

Theorem Prover

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

20

Logic

q  Formal language q Modeling systems q Modeling system properties

q  Types of Logic q  Propositional logic

§  (Boolean Algebra, variables ∈ {T,F}

q  First-order logic (Predicate logic) §  Quantification over variables (∀: For all, ∃: there exists)

q  Higher-order logic §  Quantification over sets and function

Decidability: There is an algorithm for deciding the truth of a formula (theorem)

First-Order LogicPropositional Logic Higher-Order Logic

Less expressive(-) Very expressive(+)Decidable(+) Undecidable(-)

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

21

Theorem Prover

q A theorem prover consists of q A notation (syntax) to express logic q a small set of fundamental axioms (facts)

§  A Boolean variable can be True or False: ∀ a.(a = T) ∨ (a =F)

q a small set of inference (deduction) rules §  Equality is transitive: ∀ a b c. (a = b) ∧ (b = c) ⇒ (a = c)

q Soundness is assured as every new theorem must be created from q The basic axioms and primitive inference rules q Any other already proved theorems or inference rules

q Theory (collection of verified theorems in a file) q Facilitate the reusability of pre-verified results

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

22

Theorem Proving – Example q  Check if y>x for the given system (x is a natural number)

1 y>x Problem statement

2 (x+1)2>x Implementation

3 (x+1).(x+1)>x Definition of Square

4 (x+1).x+(x+1).1>x Distributivity

5 x.x+1.x+x.1+1.1>x Distributivity

6 x.x+x+x+1>x Multiplicative Identity

7 x.x+x+1+x>x Additive Commutivity

8 x.x+x+1>0 Addition Cancellation

9 True Natural numbers > 0

2)1( +xx y

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

23

Theorem Proving

q Advantages q High expressiveness

§  Can be essentially used to analyze any system that can be expressed mathematically

q Less risk of mistakes (human errors) q Some parts of the proofs can be automated

q Disadvantages q Detailed and explicit human guidance required q The state-of-the-art is limited

q Theorem Proving Tools q Boyer-Moore (First-order Logic) U. of Texas, Austin q PVS (Higher-order Logic) Stanford Research Institute q HOL (Higher-order-logic) U. of Cambridge, UK

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

24

Some Formal Verification Myths

q Formal Verification can only be used by mathematicians q They are primarily based on mathematical concepts that is

usually transparent to the user

q The reasoning process is itself prone to errors, so why bother? q We opt to reduce design bugs not eliminate them

q Using formal verification tends to slow the design process q The early detection of design bugs are allows us to speed up the

overall design process

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

25

Formal Verification Challenge

O. Hasan and S. Tahar

Environmental Conditions

Aging Phenomena Probabilistic Choice

Unpredictable Inputs

Noise

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Probabilistic System Verification

Hardware Software

System Model

Property Satisfied?

Random Components

Probabilistic and Statistical Properties

Computer Based Analysis Framework

R andom  VariablesProperties

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

27

Probabilistic Analysis Basics – Random Variables

q Discrete Random Variables q Attain a countable number of values

q Example §  Dice[1, 6]

q Continuous Random Variables q Attain an uncountable (infinite) number of values q Examples

§  Uniform (all real numbers in an interval [a,b])

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

28

Probabilistic Analysis Basics – Probabilistic Properties

Property Description Examples

Discrete Continuous

Probability Mass Function (PMF)

Probability that the random variable is equal to some number n

Cumulative Distribution Function (CDF)

Probability that the random variable is less than or equal to some number n

Probability Density Function (PDF)

Slope of CDF for continuous random variables

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

29

Probabilistic Analysis Basics – Statistical Properties

Property Description Illustration

Expectation

Long-run average value of a random variable

Variance Measure of dispersion of a random variable

Tail Distribution

Bounds

Upper limits of the probability that the random variable acquires values far from its expectation

(Markov’s and Chebyshev’s inequalities)

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Probabilistic Analysis Approaches

Random Components

Probabilistic State Machine good

Analysis

Accuracy

Expressiveness

No CPU Time Issue

Automation

Approximate random variable

functions

Observing some test cases

û

ü

û

ü

Probabilistic State Machine

Exhaustive Verification

ü

û

û

ü

Precise random variable functions

Mathematical Reasoning

ü

ü

ü

û

Simulation Formal Methods

Model Checking Theorem Proving

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar

31

Outline

ü Introduction and Motivations

q Probabilistic Theorem Proving

q Case Studies

q Conclusions

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

32

HOL Theorem Prover

q Higher-order logic theorem prover q University of Cambridge, UK

q Notation: ML q 5 axioms

q 8 primitive inference rules

q Numerous proof assistants are available

q Inbuilt mathematical theories of Boolean, list, set, integers, real analysis and probability

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

System Properties

33

Probabilistic Theorem Proving

System Description

Sys

tem

Pro

pert

ies

(Dis

cret

e R

ando

m V

aria

bles

)

Sys

tem

Pro

pert

ies

(Con

tinuo

us R

ando

m V

aria

bles

)

System Model

Probabilistic Analysis

Theorems

Discrete Random Variables

Continuous Random Variables

Random Components

Probabilistic Properties

Statistical Properties

PMF

CDF

Expectation

Variance

Probabilistic Properties

Statistical Properties

CDF

PDF

Expectation

Variance

Theorem Prover

Formal Proofs of Properties

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

34

Probabilities in HOL

q Formal Verification of Probabilistic Algorithms in HOL, PhD thesis, U. of Cambridge, UK. [Hurd, 2002]

q A probabilistic algorithm that q Accepts : α

q Returns: β

can be modeled in HOL as a deterministic function

f : α → B∞ → (β x B∞

)

that passes around the infinite Bernoulli sequence (B∞

), which provides the source of randomness

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

35

Probabilistic Algorithms in HOL – Example 1 q Coin Flip (Head, Tail)

B∞ → (flip_outcome x B∞

)

         

⊢ flip s =

(if (top element of s) then Head else Tail, remaining portion of s)

Definition: Coin Flip

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

36

Probabilistic Algorithms in HOL – Example 2 q  An n-bit Discrete Standard Uniform

Random Variable

num → B∞

→ (real x B∞)

q  Algorithm:

q Where Bi = 1 if ith element in the Bernoulli sequence is T else 0

   {H1,  T2,  H3,  ...Tn}  →              (1/21  +  0/22  +  1/23  +  …1/2n)    =  

(0.101...1)  

PMF

0 x

1/8

1/4 2/4 3/4 1

PMF

0 x

1/4

1/4 2/4 3/4 1

∑=

=n

ii

in BU

1

)21(

PMF

0 x1

n21

n21

n

n

212 −

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

37

Probabilistic Algorithms in HOL – Example 2 (Formalization)

q  s = (B0, B1, B2, B3, …) q  std_unif_disc 0 s = (0, (B0, B1, B2, B3, …) ) q  std_unif_disc 1 s = (if (B0) then ((1/2) + 0) else 0, (B1, B2, B3, …) ) q  std_unif_disc 2 s = (if (B1) then ((1/4) + fst (std_unif_disc 1 s) )

else (fst (std_unif_disc 1 s) ) , (B2, B3, …) )

⊢ std_unif_disc 0 s = (0,s) ∧ ∀ n. std_unif_disc (n + 1) s =

(if (shd (snd (std_unif_disc n s))) then ((1/2)n+1 + fst (std_unif_disc n s))

else (fst (std_unif_disc n s)), stl (snd (std_unif_disc n s)))

Definition: Discrete Standard Uniform Random Variable

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

38

Probability Theory in HOL

q Probability Space (Ω, ∑, P)

q Ω: Sample Space §  Set of Boolean Sequences

q ∑: Events §  Sigma Algebra on Ω; a set of subsets Ω, which is closed under

complements and countable unions

q P : Probability §  Function that maps the elements of ∑ to real interval [0,1]

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

39

Probability Theory in HOL

Theorems: Basic Probability Axioms

Probability of the Sample Space ⊢ P (Ω) = 1

Probability Bounds ⊢ ∀ A. 0 ≤ P (A) ≤ 1

Probability is Monotonically Increasing

⊢ ∀ A B. A ⊆ B ⇒ P (A) ≤ P (B)

Probability is Additive ⊢ ∀ A B. A Ո B = ∅ ⇒ P (A U B) = P (A) + P (B)

Probability of a Complement Set ⊢ ∀ A. P(¬A) = 1 - P(A)

⊢ ∀ b. P {s | shd s = b} = ½

Theorem: Probability of an Element of the Boolean Sequence

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

40

Probabilistic Algorithms in HOL – Example 1 (Verification) q Coin Flip (Head, Tail)

 

 

⊢ flip s =

(if (top element of s) then Head else Tail, remaining portion of s)

Definition: Coin Flip

⊢ P {s | FST (flip s) = Head} = ½

Theorem: PMF of Coin Flip

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

41

Probabilistic Algorithms in HOL – Example 2 (Verification)

PMF

0 x1n21

n22

n

n

212 −

n21

⊢ ∀ m n x. P {s | fst (std_unif_disc n s) = x} = if (x < 0) then 0 else (if (x ≥ 1) then 0 else

(if (x=m/2n) then (1/2)n else 0

Theorem: PMF

⊢ ∀ m n x. P {s | fst (std_unif_disc n s) ≤ x} = if (x < 0) then 0 else (if (x ≥ 1) then 1 else

(if (x=m/2n) then ((m+1)/2n) else 0

Theorem: CDF

0 x

1

1

n21

n21

n22

n

n

212 −

CDF

Probabilistic Analysis using a Theorem Prover

Probabilistic Algorithms in HOL

q The approach described so far is limited to probabilistic algorithms q Can acquire a finite number of values (2n)

q The occurrence probability of each value is

1/2n: n is the number of elements of the Boolean sequence

q Not all algorithms satisfy these conditions q Example: Geometric Random Variable

§  Returns the index of the first success in an infinite number of Coin Flips or Bernoulli trials

q Probabilistic While Loop q The probability of loop termination is 1

O. Hasan and S. Tahar

42

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

43

Discrete Random Variables in HOL

Theorems: Some Formalized Discrete Random Variables

Random variable PMF

Uniform(m) ⊢ ∀ m x. x < m ⇒ P {s | fst (prob_unif m s) = x} = 1/m

Bernoulli(p) ⊢ ∀ p. 0 ≤ p ∧ p < 1 ⇒ P {s | fst (prob_bern p s) = x} = p

Geometric(p) ⊢ ∀ n p. 0 < p ∧ p ≤ 1 ⇒ P {s | fst (prob_geom p s)=(n + 1)}= p(1-p)n

Binomial(m,p) ⊢ ∀ m n p. 0 < p ∧ p ≤ 1 ⇒ P {s | fst (prob_bino m p s) = n} = (binomial m n) pn (1 – p)m - n

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Break!

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

45

System Model

Probabilistic Theorem Proving

Random Components

System

Syst

em P

erfo

rman

ce (D

iscr

ete

Ran

dom

Var

iabl

es)

Syst

em P

erfo

rman

ce (C

ontin

uous

Ran

dom

Var

iabl

es)

Probabilistic Analysis

Theorems

Discrete Random Variables

Continuous Random Variables

Probabilistic Properties

Statistical Properties

PMF

CDF

Expectation

Variance

Probabilistic Properties

Statistical Properties

CDF

PDF

Expectation

Variance

Theorem Prover

Formal Proofs of Properties

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

46

Continuous Random Variables in HOL q Sampling algorithms are non-terminating

q Tedious Formalization and Verification

q  Inverse Transform Method q Extensively used Non-uniform random number generation

method q Formalization of Continuous Probability Distributions, ,

Automated Deduction, [Hasan and Tahar, 2007]

Standard Uniform Random

Number Generator [0, 1]

Inverse Transform

Method

Random Numbers From Continuous

Distributions (Closed CDF)

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

47

Continuous Random Variables in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

48

Standard Uniform Random Variable

0 else head) flipcoin (i if 1 where)21( th

1

===∑∞

=iii

i XXU

q Continuous Uniform random variable [0,1]

q Algorithm

§  {H,  T,  H,  H  ...}  →        (1/21  +  0/22  +  1/23  +  1/24  +  …)          =  (0.1011..)2  

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

49

Formalization of Standard Uniform Random Variable q Step 1

q Discrete Standard Uniform Random Variable

§  std_unif_disc

Algorithm: q Where Bi = 1 if ith element in the

Bernoulli sequence is T else 0

   {H1,  T2,  H3,  ...Tn}  →              (1/21  +  0/22  +  1/23  +  …1/2n)    =  

(0.101...1)  

PMF

0 x

1/4

1/4 2/4 3/4 1

PMF

0 x1

n21

n21

n

n

212 −

∑=

=n

ii

in BU

1

)21(

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

50

Formalization of Standard Uniform Random Variable q Step 1

q Discrete Standard Uniform Random Variable

§  std_unif_disc

⊢ std_unif_disc 0 s = (0,s) ∧ ∀ n. std_unif_disc (n + 1) s =

(if (shd (snd (std_unif_disc n s))) then ((1/2)n+1 + fst (std_unif_disc n s))

else (fst (std_unif_disc n s)), stl (snd (std_unif_disc n s)))

Definition: Discrete Standard Uniform Random Variable

∑=

=n

ii

in BU

1

)21(

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

51

Formalization of Standard Uniform Random Variable q Step 1

q Discrete Standard Uniform Random Variable

§  std_unif_disc

q Step 2 q As n tends to infinity

⊢ ∀ s. std_unif_cont s = lim (λn. std_unif_disc n s)

Definition: Standard Uniform Random Variable

nnUU

∞→= lim

∑=

=n

ii

in BU

1

)21(

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

52

Verification of Standard Uniform Random Variable

⊢ ∀ x. P {s | std_unif_cont s) ≤ x} =

if (x < 0) then 0 else (if (x < 1) then x else 1

Theorem: CDF of Standard Uniform Random Variable

q Proof Sketch: q Verify the CDF of the discrete Uniform random variable

q Take the limit as n approaches infinity

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

53

Continuous Random Variables in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

54

Cumulative Distribution Function

q Completely characterizes both Discrete and Continuous random variables

)Pr()( xRxFR ≤=

⊢ ∀ R x. cdf R x = P {s | R s ≤ x}

Definition: Cumulative Distribution Function (CDF)

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

55

Verification of CDF Properties

Theorems: CDF Properties

Bounds ⊢ ∀ R x. 0 ≤ cdf R x ≤ 1

Monotonic ⊢ ∀ R a b. (a < b) ⇒ (cdf R a ≤ cdf R b)

Interval Probability

⊢ ∀ R a b. (a < b) ⇒ (P {s | (a < R s) ∧ (R s ≤ b)} = cdf R b – cdf R a)

Negative Infinity ⊢ ∀ R. lim (λn. cdf R (-n)) = 0

Positive Infinity ⊢ ∀ R. lim (λn. cdf R n) = 1

Continuous form the Right

⊢ ∀ R a. lim (λn. cdf R (a + )) = cdf R a

Limit from the left

⊢ ∀ R a. lim (λn. cdf R (a – )) = P {s | R s < s}

1n1+

1n1+

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

56

Continuous Random Variables in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

57

Inverse Transform Method

q Random variable X with well defined CDF F

q U: Standard Uniform random variable

q F-1: Inverse Function of F

q Proof utilizes the CDF of Standard Uniform random variable and CDF properties

)(1 UFX −=

⊢ ∀ f f_inv x. (is_cont_cdf_fn f) ∧ (inv_cdf_fn f_inv f) ⇒

P {s | f_inv (std_unif_cont s) ≤ x} = f x

Theorem: Inverse Transform Method

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

58

Continuous Random Variables in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

59

Continuous Random Variables in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

60

Example – Exponential Random Variable

q CDF:

(λx. if  x  ≤  0  then  0  else  (1  –  exp–ax))  

q  Inverse CDF: (λx.  -­‐1/a  ln(1-­‐x))    

⎭⎬⎫

⎩⎨⎧

<

x0 ,exp-10 x ,0

ax-

⊢ ∀a. is_cont_cdf_fn (λx. if x ≤ 0 then 0 else (1 – exp (–ax)))

Theorem: Valid CDF Function

⊢ ∀a. inv_cdf_fn (λx. if x ≤ 0 then 0 else (1 – exp (–ax))) (λx. -1/a ln(1-x))

Theorem: Valid CDF Inverse Function

a = 0.5 a = 1.0 a = 1.5

Probabilistic Analysis using a Theorem Prover

Example – Exponential Random Variable

q Proof q Inverse Transform Method Theorem q Real Analysis

O. Hasan and S. Tahar 61

⊢ ∀ a s. exp_rv a s = (λx.–(1/a)ln (1-x)) (std_unif_cont s)

Definition: Exponential Random Variable

⊢ ∀ a x. (0 < a) ⇒ cdf (λs. exp_rv a s) x =

if x ≤ 0 then 0 else (1 – exp (–ax))

Theorem: CDF of Exponential Random Variable

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

62

Continuous Random Variables in HOL

Theorems: Continuous Random Variables

Random Var. HOL Functions CDF

Exponential(l)

⊢ ∀ a s. exp_rv a s = (λx.–(1/a)ln(1-x)) (std_unif_cont s)

Uniform(a,b) ⊢ ∀ a b s. uniform_rv a b s = (λx. (b – a)x + a) (std_unif_cont s)

Rayleigh(l) ⊢ ∀ l s. rayleigh_rv l s = (λx.l sqrt(-2ln(1-x))) (std_unif_cont s)

Triangular(a) ⊢ ∀ a s. triangular_rv a s = (λx. a(1–sqrt(1 – x))) (std_unif_cont s)

⎭⎬⎫

⎩⎨⎧

<

x0 ,exp-10 x ,0

ax-

⎪⎪⎭

⎪⎪⎬

⎪⎪⎩

⎪⎪⎨

<

≤<

xb 1,

bxa ,a-ba-x

a x,0

⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧

<

x0 ,exp-1

0 x ,0

2

2

2x-l

⎪⎪⎭

⎪⎪⎬

⎪⎪⎩

⎪⎪⎨

<−

xa ,1

ax,2

(a2

0 x ,02

axx

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Probabilistic Theorem Proving

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

64

System Model

Probabilistic Theorem Proving

Random Components

System

Syst

em P

erfo

rman

ce (D

iscr

ete

Ran

dom

Var

iabl

es)

Syst

em P

erfo

rman

ce (C

ontin

uous

Ran

dom

Var

iabl

es)

Probabilistic Analysis

Theorems

Discrete Random Variables

Continuous Random Variables

Probabilistic Properties

Statistical Properties

PMF

CDF

Expectation

Variance

Probabilistic Properties

Statistical Properties

CDF

PDF

Expectation

Variance

Theorem Prover

Formal Proofs of Properties

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

65

Statistical Properties in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

66

Formalization of Expectation

q Using Theorem Proving to Verify Expectation and Variance for Discrete Random Variables, JAR, [Hasan and Tahar, 2008]

q  Summarizes the distribution characteristics of a random variable in a single number

⊢ ∀ f R. expec_fn f R = (f n) P {s | fst (R s) = n}

Definition: Expectation of a Function of Random Variable

⊢ ∀ R. expec R = expec_fn (λn. n) R

Definition: Expectation of a Random Variable

∑∞

=

==0

)Pr()()]([n

nRnfRfEx

∑∞

=0n

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

67

Verification of Expectation Properties

q Linearity of Expectation

q Helpful in reasoning about systems involving multiple random variables

q Proof Sketch §  2 random variables (real analysis + probability theory) §  General case (induction)

][][01∑∑==

=n

ii

n

ii RExREx

⊢ ∀ L. (∀ R. R ∈ L ⇒ (∃x. i P {s | fst (R s) = i}=x) ⇒

expec (sum_rv_lst L) = expec (el (length L – (n + 1)) L)

Theorem: Linearity of Expectation

∑∞

=0i

∑=

Llenght

0n

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

68

Verification of Expectation Properties

q Expectation of a random variable added and multiplied by constants

q Proof §  Linearity of Expectation Property §  (real analysis + probability theory)

][][ RbExabRaEx +=+

⊢ ∀ R a b. (∃x. i P {s | fst (R s) = i}=x) ⇒

expec (bind R (λm. unit (a + b m))) = a + b (expec R)

Theorem: Random Variable Added and Multiplied by Constants

∑∞

=0n

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

69

Statistical Properties in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

70

Formalization of Variance

q Measure of dispersion of a random variable

]])[[(][ 2RExRExRVar −=

⊢ ∀ R. variance R = expec_fn (λn. (n – expec R)2) R

Definition: Variance of a Random Variable

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

71

Verification of Variance Properties

q Variance in terms of Moments

q Proof §  Definitions of Variance and Expectation §  (real analysis + probability theory)

22 ])[(][][ RExRExRVar −=

⊢ ∀ R. (∃ x. i P {s | fst (R s) = i}=x) ∧

(∃ x. i2 P {s | fst (R s) = i}=x) ⇒

variance R = expec_fn (λn. n2)R – (expec R)2

Theorem: Variance in Terms of Moments

∑∞

=0i

∑∞

=0i

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

72

Verification of Variance Properties

q Linearity of Variance

q Proof Sketch §  2 random variables (real analysis + probability theory + Linearity of

Expectation Property) §  General case (induction)

⊢ ∀ L. (∀ R. R ∈ L ⇒ (∃ x. i P {s | fst (R s) = i} = x) ∧

(∃ x. i2 P {s | fst (R s) = i} = x) ⇒

variance (sum_rv_lst L) = variance (el (length L – (n + 1)) L)

Theorem: Linearity of Variance

∑∞

=0i

∑=

Llenght

0n

∑∞

=0i

][][01∑∑==

=n

ii

n

ii RVarRVar

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

73

Statistical Properties in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Tail Distribution Bounds

q Upper limits on the probability that the random variable acquires values far from its expectation

q Useful in estimating failure probabilities

q Markov’s Inequality q Chebyshev’s Inequality

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

75

Markov’s Inequality

q Obtains weak tail bound in terms of expectation of a random variable

q Proof §  Definition of expectation and its properties §  (real analysis + probability theory)

aRExaR ][)Pr( ≤≥

⊢ ∀ R a. (∃ x. (n P {s | fst (R s) = n}) = x) ∧ (0 < a) ⇒

P {s | fst (R s) ≥ a}) ≤

Theorem: Markov’s Inequality

∑∞

=0n

aR expec

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

76

Chebyshev’s Inequality

q Relatively stronger tail bound in terms of expectation and variance of a random variable

q Proof §  Definitions of expectation and variance and their properties §  (real analysis + probability theory)

2

][)|][Pr(|aRVaraRExR ≤≥−

⊢ ∀ R a. (0 < a) ∧

(∃ x. i P {s | fst (R s) = i} = x) ∧

(∃ x. i2 P {s | fst (R s) = i} = x) ⇒

P {s | abs (fst (R s) – expec R) ≥ a}) ≤

Theorem: Chebyshev’s Inequality

∑∞

=0i

∑∞

=0i

2aR variance

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

77

Statistical Properties in HOL

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

78

Example – Geometric Random Variable

q Proof §  Definitions of expectation and variance and their properties §  (real analysis + probability theory)

⊢ ∀ p. (0 < p) ∧ (p ≤ 1) ⇒ variance (λs. prob_geom p s) =

Theorem: Variance of Geometric Random Variable

2pp-1

⊢ ∀ p. (0 < p) ∧ (p ≤ 1) ⇒ expec (λs. prob_geom p s) =

Theorem: Expectation of Geometric Random Variable

p1

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

79

Verification of Expectation and Variance Relations

Theorems: Discrete Random Variables

Random variable

HOL

Function

Expectation Variance

Uniform(m)

prob_unif

Bernoulli(p) prob_bern

Geometric(p) prob_geom

Binomial(m,p) prob_bino

2m

121)1( 2 −+m

p )1( pp −

p1

2

1pp−

mp )1( pmp −

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

80

System Model

Probabilistic Theorem Proving

Random Components

System

Syst

em P

erfo

rman

ce (D

iscr

ete

Ran

dom

Var

iabl

es)

Syst

em P

erfo

rman

ce (C

ontin

uous

Ran

dom

Var

iabl

es)

Probabilistic Analysis

Theorems

Discrete Random Variables

Continuous Random Variables

Probabilistic Properties

Statistical Properties

PMF

CDF

Expectation

Variance

Probabilistic Properties

Statistical Properties

CDF

PDF

Expectation

Variance

Theorem Prover

Formal Proofs of Properties

Probabilistic Analysis using a Theorem Prover

Expectation of Continuous Random Variables

q Reimann Integral

q  fX: Probability Density Function (PDF) of random variable X q Well known facilitate the reasoning process q Limited to random variables with well-defined PDFs q Requires extended real numbers Ŕ=R U {-∞,+∞}

q Lebesgue Integral

q Ω: sample space and P: Probability function q Most general definition of expectation,

§  Caters for both Discrete and Continuous random variables q Analytically complex to handle

∫+∞

∞−

= dxxxfXEx X )(][

O. Hasan and S. Tahar

∫Ω

= XdPXEx ][

81

Probabilistic Analysis using a Theorem Prover

Expectation of Continuous Random Variables

∫Ω

= XdPXEx ][

O. Hasan and S. Tahar

Simplified Expressions that involve commonly

used arithmetic operations

82

q  Formal Reasoning about Expectation Properties for Continuous Random Variables, Formal Methods, [Hasan, Abbasi, Akbarpour and Tahar, 2009]

⊢ ∀ X. expec_cont (Ω, ∑, P) X = X d P

Definition: Expectation of a Continuous Random Variable

∫Ω

q  Formalization of Lebesgue Integral in HOL, U. of Cambridge, UK. [Aaron, 2009]

Probabilistic Analysis using a Theorem Prover

Expectation of Continuous Random Variables q Simplified Expression 1: Bounded Random Variables

⎥⎦

⎤⎢⎣

⎭⎬⎫

⎩⎨⎧ −

++<≤−+−+= ∑

=∞→

12

0)(

21)(

2))(

2(lim][

n

innnn

abiaXabiaPabiaXE

O. Hasan and S. Tahar

83

⊢ ∀ a b X. (0 ≤ a) ∧ (a < b) ∧ (∀s. a ≤ X s ≤ b) ⇒ expec_cont (Ω, ∑, P) X =

Theorem: Expectation of Bounded Random Variables

⎥⎦

⎤⎢⎣

⎭⎬⎫

⎩⎨⎧ −

++<≤−+−+∑

=∞→

12

0innnn

n

a)(b21iasXa)(b

2ia|sa))P(b

2i(alim

Probabilistic Analysis using a Theorem Prover

Expectation of Continuous Random Variables q Simplified Expression 2: Unbounded Random Variables

⎥⎦

⎤⎢⎣

⎡≥+

⎭⎬⎫

⎩⎨⎧ +

<≤= ∑−

=∞→

12

0)(

21

22lim][

nn

innnn

nXnPiXiPiXE

O. Hasan and S. Tahar

84

⊢ ∀ a b X. (∀s. 0 ≤ X s) ⇒

expec_cont (Ω, ∑, P) X =

Theorem: Expectation of Unbounded Random Variables

{ }⎥⎦

⎤⎢⎣

⎡≥+

⎭⎬⎫

⎩⎨⎧ +

<≤∑−

=∞→

12

0|

21

2|

2lim

nn

innnn

nsXsnPisXisPi

Probabilistic Analysis using a Theorem Prover

Example: Exponential Random Variable

q Proof q Evaluating the two probability terms using the CDF of Exponential

random variable

q Evaluating the infinite summation

⎥⎦

⎤⎢⎣

⎡≥+

⎭⎬⎫

⎩⎨⎧ +

<≤= ∑−

=∞→

12

0)(

21

22lim][

nn

innnn

nXnPiXiPiXE

O. Hasan and S. Tahar

85

⊢ ∀ a. (0 < a) ⇒ expec_cont (Ω, ∑, P) (λs. exp_rv a s) = 1/a

Theorem: Expectation of Exponential Random Variable

Probabilistic Analysis using a Theorem Prover

Verification of Expectation Relations

O. Hasan and S. Tahar

86

Theorems: Continuous Random Variables

Random variable HOL

Function

Expectation

Uniform(a,b)

uniform_rv

Triangular(0,b) triangular_rv

Exponential(a) exp_rv

2ba +

3b

a1

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

87

System Model

Probabilistic Theorem Proving

Random Components

System

Syst

em P

erfo

rman

ce (D

iscr

ete

Ran

dom

Var

iabl

es)

Syst

em P

erfo

rman

ce (C

ontin

uous

Ran

dom

Var

iabl

es)

Probabilistic Analysis

Theorems

Discrete Random Variables

Continuous Random Variables

Probabilistic Properties

Statistical Properties

PMF

CDF

Expectation

Variance

Probabilistic Properties

Statistical Properties

CDF

PDF

Expectation

Variance

Theorem Prover

Formal Proofs of Properties

Probabilistic Analysis using a Theorem Prover

Effort Statistics

O. Hasan and S. Tahar

88

Formalization Approx. Lines of HOL code

Measure and Probability Theories 17,000

Discrete Random Variables Formalization and Probabilistic Properties

1,500

Discrete Random Variables Statistical Properties

7,500

Continuous Random Variables Formalization and Probabilistic Properties

7,000

Continuous Random Variables Statistical Properties

8,500

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar

89

Outline

ü Introduction and Motivation

ü Probabilistic Theorem Proving

q Case Studies q Coupon Collector’s Problem

q Stop-and-Wait Protocol

q Reconfigurable Memory Arrays

q Conclusions

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

90

Coupon Collector’s Problem

q Collect all n coupons and win!

q A collection of coupons with n distinct entries

q Each distinct coupon is uniformly distributed in the collection

q Coupons are drawn randomly and independently

q How many trials are required to acquire all n coupons?

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

91

Coupon Collector’s Problem

q How many message transmissions we need on average to get all router ID’s in the path?

q Tail Distribution Bounds: q Pr (number of transmissions required to know all router ID’s in

the path > some threshold value)

5 9

12 8

11

23

18

15

8 23 8 5

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

92

Coupon Collector’s Problem

q The process of acquiring a new coupon q Geometric random variable

§  Number of trials to achieve the first success

q Coupon Collector’s Problem q A sum of n Geometric random variables

q Where each Xi denotes the Geometric random variable to acquire the ith new coupon

∑=

=n

iiXX

1

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

93

Formalization of Coupon Collector’s Problem in HOL q Coupons are identified by unique positive integers

q Accepts: Number of acquired coupons q Returns: The corresponding coupon collector list

q Example: q Input: 1 → [0] q Input: 2 → [0, 1] q And so on …

⊢ (coupon_lst 0 = [ ]) ∧

∀ n. (coupon_lst (n + 1) = n :: (coupon_lst n))

Definition: Coupon Collection List

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

94

Formalization of Coupon Collector’s Problem in HOL

q Accepts q Coupon Collector’s list (Already acquired coupons) q Total number of distinct coupons

q Returns q List of Geometric random variables corresponding to the number

of trials for all acquired and the next coupon

q  Success probability is modeled using the Uniform random variable q Probability that a new coupon number is uniformly generated

⊢ ∀ n. (geom_rv_lst [ ] n = [prob_geom 1]) ∧ ∀ h t n. (geom_rv_lst (h::t) n =

(prob_geom P{s | ¬(mem (fst (prob_unif n s)) (h::t))} ) :: (geom_rv_lst t n))

Definition: Geometric Random Variable List

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

95

Formalization of Coupon Collector’s Problem in HOL

q Example q Total number of distinct coupons: 5

§  [0]: [prob_geom P{s | ¬(mem (fst (prob_unif 5 s)) [0])} , prob_geom 1] = [prob_geom 4/5, prob_geom 1]

§  [0,1]: [prob_geom 3/5, prob_geom 4/5, prob_geom 1]

⊢ ∀ n. (geom_rv_lst [ ] n = [prob_geom 1]) ∧ ∀ h t n. (geom_rv_lst (h::t) n =

(prob_geom P{s | ¬(mem (fst (prob_unif n s)) (h::t))} ) :: (geom_rv_lst t n))

Definition: Geometric Random Variable List

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

96

Formalization of Coupon Collector’s Problem in HOL

q Accepts q  Total number of coupons (n+1)

q Returns q  Sum of (n+1) Geometric random variables

§  Each Geometric random variable models the number of trails required to acquire a distinct coupon in coupon collector’s problem

⊢ ∀ n. coupon_collector (n + 1) =

sum_rv_lst (geo_rv_lst (coupon_lst n) (n + 1))

Definition: Coupon Collector’s Problem

∑=

=n

iiXX

1

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

97

Verification of Coupon Collector’s Problem

q Proof q PMF of Uniform random variable

q Set theory and Real Analysis

⊢ ∀ L n. (dist_lst L) ∧ (∀ a. mem a L ⇒ (a < (n + 1)))

⇒ (P {s | ¬(mem (fst (prob_unif (n + 1) s)) L)}

=

Theorem: Probability of Acquiring a New Coupon

1nLlength - 1

+

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

98

Verification of Coupon Collector’s Problem

q Proof q Expectation of Geometric random variable q Linearity of Expectation property q Real Analysis

q Proof q Variance of Geometric random variable q Linearity of Variance property q Real Analysis

⊢ ∀ n. expec (coupon_collector (n + 1)) =

Theorem: Average of Coupon Collector’s problem

∑+

= ++

1n

0i 1i11) (n

⊢ ∀ n. variance (coupon_collector (n + 1)) ≤

Theorem: Variance Upper Bound of Coupon Collector’s Problem

∑+

= ++

1n

0i2

2

1)(i11)(n

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

99

Verification of Coupon Collector’s Problem

q Proof q Markov and Chebyshev’s inequalities q Expectation and variance for Coupon Collector’s problem q Real Analysis

⊢ ∀ n a. (0 < a) ⇒

P {s | fst (coupon_collector (n + 1) s) ≥ a}) ≤

Theorem: Tail Distribution Bound (Markov’s Inequality)

∑+

= +

+ 1n

0i 1i1

a1n

⊢ ∀ n a. (0 < a) ⇒

P {s | abs (fst (coupon_collector (n + 1) s) –

expec (coupon_collector (n + 1))) ≥ a}) ≤

Theorem: Tail Distribution Bound (Chebyshev’s Inequality)

∑+

= +

+ 1n

0i22

2

1)(i1

a1)(n

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

100

Coupon Collector’s Problem -Summary

q Results exactly match the paper-and-pencil based analysis methods q 100% precise

q Analysis was based on the pre-existing formalization and verification of Geometric and Uniform random variables, and Linearity of expectation and variance properties, and Chebyshev’s and Markov inequalities q ~1000 lines of HOL code q ~100 man-hours

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar

101

Outline

ü Introduction and Motivation

ü Probabilistic Theorem Proving

q Case Studies ü Coupon Collector’s Problem

q Stop-and-Wait Protocol

q Reconfigurable Memory Arrays

q Conclusions

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

102

Why Stop-and-Wait Protocol?

q Stop-and-Wait Protocol q Classical example of a real-time system

q Real-Time Systems q Involve a subtle interaction of a number of distributed

components

q Performance Analysis is not very straight-forward

Both simulation and state-based formal techniques fail to produce reasonable results

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

103

Stop-and-Wait Protocol

q Message Delay for a single packet q Unpredictable Characteristic

q Depends on the channel noise

q Channel Error probability: p q Average Message Delay: ?

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

104

Formalization of the Stop-and-Wait Protocol

q  source q  data messages list

q  dataS, dataR, ackS, ackR q  (time → data message)

q  sink, rem q  (time → data message list)

Bernoulli Random Variable

DATA-TRANS

ACK_RECV

DATA-CHAN

ACK-CHAN

DATA-RECV

ACK-TRANS

Sender Channel Receiver

source, rem t sink

dataS dataR

ackS ackR

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

105

Formalization of the Stop-and-Wait Protocol ⊢ ∀ in out del p bseqt. DATA_CHAN in out del p bseqt = ∀ t. (if t < del then (*If current time is less than channel Delay*)

(out t = set_non_packet) ∧ (*No output*)

(bseqt (t + 1) = bseqt t) (*Boolean seq. Retains its value*)

else (if good_packet (in (t - del)) then (*If a good packet arrives*)

(if ¬fst (prob_bern p (bseqt t)) then (*If no noise effect*)

(out t = in (t - del)) ∧ (*Packet reaches output*)

(bseqt (t + 1) = snd (prob_bern p (bseqt t))) (*Update Boolean seq.*)

else (out t = set_non_packet) ∧ (*No output*)

(bseqt (t + 1) = snd (prob_bern p (bseq t)))) (*Update Boolean seq.*)

else (out t = set_non_packet) ∧ (*No output*)

(bseqt (t + 1) = bseqt t))) (*Boolean seq. Retains its value*)

Definition: Data Channel

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

106

Formalization of the Stop-and-Wait Protocol

STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec flag bseqt bseq p = DATA_TRANS ws sn dataS s rem i ackS tout tf dtout dtf ∧ DATA_CHAN dataS dataR d tprop p bseqt ∧ DATA_RECV sn dataR sink r ∧ ACK_TRANS sn ackR r ackty ack msg ta dta rec flag ∧ ACK_CHAN ackR ackS d tprop ∧ ACK_RECV ws sn ackS rem s ∧ INIT source rem s sink r i ackR dtout dtf dta tout tf ta rec flag bseqt bseq

Definition: Stop-and-Wait Protocol

DATA-TRANS

ACK_RECV

DATA-CHAN

ACK-CHAN

DATA-RECV

ACK-TRANS

Sender Channel Receiver

source sink dataS dataR

ackS ackR

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

107

Functional Verification of the Stop-and-Wait Protocol

⊢ ∀ source sink. REQ source sink =

(∃ t. sink t = source) ∧ ∀ t n. is_prefix (sink t) (sink (t + n))

Definition: Functional Requirement for the Stop-and-Wait Protocol

q Ensure reliable data transfer from the sender to receiver

DATA-TRANS

ACK_RECV

DATA-CHAN

ACK-CHAN

DATA-RECV

ACK-TRANS

Sender Channel Receiver

source sink dataS dataR

ackS ackR

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

108

Functional Verification of the Stop-and-Wait Protocol

⊢ ∀ source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p.

STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p ∧

LIVE ASSUMPTION abort (*Liveness constrain: Data will be eventually received*)

⇒ REQ source sink

Theorem: Functional Correctness for the Stop-and-Wait Protocol

q The formal model of the Stop-and-Wait protocol implies the functional requirement

q Proof q  Induction on the source list q Stop-and-Wait protocol definition

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

109

Performance Analysis – Message Delay

q Transmission Trial (Noiseless Channel):

q Transmission Trial (Channel Error):

q Message Delay:

)(2 procpropaf tttt +++

outf tt +

)(2)1)(( )1( procpropafpoutf ttttGtt ++++−+ −

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

110

Performance Analysis in HOL – Message Delay

q Message Delay is a random variable q Time required to successfully transmit a single data

message

q rem t: remaining portion of source list at time t

q @t : A t such that q bseq t: Infinite Boolean Sequence at time t

⊢ ∀ rem source bseqt. MSG_DELAY rem source bseqt =

((@t. (rem t = TL source) ∧ (rem (t - 1) = source)), bseqt @t. (rem t = TL source) ∧ (rem (t - 1) = source))

Definition: Message Transmission Delay

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

111

Performance Analysis in HOL – Message Delay

⊢ ∀ source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p. STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p ∧ ¬(NULL source) ∧ (* Source has always some data to be transmitted*)

tprop + 1 + ta + tprop + 1 ≤ tout ∧ (*Tout is greater than the roundtrip delay of a message*)

LIVE ASSUMPTION abort ∧ 0 ≤ p ∧ p < 1 ⇒ (MSG_DELAY rem source bseqt = ((tf + tout) (fst (prob_geom (1 – p) bseq) – 1)+ (tf + ta + 2 (tprop + tproc)),

snd (prob_geom (1 – p) bseq))

Theorem: Average Message Delay for the Stop-and-Wait Protocol

)(2)1)(( :ChannelNoisy )1( procpropafpoutf ttttGtt ++++−+ −

q Proof q Stop-and-Wait protocol definition

q Geometric random variable properties

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

112

Performance Analysis in HOL – Average Message Delay

q  Proof q  Stop-and-Wait protocol definition q  Expectation and Geometric random variable properties

)(21

)( :Delay Message Average procpropaf

outf ttttpptt

++++−

+

⊢ ∀ source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p. STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec flag bseqt bseq p ∧ ¬(NULL source) ∧ tprop + 1 + ta + tprop + 1 ≤ tout ∧ LIVE ASSUMPTION abort ∧ 0 ≤ p ∧ p < 1

⇒ expec (MSG_DELAY rem source bseqt) = (tf + tout) (p/(1-p)) + (tf + ta + 2 (tprop + tproc))

Theorem: Average Message Delay for the Stop-and-Wait Protocol

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

113

Stop-and-Wait Protocol -Summary q Performance Analysis Results exactly match the paper-

and-pencil based analysis methods q 100% precise

q Analysis was based on the pre-existing formalization and verification of Geometric and Bernoulli random variables and expectation properties q ~6000 lines of HOL code

q ~300 man-hours

q A single Stop-and-Wait protocol model was used for both Performance Analysis and Functional Verification

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar

114

Outline

ü Introduction and Motivation

ü Probabilistic Theorem Proving

ü Case Studies ü Coupon Collector’s Problem

ü Stop-and-Wait Protocol

q Reconfigurable Memory Arrays

q Conclusions

Probabilistic Analysis using a Theorem Prover

115

Motivation

q Solution q Add Redundancy q Make Memory

Reconfigurable

q How much redundancy? q Probabilistic Techniques

using Computer Simulation §  Inaccurate

q Proposed Solution q Theorem Proving

O. Hasan and S. Tahar

Neighborhood Pattern

Sensitive Faults

Transition Faults

Stuck-at Faults

Coupling Faults

Probabilistic Analysis using a Theorem Prover

Reconfigurable Memory Arrays

q Memories fabricated with spare rows and columns q Spares can be reconfigured to replace rows and

columns with fabrication faults

q Repairability q If a combination of spare rows and columns exists such

that all faults from the memory array can be eliminated

116 O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

Reconfigurable Memory Arrays

q Repairability is judged based on Probabilistic Techniques

q 3 Step Process q Model fault occurrence behavior with an appropriate random

variable

q Estimate statistical information regarding the number of faults, such as average number of faults

q A memory would be termed as repairable if the available spare rows and columns ascertain fixing all the estimated faults with probability 1.

117 O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

118

Stuck-at Faults

O. Hasan and S. Tahar

q Most common Fabrication Fault q Occurs when a memory cell never changes its

state, i.e., it is always stuck in one state q Stuck-at 1 Fault

q Stuck-at 0 Fault

Probabilistic Analysis using a Theorem Prover

119

Formal Stuck-at Fault Model for Reconfigurable Memory Arrays q Memory Array modeled as a bipartite graph (R,C,F)

q R: set of vertices representing the memory rows

q C: set of vertices representing the memory columns

q F: set of edges, where each edge in this set represents a Stuck-at fault in the memory array and connects one vertex in R to a vertex in C

q Assumption q Faults are independent and identically distributed with

probability p

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

Formal Stuck-at Fault Model for Reconfigurable Memory Arrays

120 O. Hasan and S. Tahar

sc = b n cp cq cr

ri

rj

rk

sc = a n

Number of Columns = n

Num

ber of Row

s = n

ri

rj

rk

cp

cq

cr

e1

e4

F = { } e1, e2, e3, e4

Probabilistic Analysis using a Theorem Prover

121

Formal Stuck-at Fault Model for Reconfigurable Memory Arrays

q  The repair probability of a memory array is defined as:

where each Pr and |F| represent the probability function and the cardinality of the set F, respectively

q  The repair probability for a square memory array is given by:

where and .

q Our goal is to verify that the memory array is almost always

repairable if the stuck-at fault occurrence probability is where as .

sc)sr|FPr(| +≤

b)n)(a|FPr(| +≤nsr

a =nsc

b =

nn

w(n)nb)(a

p −+

= ∞→w(n) ∞→n

O. Hasa and S. Tahar

Probabilistic Analysis using a Theorem Prover

122

Higher-order-logic Formalization

q mem_fault_model accepts three parameters: the cardinalities of the sets R and C and the probability of fault occurrence p

q  It returns total stuck-at faults in the memory array

q  It basically performs a Bernoulli(p) trail for each cell in the memory and returns the number of True outcomes obtained

⊢ ∀ p. mem_fault_model_helper 0 p = unit 0) ∧ ∀ c p. mem_fault_model_helper (c + 1) p = bind (mem_fault_model_helper c p) (λa. bind (prob bern p) (λb. unit (if b then (a+1) else a))) ⊢ (∀ c p. mem_fault_model 0 c p = unit 0) ∧ ∀ r c p. mem_fault_model (r + 1) c p = bind (mem_fault_model r c p) (λa. bind (mem_fault_model_helper c p) ((λb. unit (a + b)))

Definition: Stuck-At Fault Memory Model

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

123

Higher-order-logic Formalization

⊢ ∀ n a b w. mem_fault_model_rep n a b w = mem_fault_model n n

Definition: Stuck-at Fault Memory Model for Repairability Problem

⎟⎠

⎞⎜⎝

⎛−

+

nn

w(n)nb)(a

O. Hasan and S. Tahar

q Function mem_fault_model_rep accepts four parameters: q Cardinality of sets R and C of a square reconfigurable memory

array as a natural number n q The fractions of spare row and columns as real numbers a and b q Real sequence w of type (naturalàreal)

q Utilizes mem_fault_model and returns number of stuck-at faults for the specific case of a square n x n memory array with fault occurrence probability

nn

w(n)nb)(a−

+

Probabilistic Analysis using a Theorem Prover

124

Alternate Expression for the Number of Faults

q The alternate expression is expressed in terms of the Binomial random variable

q Easy to use as we do not have to deal with the recursive definition

q Proof q Independence and identically distributed stuck-at

faults assumptions q Formal definitions of Bernoulli and Binomial random

variables

⊢ ∀ n a b w. mem_fault_model_rep n a b w = prob_bino n2

Lemma: Number of Stuck-at Faults in terms of Binomial R.V.

⎟⎠

⎞⎜⎝

⎛−

+

nn

w(n)nb)(a

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

125

Statistical Property 1

O. Hasan and S. Tahar

⊢ ∀ n a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (1<n) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ⇒ expec (λs. mem_fault_model_rep n a b w s) =

Theorem: Average Number of Stuck-at Faults

⎟⎠

⎞⎜⎝

⎛−

+

nn

w(n)nb)(a

n2n

q Assumptions q Fractions (a,b) are bounded by the interval [0,1] q 1<n to ensure that memory array has more than one cell q Bounds on w(n) ensure that the fault probability falls with in the

interval [0,1]

No such restriction placed on w(n) in paper and pencil analysis

q Proof q Expectation of Binomial random variable

Probabilistic Analysis using a Theorem Prover

126

Statistical Property 2

q Proof q Expectation and Variance of Binomial Random Variable

⊢ ∀ n a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (1<n) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ⇒ variance (λs. mem_fault_model_rep n a b w s) =

Theorem: Variance of Stuck-at Faults

⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠

⎞⎜⎝

⎛−

+−⎟

⎞⎜⎝

⎛−

+

nn

w(n)nb)(a

1nn

w(n)nb)(a

n2

n

O. Hasan and S. Tahar

Probabilistic Analysis using a Theorem Prover

127

Statistical Property 3

O. Hasan and S. Tahar

q Proof q Probability Axioms q Expectation and Variance of Binomial random variable q Chebyshev’s inequality

⊢ ∀ n a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (1<n) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ⇒ P {s | (fst (mem_fault_model_rep n a b w s)) ≤ (a+b)n} ≥

Theorem: Tail Distribution Bound for Stuck-at Faults

( )2)(1

nwn

⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠

⎞⎜⎝

⎛−

+−⎟

⎞⎜⎝

⎛−

+

−nn

w(n)nb)(a

1nn

w(n)nb)(a

n2

n

Probabilistic Analysis using a Theorem Prover

128

Repairability Problem

q Repairability Problem

( ) 1b)n(a|F|Prlimn

=+≤∞→

O. Hasan and S. Tahar

⊢ ∀ a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ∧ (lim ) ⇒ (lim (λn. P{ s | (fst (num_of_faults n a b w s) ) ≤ (a+b)n}) = 1)

Theorem: Repairability Problem of Stuck-at Faults

⎟⎟⎠

⎞⎜⎜⎝

⎛= 0

w(n)1λ n.

n

q Proof q Probability Axioms q Tail Distribution Bound Theorem q Real Analysis and Limit Theory

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

129

Reconfigurable Memory Array -Summary

q The Analysis Results exactly match the paper-and-pencil based analysis methods q 100% precise

q Analysis was based on the pre-existing formalization and verification of Bernoulli and Binomial random variables and Chebyshev’s inequality q ~1200 lines of HOL code

q ~80 man-hours

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar

130

Outline

ü Introduction and Motivation

ü Probabilistic Theorem Proving

ü Case Studies ü Coupon Collector’s Problem

ü Stop-and-Wait Protocol

ü Reconfigurable Memory Arrays

q Conclusions

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

Objectives

q Probabilistic Theorem Proving

q Why do we need it? §  Exact Answers (Useful for the analysis of Safety critical application)

q What is it? §  Mathematically reason about Probabilistic and Statistical

properties of a system using a computer-based theorem prover

q How can we apply it for the performance analysis of real-world applications? §  Mathematically model (Formalize) the system as a higher-order-

logic function while modelling its random components with random variables

§  Formalize probabilistic and statistical properties as higher-order-logic theorems

§  Verify these theorems in a theorem prover

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

132

Conclusions q Probabilistic Theorem Proving is not an alternative to

approaches such as simulation or model checking

q Less critical sections of the system q Simulation

q Critical sections of the system that can be expressed as a Markov Chain and can be handled without the state-space explosion problem q Model Checking

q Critical sections of the system that cannot be handled by Model Checking q Thereom Proving

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

133

Ongoing and Future Work

q Theoretical Foundations q Probability Density Function (PDF) q Continuous random variables for which CDF does not

exist in a closed form q Variance and Moments for Continuous Random

Variables q Multiple Continuous Random Variables

q Discrete Time Markov Chains

q Continuous Time Markov Chains

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

134

Ongoing and Future Work

q Applications q Algorithms

§  Birthday Paradox §  Hiring Problem §  Hat-Check Problem §  Quicksort

q Telecommunications §  Automated repeat request (ARQ) protocols §  ARQ mechanism at the logic link control (LLC) layer of the General

Packet Radio Service (GPRS) §  Wireless sensor Network Protocols

q VLSI and Digital Design §  Irrepairability Analysis of reconfigurable Memory Arrays §  Reliability of Digital Logic Circuits

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

135

Thank you!

q For More Information

q Visit our website §  http://hvg.ece.concordia.ca/Research/METH/PAHTP

q Contact §  [email protected]

§  [email protected]

q Ask Now!

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

136

References

q Motivation and a concise description of the tutorial q O. Hasan. Formal Probabilistic Analysis using Theorem Proving.

Concordia University, Montreal, Canada, 2008.

q Formalization of Measure and Probability Theory and Discrete Random Variables q J. Hurd. Formal Verification of Probabilistic Algorithms. PhD

Thesis, University of Cambridge, Cambridge, UK, 2002.

q Formalization of Continuous Random Variables q O.Hasan and S. Tahar.

Formalization of the Standard Uniform Random Variable. Theoretical Computer Science, Vol. 382, No. 1, Elsevier, 2007, pp. 71-83.

q O. Hasan and S. Tahar: Formalization of Continuous Probability Distributions; In: Automated Deduction, LNCS 4603, Springer Verlag, 2007, pp. 2-18.

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

137

References

q  Statistical Properties of Discrete Random Variables q O. Hasan and S. Tahar:

Verification of Expectation Properties for Discrete Random Variables in HOL; In: Theorem Proving in Higher-Order Logics, LNCS 4732, Springer Verlag, 2007, pp. 119-134.

q O.Hasan and S. Tahar. Using Theorem Proving to Verify Expectation and Variance for Discrete Random Variables. Journal of Automated Reasoning, Vol. 41, No. 3-4, Springer Verlag, 2008, pp. 295-323.

q O.Hasan and S. Tahar. Formal Verification of Tail Distribution Bounds in the HOL Theorem Prover. Mathematical Methods in The Applied Sciences, Vol. 32, no. 4, Wiley Interscience, March 2009, pp. 480-504.

q  Statistical Properties of Continuous Random Variables q  A. Coble. On Probability, Measure, and Integration in HOL4. Technical

Report, Computing Laboratory, University of Cambridge, UK, 2009, http://www.srcf.ucam.org/~arc54/techreport.pdf.

q O. Hasan, N. Abbasi, B. Akbarpour, S. Tahar and R. Akbarpour. Formal Reasoning about Expectation Properties for Continuous Random Variables; Formal Methods, Eindoven, Netherlands, November 2009. (To appear)

Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar

138

References

q Applications q O. Hasan and S. Tahar:

Performance Analysis of ARQ Protocols using a Theorem Prover; In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'08), IEEE Computer Society, Austin, Texas, USA, April 2008, pp. 85-94.

q O. Hasan, N. Abbasi and S. Tahar: Formal Probabilistic Analysis of Stuck-at Faults in Reconfigurable Memory Arrays; In: Integrated Formal Methods, LNCS 5423, Springer Verlag, 2009, pp. 277-291.

q O.Hasan and S. Tahar. Performance Analysis and Functional Verification of the Stop-and-Wait Protocol in HOL. Journal of Automated Reasoning, Vol. 42, No. 1, Springer Verlag, January 2009, pp. 1-33.

q O.Hasan and S. Tahar. Probabilistic Analysis of Wireless Systems using Theorem Proving. Electronic Notes in Theoretical Computer Science, Vol. 242, No. 2, Elsevier, July 2009, pp. 43-58.