CDM [2ex]Boolean Decision Diagrams [3ex]sutner/CDM/pdf/lect-27.pdf · Boolean Functions as...

Preview:

Citation preview

CDM

Boolean Decision Diagrams

Klaus Sutner

Carnegie Mellon University

Fall 2019

1 Boolean Functions

� Minimal Automata

� Binary Decision Diagrams

� Reduced Ordered BDDs

Boolean Functions 3

Let us return to the two-element Boolean algebra B = {ff, tt}.To lighten notation we will overload and write 2 = {0, 1} instead.

DefinitionA Boolean function of arity n is any function of the form

2n → 2

Note that we only consider single outputs here, f : 2n → 2 rather thanf : 2n → 2m.

Slightly more vexing is the question of whether one should include n = 0. We’llfudge things a bit.

There Are Lots 4

Obviously there are 22n Boolean functions of arity n.

n 1 2 3 4 5 6 7 8

22n 4 16 256 65536 4.3× 109 1.8× 1019 3.4× 1038 1.2× 1077

2220 = 6.7× 10315652 =∞

. . . , for all practical intents and purposes. Since there are quite so many, it is agood idea to think about how to describe and construct these functions.

Different Angles 5

Combinatorics: 2n → 2

Logic: propositional formula

Circuits: logic gates

Datatype: BDDs

In a sense, this is all the same–but different perspectives lead to different ideasand very different algorithms.

For example, from the logic perspective it is natural to generate tautologies, aboring idea from the viewpoint of circuits.

The Challenge Problems 6

The propositional logic angle immediately suggests the following computationalproblems.

Problem: TautologyInstance: A propositional formula ϕ.Question: Is ϕ a tautology?

Problem: SatisfiabilityInstance: A propositional formula ϕ.Question: Is ϕ a contingency?

Problem: EquivalenceInstance: Two propositional formulae ϕ and ψ.Question: Are ϕ and ψ equivalent?

Alas, we know from complexity theory that these are all hard (most likely).

Connections 7

In principle, there really is only one problem to solve here:

ϕ is a tautology iff ¬ϕ fails to be satisfiable

ϕ is a tautology iff ϕ is equivalent to >

ϕ and ψ are equivalent iff ϕ ⇔ ψ is a tautology

However, some care is necessary to deal with normal forms, which are oftenrequired by specific algorithms (such as DPLL). In this world, ϕ is verydifferent from ¬ϕ.

Implicational Tautologies 8

Here are a few tautologies involving implication, conjunction and disjunction. Ahuman being with basic knowledge of propositional logic can immediately seethat these are tautologies.

ϕ ⇒ ψ1 ∧ ψ2 ≡ (ϕ ⇒ ψ1) ∧ (ϕ ⇒ ψ2)

ϕ ⇒ ψ1 ∨ ψ2 ≡ (ϕ ⇒ ψ1) ∨ (ϕ ⇒ ψ2)

ψ1 ∧ ψ2 ⇒ ϕ ≡ (ψ1 ⇒ ϕ) ∨ (ψ2 ⇒ ϕ)

ψ1 ∨ ψ2 ⇒ ϕ ≡ (ψ1 ⇒ ϕ) ∧ (ψ2 ⇒ ϕ)

Implicational Tautologies, II 9

p ⇒ p identity

p ⇒ q ⇒ p introduction

(p ⇒ q ⇒ r) ⇒ (q ⇒ p ⇒ r) interchange

(p ⇒ q) ⇒ (q ⇒ r) ⇒ p ⇒ r cut

(p ⇒ q ⇒ r) ⇒ (p ⇒ q) ⇒ p ⇒ r Frege cut

((p ⇒ q) ⇒ p) ⇒ p Pierce

Threshold Functions 10

On occasion, Boolean functions have a clear combinatorial meaning and arealso easy to understand.

DefinitionA threshold function thrnm, 0 ≤ m ≤ n, is an n-ary Boolean function defined by

thrnm(x) =

{1 if #(i

∣∣ xi = 1) ≥ m,

0 otherwise.

Lots of Boolean functions can be defined in terms of threshold functions.

thrn0 is the constant tt.

thrn1 is n-ary disjunction.

thrnn is n-ary conjunction.

thrnk (x) ∧ ¬thrnk+1(x) is the counting function: “exactly k out of n.”

Counting with 8 Variables 11

And Large Formulae? 12

The queens formula for a 4× 4 board: xij = 1 iff there is a queen in position(i, j).

The two solutions are 0100000110000010 and 0010100000010100.

A Challenge 13

It is entirely straightforward to implement Boolean formulae as labeled trees,essentially the parse trees of the corresponding expressions.

Straightforward, but not necessarily algorithmically smart: it completely ignoresall the special properties of Boolean formulae (as opposed to some other terms,even other terms of the same signature).

So is there a special purpose data structure that is custom designed just forBoolean formulae?

More Precisely . . . 14

We would like our implementation to make it easy to deal with decisionproblems such as Tautology or Satisfiability, at least in some cases.

Obviously there are complexity theoretic constraints: if the implementationworked perfectly all the time, the problems would presumably drop down to P,an unlikely event.

So what we want is a data structure, plus attendant algorithms, that works wellin some interesting cases (but we full-well expect failure in others).

One plausible line of attack is to use some kind of canonical normal form: if werepresent the function in normal form, for example equivalence comes down toequality (or perhaps isomorphism).

� Boolean Functions

2 Minimal Automata

� Binary Decision Diagrams

� Reduced Ordered BDDs

Boolean Functions as Languages 16

Suppose we have some Boolean function f : 2n → 2 . In the spirit ofdisjunctive normal form, we can represent f by the collection of all the Booleanvectors a ∈ 2n such that f(a) = 1, the fibre of 1.

But then we can think of this collection as a binary language:

Lf = {a ∈ 2n | f(a) = 1 } ⊆ 2n

This language is, of course, finite, so there is a finite state machine that acceptsit. In fact, the partial minimal DFA is just a DAG: all words in the languagecorrespond to a fixed-length path from the initial state to the unique final state.

UnEqual 17

We already know an example: the UnEqual language

Lk = {uv ∈ 22k | u 6= v ∈ 2k }

It turns out that the state complexity of Lk is 3 · 2k − k + 2 = Θ(2k).

This is uncomfortably large since this Boolean function has a very simpledescription as a propositional formula:

UE(x) =∨xi ⊕ xi+k

This formula clearly has size Θ(k).

The Fix 18

The reason our DFA is large, while the formula is small, is that the machinehas to contend with input bits xi and xi+k being far apart.

By contrast, in the formula (rather, the parse tree), they are close together.

This suggests a way to get around this problem. Change the language to

Kk = {x ∈ 22k | ∃ i (x2i 6= x2i+1) }

assuming 0-indexing. In other words, the corresponding formula is now

UE′(x) =∨x2i ⊕ x2i+1

You might worry that these shenanigans won’t generalize, and you would beright. But let’s ignore this for the time being.

Much Better 19

The state complexity of Kk is much smaller than for Lk: 5k.

0 1

0 1

0 1

0 1

0

1

0

1

0, 1

0, 1

0, 10, 1

The minimal DFA for K2.

The Complement 20

0 1

0 1

0 1

0 1

The partial minimal DFA for the negation of the UnEqual function (aka as theEqual function).

The Complement? 21

Note that complementation here is a bit weird: by flipping final and non-finalstates in a DFA M we get a DFA M ′ such that

L(M ′) = 2? − L(M).

But that’s not really what we want for f , instead we need M ′ such that

L(M ′) = 2n − L(M).

We need to fix things up a bit.

The same holds for product automata for unions and intersections when thevariables are not the same. For example, one needs to be careful whenconstructing the machine for the conjunction of, say, f(x, z) and g(x, y, z, u).

Example: n = 4, #12346 22

The obvious tree automaton on 31 states for the Boolean function

(¬x1 ∧ ¬x2 ∧ x4) ∨ (x2 ∧ ¬x3)

Trim Part 23

Minimal Partial 24

MDA 25

The good news is this: because of the uniqueness of the minimal DFA, we dohave a some kind of canonical normal form. In particular, to check whether twoBoolean functions are equivalent, we can check their minimal DFAs forisomorphism. This representation is known as minimal deterministic automaton(MDA).

Still, from the previous comments it looks like this is not quite the rightsetting.

Here is another nuisance: the DFA for Kk should not have the chain on theright: it should just “output” 1 when we get over there and be done. This isjustified since UE′(0, 1,y) = 1 no matter what y is,

Four Arguments 26

Here are the frequencies of the state complexities of all the minimal automataassociated with the 65536 Boolean functions of 4 arguments.

1 16 817 1628 8289 326410 1104011 2244012 27720

Recall that the natural tree automata all have size 31.

The automaton of size 1 corresponds to the constant false functions, the othersactually describe languages L ⊆ 24.

� Boolean Functions

� Minimal Automata

3 Binary Decision Diagrams

� Reduced Ordered BDDs

Improvements 28

One might conjecture that we have to relax the constraints on our diagrams abit: more choices should make it easier to build them (of course, whilepreserving normal form properties).

One fairly natural idea is to drop the fixed-length condition in an MDA: wemay reach the exit without reading all the variables (in fact, always as soon asenough information has been accumulated to determine the value of thefunction).

The Key Papers 29

C. Y. LeeBinary Decision ProgramsBell System Tech. J., 4 (1959) 4: 985–999.

S. B. AkersBinary Decision DiagramsIEEE Trans. Computers C-27 (1978) 6: 509–516.

R. E. BryantGraph-based Algorithms for Boolean Function ManipulationIEEE Trans. Computers C-35 (1986) 8: 677–691.

Decision Trees 30

x

y y

z z z z

a0 a1 a2 a3 a4 a5 a6 a7

0 1

0 1 0 1

0 1 0 1 0 1 0 1

A general purpose method: express a Boolean function as a decision tree.

Reductions 31

In its abstract form, there is not much one can do with a full decision tree.

But if we have concrete values for the ai, things change.

We can reduce the data structure according to the following reduction rules.

Merge terminal nodes with the same label.

Merge nodes with the same descendants.

Skip over nodes with just one child.

Except for the Skip part, this is the same as state merging in the realm ofFSMs.

Example 32

x

y y

z z z z

1 0 1 0 0 0 0 1

0 1

0 1 0 1

0 1 0 1 0 1 0 1

The function f = (x+ z)(x+ y)(x+ y + z).

Preserving Sanity 33

x

y y

z z z z

1 0 1 0 0 0 0 1

A more legible version.

Merging Terminals 34

x

y y

z z z z

0 1

Merging Nodes 35

x

y y

z z z

0 1

Skipping Nodes 36

x

y

z z z

0 1

Final Result 37

x

y

z z

0 1

Reality Check 38

Just to be clear: reducing a given decision tree is mostly of conceptual interest.The full decision tree always has exponential size, so we can only handle it for afew variables.

What we really need is a way to get at the reduced version directly: startingfrom the constant functions, use Boolean operators to construct morecomplicated ones.

All these intermediate functions are kept in reduced form, so hopefully none ofthem will exhibit exponential blowup.

Boole-Shannon Expansion 39

Here is a more algebraic description of this basic idea.

Suppose f is a Boolean function with variable x. Define the cofactors of f by

fx(u, x,v) = f(u, 1,v)

fx(u, x,v) = f(u, 0,v)

Note that fx does not depend on x. Also fxy = fyx.

The internal nodes of our DAGs are based on the standard Boole-Shannonexpansion

f = x fx + x fx

If-Then-Else 40

A good way to think about expansions is in terms of if-then-else:

ite(x, y1, y0) = x y1 + x y0

Together with the Boolean constants, if-then-else provides yet another basis:

¬x = ite(x, 0, 1)

x ∧ y = ite(x, y, 0)

x ∨ y = ite(x, 1, y)

x ⇒ y = ite(x, y, 1)

x⊕ y = ite(x, ite(y, 0, 1), y)

If-Then-Else Normal Form 41

It follows that we can define if-then-else normal form (INF): the only allowedoperations are if-then-else and constants. Moreover, tests are performed onlyon variables (not compound expressions).

We can express Boole-Shannon expansion in terms of if-then-else:

f = ite(x, fx, fx)

We also assume that the variables are given in a fixed order x, y, z . . . so thatwe can lighten notation a bit and write f101 instead of fxyz.

Example 42

Consider the Boolean function f ≡ (x1 = y1) ∧ (x2 = y2) (strictly speaking weshould have written xi ⇔ yi, but that’s harder to read).

The INF of f is given by

f = ite(x1, f1, f0)

f0 = ite(y1, 0, f00)

f1 = ite(y1, f11, 0)

f00 = ite(x2, f001, f000)

f11 = ite(x2, f111, f110)

f000 = ite(y2, 0, 1)

f001 = ite(y2, 1, 0)

f110 = ite(y2, 0, 1)

f111 = ite(y2, 1, 0)

Here the implicit order is x1, y1, x2, y2. Substituting back into the first line weget INF.

Example 43

Sharing common subexpressions (such as f000 and f110):

f = ite(x1, f1, f0)

f0 = ite(y1, 0, f00)

f1 = ite(y1, f00, 0)

f00 = ite(x2, f001, f000)

f000 = ite(y2, 0, 1)

f001 = ite(y2, 1, 0)

We can now interpret these functions as nodes in a DAG just like the oneobtained by reducing a decision tree.

The DAG 44

f

f0 f1

f00

f000 f001

0 1

The fully reduced DAG.

Same 45

x1

y1 y1

x2

y2 y2

0 1

The DAG again, but this time labeled by the variables that are tested at eachparticular level.

Threshold 46

A fairly simple BDT for the counting function “at most 4 out of 10” (True andFalse nodes omitted). Number of nodes is 463, uncomfortably close to a fullbinary tree.

4 Queens 47

The INF for the 4-queens formula, and the corresponding BDT (False nodeomitted). Clearly, there are 2 solutions.

Petersen Graph 48

The Petersen graph is astandard source of examplesand counterexamples in graphtheory.

Hamiltonicity 49

How could we check whether a ugraph is Hamiltonian using Boolean formulae?

Let V = [n] and introduce Boolean variables pt,x where 0 ≤ t ≤ n, 1 ≤ x ≤ n.

pt,x = 1 iff the Hamiltonian cycle at time t passes through vertex x.

p0,1 ∧ pn,1

∀ t ∃!x pt,x

∀x > 1 ∃! t pt,x

∀ t < n, x pt,x ⇒ pt+1,y1 ∨ . . . ∨ pt+1,yk

where y1, . . . , yk is the neighborhood of x.

A similar formula encodes the existence of a Hamiltonian path instead.

Hamiltonian Cycle 50

A Boolean formula that is satisfiable iff the Petersen graph has a Hamiltoniancycle.

Converting it to a BDD produces False.

The corresponding formula for Hamiltonian paths has 24 truth assignments.

� Boolean Functions

� Minimal Automata

� Binary Decision Diagrams

4 Reduced Ordered BDDs

Binary Decision Diagrams 52

Fix a set Var = {x1, x2, . . . , xn} of n Boolean variables.

DefinitionA binary decision diagram (BDD) (over Var) is a rooted, directed acyclic graphwith two terminal nodes (out-degree 0) and interior nodes of out-degree 2. Theinterior nodes are labeled in Var.

We write var(u) for the labels.

The two successors of an interior node are traditionally referred to as lo(u) andhi(v), corresponding to the false and true branches.

We can think of the terminal nodes as being labeled by constants 0 and 1,indicating values false and true.

Ordered BDDs 53

Fix an ordering x1 < x2 < . . . < xn on Var. For simplicity assume thatxi < 0, 1.

DefinitionA BDD is ordered (OBDD) if the label sequence along any path is ordered.

Thusvar(u) < var(lo(u)), var(hi(u))

In the corresponding INF, the variables are always ordered in the sense that

ite(x, ite(y, f1, f2), ite(z, f3, f4)) implies x < y, z

Reduced Ordered BDDs 54

DefinitionA OBDD is reduced (ROBDD) if it satisfies

Uniqueness: there are no nodes u 6= v such that

var(u) = var(v), lo(u) = lo(v), hi(u) = hi(v)

Non-Redundancy: for all nodes u:lo(u) 6= hi(u)

The uniqueness condition corresponds to shared subexpressions: we couldmerge u and v. Non-redundancy corresponds to taking shortcuts: we can skipahead to the next test.

Since ROBDDs are the most important type, we simply refer to them as BDD.

Canonicity 55

By a straightforward induction we can associate any BDD with root u with aBoolean function Fu:

F0 = 0 F1 = 1

Fu = ite(var(u), Flo(u), Fhi(u))

If the BDDs under consideration are also ordered and reduced we get a usefulrepresentation.

Theorem (Canonical Form Theorem)

For every Boolean function f : Bn → B there is exactly one ROBDD u suchthat f = Fu.

Proof 56

The claim is clear for n = 0, so consider a function f : Bn → B , writtenf(x,y).

By IH we may assume that the BDDs for f(0,y) and f(1,y) are unique, say,with roots u and v, respectively.

If f(0,y) = f(1,y) this is also the BDD for f .

Otherwise introduce a new node w and set

λ(w) = x lo(w) = u hi(w) = v

The resulting BDD represents f and is unique.

2

Reduction 57

Suppose we have a decision tree for a Boolean function and would like totransform it into the (unique) BDD.

We can use a bottom-up traversal of the DAG to merge or eliminate nodes thatviolate Uniqueness or Non-Redundancy.

This traversal requires essentially only local information and can be handled in(expected) linear time using a hash table.

Of course, if the starting point is indeed a full decision tree this method is oflittle interest. But reduction is important as a clean-up step in the constructionof complicated BDDs form simpler ones.

Workspace Approach 58

Here is the key idea: suppose we construct all the Boolean functions in aparticular computation from scratch, starting from the constants and applyingthe usual arsenal of Boolean connectives. All Boolean functions are kept in thesame workspace.

In this scenario, all functions are kept in canonical form; if an intermediateresult is not canonical, clean it up using the reduction machinery. As aconsequence, every one of these functions has a unique representation: aparticular node u in a BDD represents function Fu (over some set of variables).

But then equivalence testing is O(1): we just compare two pointers. Very nice.

Operations on BDDs 59

Obviously we need a collection of operations on BDDs.

Reduce Turn a decision tree into a BDD.

Apply Given two BDDs u and v and a Boolean operation �, determinethe BDD for Fu � Fv.

Restrict Given a BDD u and a variable x, determine the BDD forFu[a/x].

Note: The purpose of Reduce is not to convert a raw (exponential size)decision tree, but to clean up after other operations.

Negation 60

The first operation to consider is simple negation: obtaining a BDD for f froma BDD for f .

Note that the two BDDs differ only in the terminal nodes, which are swapped.

This can be exploited to make complementation constant time.

Clearly all nodes in a BDD represent Boolean functions, it is natural to keeptrack of them via pointers. By adding a “negation bit” to the pointers(complement vs. regular pointers), we can obtain f at constant cost.

We can also add negation bits to all the internal edges of the diagram,indicating that the next node represents g rather than g.

Preserving Canonicity 61

In this setting, one can get away with a single terminal node, say, 1. Since allpaths end at 1, evaluation comes down to counting the number of complementedges (modulo 2).

However, a little care is needed to preserve canonicity. We have

f = xfx + xfx = (xfx + xfx)

so there are two ways to represent f . We adopt the convention that the thenbranch is chosen to be regular.

With this convention, our BDDs are still canonical.

Ops 62

Cofactors coexist peacefully with Boolean operations.

Proposition

fx = fx, (f + g)x = fx + gx, (fg)x = fxgx

Hence, the apply operation can be handled by recursive, top-down algorithmsince

ite(x, s, t) � ite(x, s′, t′) = ite(x, s � s′, t � t′)

Since complementation is O(1), it even suffices to just implement, say, logicaland.

Conjunction 63

So suppose we want to build the BDD for fg, given BDDs for f and g.

The algorithm uses top-down recursion. The exit conditions are easy:

f, g = 0 0

f = 1 g

g = 1 f

f = g g

f = g 0

Note that they can all be checked in constant time.

Apply for Conjunction 64

handle terminal cases

suppose x is next variable

• compute h1 = fx gx

• compute h2 = fx gx

• if h1 = h2 return h1

• return BDD for ite(x, h1, h2)

With proper hashing the running time is O(|u||v|), though in practice themethod is often faster.

Decision Problems and BDDs 65

Suppose we have constructed a BDD for a Boolean function f .

Then it is trivial to check if f is a tautology: the test is for f = 1 and isconstant time.

Likewise, it is trivial to check if f is satisfiable: we need f 6= 0.

In fact, we can count satisfying truth assignments in O(|u|): they correspondto the total number of paths (including potentially phantom variables) from theroot to terminal 1.

We can even construct a BDD s for all satisfying truth assignments in timeO(n|s|) where |s| = O(2|u|).

Quantifiers 66

In principle, we can even handle quantified Boolean formulae by expanding thequantifiers:

∃xϕ(x) ≡ ϕ[0/x] ∨ ϕ[1/x]

∀xϕ(x) ≡ ϕ[0/x] ∧ ϕ[1/x]

So this comes down to restriction, a (quadratic) apply operation, followed byreduction.

Alas, since SAT is just a problem of validity of an existentially quantifiedBoolean formula, this is not going to be fast in general. QBF in general is evenPSPACE-complete.

Bad News 67

Of course, there is no way to avoid exponential worst-case cost when building aBDD representing some Boolean function from scratch: after all, there are 22n

functions to be represented. So some representations must have size at least 2n

bits for information-theoretic reasons.

As it turns out, addition of k-bit natural numbers can be expressed nicely in alinear size BDD. Alas, multiplication causes problems (this should soundfamiliar).

LemmaBDDs for multiplication require exponential size.

Variable Ordering 68

In general, the size of a BDD depends drastically on the chosen ordering ofvariables (see the UnEqual example above).

It would be most useful to be able to construct a variable ordering that workswell with a given function. In practice, variable orderings are computed byheuristic algorithms and can sometimes be improved with local optimizationand even simulated annealing algorithms.

Alas, it is NP-hard to determine whether there is an ordering that produces aBDD of size at most some given bound.

Symmetric Functions 69

Note that for a symmetric Boolean functions variable ordering does not matter:we always get the same BDD.

conjunctions, disjunctions

parity (xor)

threshold (counting)

ExerciseFigure out what the BDD for n-ary xor looks like.

Average Performance 70

Again: for information-theoretic reasons, there is no way a canonical form suchas a MDA or BDD is always small.

As a matter of fact, one can show that both worst case and the average sizesof MDA and BDD agree in the limit (for many variables).

Moreover, there are exponential gaps between the size of standardrepresentations for some Boolean functions. So, to really deal with Booleanfunctions effectively, one has to have many different implementationsavailable–and switch back and forth, to whatever works best under thecircumstances.

Recommended