27
J Comb Optim (2009) 18: 124–150 DOI 10.1007/s10878-008-9142-4 Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems Christian Desrosiers · Philippe Galinier · Alain Hertz · Sandrine Paroz Published online: 5 March 2008 © Springer Science+Business Media, LLC 2008 Abstract In this paper, we propose efficient algorithms to extract minimal unsatisfi- able subsets of clauses or variables in unsatisfiable propositional formulas. Such sub- sets yield unsatisfiable propositional subformulas that become satisfiable when any of their clauses or variables is removed. These subformulas have numerous applications, including proving unsatisfiability and post-infeasibility analysis. The algorithms we propose are based on heuristics, and thus, can be applied to large instances. Further- more, we show that, in some cases, the minimality of the subformulas can be proven with these algorithms. We also present an original algorithm to find minimum car- dinality unsatisfiable subformulas in smaller instances. Finally, we report computa- tional experiments on unsatisfiable instances from various sources, that demonstrate the effectiveness of our algorithms. Keywords Satisfiability · Heuristics · Minimal unsatisfiable subformulas (MUS) 1 Introduction A propositional formula F in conjunctive normal form (CNF) is the conjunction of clauses from a set C ={C 1 ,...,C m }, acting on variable set X ={x 1 ,...,x n }. Each C. Desrosiers · A. Hertz ( ) · S. Paroz Ecole Polytechnique and GERAD, Montreal, Canada e-mail: [email protected] C. Desrosiers e-mail: [email protected] S. Paroz e-mail: [email protected] P. Galinier Ecole Polytechnique and CRT, Montreal, Canada e-mail: [email protected]

Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

Embed Size (px)

Citation preview

Page 1: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150DOI 10.1007/s10878-008-9142-4

Using heuristics to find minimal unsatisfiablesubformulas in satisfiability problems

Christian Desrosiers · Philippe Galinier ·Alain Hertz · Sandrine Paroz

Published online: 5 March 2008© Springer Science+Business Media, LLC 2008

Abstract In this paper, we propose efficient algorithms to extract minimal unsatisfi-able subsets of clauses or variables in unsatisfiable propositional formulas. Such sub-sets yield unsatisfiable propositional subformulas that become satisfiable when any oftheir clauses or variables is removed. These subformulas have numerous applications,including proving unsatisfiability and post-infeasibility analysis. The algorithms wepropose are based on heuristics, and thus, can be applied to large instances. Further-more, we show that, in some cases, the minimality of the subformulas can be provenwith these algorithms. We also present an original algorithm to find minimum car-dinality unsatisfiable subformulas in smaller instances. Finally, we report computa-tional experiments on unsatisfiable instances from various sources, that demonstratethe effectiveness of our algorithms.

Keywords Satisfiability · Heuristics · Minimal unsatisfiable subformulas (MUS)

1 Introduction

A propositional formula F in conjunctive normal form (CNF) is the conjunction ofclauses from a set C = {C1, . . . ,Cm}, acting on variable set X = {x1, . . . , xn}. Each

C. Desrosiers · A. Hertz (�) · S. ParozEcole Polytechnique and GERAD, Montreal, Canadae-mail: [email protected]

C. Desrosierse-mail: [email protected]

S. Paroze-mail: [email protected]

P. GalinierEcole Polytechnique and CRT, Montreal, Canadae-mail: [email protected]

Page 2: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 125

clause Ci ∈ C is the disjunction of literals from a set Li , each literal being either avariable x ∈ X or the negation of this variable (¬x). A CNF formula can thus beexpressed in the following way:

F =m∧

i = 1

( ∨

l ∈ Li

l

)

A formula F is satisfiable if and only if there exists a truth assignment that satisfiesall its clauses. The satisfiability problem (SAT) consists in finding a truth assignment(i.e. true or false) for the variables in X such that the formula F evaluates to true, orshowing that no such assignment exist. The first problem to be shown NP-complete,SAT plays a key role in mathematical logic and computing theory, as well as in auto-mated reasoning, machine vision, database, scheduling and integrated circuit design(Franco et al. 1997). In the case of unsatisfiable SAT instances, it is often necessaryto find a truth assignment for which the most clauses are satisfied (i.e. evaluate totrue) in a given CNF formula. This task, known as the maximum satisfiability prob-lem (Max-SAT) was proven NP-hard, and is fundamental to solve many practicalproblems in computer science (Hansen and Jaumard 1990). The Max-SAT is clearlya generalization of the SAT. The weighted Max-SAT, or Max-WSAT, is another ver-sion of the satisfiability problem where each clause Ci ∈ C has a weight wi , and thegoal is to find a truth assignment that maximizes the sum of the weights of satisfiedclauses. These additional weights allow to assign preferences in the satisfaction ofclauses. Since the Max-SAT is a particular instance of its weighted version, for whichall weights are equal, the later problem is thus also NP-hard. A considerable amountof solution methods have been proposed for these problems, which can be roughlydivided in enumeration techniques (Borchers and Furman 1999; Davis et al. 1962;Goldberg and Novikov 2000; Hansen and Jaumard 1990; Madigan et al. 2001;Shang and Wah 1997; Zhang 1997), approximation schemes (Battiti and Protasi1998; Dantsin et al. 2002), and local search algorithms (Battiti and Protasi 1998;Mazure et al. 1997; Levesque et al. 1992; Marques Silva and Sakallah 1999;Mills and Tsang 2000).

It is often the case that a propositional formula, encoding a particular system orapplication, is unsatisfiable. When this occurs, it can be difficult to identify the causesof the inconsistency, and even harder to fix it. In fact, detecting the infeasibility, alone,is NP-hard. However the inconsistency is, most of the time, caused by contradictionslocal to the problem. These contradictions form an inconsistent sub-system explain-ing the inconsistency of the whole system.

Definition 1 Given an unsatisfiable propositional formula F , a minimal unsatisfiablesubset of clauses (MUSC) M ⊆ C is a subset of clauses that is unsatisfiable, and forwhich every proper clause subset is satisfiable.

The extraction of a MUSC from an unsatisfiable formula, referred to as the mini-mal unsatisfiable subformulas selection problem (MUSSP for short), has many ben-efits. The most common is related to helping a user redesign an incoherent system,modeled as a propositional formula, to make it once again coherent. To do so, the

Page 3: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

126 J Comb Optim (2009) 18: 124–150

Table 1 Example of a nonsatisfiable instance of the SATproblem

Clause M1 M2 M3

C1 ¬x1 ∨ x2 • •C2 ¬x1 ∨ ¬x2 • •C3 x1 ∨ x3 • • •C4 x1 ∨ ¬x3 •C5 x3 ∨ ¬x5 •C6 ¬x3 ∨ ¬x5 • •C7 ¬x3 ∨ x5 • •C8 ¬x1 ∨ x3 ∨ ¬x4 •C9 ¬x1 ∨ x4 ∨ x5 •

user needs a small, yet, complete set of contradictions that led to the incoherence.A MUSC gives the user sufficient information to identify which design constraints torelax. Another use of MUSCs lies in proving that a given SAT instance is unfeasi-ble. The general idea is to find an unsatisfiable subformula, whether or not minimal,using a heuristic procedure, and then solving this subformula using an exact algo-rithm. Since a problem is feasible only if all its subproblems are, showing that asubformula is unsatisfiable proves that the original formula is also unsatisfiable. Fur-thermore, since the complexity of obtaining such a proof is exponential to the size ofthe problem, an exact algorithm is more likely to find a proof on the MUSC, then onthe original instance. In Mazure et al. (1998), small unsatisfiable subformulas, foundusing a local search strategy, were used to boost exact algorithms.

Table 1 shows a simple CNF instance of 9 clauses operating on 5 variables. ThisCNF problem is not satisfiable, and has three MUSCs, each explaining the incoher-ence of the formula. For example, M1 expresses the impossibility to assign truthvalues to x1, x2, and x3 such that the first four clauses are simultaneously satisfied.

It is often easier to understand why a problem is incoherent by simply looking atthe variables involved in this incoherence. Given a propositional formula F actingon a variable set X , and given a subset Q ⊆ X , the subformula of F induced by Q,denoted FQ, is the formula composed of the clauses Ci which have all the variablesof the literal set Li in Q. We say that a subset Q of variables is satisfiable if andonly if FQ is satisfiable. Hence, given a partial assignment in which only variablesin Q get a value, every clause Ci that contains at least one unassigned variable xj /∈Q in its literal sets is considered as satisfiable, the reason being that it is alwayspossible to assign a value to xj so that Ci evaluates to true. This leads to a variationof the Max-SAT problem where partial assignments are allowed and the goal is tofind an assignment of the largest possible subset Q ⊆ X of variables such that thecorresponding subformula FQ evaluates to true. We can further extend this problemby giving each variable xi ∈ X a weight wi , and having for goal to find an assignmentthat maximizes the sum of the assigned variables weights.

While a MUSC is a minimal unsatisfiable subset of clauses, a similar definitioncan be given for variables subsets.

Page 4: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 127

Definition 2 Given an unsatisfiable propositional formula F , acting on variableset X , a minimal unsatisfiable subset of variables (MUSV) is a subset M ⊆ X ofvariables such that M is unsatisfiable while every proper subset of M is satisfiable.

Notice that the variables composing the MUSCs are MUSVs. However, the op-posite is not necessarily true. Consider, once more, the CNF instance of Table 1,which contains two MUSVs: {x1, x2, x3} and {x1, x3, x4, x5}. The first MUSV givesa subformula composed of clauses C1, C2, C3 and C4, which is also the sub-formula corresponding to M1. However, the second MUSV yields a subformulaformed with clauses C3, C4, C5, C6, C7, C8 and C9, that is not minimal in thesense of MUSCs (i.e. C4 can be removed without making it satisfiable). The no-tions of MUSCs and MUSVs are closely related to that of irreducible infeasi-ble subsystems (IISs) in the context of linear programming (Amaldi et al. 1999;Gleeson and Ryan 1990). In the case of systems of linear inequalities with integervariables, additive and subtractive heuristics were proposed to find constraint IISs(Chinneck 1997). Other heuristic algorithms have been proposed to find vertex andedge critical-subgraphs which are the equivalent of MUSVs and MUSCs for the graphk-coloring problem (Desrosiers et al. 2008). The graph k-coloring problem consistsin coloring, if possible, the vertices of a graph G using at most k colors, and suchthat adjacent vertices have different colors. If such coloring exists, then G is saidk-colorable. A k-vertex (k-edge) critical subgraph is a subgraph G′ of G that is notk-colorable, and for which removing any vertex (edge) makes it k-colorable.

The MUSSP in unsatisfiable CNF formulas has been the subject of numerous re-searches. Recent theoretical works have shown that deciding whether a CNF formulacontains a MUSC of fixed deficiency k (i.e. for which the difference between the num-ber of clauses and variables equals k), for all k ∈ N, is NP-complete, but that efficientalgorithms can be developed for small values of k (Kleine Büning and Zhao 2002;Fleischner et al. 2002). Many methods have been proposed for the general MUSSP.Mazure et al. (1998) use local search to identify the clauses that are most difficult tosatisfy. These literals form a core that is used to guide a branch and bound algorithmfor the SAT problem. Bruni (2003) uses a procedure that ranks clauses based on thenumber of times they are conflicted during the search of an exact algorithm. Startingwith an initial set of clauses, a core is then expanded or contracted by a fixed percent-age of clauses, based on their rank, until the core becomes unsatisfiable. Although thisprocedure works well on small instances, it may take quite a long time to convergeon large real-world instances. Additionally, the cores obtained by such procedure arerarely minimal, for more difficult instances. Zhang and Malik (2003) propose a tech-nique to find unsatisfiable subformulas in large instances. This technique, which usesthe resolution proof of any DLL-based SAT solver to derive an unsatisfiable subset ofclauses, has two drawbacks: it offers no guarantee that the unsatisfiable subformulasfound are minimal, and it relies on the hypothesis that the exact SAT solver is ableto find a proof of satisfiability for a given instance, which is not always the case.The AMUSE algorithm, proposed by Oh et al. (2004), extends a generic DLL-basedSAT solver to implicitly search for an unsatisfiable subformula instead of a satisfyingassignment. This is done by adding to each clause an extra variable that acts as aselector for the clause, such that the problem becomes finding a truth assignment for

Page 5: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

128 J Comb Optim (2009) 18: 124–150

these new variables, for which the extended formula is unsatisfiable. Unlike the pre-vious selection method, AMUSE doesn’t need to solve the whole instance, but only theunsatisfiable subformula. However, it still offers no guarantee of minimality. Mneim-neh et al. (2005) present a branch and bound algorithm that uses Max-SAT solutionsto compute lower and upper bounds on the number of clauses of a MUSC. Liffitonand Sakallah (2005) propose a 2-phase algorithm, called CAMUS, aimed at finding allMUSCs. During phase 1, this algorithm builds the set F of all co-MSSs (a MSS is amaximal satisfiable set of clauses, and a co-MSS the complementary set of a MSS).During phase 2, it builds all minimal hitting sets of F, i.e. all MUSCs. This algorithmis shown to perform better than a similar technique proposed by Bailey and Stuckey(2005).

In this paper, we propose original algorithms that guarantee to obtain MUSCs andMUSVs in unsatisfiable CNF formulas. We show that these algorithms can be appliedto relatively large SAT instances, with the use of heuristics. We also present an algo-rithm to select minimum cardinality MUSCs and MUSVs in small SAT instances.Finally, we give some heuristics to help find smaller MUSCs or MUSVs, more use-ful to diagnose incoherent systems. The remainder of this paper is as follows. Theselection procedures and heuristics are presented in Sect. 2. In Sect. 3, we give somecomputational results of the algorithms carried out on generated and real-life bench-marks. Finally, concluding remarks are given in Sect. 4.

2 Selection algorithms and heuristics

The algorithms presented in this section are based on solution methods for the largeset covering problem, proposed in Galinier and Hertz (2007). This paper demon-strates that the problem of finding IISs in unfeasible constraint satisfaction problems(CSPs) can be formulated as a large set covering problem (LSCP). Since the CSP isa generalization of SAT, where variables are not restricted to take on boolean values,these algorithms, which have successfully been applied to find vertex and edge crit-ical subgraphs and solving the graph coloring problem (Desrosiers et al. 2008), canthus be applied to SAT. Furthermore, since the properties, given in Galinier and Hertz(2007), are still valid, we will refer to that paper for their proofs.

To be more succinct, we will present one version of each algorithm, which canbe used to find either MUSCs or MUSVs. This is done by introducing some genericterms in the algorithms, that take on a specific meaning, depending on the task. Thus,if one needs to find MUSCs, S corresponds to the set of clauses C , wi ∈ W is theweight of clause ci ∈ C and MaxWSAT is a procedure that takes a CNF formula F andthe set of clause weights W , and returns a set U of unsatisfied clauses for which thesum f (U) of weights is minimum. On the other hand, if the goal is to select MUSVs,S is the set of variables X , wi ∈ W is the weight of variable xi ∈ X and MaxWSAT is aprocedure that takes a CNF formula F and the set of variable weights W , and returnsa set U of unassigned variables for which the sum f (U) of weights is minimum.Depending on the task, a MUS will refer to a MUSC or a MUSV.

Page 6: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 129

Fig. 1 The Removalalgorithm

2.1 The Removal algorithm

The Removal algorithm is perhaps the simplest of all MUS selection algorithms.Similar approaches have already been proposed, for example, the zMinimal algo-rithm in Zhang and Malik (2003), Chinneck’s deletion algorithm presented in Chin-neck (1997) for linear programming, or the algorithm in Herrmann and Hertz (2002)for vertex critical subgraph detection. Given an unsatisfiable CNF formula F , theRemoval algorithm proceeds in a top-down approach, removing clauses or variablesone by one, and re-inserting those that cause F to be satisfiable. Figure 1 containsthe pseudo-code of this algorithm. The term unexplored refers to the clauses or vari-ables that haven’t been tested for removal yet, and SAT to a procedure that solves theproblem of the same name.

Property 1 Given an unsatisfiable CNF formula F , the Removal algorithm returns,in a finite number of steps, a MUS if SAT is an exact procedure. Otherwise, if SAT isheuristic, the Removal algorithm returns either a MUS or a satisfiable subformula.

To illustrate the Removal algorithm, consider the task of finding a MUSC in theCNF formula of Table 1. Suppose the clauses are removed following their index, wefirst remove C1, “destroying” in the process M1 and M3. Since M2 is the onlyremaining MUSC, it will be returned by the algorithm after the other clauses areremoved, and its clauses re-inserted. Notice that the order in which the clauses orvariables are removed affects the outcome of the algorithm. Accordingly, if we hadremoved the clauses following the reverse order of their indices, M2 and M3 wouldhave been destroyed first, and M1 returned.

2.2 The Insertion algorithm

The number of steps (i.e. calls to procedure SAT) required by the Removal algo-rithm to find a MUSC or MUSV is equal to the number of clauses or variables ofthe original formula, regardless the size of this MUS. This is most inefficient in caseswhere the MUS is much smaller than the original formula F . For such cases, it makesmore sense to use a bottom-up approach, where an initially empty formula is ex-panded until it becomes unsatisfiable. The Insertion algorithm, shown in Fig. 2,is such an approach that selects a MUS in a number of steps equal to the size of that

Page 7: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

130 J Comb Optim (2009) 18: 124–150

Fig. 2 The Insertionalgorithm

MUS. The algorithm starts by setting the weights of all clauses or variables of F to 1.It then modifies these weights as follows. When we want to give more importance toa clause or variable s ∈ S, we set its weight to α ≥ |S|, and we say that we harden s.On the other hand, when we do not want to take a clause or variable s into account,we fix its weight to 0, and we say that s is removed. This incites procedure MaxWSATto satisfy all hard clauses or assign all hard variables, and ignore all removed ones.Every iteration, MaxWSAT returns a set U containing at least one clause or variablefrom each remaining MUS of F . It then hardens one clause or variable from U andremoves the other, thus making sure that F remains unsatisfiable. When the set ofhard clauses or variables becomes unsatisfiable, MaxWSAT will obtain f (U) ≥ α

and the algorithm will return this set. If MaxWSAT is sub-optimal, it may happen thatF becomes satisfiable (i.e. f (U) = 0), in which case failure is reported. This errorcan also be repaired by re-inserting clauses or variables (i.e. setting their weight backto 1), until F becomes once again unsatisfiable.

Property 2 Given an unsatisfiable CNF formula F , the Insertion algorithm re-turns, in a finite number of steps, a MUS if MaxWSAT is an exact procedure. Other-wise, if MaxWSAT is not optimal, the Insertion algorithm returns either a MUS1

or a satisfiable subformula.

Let us illustrate the Insertion algorithm on the selection of a MUSV, in theCNF of Table 1. Since the set U , returned by MaxWSAT, must contain a variable fromeach of the two MUSVs {x1, x2, x3} and {x1, x3, x4, x5}, and since f (U) must beminimum, U can either contain x1 or x3, suppose x3. This variable is then hardened,and the next U contains x1, which is, in turn, hardened. The next U then containsx2 and a variable from the second MUSV, say x4. We must then choose to hardenone of those variables and remove the other one. Suppose x2 is hardened, the hardvariables then form a MUSV which is returned. Once again, the choice of the variableto harden determines which MUSV is returned. Thus, if we had chosen to harden x4instead of x2, the previously selected MUSV would have been destroyed and the

1Unless the Repair procedure is used.

Page 8: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 131

Table 2 An example where theInsertion algorithm cannotfind all MUSCs

Clause M1 M2 M3 M4 M5

C1 • • •C2 • • •C3 • •C4 • •C5 • •C6 • •

other one found. Notice, finally, that the Insertion algorithm may not be able toselect all MUSs of a given formula. Consider, for example, the selection of a MUSCin the CNF of Table 2. Suppose that the clauses all have a weight of 1, the optimalset U , returned by MaxWSAT, clearly contains C1 and C2. Since it contains both theseclauses, M1, which is the smallest MUSC of the CNF, will be destroyed.

Although both Insertion and AMUSE build a MUS step by step by adding newconstraints, there are major differences between these two algorithms. In particular,Insertion iteratively solves a weighted max-SAT problem by using metaheuris-tics.

2.3 The HittingSet algorithm

The HittingSet algorithm differs from the previous selection algorithms in thatit finds minimum cardinality unsatisfiable subformulas (MCUSs). This algorithm isbased on the fact that, given an unsatisfiable formula, an optimal solution to the max-SAT problem, at iteration i, gives a set Ui of unsatisfied clauses or unassigned vari-ables that intersects each MUS. A MUS M is therefore a hitting set (i.e. a set in-tersecting each set of a collection) of U = {U1, . . . ,U|M|}. Evidently, a MCUS is aminimum hitting set (MHS) of collection U . The algorithm’s pseudo-code is shownin Fig. 3. Every iteration, procedure MinHS returns a MHS H of an initially emptycollection U . The clauses or variables of S are then modified so that only those also inH become hard. If the set U returned by procedure MaxWSAT is such that f (U) ≥ α,then H is a MCUS and is returned. Else, U is added to collection U and the sameprocess is repeated.

Property 3 Given an unsatisfiable CNF formula F , the HittingSet algorithmreturns, in a finite number of steps, a MCUS if MaxWSAT and MinHS are exactprocedures. Otherwise, if MinHS returns minimal hitting sets and MaxWSAT is notoptimal, the HittingSet algorithm returns either a MUS or a satisfiable subfor-mula.

Table 3 illustrates the HittingSet algorithm on the selection of a MUSV in theCNF of Table 1. Each row gives the MHS produced by MinHS, as well as the optimaltruth assignment found by MaxWSAT and corresponding set U , at a given iteration.Notice that the sets U of the first three iterations are identical to those obtained inthe example given for the Insertion algorithm. The MHS of iteration 4 should

Page 9: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

132 J Comb Optim (2009) 18: 124–150

Fig. 3 The HittingSetalgorithm

Table 3 Illustration of theHittingSet algorithm on theexample of the Table 1

It# H Assignment U

1 ∅ 〈0 0 - 0 1〉 {x3}2 {x3} 〈- 1 0 0 0〉 {x1}3 {x1, x3} 〈1 - 0 - 0〉 {x2, x4}4 {x1, x3, x4} 〈1 - 1 0 -〉 {x2, x5}5 {x1, x2, x3} 〈- 0 0 0 0〉 {x1}

contain either x2 or x4, in this example x4. As with the Insertion algorithm, theMCUS formed of variables x1, x2 and x3 is then destroyed. However, set U shouldonce more contain x2, such that the only possible MHS for iteration 5 is the MCUS{x1, x2, x3}.

Although the HittingSet algorithm produces a MUS of minimum cardinality(which was not necessarily the case for the Removal and Insertion algorithms),it may require an exponential number of steps. Therefore, this algorithm is best suitedfor small instances. Note that one can stop the HittingSet algorithm at any timeand use |H | as a lower bound on the size of the MUSs.

Although HittingSet shares some similarities with the CAMUS algorithm (Lif-fiton and Sakallah 2005), some fundamental differences separate these two methods.First, while CAMUS focuses on finding all MUSs, HittingSet only tries to find onewith a minimum number of variables or clauses. Also, CAMUS requires to computethe set of all co-MSSs, which limits the number of instances that can be solved withthis algorithm. Finally, HittingSet uses metaheuristics instead of an exact solverfor the SAT problem, which allows it to deal with harder SAT instances. Comparativeresults can be found in the section containing computational experiments.

2.4 The PreFiltering algorithm

When dealing with large instances, it can be useful to quickly filter out as manyvariables and clauses as possible, leaving less for the MUS selection algorithm. ThePreFiltering algorithm, shown in Fig. 4, is a variation of the Insertion algo-rithm, where all clauses or variables of U are made hard, every iteration. Since at least

Page 10: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 133

Fig. 4 The PreFilteringalgorithm

one clause or variable of each MUS becomes hard at each iteration, smaller MUSs aremore likely to remain in the filtered formula than bigger ones. The PreFilteringalgorithm thus acts as a heuristic that isolates smaller MUSs.

Consider, once again, the selection of a MUSC in the CNF of Table 2, for whichthe Insertion algorithm is unable to find the MCUS M1. The first set U returnedby MaxWSAT should contain C1 and C2. Both these clauses are made hard, suchthat the algorithm returns M1. In contrast to the Insertion algorithm, the Pre-Filtering algorithm succeeds in finding the MCUS.

2.5 Neighborhood weight heuristic

Recall that, in the Insertion algorithm, the choice of the clause or variable toharden, at each iteration, determines which MUS is obtained. The neighborhoodweight heuristic uses the information in the weights of the clauses or variables tomake choices leading to smaller MUSs. We define the neighbors N (C) of a clauseC ∈ C as the set of clauses, excluding C, that contain at least one of the variablesconstrained by C. The neighborhood weight of C is then the sum of the weights ofclauses in N (C). Correspondingly, the neighbors N (x) of a variable x ∈ X is the setof variables, excluding x, which are constrained by at least one of the clauses con-straining x. The neighborhood weight of x is thus the sum of the weights of variablesin N (x).

When all weights are equal (e.g. after initialization) the neighborhood weight ofa clause or a variable can be considered as a measure of the density of the regionsurrounding this clause or variable. As clauses or variables are made hard, the neigh-borhood weight then evaluates the hardness of the surrounding region. Because theyusually have fewer variables and clauses, it makes sense to build denser MUSs. Bychoosing to harden a clause or variable with the greatest neighborhood weight, wetherefore increase the density of the obtained MUS. This strategy can also be used inthe Removal algorithm by considering the weight of removed clauses or variablesas 0 and the others as 1. Since the selected MUS has a first clause or variable removedlast, we remove those with smallest neighborhood weight first, preserving the denserones.

2.6 Heuristic selection speed-up

This subsection presents a technique that can be used to speed-up the selection ofMUSs when using heuristic versions of the Insertion and PreFiltering al-

Page 11: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

134 J Comb Optim (2009) 18: 124–150

gorithms. In those algorithms, procedure MaxWSAT is a heuristic, rather than an ex-act method. Given a formula F , if this procedure returns a set U with f (U) = 1, weknow that it is either optimal, or that F is satisfiable. Suppose that F is not satisfiable,the single clause or variable in U necessarily belongs to all the MUSs in F , and wecan right-away harden it. Furthermore, if MaxWSAT is implemented as a local searchheuristic, it can visit a great number of truth assignments for which f (U) = 1. Wecan thus memorize the unsatisfied clauses or the unassigned variables correspondingto those assignments, and harden all of them once the search is over. This technique isparticularly efficient when F contains only one MUS. In such cases, a single iterationof the selection algorithm is often enough to find the whole MUS. This idea can alsobe used to prove that the unsatisfiable formulas obtained using the Insertion orPreFiltering algorithm are minimal. Hence, if MaxWSAT finds f (U) = 1 everyiteration,2 we know that the subformula is either the only MUS of the problem, or issatisfiable.

2.7 Implementation details

In the selection algorithms, procedures SAT and MaxWSAT can be exact or heuristic.However, since the SAT and Max-WSAT problems are NP-hard, exact algorithmscan have some difficulties solving large instances, and, in those cases, we mustturn to heuristics. To solve these problems efficiently, we have implemented a lo-cal search heuristic based on the tabu search algorithm (Glover and Laguna 1997).In (Desrosiers et al. 2008), we have successfully used such a heuristic to find criticalsubgraphs for large instances. For the detection of MUSCs, we used as solution spacethe set of truth assignments, and as cost function the sum of the weights of unsatisfiedclauses, which we minimize. Given a truth assignment s, a neighbor of s is obtainedby inverting the truth value of a single variable. Furthermore, to avoid cycling andescape local minima, the tabu search algorithm forbids, during τ iterations, changingthe value of the assigned variable, unless it improves the best cost found at that mo-ment. The value of τ is given by the user as a parameter. On the other hand, for thedetection of MUSVs, the solution space of MaxWSAT is the set of partial satisfiableassignments, and the cost function minimizes the sum of the weights of unassignedvariables. A neighbor of a solution s is obtained by giving a truth value to an unas-signed variable of s, and if necessary, by unassigning variables of unsatisfied clauses,starting with the ones that are involved in the most unsatisfied clauses, until the as-signment evaluates to true. Once more, the tabu search algorithm forbids, during τ

iterations, giving to an unassigned variable the value it has just lost, unless it improvesthe best cost found.

We have also implemented a tabu search algorithm for the minimum hitting setproblem. However, for the results presented in this paper, we have used an exactalgorithm for this task. This algorithm, implemented in CPLEX, solves the followinginteger program

minm∑

i=1

xi

2Excluding the last iteration where f (U) ≥ α.

Page 12: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 135

s.t.∑

i∈Uj

xi ≥ 1, 1 ≤ j ≤ n, (1)

m∑

i=1

xi ≥ K, (2)

xi ∈ {0,1}, 1 ≤ i ≤ m

where Uj are the subsets produced by the MaxWSAT procedure, K is the cardinalityof the previous hitting set found, and where xi equals 1 if and only if the i-th clauseor variable is included in the hitting set. The role of constraint (2) is to speed up thesearch by reducing the size of the solution space, since there can not be any hittingset with strictly less than K elements.

3 Computational experiments

In this section, we present experiments that evaluate our MUS selection algorithmsusing various benchmark random and real-life instances. We test three different meth-ods.

• In the first method, named “REMOVAL”, we first use the exact Removal algorithmto find a MUSV by removing variables in a random order. We then remove clausesof that MUSV, using the MUSC version of the Removal algorithm, to find aMUSC. We solve the SAT problem exactly using the zChaff solver (Madigan etal. 2001).

• In the second method, named “P+Insertion”, we first use the PreFilter-ing algorithm to reduce the number of variables. We then apply the Insertionalgorithm to find a MUSV, and repeat the same process on the MUSV, this timeto find a MUSC. We use the tabu search algorithm, described in the last section,to solve the Max-WSAT problem, the neighborhood weight heuristic to select theclauses and variables to insert in the MUS, and also the selection speed-up tech-nique to accelerate the extraction. Because this method uses a heuristic to solvethe Max-WSAT problem, it is prone to errors that may render the subset of clausesor variables satisfiable. In those cases, we re-insert randomly some previously re-moved clauses or variables until the subset becomes unsatisfiable once again.

• In the third method, we find MCUSs using the HittingSet algorithm with thetabu search algorithm to solve the Max-WSAT problem and the exact algorithmdescribed in the last section to solve the minimum hitting set problem. Unlikethe previous two methods, we find MCUSs of variables and clauses separately,and write “HS-V” and “HS-C”, respectively, the HittingSet method that findsMUSVs and MUSCs.

We compare these three methods with several of the most popular algorithms toextract MUSCs. First of all, zCore (Zhang and Malik 2003) and AMUSE (Oh et al.2004) are algorithms available on the web that produce (non necessarily minimal)

Page 13: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

136 J Comb Optim (2009) 18: 124–150

unsatisfiable subformulas. They can be combined with the zMinimal algorithm(Zhang and Malik 2003) (also available on the web) which is similar to our Removalalgorithm for finding MUSCs. We use the output of zCore and AMUSE as input forzMinimal. The combination of zCore and AMUSE with zMinimal is denotedzCore+ and AMUSE+.

We also compare our results with those obtained using the MUP (Huang 2005),SMUS (Mneimneh et al. 2005) and CAMUS (Liffiton and Sakallah 2005) algorithmswhich all produce MUSCs. The SMUS algorithm offers the added guarantee that itsoutput is a MUSC with a minimum number of clauses. Also, since the CAMUS al-gorithm extracts all MUSCs of a CNF formula, it also finds one with a minimumnumber of clauses.

The algorithms proposed by Bruni (2003) and Mazure et al. (1998) find (non nec-essarily minimal) unsatisfiable subformulas. Since they cannot be downloaded on theweb, we could not combine their output with zMinimal.

All experiments were carried out on a 2 GHz Intel Pentium IV PC with 512 kBcache and 1 GB RAM, running Linux CentOS release 4.2. All reported computingtimes are in seconds unless the time values end with “h”, in which case the values arein hours. For our methods, we did 10 runs using different random seeds, and give themean values of the successful runs. All values reported for zCore+ and AMUSE+were obtained on our computer (since these algorithms could be downloaded on theweb). For the other algorithms, the reported solutions and times are taken from thecorresponding papers (when available). Bruni did his experiments on a 450 MHz Pen-tium II PC, Mazure et al. on a 133 MHz Pentium PCs, MUP was tested on a 2.4 GHzPentium IV, SMUS on a 2 GHz Pentium IV and CAMUS on a 2.2 GHz Opteron proces-sor with 8 GB of RAM.

In the following tables of results, “Name” gives the name of the tested instance,“V” and “C” are the number of variables and clauses of the original instance or aMUS. Furthermore, “C’” refers to the number of clauses of the MUSC obtained afterreducing a MUSV, and “t” gives the CPU times.

3.1 DIMACS instances

The first set of benchmark instances, which can be found at (Dimacs 1993), comesfrom the Second DIMACS Challenge Workshop. These instances are of four differenttypes. The AIM instances, provided by Iwama et al., are artificial 3-sat instances.The JNH instances, provided by Hooker are difficult random instances generated byrejecting unit clauses and setting the density (clause to variable ratio) to a hard value.Lastly, the SSA and BF instances, from Van Gelder and Tsuji, are real-life instancesfrom circuit fault analysis.

AIM instances The aim instances are relatively small instances which have onlyone MUS each (excepts two of them which have two MUSs having the same size).For these instances, we found MUSs of variables and clauses with the REMOVAL,P+Insertion, HS-V and HS-C methods. The results are shown in Table 4, andare compared with those obtained by zCore+, AMUSE+ and by Mazure et al. andBruni. We observe that our methods found MUSCs with an equal or lesser number of

Page 14: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 137

Tabl

e4

Com

puta

tiona

lres

ults

for

the

AIM

inst

ance

s

Inst

ance

zCor

e+A

MU

SE+

Maz

ure

Bru

niR

EM

OV

AL

P+In

sert

ion

HS-

CH

S-V

Nam

eV

CV

Ct

VC

tV

Ct

VC

tV

CC

’t

VC

C’

tV

Ct

VC

t

aim

50-1

.6-1

5080

2022

0.0

2022

0.1

2022

0.1

2022

0.1

20.0

22.0

22.0

0.2

20.0

22.0

22.0

1.7

20.0

22.0

0.1

20.0

22.0

0.1

aim

50-1

.6-2

5080

2832

0.0

2832

0.0

2832

0.1

2832

0.1

28.0

34.0

32.0

0.3

28.0

34.0

32.0

2.7

28.0

32.0

0.2

28.0

34.0

0.2

aim

50-1

.6-3

5080

2831

0.0

2831

0.1

2831

0.1

2831

0.1

28.0

33.0

31.0

0.3

28.0

33.0

31.0

3.5

28.0

31.0

0.2

28.0

33.0

0.3

aim

50-1

.6-4

5080

1820

0.0

1820

0.0

1820

0.1

1820

0.0

18.0

21.0

20.0

0.2

18.0

21.0

20.0

1.9

18.0

20.0

0.1

18.0

21.0

0.1

aim

50-2

.0-1

5010

021

220.

021

220.

021

220.

121

220.

121

.026

.022

.00.

321

.026

.022

.01.

821

.022

.00.

121

.026

.00.

2

aim

50-2

.0-2

5010

028

300.

028

300.

128

300.

128

310.

328

.034

.030

.00.

328

.034

.030

.05.

428

.030

.00.

228

.034

.00.

6

aim

50-2

.0-3

5010

022

280.

022

280.

122

280.

122

280.

022

.029

.028

.00.

222

.029

.028

.02.

422

.028

.00.

222

.029

.00.

3

aim

50-2

.0-4

5010

018

210.

018

210.

118

210.

118

210.

318

.023

.021

.00.

218

.023

.021

.01.

718

.021

.00.

118

.023

.00.

2

aim

100-

1.6-

110

016

043

470.

143

470.

143

470.

243

471.

243

.049

.047

.00.

743

.049

.047

.05.

143

.047

.00.

343

.049

.00.

7

aim

100-

1.6-

210

016

046

530.

146

530.

146

520.

346

544.

546

.057

.053

.00.

846

.057

.053

.06.

046

.053

.00.

446

.057

.01.

5

aim

100-

1.6-

310

016

051

570.

151

570.

151

570.

351

574.

651

.059

.057

.00.

851

.059

.057

.05.

151

.057

.00.

451

.059

.00.

8

aim

100-

1.6-

410

016

043

480.

143

480.

143

480.

343

482.

543

.050

.048

.00.

743

.050

.048

.04.

643

.048

.00.

343

.050

.00.

7

aim

100-

2.0-

110

020

018

190.

018

190.

118

190.

118

190.

518

.020

.019

.00.

518

.020

.019

.02.

118

.019

.00.

118

.020

.00.

4

aim

100-

2.0-

210

020

035

390.

035

390.

135

390.

235

390.

935

.041

.039

.00.

735

.041

.039

.04.

935

.039

.00.

335

.041

.01.

4

aim

100-

2.0-

310

020

025

270.

025

270.

125

270.

125

271.

825

.030

.027

.00.

625

.030

.027

.02.

525

.027

.00.

225

.030

.00.

6

aim

100-

2.0-

410

020

026

310.

026

310.

126

310.

226

321.

626

.033

.031

.00.

626

.033

.031

.02.

226

.031

.00.

226

.033

.00.

6

aim

200-

1.6-

120

032

052

550.

152

550.

152

550.

652

552.

652

.055

.055

.01.

952

.055

.055

.08.

752

.055

.00.

852

.055

.05.

1

aim

200-

1.6-

220

032

077

800.

177

800.

277

800.

976

8243

.077

.085

.080

.02.

577

.085

.080

.016

.177

.080

.01.

777

.085

.045

.5

aim

200-

1.6-

320

032

077

830.

277

830.

277

831.

277

8630

0.0

77.0

89.0

83.0

2.7

77.0

89.0

83.0

11.4

77.0

83.0

1.7

77.0

89.0

432.

4

aim

200-

1.6-

420

032

044

460.

144

460.

144

460.

644

462.

344

.049

.046

.01.

744

.049

.046

.03.

444

.046

.00.

544

.049

.00.

9

aim

200-

2.0-

120

040

049

530.

149

530.

149

530.

549

543.

749

.056

.053

.02.

149

.056

.053

.09.

049

.053

.00.

749

.056

.045

.9

aim

200-

2.0-

220

040

046

500.

146

500.

146

500.

746

503.

046

.052

.050

.02.

246

.052

.050

.011

.546

.050

.00.

746

.052

.027

.5

aim

200-

2.0-

320

040

035

370.

135

370.

135

370.

435

370.

435

.038

.037

.01.

935

.038

.037

.06.

935

.037

.00.

635

.038

.010

.9

aim

200-

2.0-

420

040

036

420.

136

420.

136

420.

636

420.

836

.045

.042

.01.

736

.045

.042

.05.

736

.042

.00.

636

.045

.07.

5

Page 15: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

138 J Comb Optim (2009) 18: 124–150

variables and clauses as those found by the other algorithms. For the instance aim100-16-2, Mazure et al. found a MUSC with one less clause than ours. We believe howeverthis to be an error since HS-C proved our MUSC with 53 clauses to be minimum.We note that in almost all cases, the MUSs are obtained very rapidly. For example,both the REMOVAL and HS-C methods find MUSCs in less than 3 seconds. TheP+Insertion method is somewhat slower than REMOVAL. This is due to the factthat zChaff solves these instances exactly in a few milliseconds, faster than the timerequired by our tabu search algorithm. In addition, we observe that the zCore+ andAMUSE+ algorithms are significantly faster than the other algorithms.

JNH instances Unlike the AIM instances, the JNH instances have generally manyMUSCs. For these instances, we tested the REMOVAL, P+Insertion, HS-V andHS-C methods with a time limit of one hour, and compared our results with thoseobtained by Bruni and with zCore+ and AMUSE+. The results are shown in Table 5.For HS-V and HS-C, the given values represent the cardinality of the hitting setsfound by each algorithm. When the time limit was reached, the values are precededby a “≥” to indicate that they represent lower bounds on the number of variables orclauses of a MUS. We observe that for these larger instances, HS-V and HS-C couldnot find MUSs with a minimum number of variables or clauses, except for instancejnh2. Also, the P+Insertion algorithm could produce a MUSV but no MUSCwithin one hour CPU time for instances jnh16 and jnh18.

We also observe that REMOVAL and P+Insertion found in many cases MUSscontaining much less variables and clauses than those found by the other algo-rithms. However, zCore+ and AMUSE+ are significantly faster. Although slower,P+Insertion produces MUSs with substantially less variables and clauses thanREMOVAL. Thus, for jnh13, REMOVAL obtained a MUSC of 66 variables and 92clauses (on average 77.3 variables and 92.0 clauses), while P+Insertion found aMUSC of 45 variables and 55 clauses (on average 45.2 variables and 55.0 clauses).This is due to the neighborhood weight heuristic that guides the Insertionmethodtowards smaller MUSs. Lastly, we can see that the bounds obtained by the HS meth-ods are generally far from the values found by the other two methods. Yet, we do notknow how close the actual MCUSs are from these bounds.

SSA and BF instances The SSA and BF instances have much more variables andconstraints than the AIM and JNH instances (a few thousands versus a few hun-dreds). Table 6 shows the results obtained with zCore+, AMUSE+, REMOVAL,P+Insertion and the algorithm of Mazure et al. These instances are too bigfor the HittingSet methods. Again, we observe that REMOVAL obtains in manycases MUSCs with fewer variables and clauses than those obtained by the other algo-rithms, although it is significantly slower than zCore+ and AMUSE+. Note that bothzCore+ and AMUSE+ use the zMinimal algorithm which is similar to our Re-moval algorithm for finding MUSCs. The small number of variables in the MUSCsfound by REMOVAL can be explained by its first phase which is to apply Removalto find a MUSV (to which clauses are then removed to get a MUSC).

Furthermore, we notice that, as expected, the P+Insertion method is fasterthan REMOVAL to find MUSs containing few variables and clauses, but much slower

Page 16: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 139

Tabl

e5

Com

puta

tiona

lres

ults

for

the

JNH

inst

ance

s

Inst

ance

zCor

e+A

MU

SE+

Bru

niR

EM

OV

AL

P+In

sert

ion

HS-

VH

S-C

Nam

eV

CV

Ct

VC

tV

Ct

VC

C’

tV

CC

’t

VC

jnh2

100

850

6169

0.2

4854

0.2

5160

3.2

63.1

181.

345

.01.

244

.576

.145

.054

.941

45

jnh3

100

850

9621

50.

991

190

0.9

9217

329

.789

.051

6.3

168.

02.

982

.047

3.0

148.

519

8.3

≥27

≥74

jnh4

100

850

8612

80.

371

102

0.3

8614

08.

284

.624

7.9

134.

02.

364

.017

7.0

102.

014

6.4

≥29

≥60

jnh5

100

850

6980

0.2

8513

10.

485

125

7.7

74.2

277.

882

.01.

564

.717

4.7

74.3

106.

1≥2

8≥6

0

jnh6

100

850

9518

10.

681

152

0.6

8815

922

.979

.133

2.1

130.

02.

273

.026

3.0

127.

914

5.0

≥31

≥61

jnh8

100

850

6780

0.2

8413

60.

470

910.

674

.326

8.2

86.0

2.2

60.0

150.

379

.210

2.9

≥24

≥39

jnh9

100

850

7910

40.

266

870.

278

118

180

.535

0.0

114.

01.

957

.012

6.5

71.9

114.

6≥2

7≥4

5

jnh1

010

085

081

119

0.3

7299

0.3

9516

10.

175

.630

7.2

77.0

1.8

58.5

137.

873

.096

.5≥2

7≥5

4

jnh1

110

085

088

140

0.3

9016

80.

679

129

1981

.035

6.1

116.

02.

070

.021

1.0

110.

726

3.8

≥28

≥77

jnh1

310

085

046

540.

162

710.

277

106

0.1

77.3

310.

392

.01.

745

.288

.755

.066

.9≥2

8≥5

0

jnh1

410

085

066

780.

270

920.

287

124

0.5

73.1

271.

070

.01.

557

.013

3.8

68.0

96.2

≥26

≥48

jnh1

510

085

086

145

0.4

8213

40.

587

140

1.4

77.0

313.

211

4.0

1.8

66.5

194.

092

.618

8.6

≥25

≥44

jnh1

610

085

010

045

110

.999

384

10.6

100

321

55.8

93.7

634.

628

1.0

7.6

92.0

600.

0–

>1h

≥35

≥161

jnh1

810

085

096

196

0.6

9422

30.

991

168

40.6

88.0

491.

716

6.0

2.9

81.0

364.

5–

>1h

≥25

≥69

jnh1

910

085

090

143

0.4

8615

30.

578

122

7.4

79.3

344.

011

0.0

2.0

72.0

239.

011

6.0

100.

9≥2

7≥5

0

jnh2

010

085

074

950.

279

121

0.3

8112

00.

779

.933

6.5

128.

02.

062

.517

0.5

86.9

142.

2≥2

4≥4

6

Page 17: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

140 J Comb Optim (2009) 18: 124–150

Tabl

e6

Com

puta

tiona

lres

ults

for

the

SSA

and

BF

inst

ance

s

Inst

ance

zCor

e+A

MU

SE+

Maz

ure

RE

MO

VA

LP+

Inse

rtio

n

Nam

eV

CV

Ct

VC

tV

Ct

VC

C’

tV

CC

’t

ssa0

432-

003

435

1027

307

323

1.2

303

328

1.3

306

320

0.8

296.

658

6.6

309.

010

.333

5.0

714.

5–

>10

h

ssa2

670-

130

1359

3321

559

787

7.5

558

712

7.2

552

669

2.23

h55

2.4

1280

.466

9.0

83.0

570.

013

46.0

–>

10h

ssa2

670-

141

986

2315

580

1303

17.9

580

1280

26.3

579

1247

1.84

h57

5.0

1366

.012

77.0

87.9

587.

013

86.0

–>

10h

ssa6

288-

047

1041

034

238

2425

1.7

2425

2.9

––

>24

h24

.045

.025

.00.

4h24

.045

.025

.037

3.9

bf04

32-0

0710

4036

6867

113

5525

.566

413

3346

.267

412

5285

.359

7.7

1421

.311

51.0

120.

965

9.0

1616

.0–

>10

h

bf13

55-0

7521

8067

7881

152

0.9

8015

11.

482

185

26.2

79.0

199.

015

0.0

127.

680

.020

3.0

150.

088

.3

bf13

55-6

3821

7747

6882

153

0.9

8115

21.

183

154

32.6

81.0

203.

015

2.0

126.

382

.020

7.0

153.

012

8.1

bf26

70-0

0113

9334

3474

133

0.5

7713

70.

879

139

519.

474

.015

2.0

141.

043

.973

.014

7.0

132.

051

.3

Page 18: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 141

in the other cases. Thus, for instance ssa6288-047, P+Insertion found the sameMUSC as REMOVAL in a mean time of 373.9 seconds, instead of a mean time of1440.0 seconds for REMOVAL.

3.2 Daimler Chrysler instances

The second set of benchmark instances, which can be found at (Sat benchmarks2003), encodes different consistency properties of the configuration database ofDaimler Chrysler’s Mercedes car lines. The results obtained for these instances arepresented in Tables 7, 8, 9 and 10.

In Table 7, we show results obtained with zCore+, AMUSE+, MUP, REMOVAL,P+Insertion, HS-C and HS-V. We have imposed a time limit of 10 hours. Weobserve that REMOVAL is much slower than zCore+, AMUSE+ and MUP, while pro-ducing MUSCs having about the same size. Furthermore, we see once more thatP+Insertion is faster than REMOVAL when the MUSs are small, but fails to findthe bigger MUSs. Finally, we notice that the HittingSet method is better suitedfor finding MCUSs of clauses than of variables. Thus, the HS-C method found,within an hour, a MCUS for each of the instances.

In Table 8, we compare the results obtained with zCore+, AMUSE+, SMUS, Re-moval, HS-C and HS-V, with a time limit of 10 hours. These instances were toodifficult for the Max-WSAT heuristic of our Insertion method and we thereforedo not report any result for P+Insertion. In the cases where SMUS failed to find aMCUS, we instead give lower and upper bounds, separated by “:”, on the number ofclauses of a MUS. As expected, since they both guarantee finding MCUSs, our HS-C algorithm and SMUS find MUSCs with the same number of clauses. Yet, HS-Crequired less computational time than SMUS, even though the experiments involv-ing SMUS were carried out on a more powerful computer. Furthermore, even thoughHS-C finds MUSs that have a minimum number of clauses, it is faster than RE-MOVAL. This is due to the fact that, unlike the REMOVAL algorithm, it builds MCUSsin a constructive approach, and the MCUSs have much less clauses than the originalinstances. Again, zCore+ and AMUSE+ are the fastest methods.

In Tables 9 and 10, we compare the results obtained with zCore+, AMUSE+,CAMUS and HS-C. When CAMUS succeeded in finding all the MUSs of an instance,within a 600 seconds CPU time limit, “#MUS” gives the number of different MUScontained in that instance, while “MUS Size” gives the smallest and largest number ofclauses of these MUSs. Otherwise, if “#MUS” is preceded by “>”, the given valueswere calculated using the MUSs found before the timeout. Finally, if “-” is givenfor all these values, then no MUS was found by their algorithm within the imposedtime limit. We see that, while CAMUS was able to determine the smallest number ofclauses of a MUS for 32 of the 84 instances, within 600 seconds, our HS-C methodfound this value for 79 instances within the same time limit. Again, zCore+ andAMUSE+ are the fastest algorithms but they do not necessarily produce a MUSC witha minimum number of clauses.

3.3 Hard instances from graph coloring

The instances considered so far were all relatively easy for exact algorithms (suchas zChaff), making them tractable both by our Removal algorithms and by tech-

Page 19: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

142 J Comb Optim (2009) 18: 124–150

Tabl

e7

Som

eco

mpu

tatio

nalr

esul

tsfo

rD

aim

ler

Chr

ysle

rin

stan

ces

Inst

ance

zCor

e+A

MU

SE+

MU

PR

EM

OV

AL

P+In

sert

ion

HS-

CH

S-V

Nam

eV

CV

Ct

VC

tV

Ct

VC

C’

tV

CC

’t

VC

tV

Ct

168_

FW_S

Z_1

0716

9865

9941

470.

843

501.

141

470.

141

.020

6.0

48.0

98.6

41.6

215.

559

.078

.841

.047

.032

.541

.020

8.2

69.1

168_

FW_U

T_2

468

1909

7487

3235

0.5

3235

0.8

3235

0.1

30.0

125.

033

.010

2.8

31.3

129.

347

.028

.430

.033

.040

.630

.012

5.0

75.3

202_

FS_R

Z_4

417

5061

9912

180.

321

270.

512

180.

117

.014

1.0

32.0

72.4

12.0

59.0

28.7

7.3

12.1

18.0

11.9

12.0

59.0

11.6

202_

FS_S

Z_8

417

5062

7320

421

91.

220

522

01.

220

622

10.

319

9.0

1923

.021

8.0

213.

9–

––

>10

h19

9.0

214.

059

9.3

198.

019

29.0

0.8h

202_

FS_S

Z_9

717

5062

5026

330.

424

340.

626

330.

121

.097

.028

.083

.821

.097

.032

.614

.221

.028

.017

.121

.097

.025

.5

202_

FW_S

Z_1

0017

9987

3822

260.

429

300.

824

280.

121

.010

6.0

29.0

106.

221

.510

9.4

30.7

18.1

22.0

23.0

16.7

21.0

106.

019

.9

202_

FW_S

Z_1

0317

9910

283

139

158

2.5

138

156

3.4

140

159

0.5

134.

016

41.0

156.

00.

3h13

3.0

1757

.033

4.0

6.3h

133.

014

8.0

424.

5≥2

8–

>10

h

202_

FW_S

Z_8

717

9989

4624

337

93.

124

439

83.

624

738

30.

623

4.0

2831

.038

2.0

372.

8–

––

>10

h23

1.0

361.

01.

0h23

1.0

2828

.00.

8h

202_

FW_S

Z_9

617

9988

4920

521

01.

521

522

01.

821

021

50.

320

5.0

2377

.021

0.0

286.

5–

––

>10

h20

4.0

209.

098

3.8

204.

023

92.0

0.9h

210_

FS_S

Z_5

517

5557

8128

450.

327

460.

629

460.

124

.010

9.0

42.0

70.4

24.1

111.

462

.010

.524

.041

.031

.424

.010

9.0

29.1

210_

FW_S

Z_9

017

8979

9421

627

93.

122

829

13.

922

128

40.

520

9.0

2044

.028

7.0

408.

5–

––

>10

h20

9.0

274.

00.

34h

≥64

–>

10h

210_

FW_S

Z_9

117

8977

2121

427

72.

621

627

93.

122

528

80.

520

6.0

2052

.028

0.1

362.

824

0.0

2593

.0–

>10

h20

6.0

271.

00.

3h≥2

8–

>10

h

210_

FW_U

T_8

630

2024

9721

2938

0.6

2538

1.1

2335

0.1

19.0

88.0

31.0

139.

519

.088

.030

.016

.919

.030

.039

.719

.088

.067

.8

220_

FV_S

Z_5

517

2857

5324

030

43.

823

830

54.

624

631

00.

622

9.0

1452

.035

6.0

427.

622

9.0

1452

.049

9.0

5.0h

233.

029

7.0

664.

9≥4

0–

>10

h

220_

FV_S

Z_6

517

2844

9618

230.

218

230.

424

290.

118

.050

.023

.054

.918

.050

.036

.05.

518

.323

.09.

718

.050

.09.

6

Page 20: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 143

Tabl

e8

Mor

eco

mpu

tatio

nalr

esul

tsfo

rD

aim

ler

Chr

ysle

rin

stan

ces

Inst

ance

zCor

e+A

MU

SE+

SMU

SR

EM

OV

AL

HS-

CH

S-V

Nam

eV

CV

Ct

VC

tC

tV

CC

’t

VC

tV

Ct

168_

FW_S

Z_1

0716

9865

9941

470.

843

501.

147

546.

541

.020

6.0

48.0

98.6

41.0

47.0

32.5

41.0

208.

269

.1

168_

FW_S

Z_4

116

9853

8727

300.

325

270.

526

257.

423

.086

.027

.060

.924

.926

.037

.023

.086

.093

.3

168_

FW_S

Z_6

616

9854

0112

160.

212

160.

516

18.8

12.0

44.0

16.0

65.8

12.0

16.0

11.6

12.0

44.0

6.3

168_

FW_U

T_2

463

1909

7489

3538

0.5

3538

0.7

3535

0.4

32.0

165.

037

.210

5.0

32.0

35.0

50.6

32.0

154.

811

2.0

168_

FW_U

T_2

469

1909

7500

3033

0.4

3033

0.7

3283

1.5

29.0

102.

032

.093

.429

.032

.038

.829

.011

9.2

133.

7

168_

FW_U

T_7

1419

0974

876

90.

46

90.

79

14.5

6.0

17.0

10.0

83.3

6.0

9.0

4.0

6.0

17.0

6.4

168_

FW_U

T_8

5119

0974

919

100.

49

100.

78

59.9

7.0

27.0

8.0

83.5

7.0

8.0

5.5

7.0

27.0

9.6

170_

FR_R

Z_3

216

5949

5630

327

0.8

3032

70.

722

712

1.3

30.0

870.

022

7.0

75.3

30.0

227.

090

.430

.087

0.0

16.4

170_

FR_S

Z_5

816

5950

0147

490.

347

490.

446

15.3

43.0

176.

047

.056

.543

.046

.019

.743

.017

6.0

58.1

170_

FR_S

Z_9

216

5950

8246

131

0.5

4613

10.

613

115

.146

.013

7.0

131.

051

.646

.013

1.0

39.7

46.0

137.

026

.5

170_

FR_S

Z_9

616

5949

5522

640.

322

630.

453

322.

821

.030

1.0

62.0

59.5

22.0

53.0

28.6

21.0

301.

011

.8

202_

FS_R

Z_4

417

5061

9912

180.

321

270.

518

131.

017

.014

1.0

32.0

72.4

12.1

18.0

11.9

12.0

59.0

11.6

202_

FS_S

Z_1

0417

5062

0130

350.

329

340.

524

5.0

19.0

116.

024

.076

.219

.024

.015

.319

.011

6.0

26.6

202_

FS_S

Z_1

2117

5061

8120

220.

321

230.

522

2.5

20.0

38.0

22.0

64.9

20.0

22.0

11.4

20.0

38.0

19.9

202_

FS_S

Z_1

2217

5061

7919

330.

319

330.

633

3.8

19.0

216.

033

.076

.419

.033

.010

.219

.021

6.0

15.1

202_

FS_S

Z_7

417

5063

5534

150

0.7

3415

00.

715

036

.434

.023

4.0

150.

079

.534

.015

0.0

88.0

34.0

236.

010

3.3

202_

FS_S

Z_8

417

5062

7320

421

91.

220

522

01.

239

:216

1.1h

199.

019

23.0

218.

021

3.9

199.

021

4.0

599.

319

8.0

1929

.00.

8h

202_

FS_S

Z_9

717

5062

5026

330.

424

340.

628

62.1

21.0

97.0

28.0

83.9

21.0

28.0

17.1

21.0

97.0

25.5

202_

FW_R

Z_5

717

9986

8530

213

1.0

3021

31.

321

358

.330

.081

5.0

213.

011

3.4

30.0

213.

081

.330

.081

5.0

42.3

202_

FW_S

Z_1

0017

9987

3822

260.

429

300.

823

174.

021

.010

6.0

29.0

106.

222

.023

.016

.721

.010

6.0

19.9

202_

FW_S

Z_1

0317

9910

283

139

158

2.5

138

156

3.4

35:1

492.

4h13

4.0

1641

.015

6.0

0.3h

133.

014

8.0

424.

5≥2

8–

>10

h

202_

FW_S

Z_1

2317

9986

8621

370.

521

370.

736

14.7

20.0

248.

036

.010

0.2

20.0

36.0

13.2

20.0

248.

015

.3

202_

FW_S

Z_6

117

9987

4516

180.

416

180.

718

163.

815

.067

.019

.098

.516

.018

.052

.715

.067

.068

.3

202_

FW_S

Z_7

717

9988

6034

156

0.8

3415

61.

015

637

.234

.024

0.0

156.

010

3.8

34.0

156.

013

6.0

34.0

240.

010

0.5

202_

FW_S

Z_9

817

9986

896

80.

416

210.

77

58.2

6.0

20.0

7.1

89.0

6.0

7.0

6.5

6.0

20.0

7.0

Page 21: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

144 J Comb Optim (2009) 18: 124–150

Tabl

e8

(Con

tinu

ed)

Inst

ance

zCor

e+A

MU

SE+

SMU

SR

EM

OV

AL

HS-

CH

S-V

Nam

eV

CV

Ct

VC

tC

tV

CC

’t

VC

tV

Ct

208_

FA_R

Z_4

316

0852

978

90.

26

80.

48

76.9

7.0

19.0

9.0

54.2

6.0

8.0

9.2

6.0

19.0

3.9

208_

FA_S

Z_1

2016

0852

7819

340.

219

340.

434

3.9

19.0

247.

034

.059

.019

.034

.09.

619

.024

7.0

9.2

208_

FA_S

Z_8

716

0852

9917

200.

217

190.

418

15.1

17.0

136.

021

.054

.717

.018

.015

.117

.013

6.0

12.7

208_

FA_U

T_3

254

1876

7334

4068

0.5

4170

0.8

4095

.338

.088

3.0

40.0

109.

938

.040

.018

.638

.088

3.0

35.2

208_

FA_U

T_3

255

1876

7337

4068

0.5

4170

0.8

4094

.638

.088

3.0

40.0

109.

838

.040

.018

.738

.088

3.0

38.0

210_

FS_R

Z_2

317

5557

7822

350.

323

410.

631

266.

317

.071

.033

.068

.617

.031

.024

.217

.071

.024

.1

210_

FS_R

Z_3

817

7557

6317

280.

319

340.

525

261.

913

.057

.025

.066

.313

.025

.015

.913

.057

.018

.0

210_

FS_R

Z_4

017

7557

5229

140

0.6

2914

00.

614

036

.229

.065

2.0

140.

081

.529

.014

0.0

47.9

29.0

652.

041

.9

210_

FS_S

Z_1

0317

7557

7535

450.

339

500.

645

386.

435

.014

4.0

45.0

71.1

35.0

45.0

31.9

35.0

144.

051

3.5

210_

FS_S

Z_1

0717

7557

6211

150.

311

150.

515

25.3

11.0

35.0

15.0

65.0

11.0

15.0

6.9

11.0

35.0

11.5

210_

FS_S

Z_1

2317

7559

2149

176

0.7

4917

60.

817

60.

4h49

.031

8.0

176.

075

.549

.017

6.0

83.9

49.0

318.

034

7.5

210_

FS_S

Z_7

817

7559

3037

170

0.7

3717

00.

817

056

.734

.023

9.0

180.

074

.837

.017

0.0

116.

534

.023

9.0

114.

9

210_

FW_R

Z_5

717

8974

0514

260.

417

320.

725

355.

013

.057

.025

.093

.113

.025

.017

.313

.057

.027

.7

210_

FW_R

Z_5

917

8973

9429

140

0.6

2914

00.

814

056

.729

.065

2.0

140.

010

7.7

29.0

140.

051

.329

.065

2.0

96.7

210_

FW_S

Z_1

0617

8974

1739

550.

442

530.

849

789.

638

.020

5.0

51.0

97.6

38.0

49.0

37.2

37.0

201.

074

1.6

210_

FW_S

Z_1

1117

8974

0411

150.

411

150.

615

35.2

11.0

36.0

15.0

99.5

11.0

15.0

7.8

11.0

36.0

29.2

210_

FW_S

Z_1

2817

8974

1214

230.

413

230.

722

151.

312

.030

.023

.097

.313

.022

.020

.012

.030

.035

.0

210_

FW_S

Z_9

017

8979

9421

627

93.

122

829

13.

937

:275

1.8h

209.

020

44.0

287.

040

8.5

209.

027

4.0

0.34

h≥6

4–

>10

h

210_

FW_S

Z_9

117

8977

2121

427

72.

621

627

93.

141

:271

1.8h

206.

020

52.0

280.

136

2.7

206.

027

1.0

0.3h

≥28

–>

10h

220_

FV_R

Z_1

217

2845

1210

110.

213

140.

411

50.6

16.0

81.0

17.0

50.6

10.0

11.0

5.6

10.0

40.0

11.1

220_

FV_R

Z_1

317

2845

099

100.

213

140.

310

33.4

13.0

42.0

14.0

49.2

9.0

10.0

5.6

9.0

31.0

8.0

220_

FV_R

Z_1

417

2845

0810

110.

210

110.

411

33.9

10.0

40.0

11.0

53.7

10.0

11.0

3.8

10.0

40.0

7.3

220_

FV_S

Z_4

617

2844

9816

170.

231

320.

417

78.4

16.0

78.0

17.0

50.4

16.0

17.0

10.7

16.0

78.0

31.8

220_

FV_S

Z_6

517

2844

9618

230.

218

230.

423

18.9

18.0

50.0

23.0

54.9

18.3

23.0

9.7

18.0

50.0

16.4

Page 22: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 145

Table 9 More computational results for Daimler Chrysler instances

Instance zCore+ AMUSE+ CAMUS HS-C

Name V C V C t V C t t #MUS MUS Size V C t

Min Max

C168_FW_SZ_107 1698 6599 41 47 0.8 43 50 1.1 – – – – 41.0 47.0 32.5

C168_FW_SZ_128 1698 5425 30 96 0.6 29 96 0.7 – – – – 28.0 92.0 52.9

C168_FW_SZ_41 1698 5387 27 30 0.3 25 27 0.5 – – – – 24.9 26.0 37.0

C168_FW_SZ_66 1698 5401 12 16 0.2 12 16 0.5 – – – – 12.0 16.0 11.6

C168_FW_SZ_75 1698 5422 47 48 0.3 47 48 0.5 – – – – 47.0 48.0 29.1

C168_FW_UT_2463 1909 7489 35 38 0.5 35 38 0.7 – – – – 32.0 35.0 50.7

C168_FW_UT_2468 1909 7487 32 35 0.5 32 35 0.8 – – – – 30.0 33.0 40.6

C168_FW_UT_2469 1909 7500 30 33 0.4 30 33 0.7 – – – – 29.0 32.0 38.8

C168_FW_UT_714 1909 7487 6 9 0.4 6 9 0.7 – – – – 6.0 9.0 4.0

C168_FW_UT_851 1909 7491 9 10 0.4 9 10 0.7 0.3 102 8 16 7.0 8.0 5.5

C168_FW_UT_852 1909 7489 9 10 0.4 9 10 0.7 0.3 102 8 16 7.0 8.0 5.4

C168_FW_UT_854 1909 7486 9 10 0.4 9 10 0.7 0.3 102 8 16 7.0 8.0 5.1

C168_FW_UT_855 1909 7485 9 10 0.4 9 10 0.7 0.3 102 8 16 7.0 8.0 5.2

C170_FR_RZ_32 1659 4956 30 327 0.8 30 327 0.7 0.8 32768 227 228 30.0 227.0 90.4

C170_FR_SZ_58 1659 5001 47 49 0.3 47 49 0.4 7.5 218692 46 63 43.0 46.0 19.7

C170_FR_SZ_92 1659 5082 46 131 0.5 46 131 0.6 0.1 1 131 131 46.0 131.0 39.7

C170_FR_SZ_95 1659 4955 22 63 0.3 21 62 0.5 – >23301932 53 66 22.0 52.0 27.0

C170_FR_SZ_96 1659 4955 22 64 0.3 22 63 0.4 – >10383703 67 82 22.0 53.0 28.6

C202_FS_RZ_44 1750 6199 12 18 0.3 21 27 0.5 – >7764186 29 54 12.1 18.0 11.9

C202_FS_SZ_104 1750 6201 30 35 0.3 29 34 0.5 – >4803992 70 94 19.0 24.0 15.3

C202_FS_SZ_121 1750 6181 20 22 0.3 21 23 0.5 0.1 4 22 24 20.0 22.0 11.4

C202_FS_SZ_122 1750 6179 19 33 0.3 19 33 0.6 0.1 1 33 33 19.0 33.0 10.2

C202_FS_SZ_74 1750 6355 34 150 0.7 34 150 0.7 – – – – 34.0 150.0 88.0

C202_FS_SZ_84 1750 6273 204 219 1.2 205 220 1.2 – – – – 199.0 214.0 599.3

C202_FS_SZ_95 1750 6184 11 13 0.3 13 15 0.5 – >6173760 35 50 6.0 7.0 5.4

C202_FS_SZ_97 1750 6250 26 33 0.4 24 34 0.6 – >5466033 99 125 21.0 28.0 17.1

C202_FW_RZ_57 1799 8685 30 213 1.0 30 213 1.3 0.4 1 213 213 30.0 213.0 81.3

C202_FW_SZ_100 1799 8738 22 26 0.4 29 30 0.8 – – – – 22.0 23.0 16.7

C202_FW_SZ_103 1799 10283 139 158 2.5 138 156 3.4 – – – – 133.0 148.0 424.5

C202_FW_SZ_118 1799 8811 47 130 0.8 47 130 0.9 – >11072627 130 132 46.0 129.0 110.5

C202_FW_SZ_123 1799 8686 21 37 0.5 21 37 0.7 0.2 4 36 38 20.0 36.0 13.2

C202_FW_SZ_124 1799 8684 19 33 0.4 19 33 0.8 0.1 1 33 33 19.0 33.0 11.2

C202_FW_SZ_61 1799 8745 16 18 0.4 16 18 0.7 – – – – 16.0 18.0 52.7

C202_FW_SZ_77 1799 8860 34 156 0.8 34 156 1.0 – – – – 34.0 156.0 135.9

C202_FW_SZ_87 1799 8946 243 379 3.1 244 398 3.6 – – – – 231.0 361.0 3519.9

C202_FW_SZ_96 1799 8849 205 210 1.5 215 220 1.8 – – – – 204.0 209.0 983.8

C202_FW_SZ_98 1799 8689 6 8 0.4 16 21 0.7 – – – – 6.0 7.0 6.5

C202_FW_UT_2814 2038 11352 22 28 0.6 15 19 1.1 – – – – 15.0 16.0 95.1

C202_FW_UT_2815 2038 11352 22 28 0.6 15 19 1.1 – – – – 15.0 16.0 96.9

C208_FA_RZ_43 1608 5297 8 9 0.2 6 8 0.4 – >87515 9 28 6.0 8.0 9.2

C208_FA_RZ_64 1608 5279 29 212 0.7 29 212 0.7 0.2 1 212 212 29.0 212.0 67.0

C208_FA_SZ_120 1608 5278 19 34 0.2 19 34 0.4 0.1 2 34 34 19.0 34.0 9.6

Page 23: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

146 J Comb Optim (2009) 18: 124–150

Table 9 (Continued)

Instance zCore+ AMUSE+ CAMUS HS-C

Name V C V C t V C t t #MUS MUS Size V C t

Min Max

C208_FA_SZ_121 1608 5278 18 32 0.2 18 32 0.4 0.1 2 32 32 18.0 32.0 9.1

C208_FA_SZ_87 1608 5299 17 20 0.2 17 19 0.4 0.9 12884 18 27 17.0 18.0 15.1

C208_FA_UT_3254 1876 7334 40 68 0.5 41 70 0.8 0.7 17408 40 74 38.0 40.0 18.6

C208_FA_UT_3255 1876 7337 40 68 0.5 41 70 0.8 1.4 52736 40 74 38.0 40.0 18.7

C208_FC_RZ_65 1654 5591 12 14 0.2 11 12 0.4 – – – – 11.0 12.0 13.8

C208_FC_RZ_70 1654 5543 29 212 0.8 29 212 0.7 0.2 1 212 212 29.0 212.0 67.5

C208_FC_SZ_107 1654 5641 36 47 0.3 38 50 0.5 – – – – 31.0 44.0 26.1

C208_FC_SZ_127 1654 5542 19 34 0.2 19 34 0.5 0.1 1 34 34 19.0 34.0 10.2

C208_FC_SZ_128 1654 5542 18 32 0.3 18 32 0.5 0.1 1 32 32 18.0 32.0 9.6

niques such as zCore+ and AMUSE+. In order to test our algorithms in a differentcontext, we have generated instances which are much more difficult for exact algo-rithms. These five new instances were generated from random graph k-coloring prob-lems, using Joe Culberson’s converter (Culberson 2004). For each of these instances,we find a potential MUSV using the Insertion method. We then determine if thiscore is satisfiable using zChaff, and if the core is unsatisfiable, we have a proofthat the original instance is unsatisfiable. Table 11 summarizes the results of this ex-periment. The columns labeled “zChaff” and “Check.” give the computation timerequired by zChaff to solve the original instances and their extracted core, with acomputation time limit of 24 hours. We can observe that, out of the five tested in-stances, zChaff was only able to solve one instance within the imposed time limit,while it succeeded in solving three of the extracted cores. Furthermore, we notice thatzChaff required much less time to solve the cores. For example, while it failed tosolve instance DGHP50_5 within 24 hours, it was able to solve its core in just 7.74hours. Given that it took 0.87 hours to extract the core, the total time needed to provethis instance unsatisfiable was 8.61 hours. Notice that zCore+ and AMUSE+ wereunable to treat these instances within the 24 hours time limit.

3.4 Overall discussion of the results

In the previous section we have clearly observed that the proposed algorithms areslower than other algorithms such as zCore+ and AMUSE+ for extracting MUSCs.They have, however, many advantages when compared to the most popular algo-rithms. First of all, our algorithms can find not only MUSCs but also MUSVs. Hence,for example, while the MUSC version of Removal is similar to zMinimal, we haveshown that the MUSV version can first be applied to reduce the size of an instance,and a MUSC can then be extracted using the MUSC version of Removal. Theresulting algorithm, called REMOVAL has sometimes produced MUSCs with muchless variables and clauses than the other algorithms. The P+Insertion algorithmnot only follows the same approach (i.e., removing clauses from a MUSV to geta MUSC), but also uses a neighborhood weight heuristic in order to extract small

Page 24: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 147

Table 10 More computational results for Daimler Chrysler instances

Instance zCore+ AMUSE+ CAMUS HS-C

Name V C V C t V C t t #MUS MUS Size V C t

Min Max

C210_FS_RZ_23 1755 5778 22 35 0.3 23 41 0.6 – – – – 17.0 31.0 24.2

C210_FS_RZ_38 1755 5763 17 28 0.3 19 34 0.5 – >5365108 46 59 13.0 25.0 15.9

C210_FS_RZ_40 1755 5752 29 140 0.6 29 140 0.6 0.3 15 140 173 29.0 140.0 47.9

C210_FS_SZ_103 1755 5755 35 45 0.3 39 50 0.6 – – – – 35.0 45.0 31.9

C210_FS_SZ_107 1755 5762 11 15 0.3 11 15 0.5 – >2160988 48 65 11.0 15.0 6.9

C210_FS_SZ_123 1755 5921 49 176 0.7 49 176 0.8 – >10064216 177 234 49.0 176.0 83.9

C210_FS_SZ_129 1755 5753 20 33 0.3 20 33 0.5 0.1 1 33 33 20.0 33.0 10.2

C210_FS_SZ_130 1755 5753 19 31 0.3 19 31 0.5 0.1 1 31 31 19.0 31.0 9.6

C210_FS_SZ_55 1755 5781 28 45 0.3 27 46 0.6 – – – – 24.0 41.0 31.4

C210_FS_SZ_78 1755 5930 37 170 0.7 37 170 0.8 – – – – 37.0 170.0 116.5

C210_FW_RZ_30 1789 7426 22 39 0.4 27 48 0.8 – – – – 19.0 35.0 36.4

C210_FW_RZ_57 1789 7405 14 26 0.4 17 32 0.7 – >4597505 47 61 13.0 25.0 17.3

C210_FW_RZ_59 1789 7394 29 140 0.6 29 140 0.8 0.4 15 140 173 29.0 140.0 51.3

C210_FW_SZ_106 1789 7417 39 55 0.4 42 53 0.8 – – – – 38.0 49.0 37.2

C210_FW_SZ_111 1789 7404 11 15 0.4 11 15 0.6 – >6904528 39 51 11.0 15.0 7.8

C210_FW_SZ_128 1789 7412 14 23 0.4 13 23 0.7 – – – – 13.0 22.0 20.0

C210_FW_SZ_129 1789 7606 49 176 0.9 49 176 1.0 – >7528982 238 281 49.0 176.0 90.0

C210_FW_SZ_135 1789 7395 20 33 0.4 20 33 0.7 0.1 1 33 33 20.0 33.0 11.6

C210_FW_SZ_136 1789 7395 19 31 0.4 19 31 0.7 0.2 1 31 31 19.0 31.0 10.8

C210_FW_SZ_80 1789 7572 39 173 0.8 35 175 1.0 – – – – 37.0 171.0 155.3

C210_FW_SZ_90 1789 7994 216 279 3.1 228 291 3.9 – – – – 209.0 274.0 1225.7

C210_FW_SZ_91 1789 7721 214 277 2.6 216 279 3.1 – – – – 206.0 271.0 1093.4

C210_FW_UT_8630 2024 9721 29 38 0.6 25 38 1.1 – – – – 19.0 30 .0 39.7

C210_FW_UT_8634 2024 9719 28 32 0.5 28 35 1.0 – – – – 18.0 23.0 30.0

C220_FV_RZ_12 1728 4512 10 11 0.2 13 14 0.4 1.7 80272 11 35 10.0 11.0 5.6

C220_FV_RZ_13 1728 4509 9 10 0.2 13 14 0.3 0.3 6772 10 27 9.0 10.0 5.6

C220_FV_RZ_14 1728 4508 10 11 0.2 10 11 0.4 0.1 80 11 18 10.0 11.0 3.8

C220_FV_SZ_114 1728 4777 47 132 0.5 47 132 0.6 – >1356651 320 364 47.0 132.0 62.3

C220_FV_SZ_121 1728 4508 14 58 0.3 14 58 0.4 0.2 9 58 65 14.0 58 .0 17.3

C220_FV_SZ_39 1728 5263 199 205 2.2 211 217 2.8 – – – – 199.0 205.0 340.2

C220_FV_SZ_46 1728 4498 16 17 0.2 31 32 0.4 – >11089464 40 71 16.0 17.0 10.7

C220_FV_SZ_55 1728 5753 240 304 3.8 238 305 4.6 – >2822517 663 670 233.0 297.0 664.9

C220_FV_SZ_65 1728 4496 18 23 0.2 18 23 0.4 3.2 103442 23 40 18.3 23.0 9.7

MUSs. We have observed that such a technique make it possible, in many cases, toobtain MUSCs with significantly fewer clauses.

There are other important differences between the Insertion and the Re-moval algorithms. For example, the Insertion algorithm uses a tabu searchheuristic to solve the Max-WSAT problem, while Removal uses the (exact)zChaff algorithm to solve the SAT problem. In many cases, the tabu search al-gorithm was slower than zChaff, explaining why Insertion was often slower

Page 25: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

148 J Comb Optim (2009) 18: 124–150

Table 11 Computational results for instances coming from graph coloring

Instance zChaff P+Insertion Check

Name V C t V C t t

DGHP50_1 350 3641 5.23h 259 2277 0.43h 2.48h

DGHP50_2 400 4618 >24h ∗336 ∗3450 1.43h >24h

DGHP50_3 400 4650 >24h 288 2796 0.70h 17.42h

DGHP50_4 400 4602 >24h ∗288 ∗2724 2.22h >24h

DGHP50_5 400 4706 >24h 280 2619 0.87h 7.74h

than Removal. However, we have built hard coloring instances that are very difficultfor zChaff, while still manageable by the tabu search heuristic. For these instances,the Insertion algorithm was the only one able to find MUSVs for proving theunsatisfiability of the considered instances. Another difference is that the Inser-tion algorithm builds MUSs in a constructive way. Therefore, when the tabu searchheuristic makes no mistake, the number of steps is smaller or equal to the numberof clauses in the MUSC. Hence, it is not surprising that the Insertion algorithmis sometimes faster than the Removal algorithm when the MUSCs have much lessclauses than the original instance.

The MUSC version of the HittingSet algorithm (i.e, HS-C) provides MUSCswith a minimum number of clauses (i.e., MCUSs). Unlike the Removal and the In-sertion algorithms, the number of steps of HS-C is exponential with respect to thenumber of clauses, in the worst case. Therefore, it is not surprising that the algorithmfailed to solve several instances, in particular some large DIMACS instances. How-ever, when compared to SMUS and CAMUS on the Daimler instances, HS-C was theunique algorithm able to solve all these instances generally in less than one hour. It ismoreover interesting to observe important differences between HS-C and the CAMUSalgorithm. As mentioned in Sect. 1, CAMUS first builds the set of all co-MSSs, andthen all MUSCs. When it is successful, this algorithm is much faster than our HS-Cmethod. However, this algorithm may fail (during phase 1) if the number of co-MSSsis too large, or even (during phase 2) if the number of MUSCs is too large. UnlikeCAMUS, our HS-C algorithm tries to discover a MCUS by generating only a limitednumber of (non necessarily minimal) co-MSSs (the sets denoted U in Fig. 3). Thismakes it possible to solve many instances that are out of reach for CAMUS.

4 Conclusion

We presented, in this paper, novel algorithms to find minimal unsatisfiable subsets(MUSs) of variables or clauses of a given SAT problem. We also described an algo-rithm that finds MUSs with a minimum number of clauses or variables. Furthermore,we showed how these algorithms can be used with exact or heuristic SAT solvers,depending on the difficulty of the SAT instances to solve. We saw that the extractionalgorithms guarantee unsatisfiability and minimality when using an exact solver, andthat a guarantee of minimality can be obtained, in some cases, while using a heuristic

Page 26: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

J Comb Optim (2009) 18: 124–150 149

solver. Moreover, we presented additional techniques to speed-up the extraction of aMUS and to find MUSs containing less variables and clauses. Finally, we evaluatedour algorithms with some computational experiments on various instances, and com-pared our results with those obtained by state-of-the-art algorithms for the same task.These experiments have shown that our algorithms are not the fastest ones, but theygenerally produce MUSs with less variables and clauses. For very large instances onwhich exact algorithms (e.g., zChaff) cannot be applied to prove their unsatisfia-bility, our Insertion algorithm was the only one able to extract MUSs. Also, ourHS-C algorithm can determine MUSCs of minimum cardinality on instances whichare out of reach for algorithms like CAMUS. In summary, the proposed algorithmsrepresent an interesting alternative to existing algorithms for the extraction of MUSs.

References

Amaldi E, Pfetsch ME, Trotter LEJ (1999) Some structural and algorithmic properties of the maximumfeasible subsystem problem. In: Proceedings of the 7th international IPCO conference on integerprogramming and combinatorial optimization, June 1999, pp 45–59

Bailey J, Stuckey PJ (2005) Discovery of minimal unsatisfiable subsets of constraints using hitting setdualization. In: Proceedings of the 7th international symposium on practical aspects of declarativelanguages (PADL05). Lecture notes in computer science, vol 3350. Springer, Berlin, pp 174–186

Battiti R, Protasi M (1998) Approximate algorithms and heuristics for MAX-SAT. In: Ding-zhu D (ed)Handbook of combinatorial optimization, vol 1. Kluwer Academic, Boston

Borchers B, Furman J (1999) A two-phase exact algorithm for MAX-SAT and weighted MAX-SAT prob-lems. J Comb Optim 2:299–306

Bruni R (2003) Approximating minimal unsatisfiable subformulae by means of adaptive core search. Dis-crete Appl Math 130(2):85–100

Chinneck JW (1997) Finding a useful subset of constraints for analysis in an infeasible linear program.INFORMS J Comput 9(2):164–174

Culberson J (2004) http://web.cs.ualberta.ca/~joe/Coloring/index.htmlDantsin E, Goerdt A, Hirsch EA, Kannan R, Kleinberg J, Papadimitriou C, Raghavan P, Schöning U

(2002) A deterministic (2 − 2/(k + 1))n algorithm for k-SAT based on local search. Theor ComputSci 289(1):69–83

Davis M, Logemann G, Loveland D (1962) A machine program for theorem-proving. Commun ACM5(7):394–397

Desrosiers C, Galinier P, Hertz A (2008) Efficient algorithms for finding critical subgraphs. Discrete ApplMath 156(2):244–266

Dimacs ftp site (1993) ftp://dimacs.rutgers.edu/pub/challenge/sat/benchmarks/cnfFleischner H, Kullmann O, Szeider S (2002) Polynomial-time recognition of minimal unsatisfiable formu-

las with fixed clause-variable difference. Theor Comput Sci 289(1):503–516Franco J, Gu J, Purdom PW, Wah BW (1997) Satisfiability problem: theory and applications. In DIMACS

series in discrete mathematics and theoretical computer science, pp 19–152Galinier P, Hertz A (2007) Solution techniques for the large set covering problem. Discrete Appl Math

155:312–326Gleeson J, Ryan J (1990) Identifying minimally infeasible subsystems of inequalities. ORSA J Comput

2(1):61–63Glover F, Laguna M (1997) Tabu search. Kluwer Academic, BostonGoldberg E, Novikov Y (2000) Berkmin: a fast and robust SAT-solver. In: Design, automation, and test in

Europe ’02, March 2000, pp 142–149Hansen P, Jaumard B (1990) Algorithms for the maximum satisfiability problem. Computing 44(4):279–

303Herrmann F, Hertz A (2002) Finding the chromatic number by means of critical graphs. ACM J Exp

Algorithmics 7(10):1–9Huang J (2005) Mup: a minimal unsatisfiability prover. In: Proceedings of the tenth Asia and South Pacific

design automation conference (ASP-DAC-05), pp. 432–437

Page 27: Using heuristics to find minimal unsatisfiable subformulas in satisfiability problems

150 J Comb Optim (2009) 18: 124–150

Kleine Büning H, Zhao X (2002) Polynomial time algorithms for computing a representation for minimalunsatisfiable formulas with fixed deficiency. Inf Process Lett 84(3):147–151

Levesque H, Mitchell D, Selman B (1992) GSAT—a new method for solving hard satisfiability problems.In: Proceedings of the 10th national conference on artificial Intelligence (AAAI-92), pp 440–446

Liffiton MH, Sakallah KA (2005) On finding all minimally unsatisfiable subformulas. In: Proceedingsof the 8th international conference on theory and applications of satisfiability testing (SAT-2005).Lecture notes in computer science, vol 3569. Springer, Berlin, pp 173–186

Madigan CF, Malik S, Moskewicz MW, Zhang L, Zhao Y (2001) Chaff: engineering an efficient SATsolver. In: Proceedings of the 38th conference on design automation, June 2001, pp 530–535

Marques Silva JP, Sakallah KA (1999) GRASP: a search algorithm for propositional satisfiability. IEEETrans Comput 48(5):506–521

Mazure B, Saïs L, Grégoire E (1997) Tabu search for SAT. In: Proceedings of the 14th national conferenceon artificial intelligence (AAAI-97), pp 281–285

Mazure B, Saïs L, Grégoire E (1998) Boosting complete techniques thanks to local search methods. AnnMath Artif Intell 22(3–4):319–331

Mills P, Tsang E (2000) Guided local search for solving SAT and weighted MAX-SAT problems. J AutomReas 24(1):205–223

Mneimneh M, Lynce I, Andraus Z, Marques-Silva J, Sakallah K (2005) A branch-and-bound algorithmfor extracting smallest minimal unsatisfiable formulas. In: Proceedings of international conferenceon theory and applications of satisfiability testing, vol 3569, pp 467–474

Oh Y, Mneimneh MN, Andraus ZS, Sakallah KA, Markov IL (2004) AMUSE: a minimally-unsatisfiablesubformula extractor. In: Proceedings of the 41st annual conference on design automation. ACM,New York, pp 518–523

Sat benchmarks from automotive product configuration (2003) http://www-sr.informatik.uni-tuebingen.de/~sinz/DC/

Shang Y, Wah BW (1997) Discrete Lagrangian-based search for solving MAX-SAT problems. In: Pro-ceedings of the 15th international joint conference on artificial intelligence, pp 378–383

Zhang H (1997) SATO: an efficient propositional prover. In: Proceedings of international conference onautomated deduction (CADE-97), pp 272–275

Zhang L, Malik S (2003) Extracting small unsatisfiable cores from unsatisfiable boolean formulas. In:Sixth international conference on theory and applications of satisfiability testing (SAT2003), May2003, pp 518–523