Coverity White Paper-SAT-Next Generation Static Analysis 0

Embed Size (px)

Citation preview

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    1/21

    The Next

    Generation oStatic Analysisin C/C++ code

    Boolean Satisfability andPath SimulationA Perect Match

    Ben Chel, Coverity CTO

    Andy Chou, Coverity Chie Scientist

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    2/21

    The Next Generation o Static Analysis

    Introduction

    Since its introduction,static source code analysis has had a mixed reputation with development teams due to longanalysis times,excessive noise or an unacceptable rate o alse-positive results.Excessive alse-positive results arethe main reason why many source code analysis products quickly become shelware ater a ew uses.Despite earlyshortcomings,the promise o static analysis remained o interest to developers because the technology oers theability to ind bugs beore sotware is run,improving code quality and dramatically accelerating the availabilityo new applications.Though static analysis has historically struggled to deliver on this promise,a groundbreakingnew use o Boolean satisiability (SAT) in the ield is poised to help static analysis deliver on its potential.

    This white paper will provide a brie overview o the history o static analysis and explain how the use o SAT

    in static analysis is enabling developers to improve the quality and security o their code by identiying a greaternumber o critical deects in their code with the lowest rate o alse-positive results in the industry.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    3/21

    The Next Generation o Static Analysis

    A Brie History o Static Analysis

    The Need or Tools to Find Code DeectsBeore the irst sotware application was released, the irst sotware deect(s)

    had been ound andeliminated. Call them bugs, errors, ailures or other namesnot suitable or publication sotware deectshave existed as long as sotwareitsel. As early applications evolved to become more robust and morecomplex,the remaining deects became more diicult to corral. Simply stated, the more

    lines o codenecessary to create an application, the more deects one wouldexpect to encounter during development.

    Consider a theoretical average ratio o one deect per ,000 lines o code (likely

    a gross underestimation).As a code base expands rom thousands o lines, totens o thousands, to hundreds o thousands, thisdeect ratio would becomeoverwhelming or developers relying exclusively on manual techniques tocontrol the resulting volume o deects.

    With applications assuming more critical unctions or business and industry,

    the consequence o deectsin the ield now mandates that sotware meetspeciic quality standards prior to release. It is at this

    intersection o opportunity

    and risk that developers turned to sotware itsel to try and eliminate sotware

    As soon as we started

    programming, we ound to

    our surprise that it wasnt as

    easy to get programs right as

    we had thought. Debugging

    had to be discovered. I can

    remember the exact instant

    when I realized that a large

    part o my lie rom then on

    was going to be spent in indingmistakes in my own programs.

    Maurice Wilkes,

    Inventor o the EDSAC,

    949

    deects earlier in the development lie-cycle. Applying static analysis to sotware, the automated review ocodeprior to run-time with the intention o identiying deects, was an obvious solution to this undamental challenge

    o ensuring code quality.

    st Generation Static AnalysisThe irst static analysis tool appeared in the late 70s. Known most commonly as Lint, this can be regardedasthe st generation o commercially viable static analysis. Lint held great promise or developers when itwas ini-tially released. For the irst time, developers had the ability to automate the detection o sotware deects veryearly in the application liecycle, when they were easiest to correct. By extension, this wouldprovide developers

    with greater conidence in the quality o their code prior to release. The technology3 Boolean Satisiability andPath SimulationA Perect Matchbehind Lint was revolutionary, because it used compilers to conduct deectchecking, enabling it to becomethe irst viable static source code analysis solution.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    4/21

    The Next Generation o Static Analysis3

    False PositivesVs. Noise inStatic Analysis

    There are two types o inaccurate

    results that cancause problemsor static analysis solutionnoise

    and alse positives. A noisy result

    is one where the analysisistechnically correct but the result

    is irrelevant. A alse positive

    result is one where the analysis

    asserts a actabout the code (e.g.,pointer could be null at line 50

    and is dereerenced) and that

    act is incorrect (e.g., thepointercould not possibly be null). Its

    important or any static analysis

    to have a low rate o noise and

    alsepositives. In other words, itshould be right most o the time

    (low alse positive rate) and report

    results thatthe end users ind tobe relevant (low noise rate).

    In reality though, Lint was not truly designed with the goal o identiying

    deects that cause run-timeproblems. Rather, its purpose was to lag suspiciousor non-portable constructs in the code to help developerscode in a moreconsistent ormat. By suspicious code we mean code that, while technically

    correct rom theperspective o the source code language (e.g., C, C++), mightbe structured so that it could possibly execute inways that the developer did notintend. The problem with lagging suspicious code is, like compiler warnings, that code o this type could, and oten would, work correctly. Because o this,

    and Lints limited analysiscapabilities, noise rates were extremely high, otenexceeding a 0: ratio between noise and real deects.

    Consequently, inding genuine deects required developers to conduct time-

    consuming manual reviews oLints results, compounding the exact problemthat static analysis was supposed to eliminate. For this reason,Lint was never

    widely adopted as a deect detection tool, although it did enjoy some limited

    success with aew organizations. In act, as a testament to the quality o Lintsunderlying technology, many dierentversions o the product still remainavailable today.

    nd Generation Static AnalysisFor nearly two decades static analysis remained more iction than act as a

    commercially viable productiontool or identiying deects. In early 000a second generation o tools (e.g., Stanord Checker) emergedthat oeredenough value to become commercially viable. By leveraging new technology

    that expanded thecapabilities o irst generation tools past simple patternmatching to also ocus on path coverage, secondgeneration static analysis

    was able to uncover more deects with real run-time implications.

    These tools could also analyze entire code bases, not just one ile. By shiting

    ocus rom suspiciousconstructs to run-time deects, developers o these newstatic analysis technologies recognized the need tounderstand more o theinter-workings o code bases. This meant combining sophisticated path analysis

    withinter-procedural analysis to understand what happened when the low ocontrol passed rom one unctionto another within a given sotware system.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    5/21

    The Next Generation o Static Analysis4

    False Positivesand Noise andst Generation

    ToolsFalse-positive and noisy results

    have always been a challenge or

    static analysis. Even the developers

    behind the most popular st

    generation tools recognized this.

    Consider a quote rom Stephen

    Johnson, the inventor o Lint:

    Lint tries to give inormation

    with a high degree o relevance.Messages o the orm xxx might

    be a bug are easy to generate, but

    are acceptable only in proportion

    to the raction o real bugs they

    uncover. I this raction o real

    bugs is too small, the messages

    lose their credibility and serve

    merely to clutter up the output,

    obscuring the more important

    messages.

    Stephen Johnson 7/6/978

    Despite their adoption and use by organizations, nd generation static analysis

    still had diiculty inding thesweet spot between accuracy and scalability. Somesolutions were accurate or a small set o deect types, butcould not scale toanalyze millions o lines o code. Others could run in a short amount o time,

    but hadaccuracy rates similar to Lint, introducing the amiliar alse-positiveand noise problems.When implemented,these tools can report deects at anacceptable ratio, but only with restricted analysis parameters.

    The problem o wrestling with an elusive sweet spot between accuracy and

    scalability led to a alse-positiveproblem that, like the noise problem preventingst generation tools rom delivering on the promise o staticanalysis, slowednd generation tools rom being more rapidly adopted.While nd generation

    tools movedstatic analysis orward rom a technology standpoint by identiyingmeaningul deects, their results wereoten ound wanting or greater accuracyby developers.

    Many nd generation tools suered rom the heterogeneity o build and

    development environments. Due tothe inconsistency in developmentenvironments between organizations, nd generation tools could oten require painul, time-consuming customization or integration eorts or

    teams that used anything but themost common technologies such as plainMakeiles with a single target, antiles, or Visual Studio project iles.

    3rd Generation Static AnalysisToday, a new generation o static analysis is emerging, technology that delivers

    unmatched quality o resultsrom a solution that its within existing developmentprocesses and environments. Leveraging solvers orBoolean satisiability (SAT)to complement traditional path simulation analysis techniques, 3rd generationstatic analysis is providing developers with unmatched results that are both

    comprehensive and accurate.

    This paper will explain the technology behind SAT in depth, but or now, lets

    deine SAT as the problem odetermining i the variables o a given Booleanormula can be assigned in such a way as to make the ormulaevaluate to true.It is equally important to determine whether no such assignments exist, which

    would implythat the unction expressed by the ormula is identically alse or

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    6/21

    The Next Generation o Static Analysis5

    all possible variable assignments. In thislatter case, we would say that theormula is unsatisiable; otherwise it is satisiable.

    The innovative use o SAT allows static source code analysis to ind the right

    deects in code without a highrate o distracting alse positives. It also providesnew capabilities that lay the oundation or urthertechnological advances instatic analysis. Beore delving into the technology, some background on the

    use oSAT may provide some helpul context.

    BooleanSatisiability:In complexity theory, the Boolean

    satisiability problem (SAT) is a

    decision problem, whose instanceis a Boolean expression written

    using only AND, OR, NOT,

    variables, and parentheses. The

    question is: Given the expression,

    is there some assignment o

    TRUE and FALSE values to

    the variables that will make the

    entire expression true? A ormula

    o propositional logic is said to

    be satisiable i logical values can

    be assigned to its variables in a

    way that makes the ormula true.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    7/21

    The Next Generation o Static Analysis6

    Leveraging SAT in Chip Design

    In the ield o computer-aided engineering and design, EDA (Electronic Design Automation) tools have been

    used with tremendous success to eliminate costly design ailures in hardware devices. For decades companies such

    as Cadence, Mentor Graphics and Synopsys have used advanced tools to simulate the operations o circuits prior

    to abrication to veriy their correctness and perormance.

    The use o SAT was introduced in these tools in the 980s to create a digital simulation o hardware, allowing

    developers to test a designs architectural operation and inspect the accuracy o perormance at both cycle and

    interace levels. This visibility accelerated development and improved quality o chips by allowing EDA

    developers to more thoroughly simulate their designs prior to abrication.

    Because o its successul history in hardware development, the introduction o SAT to sotware analysis is unlike

    previous advancements in static analysis. Leveraging SAT solvers or perorming static analysis o hardware

    designs is a mature, sophisticated technique that incorporates decades o reinement rom our peers in the

    hardware tools industry. Coveritys patent-pending incorporation o this technology lays the groundwork or a

    new generation o static analysis, where we can analyze code with two complementary analysis techniques (SAT

    and Path Simulation) to deliver unmatched accuracy o results.

    This paper will explain how SAT is pushing the technological capabilities o static analysis today, and how SAT

    oers the prospect o even greater innovation tomorrow.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    8/21

    The Next Generation o Static Analysis77

    Static Analysis Today

    The 3rd Generation ExposedThe sotware development challenges o today require companies to look

    or innovative ways to remove the critical perormance laws and security

    vulnerabilities in sotware beore it gets to the quality assurance lab, or worse,

    into the ield. Fortunately or developers, source code analysis is now up to

    the task. By combining breakthrough techniques in the application o Boolean

    satisiability to sotware with the latest advances in path simulation analysis,

    the most sophisticated source code analysis can boast out-o-the-box alse

    positive rates as low as 0 percent and scalability to tens o millions o lines

    o C/C++ or Java code.

    Sotware developers can use static analysis to automatically uncover errors

    that are typically missed by unit testing, system testing, quality assurance, and

    manual code reviews. By quickly inding and ixing these hard-to-ind deects

    at the earliest stage in the sotware development lie cycle, organizations are

    saving millions o dollars in associated costs.

    The Price oFailure inSotwareA 00 National Institute o

    Standards and Technology

    (NIST) study titled The

    Economic Impacts o Inadequate

    Inrastructure or Sotware

    Testing estimated that sotware

    errors cost the U.S. economy an

    estimated $59.5 billion annually.

    The report states that leveraging

    test inrastructures that allow

    developers to identiy and ix

    deects earlier and more eectively

    could eliminate more than one-

    third o these costs. For more

    inormation on this study, visit

    http://www.nist.gov/public_

    aairs/releases/n0-0.htm.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    9/21

    The Next Generation o Static Analysis8

    SotwareDNA MapAn extremely accurate

    representation o a sotware

    system based on understanding

    all operations that the build

    system perorms as well as an

    authentic compilation o every

    source ile in that build system.

    By providing an accurate

    representation o an application

    to the unction, statement,

    and even expression level, the

    Sotware DNA Map enables

    static code analysis to overcome it

    previous limitations o excessive

    alse positives and deliver accurate

    results that developers can put toimmediate use.

    Sotware DNA Map

    Foundation o Superior Code AnalysisI developers want superior analysis, they must have a perect picture o their

    sotware, because code analysis is only as accurate as the data it is based upon.

    Imagine someone gave a sotware development team a system o a ew million

    lines o codea system theyd been working on or a long timeand asked

    them to draw a map o the sotware system. To be useul, this map would need

    to address all the ollowing details: how every single ile was compiled or all

    targets, how each set o iles was linked together, the dierent binaries that

    were generated, the unctions within each ile and the corresponding callgraph,

    all the dierent control low graphs through each unction, and on and on. I

    the developers attempted to do this by hand, it would be practically impossible.

    Today, automating this onerous task is possible, and the process o creating

    such a picture, known as a Sotware DNA Map, is the oundation that permits

    static analysis to signiicantly improve code quality and security by leveraging

    both path simulation analysis and SAT.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    10/21

    The Next Generation o Static Analysis9

    Boolean Satisiability (SAT)

    As we discussed, the concept o Boolean satisiability is not a new one, but recently, eicient SAT solvers have

    been developed to solve very complicated ormulas with millions o variables.

    A SAT solver is a computer program that takes in a ormula o variables under the operations o AND, OR,

    and NOT and determines i there is a mapping o each individual variable to TRUE and FALSE such that the

    entire ormula evaluates to TRUE (satisiable). I there is at least one such mapping, the solver returns a speciic

    satisying assignment. I no such assignment exists, the SAT solver indicates that the ormula is unsatisiable

    and may provide a proo demonstrating that it is unsatisiable.

    The application o SAT to sotware requires that source code be represented in a orm that can be pluggedautomatically into a SAT solver. Fortunately, with a Sotware DNA Map, all the necessary inormation rom

    the code is available to transorm it into any representation desired. Because SAT solvers deal in TRUE,

    FALSE, AND, OR, and NOT, the relevant portions o this program can be transormed into these constructs.

    Take an 8-bit variable as an example:

    char a;

    To represent a as TRUES and FALSES, those 8 bits (s and 0s) can be each thought o as TRUES and

    FALSES, so a becomes an array o 8 Boolean values:

    a 0, a 1, a 2 , a 3 , a 4 , a 5 , a 6 , a 7

    The operations that make up expressions in the code also must be translated. All expressions in code can be

    converted to an equivalent ormula o AND, OR, and NOT. The thinking behind this is that a compiler must

    turn these operations into instructions in machine code and that machine code must run on a chip. The chips

    circuitry is nothing more than pushing s and 0s (high voltage and low voltage) through a number o gates (all

    o which can be simpliied to AND, OR, and NOT); thereore, this indicates that such a mapping exists. Forexample, to convert the expression:

    a == 19

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    11/21

    The Next Generation o Static Analysis0

    into a ormula, the ollowing expression would do the trick:

    ! a 7 ^ ! a 6 ^ ! a 5 ^ a 4 ^ ! a 3 ^ ! a 2 ^ a1 ^ a 0

    In this example, a0 is the low bit o a and a7 is the high bit. Plugging this

    into a SAT Solver would render the ollowing assignment o variables or

    the ormula to be satisied:

    a 0 = True ( 1 )

    a1 = True ( 1 )

    a 2 = Fa ls e ( 0 )

    a 3 = Fa ls e ( 0 )

    a 4 = True ( 1 )

    a 5 = Fa ls e ( 0 )a 6 = Fa ls e ( 0 )

    a7 = Fa ls e ( 0 )

    Taking that, as binary 00000, shows that it is equivalent to 9.

    Once the entire Sotware DNA Map is represented in this ormat o TRUES, FALSES, NOTS, ANDS, and

    ORS, a wide variety o ormulas can be constructed rom this representation and SAT solvers can be applied

    to analyze the code or additional, more sophisticated quality and security problems. It is this bit-accurate

    representation o the sotware that enables more precise static analysis than previously was possible based

    solely on path simulation.

    Bit-accurateSotwareRepresentationA representation o a sotware

    system in terms o ormulas that

    are made up o Boolean variables

    under the logical operators AND,

    OR, and NOT.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    12/21

    The Next Generation o Static Analysis

    With a bit-accurate representation, static analysis can now leverage SAT solvers to analyze the sotware in this

    additional way as a complement to path simulation techniques. The diagrams below show the dierence between

    a traditional path simulation representation (Figure ) and bit-accurate representation (Figure ):

    Once we have a bit-accurate representation o the code, the next step is to determine what questions we pose

    to SAT solvers to obtain useul inormation rom them. Equally important is an understanding o how these

    questions interact with existing path simulation analysis to improve the overall quality o our analysis.

    Figure 1: Path Simulation Figure 2: Bit-Accurate

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    13/21

    The Next Generation o Static Analysis

    SAT: False Path Pruning

    The irst application o SAT solvers in static analysis to date has been or alse path pruning.When perorming

    path simulation analysis, sometimes a deect will be reported on a path that is ineasible at runtime. This

    reported deect should be eliminated rom the static analysis results because there is no possible combination

    o values o variables where that deect could actually occur in the application. The methodologies or dealing

    with this problem within a path simulation analysis ramework pale in comparison to the abilities a bit-accurate

    representation and a good SAT solver can provide in this regard.

    For example, with a bit-accurate representation o the source code, a 3rd generation static analysis solution can

    construct a ormula that is a conjunction (AND) o all the conditions in a path that lead to any given deect

    discovered by path simulation analysis. By solving this ormula, the SAT solver can indicate i there is a set ovalues or the variables involved in all the conditions such that the path can actually be executed. I the SAT

    solver says satisiable, then the path is easible at runtime. I the SAT solver says unsatisiable, the path can

    never be executed and no bug should be reported.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    14/21

    The Next Generation o Static Analysis3

    Scalability in SAT and PathSimulation Analysis

    With the technological het o combined path simulation analysis and SAT-solving, many readers might be

    wondering i its possible to perorm all o this analysis yet still handle large amounts o code. The oundational

    technology in a Sotware DNA Map permits 3rd generation static analysis to eectively combine Path Simulation

    and SAT based analysis while scaling to millions o lines o code. The power o adding SAT to Path Simulation

    comes rom the act that the strengths o SAT allow the analysis to compensate or the natural weaknesses that

    occur when traditional Path Simulation is made to scale over millions o lines o code.

    In path simulation analysis, in order to scale to millions o lines o code in a reasonable amount o time, the code

    must be abstracted to a higher level representation that consists o states and transitions between those states.

    For example, when perorming analysis that looks or NULL pointer dereerences, the analysis need not careabout the some 4 billion dierent possible values or every pointer (assuming a 3 bit world). The analysis only

    cares about two values: NULL and NOT NULL.

    As you can see, this simpliying assumption signiicantly reduces the state space that must be explored. One

    drawback o this technique is that it also gives up some precision o value since now we consider all pointers

    that are NOT NULL to have the same value, but i we are only looking or places where NULL pointers are

    dereerenced, this is not a problem in practice.

    With reduced state space, we must also look or ways to reduce the complexity o the total number o paths

    explored through the code base. Because there are an exponential number o paths through any given sotware

    system, tracing all those paths would lead to an analysis solution that is incapable o eectively scaling. Coveritys

    patent-pending technology meets this challenge by aggressively caching states that have already been explored.

    Going back to our pointer example, imagine we had a control low graph that looked like the ollowing:

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    15/21

    The Next Generation o Static Analysis4

    Figure 3: Control Flow Graph

    There are clearly our paths through this code base (a-b-d-e-g, a-c-d-e-g, a-b-d--g, a-c-d--g). However, we

    may not need to explore all our paths in the code. I there are no pointer operations in any o the blocks a-g, the

    analysis can do the ollowing:

    ) Explore the path a-b-d-e-g. (NOTE: There are no pointers, so no state changes occur)

    ) Back track to the last decision point (d)

    3) Explore -g (hence, a-b-d--g). (NOTE: There are no pointers, so no state changes occur)

    4) Then backtrack to the irst decision point (a). Explore the other paths that begin with a-c-d

    It is when we reach point (d) in this particular case that we employ the use o cached knowledge o states to save

    analysis time. In exploring the third path, once we reach (d) or the third time, we know that all o the paths

    below (d) have been exhausted, and since we are entering (d) with the same state that we previously entered it,

    we can stop analyzing paths because we know the analysis has already covered all possible cases. Obviously, this

    is a very simpliied view o sotware, but hopeully it illustrates the general concept behind caching states when

    perorming path analysis.

    All reasonable nd generation static analysis tools must perorm some sort o this abstraction and caching inorder to scale to millions o lines o code and still report reasonable deects. However, as you can imagine, the

    abstractions can lead to simpliying assumptions which in turn can lead to alse positives or alse negatives. With

    incorrect simpliying assumptions, caching may cache out in the wrong places. Because o the abstraction and

    caching that path simulation analysis already perorms, SAT-based techniques need to be incorporated in this

    caching to scale properly.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    16/21

    The Next Generation o Static Analysis5

    Combining SAT-based andPath Simulation Analysis

    An area where SAT dierentiates nd and 3rd generation static analysis is in determining which paths in your

    code should be explored. One challenge that has historically limited the eectiveness o path simulation-only

    techniques is the elimination o paths that should be explored (improperly deciding a path does not need to

    be explored) and the exploration o paths that should not be explored (improperly deciding that a path could

    actually execute at run time when in act it cannot). To give an example o the latter, consider the control low

    graph above (Figure 3), but with the decisions at blocks (a) and (d) being the ollowing:

    [ a] : i f ( x = = 0 )

    [ d] : i f ( x ! = 0 )

    Assuming x does not change in blocks (b) or (c), while there might appear to be 4 paths through the control

    low graph, we know that because o the dependency between the condition o (a) and condition o (d), there

    are only paths through the code base. I the analysis decides to explore the path a-b-d-e-g, this would be a

    FALSE path because its impossible to execute at runtime. Moreover, i the analysis reported a deect on this

    path, that deect would clearly be a alse positive since that path cannot exist when running the program.

    To tackle the alse path problem with SAT, we rely upon path simulation analysis to explore the paths, but

    beore reporting a deect, the analysis conjoins all the conditions that must be true on a given path and runs

    the resulting ormula through a SAT solver. I the SAT solver reports that the ormula is satisiable, we know

    the path must be easible and the deect is reported. However, i SAT reports that it is unsatisiable, we know

    we have explored a alse path and the deect should not be reported. Given the above example, the ormula we

    would plug into a SAT solver would be:

    x = = 0 AND x ! = 0

    Clearly this is not satisiable or any values o x (x cant be both zero and not zero) so we would know to prunethis path and not report a deect, even i our path simulation analysis did not have the right abstraction to

    capture this data.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    17/21

    The Next Generation o Static Analysis6

    Adding SAT to path simulation analysis in this way does lead to a ew technical challenges. Because perorming

    a SAT computation does introduce some system overhead, we must discriminate when to leverage SAT solvers

    in order to maintain scalability with large code bases. Ideally, alse paths are pruned as soon as there is an

    inconsistency in the conditions that lead to that path.

    Consider the above example with x == 0 and x != 0, but now imagine that the unction is much longer and these

    two decisions are the irst two in the unction. Using traditional path simulation techniques, we may explore

    what is essentially the same alse path multiple times due to the caching behavior o any given check that we

    perorm. I that occurs, then SAT may be called a tremendous number o times to compute essentially the same

    thing (that x cant be zero and non-zero).

    To avoid this problem, whenever we perorm a alse path computation using SAT and it proves to be unsatisiable

    (a alse path), we reduce the ormula to the simplest set o conditions that lead to the ormula being unsatisiable

    (a known technique in SAT-solving) and then annotate those nodes in the control low graph or later use withthe path simulation engine. This way, we essentially leverage the concept o caching not only with the path

    simulation states, but also with the inormation that the SAT solver returns when it perorms a path computation.

    This interaction between path simulation analysis and SAT-based techniques is at the crux o Coveritys patent-

    pending use o SAT in static analysis, without which there would be no way or SAT to scale beyond a ew

    thousand lines o code.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    18/21

    The Next Generation o Static Analysis7

    The True-Positive Result

    Consider or a moment our earlier review o the history o static analysis and the problems with excessive alse

    positive results. By identiying alse paths with SAT, developers are no longer tasked with siting through

    incorrect results rom their static analysis products. This enables them to ocus testing and analysis eorts on

    potential problems that have a real possibility o compromising the project at hand.

    Early test results o leveraging SAT-based alse path pruning in the ield have been very promising. Based on

    data rom more than 0 open source projects at http://www.scan.coverity.com, a joint venture between Coverity,

    Symantec and the U.S. Department o Homeland Security, alse-positive rates dropped by an additional 5%

    when the alse path pruning SAT solver was incorporated in the overall analysis.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    19/21

    The Next Generation o Static Analysis8

    Future Uses o SAT inStatic Analysis

    Its important to note that alse path pruning is just one example o how SAT can be leveraged to deliver better

    static source code analysis. Other potential applications o SAT include the ability to ind problems such as

    string overlows, integer overlows or deadcode, and the use o static assert checking to identiy diicult-to-ind

    logic bugs.While some instances o the deects in these categories can be discovered today, SAT-based analysis

    will allow us to build on the success o existing path simulation analysis to reach new levels o accuracy and

    precision in static code analysis.

    SAT will beneit sotware developers and architects by undamentally changing the way they view static analysis

    o their source code. Just as going rom simple grep parsing through code to practical path simulation analysis

    earlier this decade was a huge eye-opener or many developers (You mean, it can show me the whole path to getto a real bug?), leveraging SAT solvers or static analysis can impress anyone who writes sotware (How could

    you possibly know that about my code?). Because o this, SAT is the technological breakthrough that introduces

    a new generation o static analysis, bringing us one step closer to delivering on its long-awaited promise.

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    20/21

  • 8/6/2019 Coverity White Paper-SAT-Next Generation Static Analysis 0

    21/21

    About CoverityCoverity (www.coverity.com), the leader in improving sotware quality and security, is a privately held company

    headquartered in San Francisco. Coveritys groundbreaking technology enables developers to control complexity

    in the development process by automatically inding and helping to repair critical sotware deects and security

    vulnerabilities throughout the application liecycle. More than 450 leading companies including ARM, Phillips,

    RIM, Rockwell-Collins, Samsung and UBS rely on Coverity to help them ensure the delivery o superior sotware.

    Free TrialRequest a ree Coverity trial and see irst hand how to rapidly detect and remediate serious deects and

    vulnerabilities. No changes to your code are necessary. There are no limitations on code size, and you will

    receive a complimentary report detailing actionable analysis results. Register or the on-site evaluation at

    www.coverity.com or call us at (800) 873-893.

    Headquarters

    85 Berry Street, Suite 400

    San Francisco, CA 9407

    (800) 873-893

    http://www.coverity.com

    [email protected]

    Boston

    30 Congress Street

    Suite 303

    Boston, MA 00

    (67) 933-6500

    UK

    Coverity Limited

    Magdalen Centre

    Robert Robinson Avenue

    The Oxord Science Park

    Oxord OX4 4GA

    England

    Japan

    Coverity Asia Pacifc

    Level 3, Shinjuku Nomura Bldg.

    -6- Nishi-Shinjuku, Shinjuku-ku

    Tokyo 63-053

    Japan