Click here to load reader

Demand-driven Computation of Interprocedural soffa/Soffa_Pubs_all/Conferences/Demand-driven...Demand-driven Computation of Interprocedural Data Flow * ... Department of Computer Science

  • View
    214

  • Download
    2

Embed Size (px)

Text of Demand-driven Computation of Interprocedural...

  • Demand-driven Computation of Interprocedural Data Flow *

    Evelyn Duesterwald Rajiv Gupta

    Department of Computer Science

    University of Pittsburgh

    Pittsburgh, PA 15260

    {duester,gupta,soffa} Qcs.pitt .edu

    Abstract

    This paper presents a general framework for deriving demand-driven algorithms for interprocedural data flow analysis ofimperative programs. The goal of demand-driven analysisis to reduce the time and/or space overhead of conventionalexhaustive analysis by avoiding the collection of informationthat is not needed. In our framework, a demand for data flowinformation is modeled as a set of data flow queries. Thederived demand-driven algorithms find responses to thesequeries through a partial reversal of the respective data flowanalysis. Depending on whether minimizing time or space isof primary concern, result caching may be incorporated inthe derived algorithm. Our framework is applicable to inter-procedural data flow problems with a finite domain set. Ifthe problems flow functions are distributive, the derived de-mand algorithms provide as precise information as the corre-sponding exhaustive analysis. For problems with monotonebut non-distributive flow functions the provided data flowsolutions are only approximate. We demonstrate our ap-proach using the example of interprocedural copy coustantpropagation,

    1 Introduction

    Phrased in the traditional data flow framework [KU77], thesolution to a data flow problem is expressed as the fixedpoint of a system of equations. Each equation expresses thesolution at one program point in terms of the solution atimmediately preceding (or succeeding) points. This formu-lation results in an inherently exhaustive solution; that is,to find the solution at one program point, the solution at allpoints must be computed.

    This paper presents an alternative approach to programanalysis that avoids the costly computation of exhaustivesolutions through the demand-driven retrieval of data flowinformation. We describe a general framework for deriv-ing demand-driven algorithms that is aimed at reducing the

    Partially supported by National ScienceFoundation PresidentialYoung Investigator Award CCR-9157371 and Grant CCR-9109OS9 tothe University of Pittsburgh

    Permissionto copy without fee aii or part of this material isgranted provided that the copies are not made or distributed fordirect commercial advantage, the ACM copyright notice and thetitie of the publication and its date appear, and notice is giventhat copyin is by permission of the Association of Computing

    +Machinery. o copy otherwise, or to repubiish, requires a feeanchorspecific permission.POPL 951/95 San Francisco CA USA0 1995 ACM 0-89791-892-1/95/0001....$3.50

    Mary Lou Soffa

    time and/or space consumption of conventional exhaustiveanalyzers.

    Demand-driven analysis reduces the analysis cost by pre-venting the over- analysis of a program that occurs if parts ofthe analysis effort are spent on the collection of superfluousinformation. Optimizing and parallelizing compilers that ex-haustively analyze a program with respect to each data flowproblem of interest are likely to over-analyze the program.Typically, code transformations are applied only selectivelyover the program and therefore require only a subset of theexhaustive data flow solution. For example, some optimiza-tion are applicable to only certain structures in a program,such as loop optimizations. Even if optimizations are appli-cable everywhere in the program, one may want to reducethe overall optimization overhead by restricting their appli-cation to only the most frequently executed regions of theprogram (e.g., frequently called procedures or inner loops).

    One strategy for reducing the analysis cost in these ap-plications is to simply limit the exhaustive analysis to onlyselected code regions. However, this strategy may preventthe application of otherwise safe optimizations due to theworst case assumptions that would have to be made at theentry and exit points of a selected code region. For exam-ple, data flow information that enters a selected code regionfrom outside the region is vital in determining the side effectsof procedure calls contained in that region. SimilarIy, dataflow from outside a loop may be needed to simplify and/ordetermine the loop bounds or array subscripts in the loop.These applications favor a demand-driven approach that al-lows the reduction of the analysis cost while still providingall necessary data flow information.

    Another advantage of demand-driven analysis is its suit-ability for servicing on-line data flow requests in softwaretools. Interactive software tools that aid in debugging andunderstanding of complex code require information to begathered about various aspects of a program. Typically, theinformation requested by a user is not exhaustive but selec-tive, i.e., data flow for only a selected area of the programis needed. Moreover, the data flow problems to be solvedare not fixed before the software tool executes but can varydepending on the users requests. For example, during de-bugging a user may want to know where a certain value isdefined in the program, as well as other data flow informa-tion that would help locate bugs. A demand-driven analysisapproach naturally provides the capabilities to service re-quests whose nature and extent may vary depending on theuser and the program.

    The utility of demand-driven analysis has previously been

    37

  • demonstrated for a number of specific analysis problems[CCF92, CHK92, CG93, SY93, SMHY93, Mas94]. Unlikethese applications, the objective of our approach is to ad-dress demand-based analysis in a general way. We present alattice based framework for the derivation of demand-drivenalgorithms for interprocedural data flow analysis. In thisframework, a demand for a specific subset of the exhaustivesolution is formulated as a set of queries. Queries may begenerated automatically (e.g., by the compiler) or manuallyby the user (e.g., in a software tool). A query

    q=

    raises the question as to whether a specific set of facts y ispart of the exhaustive solution at program point n. A re-sponse (true or j alse) to the query q is determined by prop-agating q from point n in reverse direction of the originalanalysis until all points have been encountered that con-tribute to the response for q. This query propagation ismodeled as a partiaJ reversal of the original data flow analy-sis, Specifically, by reversing the information flow associatedwith program points, we derive a system of query propaga-tion rules. The response to a query is found after a finitenumber of applications of these rules. We present a genericdemand algorithm that implements the query propagationand discuss two optimizations of the algorithm: (i) early-termination to reduce the response time for a single queryand (ii) result caching to optimize the performance over asequence of queries. In the worst case, in which the amountof information demanded is equal to the exhaustive solu-tion, the asymptotic complexity of the demand algorithmis no worse than the comdexitv of a standard iterative ex-haustive algorithm, -

    The derivation of demand algorithms is based on a con-ventional exhaustive interprocedural analysis framework.Several formal frameworks for (exhaustive) interproceduralanalysis have been described [CC77, Ros79, JM82, SP81,KS92]. We use the framework by Sharir and Pnueli [SP81]as the basis for our approach. We first follow the assump-tions of the Sharir-Pnueli framework and consider programswith parameterless (recursive) procedures and with a singleglobal address space. We then consider extensions to ourframework to allow non-procedure valued reference param-eters and local variables. These extension are discussed forthe example of demand-driven copy constant propagation.

    Our approach is applicable to monotone interproceduraldata flow problems with a finite domain set (finite set offacts) and yields precise data flow solutions if all flow func-tion are distributive. This finiteness restriction does notapply if the program under analysis consists of onlY a sin-gle procedure (the tmbwprocedural case). The distributivityof the flow functions is needed to ensure that the deriveddemand algorithms are as precise as their exhaustive coun-terparts. Conceptually, our approach may also be appliedon problems with monotone but non-distributive flow func-tions at the cost of reduced precision. We discuss the lossof information that is caused by non-distributive flow func-tions and show how our derived demand algorithm. can stillbe used to provide approximate but safe query responses fornon-distributive problems.

    The class of distributive and finite data flow problemsthat can be handled precisely includes, among others, theinterprocedural versions of the classical bitvector problems,such as live variables and available exmessions, as well ascommon interprocedural problems, such as procedure side-

    effect analysis [C K88]. We have chosen the example of in-

    terprocedural copy constant propagation for illustrating thedemand-driven framework in this paper.

    Section 2 reviews Sharir and Pnuelis interproceduralframework. In Section 3 we derive a system of query prop-agation rules from which we establish a generic demand al-gorithm. We discuss optimizations of the generic algorithmwhich include early termination and result caching in Sec-tion 4. Section 5 demonstrates the demand algorithm usingthe example of interprocedural copy constant propagationand presents the extensions to include reference parametersand local variables. We discuss related work in Section 6and conclusions are given in Section 7.

    2 Background

    A program consisting of a set of (recursive) procedures isrepresented as an interprocedural J70 w graph (IFG) G ={G,,..., Gk } where Gp = (lVP,

Search related