CBSE'05 1
Component-Level Dataflow Analysis
Atanas (Nasko) Rountev
Ohio State University
CBSE'05 2
Outline
Interprocedural dataflow analysisInterprocedural dataflow analysis Whole-program analysis: limitations
Problem: Problem: making dataflow analysis usable and useful for component-based software Technical challenges
Ongoing and future work
CBSE'05 3
Uses of Dataflow Analysis
Software understanding tools e.g. dependence analysis for program
slicing, change impact analysis, refactoring, etc.
Software testing e.g. dataflow-based testing; testing of
object interactions in OO software
Software checking e.g. object protocols: open(read|write)*close
Performance optimizations in compilers
CBSE'05 4
Model for Whole-Program Analysis
code for C1
code for C2
…
code for Cn
C1 + C2 + … + Cn constitute a complete program
Implicit assumption: it is possible and desirable to analyze the source code of the entire program as a single unit
dataflowsolution forC1 + C2 + … + Cn
WholeWholeProgramProgramDataflowDataflowAnalysisAnalysis
CBSE'05 5
Limitations of Whole-Program Analysis
What if some of the components are only available in binary form?
What if we are building a library?
What if we are using large libraries that need to be re-analyzed from scratch? e.g. the standard Java libraries contain a
few thousand classes
What if one part of program changes? may have to re-analyze the entire program
CBSE'05 6
Outline
Interprocedural dataflow analysisInterprocedural dataflow analysis Whole-program analysis: limitations
Problem: Problem: making dataflow analysis usable and useful for component-based software Dozens of existing analyses could potentially
become useful for component-based software In tools for software understanding,
testing, checking, and optimization Technical challenges
CBSE'05 7
A Simple Case: Main + Lib
code for Lib
Goal: the solution for Main should be as good as the solution that would have been computed by a whole-program analysis (no loss of precision)
ComponentComponentLevelLevel
DataflowDataflowAnalysisAnalysis summary for Lib
code for Main dataflowsolution for Main
ComponentComponentLevelLevel
DataflowDataflowAnalysisAnalysissummary for Lib
dataflowsolution for Lib
CBSE'05 8
Component Model and Summary Info
Component = set of related procedures or classes Component interactions: synchronous
calls, shared variables
Challenge: more sophisticated component models
Summary information is computed based only on the source code of Lib
Challenge: use info from component specifications
CBSE'05 9
Summary Functions
Main
Lib
Main calls procedure Q
path p1: dataflow function f1
path p2: dataflow function f2
Summaryfunction for Q:fQ = f1 f2
computed bythe analysisof Lib
Q
CBSE'05 10
Open Questions
Challenge: compact representation of dataflow functions and their transitive composition and meet Existing work solves this problem for some
analysis categories; need generalizations
Challenge: callbacks e.g., function pointers in C e.g., virtual dispatch in C++ and Java Fundamental problem, not addressed
adequately by existing work
CBSE'05 11
Callbacks
Main
Lib
Main calls procedure Q;during the call,Lib calls R
The functionfor p2 cannot becomputed untilMain is analyzed
Q
R
Solution: summary functions for subpathssubpaths,computed during the analysis of Lib;Later, compose them with the functions from Main
CBSE'05 12
Ongoing Work
Goal 1Goal 1 (achieved): theoretical model for computing and using summary functions in the presence of callbacks
Goal 2 Goal 2 (ongoing): instantiate the model to common categories of analyses dependence analysis, pointer analysis, etc.
Goal 3Goal 3: experimental evaluation e.g. how large are the summaries? Eclipse plug-in for call graph construction:
needs summary info for all Java 1.4 libraries
CBSE'05 13
Future Work
Beyond the traditional restrictions Use not only code, but also component
specifications: e.g., “sharpen” the summary functions based on preconditions
Higher-level of abstraction for component interfaces and interactions Right now: low-level mechanisms such as
procedure calls and shared variables
Extensive experimental evaluation on real-world software systems