Click here to load reader

PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio

  • View
    212

  • Download
    0

Embed Size (px)

Text of PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of...

Interprocedural Dataflow Analysis in the Presense of Large LibrariesInterprocedural Dataflow Analysis in the Presence of Large Libraries
Atanas (Nasko) Rountev
*
Performance optimizations in compilers
Software understanding and transformation
Software testing
e.g. dataflow-based testing; testing of object interactions in OO software
Software checking
CC 2006, Scott Kagan, PRESTO Research Group
*
Components C1, C2, …, Cn form a complete program
Assumption: it is possible and desirable to analyze the source code of the entire program
code for C1
code for C2
*
Main + Lib form a complete program
What if we are using large libraries that need to be re-analyzed from scratch?
e.g. the standard Java libraries contain about 10,000 classes and 80,000 methods
need to be re-analyzed with every new Main component
code for Main
code for Lib
*
CC 2006, Scott Kagan, PRESTO Research Group
Chart1
jlex
jlex
jb
jb
proxy
proxy
javacup
javacup
rabbit
rabbit
sablecc
sablecc
db
db
compress
compress
raytrace
raytrace
fractal
fractal
echo
echo
mpegaudio
mpegaudio
jack
jack
jtar
jtar
jflex
jflex
jess
jess
mindterm
mindterm
muffin
muffin
javac
javac
A Specific Case: Main + Lib
Goal: the solution for Main should be as good as the solution that would have been computed by a whole-program analysis (no loss of precision)
code for Lib
*
Sharir-Pnueli 1981
Edge function f: L L for effects of a statement
Path function: f = fn fn-1 … f2 f1
Phase 1: summary functions φn: L L
solution at node n as a function of the solution at the entry of n’s procedure
Phase 2: solutions at start nodes of procedures
Phase 3: solutions at the remaining nodes
CC 2006, Scott Kagan, PRESTO Research Group
*
CC 2006, Scott Kagan, PRESTO Research Group
1
14
15
f0
main
16
17
24
22
21
18
20
f1
19
f2
f3
f4
23
f5
f6
f7
p3
p2
25
26
f8
e.g. virtual dispatch in C++ and Java
Can no longer determine φ21 and φ13 without code for ext
CC 2006, Scott Kagan, PRESTO Research Group
1
14
ext
15
f0
29
30
31
f9
main
16
17
24
22
21
18
20
f1
19
f2
f3
f4
23
f5
f6
f7
p3
p2
25
26
f8
27
2
3
4
5
6
28
p1
7
8
9
10
11
12
13
Compute functions for sets of library-local paths
φ = id
14
15
16
17
24
22
21
18
20
19
f2
f3
f4
23
f5
f6
f7
p3
p2
25
26
f8
27
28
p1
7
8
9
10
11
12
13
“Fixed” call in the library
always invokes the same library procedure independent of code for main component
“Fixed” procedure in the library
makes no calls, or
standard functional approach can be applied
For any other procedure, compute φ
k is the start node, or
k is a return from a non-fixed call, or
k is a return from a fixed call to a non-fixed procedure
k n
*
17: return from a non-fixed call
12: return from a fixed call to a non-fixed procedure
k n
p2
14
15
16
17
24
22
21
18
20
19
23
p3
25
26
27
28
f4
f5
f6
f7
f8
p1
7
8
9
10
11
12
13
f2
f3
p2
14
15
16
17
24
22
21
18
20
19
23
p3
25
26
27
28
f4
f5
f6
f7
f8
p1
7
8
9
10
11
12
13
f2
f3
Create a “fake” graph for the whole program
Run a whole-program analysis engine
Safe solutions for non-library nodes
precise for distributive problems
1
p2
ext
21
f0
29
30
p1
31
7
11
f9
12
main
13
14
16
17
f1
2
3
4
5
6
CC 2006, Scott Kagan, PRESTO Research Group
Chart6
jb
jb
socksproxy
socksproxy
jlex
jlex
RabbIT2
RabbIT2
javacup
javacup
sablecc
sablecc
db
db
compress
compress
fractal
fractal
raytrace
raytrace
socksecho
socksecho
jack
jack
jtar
jtar
jess
jess
mpegaudio
mpegaudio
jflex
jflex
mindterm
mindterm
muffin
muffin
javac
javac
CC 2006, Scott Kagan, PRESTO Research Group
Chart8
jb
jb
socksproxy
socksproxy
jlex
jlex
RabbIT2
RabbIT2
javacup
javacup
sablecc
sablecc
db
db
compress
compress
fractal
fractal
raytrace
raytrace
socksecho
socksecho
jack
jack
jtar
jtar
jess
jess
mpegaudio
mpegaudio
jflex
jflex
mindterm
mindterm
muffin
muffin
javac
javac
Compact representation of functions
CC 2006, Scott Kagan, PRESTO Research Group
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
jlex
jb
proxy
javacup
rabbit
sablecc
db
compress
raytrace
fractal
echo
mpegaudio
jack
jtar
jflex
jess
mindterm
muffin
javac
p2
14
15
16
17
24
22
21
18
2019
23
p3
25
26
27
28
f
4
f
5
f
6
f
7
f
8
p1
7
8
910
11
12
13
f
2
f
3
p2
14
15
16
17
24
22
21
18
2019
23
p3
25
26
27
28
f
4
f
5
f
6
f
7
f
8
p1
7
8
910
11
12
13
f
2
f
3
1
f
0
main
2
3
4
5
6
ext
29
30
31
f
1
f
9
p2
21
p1
7
11
12
13
14
16
17
p2
14
15
16
17
24
22
21
18
2019
23
p3
25
26
27
28
f
4
f
5
f
6
f
7
f
8
p1
7
8
910
11
12
13
f
2
f
3
1
f
0
main
2
3
4
5
6
p2
14
15
16
17
24
22
21
18
2019
23
p3
25
26
27
28
f
1
f
4
f
5
f
6
f
7
f
8
p1
7
8
910
11
12
13
f
2
f
3
0
50000
100000
150000
200000
250000
300000
350000
400000
jb
socksproxy
jlex
RabbIT2
javacup
sablecc
db
compress
fractal
raytrace
socksecho
jack
jtar
jess
mpegaudio
jflex
mindterm
muffin
javac