28
Lecture 8. Pointer Analysis Wei Le 2014.11

Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Lecture 8. Pointer Analysis

Wei Le

2014.11

Page 2: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Outline

I What it is?

I Why it is important?

I How hard is the problem?

I Classical algorithms: Andersen-Style and Steensgaard-Style

I Pointer analysis in Java

Page 3: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

What is pointer analysis and what is alias analysis?

I Pointer analysis aims to determine what memory location a pointerpoints to?Example: In Microsoft Phoenix, memory locations are labeled withtags, and each pointer is associated with a tag

I Alias analysis aims to determine when the two variables refer to thesame memory/storage locationExample: int x; p = &x; q = p; alias pairs: *p and *q, *p and x, *qand x

Page 4: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Aliases

Aliasing can arise due to:

I Pointers:e.g., int *p, i; p = &i;

I Call-by-reference:e.g, void m(Object a, Object b) m(x,x); // a and b alias in bodyof m

I Array indexing:e.g, int i,j,a[100]; i = j; // a[i] and a[j] alias

Page 5: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Alias Analysis

I Must-alias and maybe-alias (may-alias)I Aliasing that must occur during executionI Aliasing that may occur during execution

I Representation:I Complete alias pairs: store all alias pairs explicitly, such as in [Landi

et al.’92].I Compact alias pairs: store only basic alias pairs, and derive new alias

pairs by dereference, transitivity and commutativity, such as in [Choiet al.’93].

I Points-to relations: points-to pairs to indicate one variable ispointing to another, such as in [Emmai’93]

I Equivalence sets: sets that are aliases

I An example: ”q=&p; p=&i; r=&i;”I Complete: 〈∗q, p〉 〈∗p, i〉 〈∗r , i〉 〈∗ ∗ q, ∗p〉 〈∗ ∗ q, i〉 〈∗p, ∗r〉 〈∗ ∗ q, ∗r〉I Compact: 〈∗q, p〉 〈∗p, i〉 〈∗r , i〉I Points-to: (q, p)(p, i)(r , i)I Equivalence set: {∗p, i , ∗r}

Page 6: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Why Alias Analysis Is Important?

Compiler optimization:

Bug detection: we need to precisely check the relations, values andranges of variables

......

Page 7: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

How Hard Is This Problem? [3]

Page 8: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

How Hard Is This Problem?I Research started in late 1970s.I Moderate alias analysis implemented in commercial compilers, like

gcc and DECI Aggressive alias analysis still in research stage.I Classifications of existing alias analysis:

I Intra-procedural: isolated functions only.I Inter-procedural: consider function calls and their interactions;I Type-based techniques: use type information to decide alias.I Flow-based techniques: consider alias generated by flow of

statement.

Page 9: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

How Hard Is This Problem?

I Function calls: will formal parameters and global variables bechanged?

I Function pointers: how to resolve function pointers?

I Aggregate data structure: how to handle struct and union?

I Heap objects: how to include heap object?

I Recursive data structure: will it lead to infinite number of alias?

I Pointer arithmetic: will points-to relations change due to pointerarithmetic?

I Type casting: what to do if a variable is type-casted?

Page 10: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Modeling Memory Locations

I For global variables, use a single node

I For local variables, use a single node per context, just one node ifcontext insensitive

I For dynamically allocated memoryI problem: Potentially unbounded locations created at runtimeI Need to model locations with some finite abstraction

Page 11: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Modeling Dynamic Memory Locations

I For each allocation statement, use one node per context

I One node for entire heap

I One node for each type

I Nodes based on the shape of the heap

Page 12: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Recursive Data Structure

I 1-level: combine all elements as one big cell [Emami’93,Wilson etal’95]

I k-limiting: distinguish first k-elements (k is an arbitrary number)[Landi et al’92, Cheng et al’00]

I beyond k-limiting: consider all elements, using some kinds symbolicaccess paths [Deutch’94]

I shape analysis: not only distinguish all elements, but also tell howthey are connected (connection analysis) and what their shapeis.[Ghiya et al’95]

Page 13: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Shape AnalysisGoal:

I Static code analysis that discovers and verifies properties of linked,dynamically allocated data structures in (usually imperative)computer programs, e.g., discriminating between cyclic and acycliclists and proving that two data structures cannot access the samepiece of memory.

I Identify may-alias relationshipsI x points to an acyclic list, cyclic list, tree, dag, ...I disjointedness properties: x and y point to structures that do not

share cellsI show that data-structure invariants hold

Applications:I Verification: detecting memory leaks, proving the absence of mutual

exclusive and deadlockI Optimization: parallel programs

History:I Originally formulated by Jones and Muchnick, 1981I Mooly Sagiv, Tom Reps and Reinhard Wihelm, since 1995

Page 14: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Handling struct

Two approaches:

Page 15: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis [1]

I Flow-insensitive, context-insensitive analysis, first for C programs(1994) later for Java

I View pointer assignments as subset constraints, also called inclusionbased algorithms

I Use constraints to propagate points-to information

Page 16: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis: An Example

Page 17: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis

I Can be cast as a graph closure problem: one node for each pts(p),pts(a)

I Each node has an associated points-to set

I Compute transitive closure of graph, and add edges according tocomplex constraints

Page 18: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis: Workqueue Algorithm

Page 19: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis: Workqueue Algorithm

Page 20: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis: Workqueue Algorithm

Page 21: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis: Cycle Elimination

Page 22: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen-Style Pointer Analysis: Cycle Elimination

Page 23: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Steensgaard-Style Pointer Analysis [5]

I Uses equality constraints instead of subset constraints (1996)

I Unification based approach: assignment unifies the graph nodes,e.g., x = y (unified x and y in the same node), also called union-findalgorithm, exclusion-based approaches, nearly linear complexity

I Less precise than Andersen-style, thus more scalable

Page 24: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Steensgaard-Style Pointer Analysis: An Example

Page 25: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Andersen vs. Steensgaard Pointer Analysis

I Both are flow-insensitive and context-insensitive

I Differ in points-to set construction

I Andersen: many out edges, one variable per node

I Steensgaard: one out edge, many variables per node

Page 26: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Pointer Analysis in Java [2, 4]

I Goal: determine the set of objects pointed to by a reference variableor a reference object field.

I Different languages use pointers differently:I Most C programs have many more occurrences of the address-of (&)

operator than dynamic allocationI Java allows no stack-directed pointers, many more dynamic

allocation sites than similar-sized C programsI Java strongly typed, limits set of objects a pointer can point to (Can

improve precision)I Larger libraries in Java, more entry points in JavaI ...

Page 27: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Object-Sensitive Pointer Analysis [2, 4]

Context-Sensitivity:

I Call site: by program statement of method invocation

I Object sensitivity: by receiving object of method invocation (state ofthe object is not merged)

Page 28: Lecture 8. Pointer Analysis - Iowa State Universityweb.cs.iastate.edu/~weile/cs641/8.PointerAnalysis.pdf · Andersen-Style Pointer Analysis [1] I Flow-insensitive, context-insensitive

Lars Ole Andersen.Program analysis and specialization for the c programming language.Technical report, 1994.

Laurie Hendren.Context-sensitive points-to analysis: Is it worth it.In Compiler Construction, 15th International Conference, volume3923 of LNCS, pages 47–64. Springer, 2006.

Michael Hind.Pointer analysis: Haven’t we solved this problem yet?In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop onProgram Analysis for Software Tools and Engineering, PASTE ’01,pages 54–61, New York, NY, USA, 2001. ACM.

Ana Milanova, Atanas Rountev, and Barbara G. Ryder.Parameterized object sensitivity for points-to analysis for java.ACM Trans. Softw. Eng. Methodol., 14(1):1–41, January 2005.

Bjarne Steensgaard.Points-to analysis in almost linear time.In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium onPrinciples of Programming Languages, POPL ’96, pages 32–41, NewYork, NY, USA, 1996. ACM.