67
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Dataflow Analysis Analysis Introduction Introduction Guo, Yao Guo, Yao Part of the slides are adapted from MIT 6.035 “Computer Language Engineering”

Dataflow Analysis Introduction

  • Upload
    arella

  • View
    96

  • Download
    1

Embed Size (px)

DESCRIPTION

Dataflow Analysis Introduction. Guo, Yao. Part of the slides are adapted from MIT 6.035 “Computer Language Engineering”. Dataflow Analysis. Last lecture: How to analyze and transform within a basic block This lecture: How to do it for the entire procedure. Outline. Reaching Definitions - PowerPoint PPT Presentation

Citation preview

Page 1: Dataflow Analysis Introduction

School of EECS, Peking University

“Advanced Compiler Techniques” (Fall 2011)

Dataflow Dataflow Analysis Analysis

IntroductionIntroductionGuo, YaoGuo, Yao

Part of the slides are adapted from MIT 6.035 “Computer Language Engineering”

Page 2: Dataflow Analysis Introduction

2Fall 2011“Advanced Compiler

Techniques”

Dataflow AnalysisDataflow Analysis Last lecture: Last lecture:

How to analyze and transform within a How to analyze and transform within a basic blockbasic block

This lecture: This lecture: How to do it for the entire procedureHow to do it for the entire procedure

Page 3: Dataflow Analysis Introduction

3Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Reaching DefinitionsReaching Definitions Available ExpressionsAvailable Expressions Live VariablesLive Variables

Page 4: Dataflow Analysis Introduction

4Fall 2011“Advanced Compiler

Techniques”

Reaching DefinitionsReaching Definitions Concept of definition and useConcept of definition and use

a = x+ya = x+y

is a is a definitiondefinition of a of a

is a is a useuse of x and y of x and y A definition reaches a use if A definition reaches a use if

value written by definitionvalue written by definition

may be read by usemay be read by use

Page 5: Dataflow Analysis Introduction

5Fall 2011“Advanced Compiler

Techniques”

Reaching DefinitionsReaching Definitions s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*b;i = i + 1; return s

Page 6: Dataflow Analysis Introduction

6Fall 2011“Advanced Compiler

Techniques”

Reaching Definitions and Reaching Definitions and Constant PropagationConstant Propagation

Is a use of a variable a constant?Is a use of a variable a constant? Check all reaching definitionsCheck all reaching definitions If all assign variable to same constantIf all assign variable to same constant Then use is in fact a constantThen use is in fact a constant

Can replace variable with constantCan replace variable with constant

Page 7: Dataflow Analysis Introduction

7Fall 2011“Advanced Compiler

Techniques”

Is Is aa Constant in Constant in s = s = s+a*bs+a*b??

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*b;i = i + 1; return s

Yes!On all reaching

definitionsa = 4

Page 8: Dataflow Analysis Introduction

8Fall 2011“Advanced Compiler

Techniques”

Constant Propagation Constant Propagation TransformTransform

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + 4*b;i = i + 1; return s

Yes!On all reaching

definitionsa = 4

Page 9: Dataflow Analysis Introduction

9Fall 2011“Advanced Compiler

Techniques”

Is Is bb Constant in Constant in s = s = s+a*bs+a*b??

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*b;i = i + 1; return s

No!One reaching definition with

b = 1 One reaching definition with

b = 2

Page 10: Dataflow Analysis Introduction

10Fall 2011“Advanced Compiler

Techniques”

SplittingSplittingPreserves Information Lost Preserves Information Lost

At MergesAt Merges

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*b;i = i + 1; return s

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*b;i = i + 1; return s

i < n

s = s + a*b;i = i + 1; return s

Page 11: Dataflow Analysis Introduction

11Fall 2011“Advanced Compiler

Techniques”

SplittingSplittingPreserves Information Lost Preserves Information Lost

At MergesAt Merges

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*b;i = i + 1; return s

s = 0; a = 4; i = 0;

k == 0

b = 1; b = 2;

i < n

s = s + a*1;i = i + 1;

return s

i < n

s = s + a*2;i = i + 1;

return s

Page 12: Dataflow Analysis Introduction

12Fall 2011“Advanced Compiler

Techniques”

Computing Reaching Computing Reaching DefinitionsDefinitions

Compute with sets of definitionsCompute with sets of definitions represent sets using bit vectorsrepresent sets using bit vectors each definition has a position in bit vectoreach definition has a position in bit vector

At each basic block, computeAt each basic block, compute definitions that reach start of blockdefinitions that reach start of block definitions that reach end of blockdefinitions that reach end of block

Do computation by simulating Do computation by simulating execution of program until reach fixed execution of program until reach fixed pointpoint

Page 13: Dataflow Analysis Introduction

13Fall 2011“Advanced Compiler

Techniques”

1: s = 0; 2: a = 4; 3: i = 0;k == 0

4: b = 1; 5: b = 2;

0000000

11100001110000

1111100

1111100 1111100

1111111

1111111 1111111

1 2 3 4 5 6 7

1 2 3 4 5 6 7 1 2 3 4 5 6 7

1 2 3 4 5 6 7

1 2 3 4 5 6 7

1 2 3 4 5 6 7

1110000

1111000 1110100

1111100

01011111111100

1111111i < n

1111111return s6: s = s + a*b;

7: i = i + 1;

Page 14: Dataflow Analysis Introduction

14Fall 2011“Advanced Compiler

Techniques”

Data-Flow Analysis Data-Flow Analysis SchemaSchema

Data-flow value: at every program point

Domain: The set of possible data-flow values for this application

IN[S] and OUT[S]: the data-flow values before and after each statement s

Data-flow problem: find a solution to a set of constraints on the IN [s] ‘s and OUT[s] ‘s, for all statements s. based on the semantics of the

statements ("transfer functions" ) based on the flow of control.

Page 15: Dataflow Analysis Introduction

15Fall 2011“Advanced Compiler

Techniques”

ConstraintsConstraints Transfer function: relationship

between the data-flow values before and after a statement. Forward: OUT[s] = fs(IN[s])

Backward: IN[s] = fs(OUT[s])

Within a basic block (sWithin a basic block (s11,s,s22,…,s,…,snn)) IN[si+1 ] = OUT[si], for all i = 1, 2, ..., n-

1

Page 16: Dataflow Analysis Introduction

16Fall 2011“Advanced Compiler

Techniques”

Data-Flow Schemas on Data-Flow Schemas on Basic BlocksBasic Blocks

Each basic block B (sEach basic block B (s11,s,s22,…,s,…,snn) has) has IN IN –– data-flow values immediately before a data-flow values immediately before a

blockblock OUT OUT –– data-flow values immediately after a data-flow values immediately after a

blockblock

IN[B] = IN[S1]

OUT[B] = OUT[Sn]

OUT[B] = fB (IN[B] ) Where fB = fsn ◦ ••• ◦ fs2 ◦ fs1

Page 17: Dataflow Analysis Introduction

17Fall 2011“Advanced Compiler

Techniques”

Between BlocksBetween Blocks Forward analysisForward analysis

(eg: Reaching definitions)(eg: Reaching definitions) IN[B] = UP a predecessor of B OUT[P]

Backward analysisBackward analysis (eg: live variables) IN[B] = fB (OUT[B])

OUT[B] = US a successor of B IN[S].

Page 18: Dataflow Analysis Introduction

18Fall 2011“Advanced Compiler

Techniques”

Formalizing Reaching Formalizing Reaching DefinitionsDefinitions

Each basic block hasEach basic block has IN - set of definitions that reach beginning of IN - set of definitions that reach beginning of

blockblock OUT - set of definitions that reach end of blockOUT - set of definitions that reach end of block GEN - set of definitions generated in blockGEN - set of definitions generated in block KILL - set of definitions killed in blockKILL - set of definitions killed in block

GEN[s = s + a*b; i = i + 1;] = 0000011GEN[s = s + a*b; i = i + 1;] = 0000011 KILL[s = s + a*b; i = i + 1;] = 1010000KILL[s = s + a*b; i = i + 1;] = 1010000 Compiler scans each basic block to derive Compiler scans each basic block to derive

GEN and KILL setsGEN and KILL sets

Page 19: Dataflow Analysis Introduction

19Fall 2011“Advanced Compiler

Techniques”

ExampleExample

Page 20: Dataflow Analysis Introduction

20Fall 2011“Advanced Compiler

Techniques”

Dataflow EquationsDataflow Equations IN[b] = OUT[b1] U ... U OUT[bn]IN[b] = OUT[b1] U ... U OUT[bn]

where b1, ..., bn are predecessors of b in CFGwhere b1, ..., bn are predecessors of b in CFG OUT[b] = (IN[b] - KILL[b]) U GEN[b]OUT[b] = (IN[b] - KILL[b]) U GEN[b] IN[entry] = 0000000IN[entry] = 0000000 Result: system of equationsResult: system of equations

Page 21: Dataflow Analysis Introduction

21Fall 2011“Advanced Compiler

Techniques”

Solving EquationsSolving Equations Use fixed point algorithmUse fixed point algorithm Initialize with solution of OUT[b] = 0000000Initialize with solution of OUT[b] = 0000000 Repeatedly apply equationsRepeatedly apply equations

IN[b] = OUT[b1] U ... U OUT[bn]IN[b] = OUT[b1] U ... U OUT[bn] OUT[b] = (IN[b] - KILL[b]) U GEN[b]OUT[b] = (IN[b] - KILL[b]) U GEN[b]

Until reach fixed point Until reach fixed point Until equation application has no further Until equation application has no further

effecteffect Use a worklist to track which equation Use a worklist to track which equation

applications may have a further effectapplications may have a further effect

Page 22: Dataflow Analysis Introduction

22Fall 2011“Advanced Compiler

Techniques”

Reaching Definitions Reaching Definitions AlgorithmAlgorithm

for all nodes n in N for all nodes n in N OUT[n] = emptyset; OUT[n] = emptyset; // OUT[n] = GEN[n];// OUT[n] = GEN[n];

IN[Entry] = emptyset; IN[Entry] = emptyset; OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry]; Changed = N - { Entry }; Changed = N - { Entry }; // N = all nodes in graph// N = all nodes in graph

while (Changed != emptyset)while (Changed != emptyset) choose a node n in Changed;choose a node n in Changed; Changed = Changed - { n };Changed = Changed - { n };

IN[n] = emptyset;IN[n] = emptyset; for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] U OUT[p];IN[n] = IN[n] U OUT[p];

OUT[n] = GEN[n] U (IN[n] - KILL[n]);OUT[n] = GEN[n] U (IN[n] - KILL[n]);

if (OUT[n] changed)if (OUT[n] changed) for all nodes s in successors(n) for all nodes s in successors(n)

Changed = Changed U { s };Changed = Changed U { s };

Page 23: Dataflow Analysis Introduction

23Fall 2011“Advanced Compiler

Techniques”

QuestionsQuestions Does the algorithm halt?Does the algorithm halt?

yes, because transfer function is monotonicyes, because transfer function is monotonic if increase IN, increase OUTif increase IN, increase OUT in limit, all bits are 1in limit, all bits are 1

If bit is 0, does the corresponding If bit is 0, does the corresponding definition ever reach basic block?definition ever reach basic block?

If bit is 1, does the corresponding If bit is 1, does the corresponding definition always reach the basic block?definition always reach the basic block?

Page 24: Dataflow Analysis Introduction

24Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Reaching DefinitionsReaching Definitions Available ExpressionsAvailable Expressions Live VariablesLive Variables

Page 25: Dataflow Analysis Introduction

25Fall 2011“Advanced Compiler

Techniques”

Available ExpressionsAvailable Expressions An expression x+y is available at a point p An expression x+y is available at a point p

if if every path from the initial node to p must every path from the initial node to p must

evaluate x+y before reaching p, evaluate x+y before reaching p, and there are no assignments to x or y after and there are no assignments to x or y after

the evaluation but before p.the evaluation but before p. Available Expression information can be Available Expression information can be

used to do global (across basic blocks) CSEused to do global (across basic blocks) CSE If expression is available at use, no need If expression is available at use, no need

to reevaluate itto reevaluate it

Page 26: Dataflow Analysis Introduction

26Fall 2011“Advanced Compiler

Techniques”

Example: Available Example: Available ExpressionExpression

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

Page 27: Dataflow Analysis Introduction

27Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

YES!

Page 28: Dataflow Analysis Introduction

28Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

YES!

Page 29: Dataflow Analysis Introduction

29Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

NO!

Page 30: Dataflow Analysis Introduction

30Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

NO!

Page 31: Dataflow Analysis Introduction

31Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

NO!

Page 32: Dataflow Analysis Introduction

32Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

YES!

Page 33: Dataflow Analysis Introduction

33Fall 2011“Advanced Compiler

Techniques”

Is the Expression Is the Expression Available?Available?

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

YES!

Page 34: Dataflow Analysis Introduction

34Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

Page 35: Dataflow Analysis Introduction

35Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

Page 36: Dataflow Analysis Introduction

36Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = a + c

j = a + b + c + d

b = a + dh = c + f

Page 37: Dataflow Analysis Introduction

37Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = f

j = a + b + c + d

b = a + dh = c + f

Page 38: Dataflow Analysis Introduction

38Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = f

j = a + b + c + d

b = a + dh = c + f

Page 39: Dataflow Analysis Introduction

39Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = f

j = a + c + b + d

b = a + dh = c + f

Page 40: Dataflow Analysis Introduction

40Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = f

j = f + b + d

b = a + dh = c + f

Page 41: Dataflow Analysis Introduction

41Fall 2011“Advanced Compiler

Techniques”

Use of Available Use of Available ExpressionsExpressions

a = b + cd = e + ff = a + c

g = f

j = f + b + d

b = a + dh = c + f

Page 42: Dataflow Analysis Introduction

42Fall 2011“Advanced Compiler

Techniques”

Computing Available Computing Available ExpressionsExpressions

Represent sets of expressions using bit Represent sets of expressions using bit vectorsvectors

Each expression corresponds to a bitEach expression corresponds to a bit Run dataflow algorithm similar to reaching Run dataflow algorithm similar to reaching

definitionsdefinitions Big differenceBig difference

definition reaches a basic block if it comes definition reaches a basic block if it comes from ANY predecessor in CFGfrom ANY predecessor in CFG

expression is available at a basic block only if it expression is available at a basic block only if it is available from ALL predecessors in CFG is available from ALL predecessors in CFG

Page 43: Dataflow Analysis Introduction

43Fall 2011“Advanced Compiler

Techniques”

a = x+y;x == 0

x = z;b = x+y;

i < n

c = x+y;i = i+c;

d = x+y

i = x+y;

Expressions1: x+y2: i<n3: i+c4: x==0

0000

1001

1000

1000

1100 1100

Page 44: Dataflow Analysis Introduction

44Fall 2011“Advanced Compiler

Techniques”

a = x+y;t = a

x == 0

x = z;b = x+y;

t = b

i < n

c = x+y;i = i+c;

d = x+y

i = x+y;

Expressions1: x+y2: i<n3: i+c4: x==0

0000

1001

1000

1000

1100 1100

Global CSE Transform

must use same tempfor CSE in all blocks

Page 45: Dataflow Analysis Introduction

45Fall 2011“Advanced Compiler

Techniques”

a = x+y;t = a

x == 0

x = z;b = x+y;

t = b

i < n

c = t;i = i+c;

d = t

i = t;

Expressions1: x+y2: i<n3: i+c4: x==0

0000

1001

1000

1000

1100 1100

Global CSE Transform

must use same tempfor CSE in all blocks

Page 46: Dataflow Analysis Introduction

46Fall 2011“Advanced Compiler

Techniques”

Formalizing AnalysisFormalizing Analysis Each basic block hasEach basic block has

IN - set of expressions available at start of blockIN - set of expressions available at start of block OUT - set of expressions available at end of OUT - set of expressions available at end of

blockblock GEN - set of expressions computed in block (and GEN - set of expressions computed in block (and

not killed later)not killed later) KILL - set of expressions killed in in block (and KILL - set of expressions killed in in block (and

not re-computed later)not re-computed later) GEN[x = z; b = x+y] = 1000GEN[x = z; b = x+y] = 1000 KILL[x = z; b = x+y] = 0001KILL[x = z; b = x+y] = 0001 Compiler scans each basic block to derive Compiler scans each basic block to derive

GEN and KILL setsGEN and KILL sets

Page 47: Dataflow Analysis Introduction

47Fall 2011“Advanced Compiler

Techniques”

Dataflow EquationsDataflow Equations

IN[b] = OUT[b1] IN[b] = OUT[b1] ... ... OUT[bn] OUT[bn] where b1, ..., bn are predecessors of b in CFGwhere b1, ..., bn are predecessors of b in CFG

OUT[b] = (IN[b] - KILL[b]) U GEN[b]OUT[b] = (IN[b] - KILL[b]) U GEN[b] IN[entry] = 0000IN[entry] = 0000 Result: system of equationsResult: system of equations

Page 48: Dataflow Analysis Introduction

48Fall 2011“Advanced Compiler

Techniques”

Solving EquationsSolving Equations Use fixed point algorithmUse fixed point algorithm IN[entry] = 0000IN[entry] = 0000 Initialize OUT[b] = 1111Initialize OUT[b] = 1111 Repeatedly apply equationsRepeatedly apply equations

IN[b] = OUT[b1] IN[b] = OUT[b1] ... ... OUT[bn] OUT[bn] OUT[b] = (IN[b] - KILL[b]) U GEN[b]OUT[b] = (IN[b] - KILL[b]) U GEN[b]

Use a worklist algorithm to reach fixed Use a worklist algorithm to reach fixed pointpoint

Page 49: Dataflow Analysis Introduction

49Fall 2011“Advanced Compiler

Techniques”

Available Expressions Available Expressions AlgorithmAlgorithm

for all nodes n in Nfor all nodes n in NOUT[n] = E; OUT[n] = E;

IN[Entry] = emptyset; IN[Entry] = emptyset; OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry]; Changed = N - { Entry }; Changed = N - { Entry }; // N = all nodes in graph// N = all nodes in graph

while (Changed != emptyset)while (Changed != emptyset) choose a node n in Changed;choose a node n in Changed; Changed = Changed - { n };Changed = Changed - { n };

IN[n] = E; IN[n] = E; // E is set of all expressions// E is set of all expressions for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] IN[n] = IN[n] OUT[p]; OUT[p];

OUT[n] = GEN[n] U (IN[n] - KILL[n]);OUT[n] = GEN[n] U (IN[n] - KILL[n]);

if (OUT[n] changed)if (OUT[n] changed) for all nodes s in successors(n) for all nodes s in successors(n)

Changed = Changed U { s };Changed = Changed U { s };

Page 50: Dataflow Analysis Introduction

50Fall 2011“Advanced Compiler

Techniques”

QuestionsQuestions Does algorithm always halt?Does algorithm always halt?

If expression is available in some If expression is available in some execution, is it always marked as available execution, is it always marked as available in analysis?in analysis?

If expression is not available in some If expression is not available in some execution, can it be marked as available in execution, can it be marked as available in analysis?analysis?

Page 51: Dataflow Analysis Introduction

52Fall 2011“Advanced Compiler

Techniques”

Duality In Two Duality In Two AlgorithmsAlgorithms

Reaching definitionsReaching definitions Confluence operation is Confluence operation is set unionset union OUT[b] initialized to empty setOUT[b] initialized to empty set

Available expressionsAvailable expressions Confluence operation is Confluence operation is set intersectionset intersection OUT[b] initialized to set of available expressionsOUT[b] initialized to set of available expressions

General framework for dataflow algorithms.General framework for dataflow algorithms. Build parameterized dataflow analyzer Build parameterized dataflow analyzer

once, use for all dataflow problemsonce, use for all dataflow problems

Page 52: Dataflow Analysis Introduction

53Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Reaching DefinitionsReaching Definitions Available ExpressionsAvailable Expressions Live VariablesLive Variables

Page 53: Dataflow Analysis Introduction

54Fall 2011“Advanced Compiler

Techniques”

Live Variable AnalysisLive Variable Analysis A variable v is live at point p if A variable v is live at point p if

v is used along some path starting at p, and v is used along some path starting at p, and no definition of v along the path before the no definition of v along the path before the

use.use.

When is a variable v dead at point p?When is a variable v dead at point p? No use of v on any path from p to exit node, orNo use of v on any path from p to exit node, or If all paths from p redefine v before using v.If all paths from p redefine v before using v.

Page 54: Dataflow Analysis Introduction

55Fall 2011“Advanced Compiler

Techniques”

What Use is Liveness What Use is Liveness Information?Information?

Register allocation.Register allocation. If a variable is dead, can reassign its registerIf a variable is dead, can reassign its register

Dead code elimination.Dead code elimination. Eliminate assignments to variables not read Eliminate assignments to variables not read

later.later. But must not eliminate last assignment to But must not eliminate last assignment to

variable (such as instance variable) visible variable (such as instance variable) visible outside CFG.outside CFG.

Can eliminate other dead assignments.Can eliminate other dead assignments. Handle by making all externally visible variables Handle by making all externally visible variables

live on exit from CFGlive on exit from CFG

Page 55: Dataflow Analysis Introduction

56Fall 2011“Advanced Compiler

Techniques”

Conceptual Idea of Conceptual Idea of AnalysisAnalysis

Simulate executionSimulate execution But start from exit and go backwards in But start from exit and go backwards in

CFGCFG Compute liveness information from end to Compute liveness information from end to

beginning of basic blocksbeginning of basic blocks

Page 56: Dataflow Analysis Introduction

57Fall 2011“Advanced Compiler

Techniques”

Liveness ExampleLiveness Example a = x+y;

t = a;c = a+x;x == 0

b = t+z;

c = y+1;

1100100

1110000

Assume a,b,c Assume a,b,c visible outside visible outside methodmethod

So are live on exitSo are live on exit Assume x,y,z,t not Assume x,y,z,t not

visiblevisible Represent Liveness Represent Liveness

Using Bit VectorUsing Bit Vector order is abcxyztorder is abcxyzt

1100111

1000111

1100100

0101110

a b c x y z t

a b c x y z t

a b c x y z t

Page 57: Dataflow Analysis Introduction

58Fall 2011“Advanced Compiler

Techniques”

Dead Code EliminationDead Code Elimination a = x+y;

t = a;c = a+x;x == 0

b = t+z;

c = y+1;

1100100

1110000

Assume a,b,c Assume a,b,c visible outside visible outside methodmethod

So are live on exitSo are live on exit Assume x,y,z,t not Assume x,y,z,t not

visiblevisible Represent Liveness Represent Liveness

Using Bit VectorUsing Bit Vector order is abcxyztorder is abcxyzt

1100111

1000111

1100100

0101110

a b c x y z t

a b c x y z t

a b c x y z t

Page 58: Dataflow Analysis Introduction

59Fall 2011“Advanced Compiler

Techniques”

Formalizing AnalysisFormalizing Analysis Each basic block hasEach basic block has

IN - set of variables live at start of blockIN - set of variables live at start of block OUT - set of variables live at end of blockOUT - set of variables live at end of block USE - set of variables with upwards exposed USE - set of variables with upwards exposed

uses in block (use prior to definition)uses in block (use prior to definition) DEF - set of variables defined in block prior to DEF - set of variables defined in block prior to

useuse USE[x = z; x = x+1;] = { z } USE[x = z; x = x+1;] = { z } (x not in USE)(x not in USE) DEF[x = z; x = x+1; y = 1;] = {x, y}DEF[x = z; x = x+1; y = 1;] = {x, y} Compiler scans each basic block to derive Compiler scans each basic block to derive

USE and DEF setsUSE and DEF sets

Page 59: Dataflow Analysis Introduction

60Fall 2011“Advanced Compiler

Techniques”

AlgorithmAlgorithmfor all nodes n in N - { Exit } for all nodes n in N - { Exit }

IN[n] = emptyset;IN[n] = emptyset;OUT[Exit] = emptyset; OUT[Exit] = emptyset; IN[Exit] = use[Exit];IN[Exit] = use[Exit];Changed = N - { Exit };Changed = N - { Exit };

while (Changed != emptyset)while (Changed != emptyset) choose a node n in Changed;choose a node n in Changed; Changed = Changed - { n };Changed = Changed - { n };

OUT[n] = emptyset;OUT[n] = emptyset; for all nodes s in successors(n) for all nodes s in successors(n)

OUT[n] = OUT[n] U IN[p];OUT[n] = OUT[n] U IN[p];

IN[n] = use[n] U (out[n] - def[n]);IN[n] = use[n] U (out[n] - def[n]);

if (IN[n] changed)if (IN[n] changed) for all nodes p in predecessors(n)for all nodes p in predecessors(n) Changed = Changed U { p };Changed = Changed U { p };

Page 60: Dataflow Analysis Introduction

61Fall 2011“Advanced Compiler

Techniques”

Similar to Other Similar to Other Dataflow AlgorithmsDataflow Algorithms

Backward analysis, not forwardBackward analysis, not forward Still have transfer functionsStill have transfer functions Still have confluence operatorsStill have confluence operators Can generalize framework to work Can generalize framework to work

for both forwards and backwards for both forwards and backwards analysesanalyses

Page 61: Dataflow Analysis Introduction

62Fall 2011“Advanced Compiler

Techniques”

ComparisonComparison

Page 62: Dataflow Analysis Introduction

63Fall 2011“Advanced Compiler

Techniques”

ComparisonComparisonAvailable Available

ExpressionsExpressions

for all nodes n in Nfor all nodes n in N

OUT[n] = E; OUT[n] = E;

IN[Entry] = emptyset; IN[Entry] = emptyset;

OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry];

Changed = N - { Entry }; Changed = N - { Entry };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

IN[n] = E; IN[n] = E;

for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] IN[n] = IN[n] OUT[p];OUT[p];

OUT[n] = GEN[n] U (IN[n] - OUT[n] = GEN[n] U (IN[n] - KILL[n]);KILL[n]);

if (OUT[n] changed)if (OUT[n] changed)

for all nodes s in for all nodes s in successors(n) successors(n)

Changed = Changed U { s Changed = Changed U { s };};

Reaching Reaching DefinitionsDefinitions

for all nodes n in N for all nodes n in N

OUT[n] = emptyset; OUT[n] = emptyset;

IN[Entry] = emptyset; IN[Entry] = emptyset;

OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry];

Changed = N - { Entry }; Changed = N - { Entry };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

IN[n] = emptyset;IN[n] = emptyset;

for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] U OUT[p];IN[n] = IN[n] U OUT[p];

OUT[n] = GEN[n] U (IN[n] - OUT[n] = GEN[n] U (IN[n] - KILL[n]);KILL[n]);

if (OUT[n] changed)if (OUT[n] changed)

for all nodes s in for all nodes s in successors(n) successors(n)

Changed = Changed U Changed = Changed U { s };{ s };

Live VariablesLive Variables

for all nodes n in N - { Exit } for all nodes n in N - { Exit }

IN[n] = emptyset;IN[n] = emptyset;

OUT[Exit] = emptyset; OUT[Exit] = emptyset;

IN[Exit] = use[Exit];IN[Exit] = use[Exit];

Changed = N - { Exit };Changed = N - { Exit };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

OUT[n] = emptyset;OUT[n] = emptyset;

for all nodes s in successors(n) for all nodes s in successors(n)

OUT[n] = OUT[n] U IN[p];OUT[n] = OUT[n] U IN[p];

IN[n] = use[n] U (out[n] - def[n]);IN[n] = use[n] U (out[n] - def[n]);

if (IN[n] changed)if (IN[n] changed)

for all nodes p in for all nodes p in predecessors(n)predecessors(n)

Changed = Changed U Changed = Changed U { p };{ p };

Page 63: Dataflow Analysis Introduction

64Fall 2011“Advanced Compiler

Techniques”

ComparisonComparisonAvailable Available

ExpressionsExpressions

for all nodes n in Nfor all nodes n in N

OUT[n] = E; OUT[n] = E;

IN[Entry] = emptyset; IN[Entry] = emptyset;

OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry];

Changed = N - { Entry }; Changed = N - { Entry };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

IN[n] = E; IN[n] = E;

for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] IN[n] = IN[n] OUT[p];OUT[p];

OUT[n] = GEN[n] U (IN[n] - OUT[n] = GEN[n] U (IN[n] - KILL[n]);KILL[n]);

if (OUT[n] changed)if (OUT[n] changed)

for all nodes s in for all nodes s in successors(n) successors(n)

Changed = Changed U { s Changed = Changed U { s };};

Reaching Reaching DefinitionsDefinitions

for all nodes n in N for all nodes n in N

OUT[n] = emptyset; OUT[n] = emptyset;

IN[Entry] = emptyset; IN[Entry] = emptyset;

OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry];

Changed = N - { Entry }; Changed = N - { Entry };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

IN[n] = emptyset;IN[n] = emptyset;

for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] U OUT[p];IN[n] = IN[n] U OUT[p];

OUT[n] = GEN[n] U (IN[n] - OUT[n] = GEN[n] U (IN[n] - KILL[n]);KILL[n]);

if (OUT[n] changed)if (OUT[n] changed)

for all nodes s in for all nodes s in successors(n) successors(n)

Changed = Changed U Changed = Changed U { s };{ s };

Page 64: Dataflow Analysis Introduction

65Fall 2011“Advanced Compiler

Techniques”

ComparisonComparisonReaching Reaching

DefinitionsDefinitions

for all nodes n in N for all nodes n in N

OUT[n] = emptyset; OUT[n] = emptyset;

IN[Entry] = emptyset; IN[Entry] = emptyset;

OUT[Entry] = GEN[Entry]; OUT[Entry] = GEN[Entry];

Changed = N - { Entry }; Changed = N - { Entry };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

IN[n] = emptyset;IN[n] = emptyset;

for all nodes p in predecessors(n) for all nodes p in predecessors(n)

IN[n] = IN[n] U OUT[p];IN[n] = IN[n] U OUT[p];

OUT[n] = GEN[n] U (IN[n] - OUT[n] = GEN[n] U (IN[n] - KILL[n]);KILL[n]);

if (OUT[n] changed)if (OUT[n] changed)

for all nodes s in for all nodes s in successors(n) successors(n)

Changed = Changed U Changed = Changed U { s };{ s };

Live VariableLive Variable

for all nodes n in Nfor all nodes n in N

IN[n] = emptyset;IN[n] = emptyset;

OUT[Exit] = emptyset; OUT[Exit] = emptyset;

IN[Exit] = use[Exit];IN[Exit] = use[Exit];

Changed = N - { Exit };Changed = N - { Exit };

while (Changed != emptyset)while (Changed != emptyset)

choose a node n in Changed;choose a node n in Changed;

Changed = Changed - { n };Changed = Changed - { n };

OUT[n] = emptyset;OUT[n] = emptyset;

for all nodes s in successors(n) for all nodes s in successors(n)

OUT[n] = OUT[n] U IN[p];OUT[n] = OUT[n] U IN[p];

IN[n] = use[n] U (out[n] - def[n]);IN[n] = use[n] U (out[n] - def[n]);

if (IN[n] changed)if (IN[n] changed)

for all nodes p in for all nodes p in predecessors(n)predecessors(n)

Changed = Changed U Changed = Changed U { p };{ p };

Page 65: Dataflow Analysis Introduction

66Fall 2011“Advanced Compiler

Techniques”

Pessimistic vs. Optimistic Pessimistic vs. Optimistic AnalysesAnalyses

Available expressions is optimistic Available expressions is optimistic (for common sub-expression (for common sub-expression elimination)elimination) Assume expressions are available at start of Assume expressions are available at start of

analysisanalysis Analysis eliminates all that are not availableAnalysis eliminates all that are not available Cannot stop analysis early and use current resultCannot stop analysis early and use current result

Live variables is pessimistic (for dead code Live variables is pessimistic (for dead code elimination)elimination) Assume all variables are live at start of analysisAssume all variables are live at start of analysis Analysis finds variables that are deadAnalysis finds variables that are dead Can stop analysis early and use current resultCan stop analysis early and use current result

Dataflow setup same for both analysesDataflow setup same for both analyses Optimism/pessimism depends on intended useOptimism/pessimism depends on intended use

Page 66: Dataflow Analysis Introduction

67Fall 2011“Advanced Compiler

Techniques”

SummarySummary Dataflow AnalysisDataflow Analysis

Control flow graphControl flow graph IN[b], OUT[b], transfer functions, join pointsIN[b], OUT[b], transfer functions, join points

Paired analyses and transformationsPaired analyses and transformations Reaching definitions/constant propagationReaching definitions/constant propagation Available expressions/common sub-expression Available expressions/common sub-expression

eliminationelimination Live-variable analysis/Dead code eliminationLive-variable analysis/Dead code elimination

Stacked analysis and transformations work Stacked analysis and transformations work togethertogether

Page 67: Dataflow Analysis Introduction

68Fall 2011“Advanced Compiler

Techniques”

Next TimeNext Time Data Flow Analysis: FoundationData Flow Analysis: Foundation

DragonBook: §9.3DragonBook: §9.3