41
Dataflow Analysis for Software Product Lines May 24, 2012 COPLAS, DIKU Dataflow Analysis for Software Product Lines Claus Brabrand IT University of Copenhagen Universidade Federal de Pernambuco [ [email protected] ] Márcio Ribeiro Universidade Federal de Alagoas Universidade Federal de Pernambuco [ [email protected] ] Paulo Borba Universidade Federal de Pernambuco [ [email protected] ] Társis Tolêdo Universidade Federal de Pernambuco [ [email protected] ]

Dataflow Analysis for Software Product Lines May 24, 2012 COPLAS, DIKU Dataflow Analysis for Software Product Lines Claus Brabrand IT University of Copenhagen

Embed Size (px)

Citation preview

Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Dataflow Analysis forSoftware Product Lines

Claus BrabrandIT University of Copenhagen

Universidade Federal de Pernambuco[ [email protected] ]

Márcio RibeiroUniversidade Federal de Alagoas

Universidade Federal de Pernambuco[ [email protected] ]

Paulo BorbaUniversidade Federal de Pernambuco

[ [email protected] ]

Társis TolêdoUniversidade Federal de Pernambuco

[ [email protected] ]

[ 3 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

[ 4 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Introduction

1x CAR

=

1x CELL PHONE

=

1x APPLICATION

=

CARS CELL PHONES APPLICATIONS

Traditional Software Development:One program = One product

Product Line:A ”family” of products (of N ”similar” products):

customize

SPL:(Family ofPrograms)

[ 5 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Software Product Line

SPL:

Feature Model: (e.g.: ψFM ≡ VIDEO COLOR)

Family ofPrograms:

COLOR

VIDEO

COLORVIDEO

VID

EO

Ø

{ Video }

{ Color, Video }

Configurations:Ø, {Color}, {Video}, {Color,Video}VALID

{ Color }

customize

2F

Set of Features:F = { COLOR, VIDEO }

2F

[ 6 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Software Product Line

SPL:Family of s:

COLOR

VIDEO

COLORVIDEO

VID

EO

Program

Conditional compilation:

#ifdef ( )

...

#endif

Alternatively,via Aspects(as in AOSD)

Logo logo;...

...logo.use();

#ifdef (VIDEO) logo = new Logo();#endif

Exam

ple

(SPL

fragm

ent)

Similarly for; e.g.:■ uninitialized vars■ unused variables■ ...

*** null-pointer exception!in configurations: {Ø, {COLOR}}

: fF | |

[ 8 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

resultresult

0100101111011010100111110111

0100101111011010100111110111

Analysis of SPLs

The Compilation Process:

...and for Software Product Lines:

0100101111011010100111110111

resultcompile run

ERROR!

customize 0100101111011010100111110111

result

run

ERROR!

ANALYZE!

ANALYZE!

Feature-sensitive data-flow analysis !

runruncompilecompilecompile

ANALYZE!ANALYZE! ERROR!ERROR!

2F

[ 9 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

[ 10 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Dataflow Analysis

Dataflow Analysis:1) Control-flow graph

2) Lattice (finite height)

3) Transfer functions (monotone)

L

Example:"sign-of-x analysis"

[ 11 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Analyzing a Program1) Program 2) Build CFG 3) Make Equations

4) Solve equations: fixed-point computation (iteration)

5) SOLUTION (least fixed point):

Annotated with program points

[ 12 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

[ 13 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

A1 (brute force)

A1 (feature in-sensitive):N = 2F compilations!

void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

c = {A}: c = {B}: c = {A,B}:

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

ψFM = A B∨

L

[ 14 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

A2 (consecutive)

A2 (feature sensitive!):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

c = {A}: c = {B}: c = {A,B}:

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

A:

B:

A:

B:

A:

B:

0+

✓ ✗

✓ ✓

✓ ✓

ψFM = A B∨

L

[ 15 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

A3 (simultaneous)

A3 (feature sensitive!):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

∀c ∈ {{A},{B},{A,B}}:

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

A:

B:

0+

✓({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

✓✓

✓✓

✓✓

ψFM = A B∨

L

[ 16 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

A4 (shared)

A4 (feature sensitive!):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

ψFM = A B:∨

int x = 0;

x++;

x--;

A:

B:

_|( [[ψ]] = )

0 +( [[ψ ¬A∧ ]] = , [[ψ A∧ ]] = )

0( [[ψ]] = )

(A B) ¬A ¬B ≡ ∨ ∧ ∧ false…using BDDrepresentation!(compact+efficient)

+ - 0/+( [[ψ ¬A∧ ¬B∧ ]] = , [[ψ A∧ ¬B∧ ]] = , [[ψ ¬A∧ B∧ ]] = , [[ψ A∧ B∧ ]] = )0

i.e., invalid given wrt.the feature model, ψ !

ψFM = A B∨

L

[ 17 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Specification: A1, A2, A3, A4

A1

A2

A3

A4

[ 18 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

A1, A2, A3, and A4A1 A2

A3 A4

[ 19 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

[ 20 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Intraprocedural Evaluation

Four (qualitatively different) SPL benchmarks:Implementation: A1, A2, A3, A4 in SOOT + CIDEEvaluation: total time, analysis time, memory usage

[ 21 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Results (total time)

In theory:

In practice:

6x 8x 14x

3x5x

3x

1x 1x 1x

2x 2½x2x

A2 (3x), A3 (4x), A4 (5x)

Feature sensitive (avg. gain factor):

(Reaching Definitions)

2F 2F

2F

[ 22 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Results (analysis time)

In theory:

In practice: TIME(A4) : Depends ondegree of sharing in SPL !

(caching!)(Reaching Definitions) A3 (1.5x) faster

On average (A2 vs A3):

A2

A3

vs

2F

[ 23 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Results (memory usage)

In theory:

In practice:(Reaching Definitions) 6.3 : 1

Average

2F

A2

A3

vs

SPACE(A4) : Depends ondegree of sharing in SPL !

[ 24 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

[ 25 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Related Work (DFA)

Path-sensitive DFA:

Idea of “conditionally executed statements”

Compute different analysis info along different paths (~ A2, A3, A4) to improve precision or to optimize “hot paths”

Predicated DFA:

Guard lattice values by propositional logic predicates (~ A4), yielding “optimistic dataflow values” that are kept distinct during analysis (~ A3 and A4)

“Constant Propagation with Conditional Branches”( Wegman and Zadeck ) TOPLAS 1991

“Predicated Array Data-Flow Analysis for Run-time Parallelization”( Moon, Hall, and Murphy ) ICS 1998

Our work: Automatically lift any DFA to SPLs (with ψFM) ⇒feature-sensitive analysis for analyzing entire program family

[ 26 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Related Work (Lifting for SPLs)

Model Checking:

Type Checking:

Parsing:

Testing:

Model Checking Lots of Systems: Efficient Verification of Temporal Properties in Software Product Lines”( Classen, Heymans, Schobbens, Legay, and Raskin ) ICSE 2010

Model checks all SPLs at the same time (3.5x faster) than one by one! (similar goal, diff techniques)

Type checking ↔ DFA (similar goal, diff techniques)Our: auto lift any DFA (uninit vars, null pointers, ...)

“Type Safety for Feature-Oriented Product Lines”( Apel, Kastner, Grösslinger, and Lengauer ) ASE 2010

“Type-Checking Software Product Lines - A Formal Approach”( Kastner and Apel ) ASE 2008

“Variability-Aware Parsing in the Presence of Lexical Macros & C.C.”( Kastner, Giarrusso, Rendel, Erdweg, Ostermann, and Berger ) OOPSLA 2011

“Reducing Combinatorics in Testing Product Lines”( Hwan, Kim, Batory, and Khurshid ) AOSD 2011

Select relevant feature combinations for a given test caseUses (hardwired) DFA (w/o FM) to compute reachability

(similar techniques, diff goal):Split and merging parsing (~A4) and also uses instrumentation

[ 27 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Emerging Interfaces

[ 28 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Emerging Interfaces

"A Tool for Improving Maintainability of Preprocessor-based Product Lines"( Márcio Ribeiro, Társis Tolêdo, Paulo Borba, Claus Brabrand )

*** Best Tool Award ***CBSoft 2011:

[ 29 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

[ 30 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Conclusion(s)

It is possible to analyze SPLs using DFAs

We can automatically "lift" any dataflow analysis and make it feature sensitive:

A2) Consecutive

A3) Simultaneous

A4) Shared Simultaneous

A2,A3,A4 much faster (3x,4x,5x) than naive A1

A3 is (1.5x) faster than A2 (caching!)

A4 saves lots of memory vs A3 (sharing!) 6.3 : 1

[ 31 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Future Work

Explore how all this scales to…:

In particular:…relative speed of A1 vs A2 vs A3 vs A4 ?

…which analyses are feasible vs in-feasible ?

INTER-proceduraldata-flow analysis

In progress...!

Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

< Obrigado* >

*) Thanks

Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

BONUS SLIDES

[ 34 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Results (analysis time)

In theory:

In practice: TIME(A4) : Depends ondegree of sharing in SPL !

Nx1 ≠ 1xN?!

(caching!)

(Reaching Definitions) A3 (1.5x) fasterOn average (A2 vs A3):

A2

A3

vs

2F

2F

[ 35 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

A2 vs A3 (caching)

Cache misses in A2 vs A3:

Normal cache:As expected, A2 incurs more cache misses ( slower!)⇒

Full/no cache*:As hypothesized, this indeed affects A2 more than A3

i.e., A3 has better cache properties than A2

*) we flush the L2 cache, by traversing an 8MB “bogus array” to invalidate cache!

A2

A3

vs

[ 36 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

IFDEF normalization

Refactor "undisciplined" (lexical) ifdefs into "disciplined" (syntactic) ifdefs:

Normalize "ifdef"s (by transformation):

[ 37 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Example Bug from Lampiro

Lampiro SPL (IM client for XMPP protocol):

*** uninitialized variable "logo"(if feature "GLIDER" is defined)

Similar problems with:undeclared variables, unused variables, null pointers, ...

[ 38 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

BDD (Binary Decision Diagram)

Compact and efficient representation forboolean functions (aka., set of set of names)

FAST: negation, conjunction, disjunction, equality !

= F(A,B,C) = A(BC)

A

C

minimized BDD

B

A

BB

C C C C

BDD

[ 39 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Formula ~ Set of Configurations

Definitions (given F, set of feature names):f F feature namec 2F configuration (set of feature names) c FX 22 set of config's (set of set of feature names) X 2F

Exampleifdefs:

F

[[ BA ]]

[[ A(BC) ]]

F = {A,B}

F = {A,B,C}

= { {A}, {B}, {A,B} }

= { {A,B}, {A,C}, {A,B,C} }

[ 40 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Feature Model (Example)

Feature Model:

Feature set:

Formula:

Set of configurations:FM Car Engine (1.01.4) Air1.4

{ {Car, Engine, 1.0}, {Car, Engine, 1.4}, {Car, Engine, 1.4, Air} }

F = {Car, Engine, 1.0, 1.4, Air}

Note:| [[FM]] | = 3 < 32 = |2F |

[[ ]] =

Engine

1.0

Air

Air

1.4

[ 41 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Conditional Compilation

The 'ifdef' construction:

Syntactic variant of lexical #ifdef

Propositional Logic:

where fF (finite set of feature names)

Example:

STM : 'ifdef' '(' ')' STM

: fF | |

status.print("you die");ifdef (DeluxeVersion && ColorDisplay) { player.redraw(Color.red); Audio.play("crash.wav");}lives = lives - 1;

A

ifdef (A) { ...

}

[ 42 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

Lexical #ifdef Syntactic ifdef

Simple transformation:

We do not handle non-syntactic '#ifdef's:

Fair assumption(also in CIDE)

Nested ifdef's also give rise to a conj.of formulas

[ 43 ]Dataflow Analysis for Software Product Lines May 24, 2012COPLAS, DIKU

CASE 1: "COPY"

A4: Lazy Splitting (using BDDs)CASE 2: "APPLY" CASE 3: "SPLIT"

: S

[ =l , ... ]

[ =l , ... ]

l ' = fS(l )

: S

[ =l , ... ]

[ =l ', ... ]

l ' = fS(l )

: S

[ =l , ... ]

[ =l, =l' ,...]

l ' = fS(l )

= Ø = Ø