Finding Near-Optimal Configurations in Product Lines by ... · Finding Near-Optimal Configurations in Product Lines by Random Sampling Jeho Oh 1 ... but cheap and light as possible

Finding Near-Optimal Configurations in Product Lines by Random Sampling

Jeho Oh

1

Don Batory Margaret Myers Norbert Siegmund

Quickly find SPL configurations with

near-optimal performance for a given workload

• Configuration space is often huge: 𝑛 features ≤ 2𝑛 configurations

( 273 optional features: 1082 products, one for every atom in universe )

• Searching for the optimal configuration is daunting,

as benchmarking all configurations is infeasible

• Find a way to get good enough configurations with practical effort

2

I want a hybrid car with laser headlight, but cheap and light as possible.

Now 273/275 options left to decide…

Big Picture

3

randomly sample on configuration space for

near-optimal configurations

featuremodel

user-imposedfeature constraints

near-optimalperforming

configuration

learnperformance

model

useoptimizer

sampleconfigurations

Performance Model Approach

Our Approach

Contributions

• Allow true random sampling of configurations

• Provide statistical bounds on searching by sampling

• Directly search the space for any given workload

4

Search by Random Samplingwhich we describe next…

• To randomly sample configurations

from uniform distribution:

– Identify valid configuration space

– Select a random number in [ 1, total # of configs ]

– Return the configuration with matching number

• Binary Decision Diagram (BDD):

– Compile prop. formula into graph structure

– Derive all possible solutions (configs)

– Allows efficient sampling from traversing BDD

Random Sampling with BDD

5

(𝑟𝑜𝑜𝑡 ↔ 𝑡𝑟𝑢𝑒)∧ 𝑟𝑜𝑜𝑡 ↔ 𝐴∧ (𝐵 → 𝑟𝑜𝑜𝑡)

∧ 𝐶 ↔ 𝐴 ∧ ¬𝐷

∧ 𝐷 ↔ 𝐴 ∧ ¬𝐶

∧ (𝐷 → 𝐵)

𝑟𝑜𝑜𝑡

𝐴 𝐵

𝐶 𝐷

𝐷 implies 𝐵

Feature Model Prop. Formula

BDD

𝑟𝑜𝑜𝑡

𝐴

0 1

𝐵

𝐶

𝐷 𝐷

0 1

2

2

0 1

1

0

03

0 3

Randomly sample from space of all valid

configurations, not space of all features

• 𝑛 random numbers over unit range [0,1], 𝑥 as the number closest to 1

Analyze the distance between 𝑥 and 1, (1 − 𝑥)

• C.D.F. of 𝑥 over 𝑛: 𝑝𝑛 𝑋 ≤ 𝑥 = 0𝑥𝑛 ⋅ 𝑥𝑛−1 ⋅ 𝑑𝑥 = 𝑥𝑛

• Average distance from 𝑥 to 1: 𝐸𝑛 = 011 − 𝑥 ⋅ 𝑛 ⋅ 𝑥𝑛−1 ⋅ 𝑑𝑥 =

𝟏

𝒏+𝟏

(Equations for sampling over discrete space on Section 3.3 of the paper)

Statistics of Random Sampling (1)

6

0 10 1

𝑥 = 0.5

𝐸1 = 0.5

0 1

𝑥 = 0.67

𝐸2 = 0.33

0 1

𝑥 = 0.75

𝐸3 = 0.25

0 1

𝐸4 = 0.20

𝑥 = 0.80

Statistical regularity: 𝐸𝑛 =1

𝑛+1with bounds 𝜎𝑛 ≈

1

𝑛+1

• Correspondences:

– Selection of numbers in [ 0,1 ] ► Selection in [ 1, total # of configs ]

– Closeness to 1 ► Closeness to the best configuration, Ω

Statistics of Random Sampling (2)

7

On average, sample with the best performance has

top 𝟏𝟎𝟎

𝒏+𝟏% performance among all configurations

200

210

220

230

240

250

260

270

0 300 600 900

LLVM (Compiler infrastructure, 11 features, 1024 configs)

𝛀𝟎

𝐸1 =Ω − 𝑥

1024= 0.50𝐸3 =

Ω − 𝑥

1024= 0.25𝐸7 =

Ω − 𝑥

1024= 0.125

Per

form

an

ce

Sorted configurations

Sorted config space

Best performing config

Monotonic graph:

Closer to Ω (x-axis),

better perf. (y-axis)

Statistical Recursive Searching (SRS)

• Can search better than 99 samples for 1% by random sampling

• Feature selection:

– Makes some configs perform better

– Constrict config space

• Search within smaller and better

config space by feature selection

• Find most influential features by:

– Performance difference

– Welch’s t-Test

8

Use samples to recursively constrict the search space

200

210

220

230

240

250

260

270

0 200 400 600 800 1000

LLVM

Per

form

an

ce












9


200

210

220

230

240

250

260

270

0 200 400 600 800 1000

LLVM

Per

form

an

ce












10


200

210

220

230

240

250

260

270

0 200 400 600 800 1000

LLVM

Per

form

an

ce












11


200

210

220

230

240

250

260

270

0 200 400 600 800 1000

LLVM

Per

form

an

ce












12


200

210

220

230

240

250

260

270

0 200 400 600 800 1000

LLVM

Per

form

an

ce


4 features excluded:

- 1/16 of entire space,

- Better overall performance

EVALUATION

13

ICSE 2012

Evaluation Method

• Use ground truth data from Siegmund et al. (http://fosd.de/SPLConqueror)

• Measured accuracy of search:

– 𝜹𝒙 : % of configurations better than the best config found so far

– 𝜹𝒚 : % performance difference to Ω

• Averaged from 100 searches per different conditions

• Full result is available in Section 5 of the paper

14

SPL Type # features # configs Performance

LLVM Compiler infrastructure 11 1024 Test suite compilation time

BerkeleyDBC Database system 18 2560 Benchmark response time

X264 Video encoder 16 1152 Video encoding time

Apache Web server 9 192 Maximum server load

𝜹𝒙: Theory vs. Actual• 𝛿𝑥 from randomly sampling different # of samples

vs. theoretical value derived using same # of samples

15

Theory matches observations

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

10 20 30 40 50 60 70 80 90 100

LLVM

X264

BerkeleyDBC

Apache

Theoretical

Apache_Theoretical

Dis

tan

ce t

o 𝛀

(𝜹𝒙)

# of random samples (𝒏)

𝑬𝒏 (Eq. (9))

𝑬𝟏𝟗𝟐,𝒏 (Eq. (7), Apache)

𝜹𝒙 : SRS vs. Random Sampling (non-recursive)

• 𝛿𝑥 of SRS over different # of samples per recursion

vs. theoretical 𝛿𝑥 of random sampling with same total # of samples (𝑁)

16

0%

2%

4%

6%

8%

10%

12%

5 15 25 35

LLVM

0%

2%

4%

6%

8%

10%

12%

14%

16%

5 15 25 35

BerkeleyDBC

0%

2%

4%

6%

8%

10%

12%

14%

5 15 25 35

X264

𝛿𝑥 (Measured)

Dis

tan

ce t

o 𝛀

(𝜹𝒙)

# of samples per recursion (𝒏)

Dis

tan

ce t

o 𝛀

(𝜹𝒙)


Dis

tan

ce t

o 𝛀

(𝜹𝒙)


SRS is more efficient than random sampling alone

𝐸𝑁 (Theoretical)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

0 20 40 60 80 100 120 140 160 180

BerkeleyDBC

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

0 20 40 60 80 100

LLVM

𝜹𝒚: SRS vs. Performance Models (1)

• 𝛿𝑦 over total # of samples in SRS vs.

𝛿𝑦 of perf. models assuming an ideal optimizer (always finds best predicted config)

17

SRSSarkar2015 Siegmund2012

SRS needs many fewer samples for same accuracy, and

yields much better accuracy for same number of samples

Per

form

an

ce d

iffe

ren

ce t

o 𝛀

(𝜹𝒚)

Total # of samples (𝑵)P

erfo

rman

ce d

iffe

ren

ce t

o 𝛀

(𝜹𝒚)

Total # of samples (𝑵)

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

0 20 40 60 80 100

160%

162%

164%

X264

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

0 20 40 60 80 100 120 140 160 180

BerkeleyDBC

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

0 20 40 60 80 100

LLVM

𝜹𝒚: SRS vs. Performance Models (2)

• 𝛿𝑦 over total # of samples in SRS vs.

𝛿𝑦 of performance models assuming an ideal optimizer

18

SRSSarkar2015 Siegmund2012

In SRS, more samples yields better accuracy

Per

form

an

ce d

iffe

ren

ce t

o 𝛀

(𝜹𝒚)


Per

form

an

ce d

iffe

ren

ce t

o 𝛀

(𝜹𝒚)


Per

form

an

ce d

iffe

ren

ce t

o 𝛀

(𝜹𝒚)


𝜹𝒙: Scalability of Searching• Combine SPLs to simulate

larger configuration spaces

• Measure 𝛿𝑥 for SRS and

non-recursive searching

19

𝐸𝑁

0%

2%

4%

6%

8%

10%

12%

5 15 25 35

Apache×X264×LLVM×BerkeleyDBC

Dis

tan

ce t

o 𝛀

(𝜹𝒙)


𝛿𝑥

Accuracy is independent of the size of the configuration space

Combined Systems # of Features # of Confgs.

Apache × LLVM × BerkeleyDBC 38 503,316,480

Apache × X264 × BerkeleyDBC 51 566,231,040

LLVM × Apache × X264 × BerkeleyDBC 62 579,820,584,960

# of random samples

Dis

tan

ce t

o 𝛀

(𝜹𝒙)

0.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

30 50 70 90 110

Theory vs. Actual

ApacheH264

BerkeleyDBC

ApacheLLVM

BerkeleyDBC

E_n (Eqn. (9))𝑬𝒏 (Eqn. (9))

0%

2%

4%

6%

8%

10%

12%

5 15 25 35

Apache × LLVM × BerkeleyDBC

0%

2%

4%

6%

8%

10%

12%

5 15 25 35

Apache × X264 × BerkeleyDBC

Dis

tan

ce t

o 𝛀

(𝜹𝒙)


Dis

tan

ce t

o 𝛀

(𝜹𝒙)


Conclusions and Contributions

1. True random sampling of configuration spaces

2. Guaranteed tight statistical bounds on finding good configurations

3. Can recursively search through configuration space for more efficient searching

4. Scalable search method - accuracy independent of the configuration space size

20

Thank You !Finding Near-Optimal Configurations in Product Lines by Random Sampling

Jeho Oh, Don Batory, Margaret Myers, Norbert Siegmund

21

Supplemental Slides from Now On

22

Statistics of Random Sampling

• 𝑛 random numbers over unit range [0,1], 𝑥 is the number closest to 1Analyze the distance between 𝑥 and 1, (1 − 𝑥)

• C.D.F. of 𝑥 over 𝑛:

𝑝𝑛 𝑋 ≤ 𝑥 = 0𝑥𝑛 ⋅ 𝑥𝑛−1 ⋅ 𝑑𝑥 = 𝑥𝑛

• Average distance from 𝑥 to 1:

𝐸𝑛 = 011 − 𝑥 ⋅ 𝑛 ⋅ 𝑥𝑛−1 ⋅ 𝑑𝑥 =

𝟏

𝒏+𝟏

• Standard deviation:ത𝐸𝑛 = 0

11 − 𝑥 2 ⋅ 𝑛 ⋅ 𝑥𝑛−1 ⋅ 𝑑𝑥 =

1

(𝑛+1)(𝑛+2)

𝜎𝑛 = ത𝐸𝑛 − 𝐸𝑛2 =

𝟏

(𝒏+𝟏)(𝒏+𝟐)−

𝟏

𝒏+𝟏 𝟐

23

0%

2%

4%

6%

8%

10%

10 20 30 40 50 60 70 80 90 100

En

sn

𝐸𝑛

𝜎𝑛

Dis

tan

ce t

o 𝟏

# of random numbers

PCS Graphs of Real Systems

• All configuration measurements sorted by performance (descending order)

24

0.8K

1.3K

1.8K

2.3K

0 50 100 150

Apache (9, 192)

0

10

20

30

40

50

0 500 1000 1500 2000 2500

BerkeleyDBC (18, 2560)

200

400

600

800

0 200 400 600 800 1000

X264 (16, 1152)

200

220

240

260

0 200 400 600 800 1000

LLVM (11, 1024)


Per

form

ance


Per

form

ance


Per

form

ance


Per

form

ance

* System Name (# of features, # of configs)

Statistical Recursive Searching(SRS)

25

Use samples to recursively reduce the config space to focus

0

5

10

15

20

25

30

35

40

45

0 500 1000 1500 2000 2500

BerkeleyDBC

Configurations sorted by performance

Per

form

ance

PS32K, HAVE_HASHPS32K, ¬HAVE_HASH

PS16K, HAVE_HASHPS16K, ¬HAVE_HASH

PS8K, ¬ HAVE_HASHPS8K, HAVE_HASH



¬ HAVE_CRYPTOHAVE_CRYPTO

1. Random sample from config space

2. From best 2 configs, get common decisions 𝑑

3. For each common decision 𝑑, measure:

𝛿𝑑 = Avg. perf. of samples with 𝑑

𝛿¬𝑑 = Avg. perf. of samples without 𝑑

Δ𝑑 = 𝛿𝑑 − 𝛿¬𝑑

4. If a Δ𝑑 indicates performance improvement,

perform Welch’s T-test to evaluate its certainty

5. Use features certain to improve performance

to constrain the configuration space to recurse

Finding Features to Recurse

26

Decisions to recurse: 𝑓1

𝜙 = 𝐹𝑀

𝑐2 𝑐1Ω0

𝑐2 = {𝑓1, 𝑓2, ¬𝑓3, 𝑓4, ¬𝑓5, 𝑓7, ¬𝑓8}

𝑐1 = {𝑓1, 𝑓2, 𝑓3, ¬𝑓4, 𝑓5, 𝑓7, ¬𝑓8}

𝑑 = {𝑓1, 𝑓2, 𝑓7, ¬𝑓8}

Common Decisions 𝑓1 𝑓2 𝑓7 ¬𝑓8

Δ𝑑 improves perf. Yes No No Yes

𝜙 = 𝐹𝑀 ∧ 𝑓1Ω0

Welch’s T-test Pass - - Fail

Documents

Finding Near-Optimal Configurations in Product Lines by ... · Finding Near-Optimal Configurations in Product Lines by Random Sampling Jeho Oh 1 ... but cheap and light as possible