A Genetic Algorithm for Optimized Feature Selection with ...jules/jianmei.pdf · The results show that GAFES can produce fea-ture selections that are within objective function scores

A Genetic Algorithm for Optimized Feature Selection with ResourceConstraints in Software Product Lines

Jianmei Guoa,∗, Jules Whiteb, Guangxin Wanga, Jian Lia, Yinglin Wanga

aDepartment of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Road, Minhang, Shanghai200240, China

bBradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA 24060, USA

Abstract

Software product line (SPL) engineering is a software engineering approach to building configurable soft-ware systems. SPLs commonly use a feature model to capture and document the commonalities and vari-abilities of the underlying software system. A key challenge when using a feature model to derive a newSPL configuration is determining how to find an optimized feature selection that minimizes or maximizesan objective function, such as total cost, subject to resource constraints. To help address the challenges ofoptimizing feature selection in the face of resource constraints, this paper presents GAFES, an artificial intel-ligence approach, based on genetic algorithms (GAs), for optimized feature selection in SPLs. Our empiricalresults show that GAFES can produce solutions with 86-97% ofthe optimality of other automated featureselection algorithms and in 45-99% less time than existing exact and heuristic feature selection techniques.

Keywords: Software product lines, Product derivation, Feature models, Optimization, Genetic algorithm

1. Introduction

Software product lines(SPLs) are a software engi-neering approach for creating configurable softwareapplications that can be adapted to a variety of re-quirement sets (Clements & Northrop, 2001). SPLsare built around a set of common software compo-nents with points of variability that allow the cus-tomization of the product. For example, the BoldStroke SPL (Boeing, 2002), developed by the Boe-ing company, comprises a wide range of reusableartifacts, such as a configurable architecture, set ofapplication components, development processes, anddevelopment tools, which are used to create Opera-tional Flight Programs for a variety of Boeing air-

∗Corresponding author. Tel.:+86-21-34204415; fax:+86-21-34204728.

Email addresses:[email protected](Jianmei Guo),[email protected] (Jules White),[email protected] (Guangxin Wang),[email protected] (Jian Li),[email protected](Yinglin Wang)

craft.

Developing an SPL comprises two distinct devel-opment processes: 1)domain engineering, which isthe process of developing the common and variablesoftware artifacts of the SPL (Pohl et al., 2005) and2) product derivation, which is the process of config-uring the reusable software artifacts for a customer-or market-specific set of requirements. The coreprinciple behind SPL engineering is that the in-creased cost of developing a set of reusable and cus-tomizable software artifacts can be amortized acrossmultiple products (Deelstra et al., 2005), such as dif-ferent Boeing aircraft.

An important component of an SPL is a model orroadmap that dictates the rules governing how thepoints of variability can be configured. A widelyused approach for modeling the commonalities andvariabilities in an SPL is Feature-Oriented ProductLine Engineering, which captures and represents thecommonalities and variabilities of systems in termsof features (Kang et al., 1990, 1998, 2002).Features

Preprint submitted to Journal of Systems and Software February 21, 2011

are essential abstractions of product characteristicsrelevant to customers and are typically increments offunctionality (Kang et al., 2002; Batory et al., 2006).

Every product derived from a feature-orientedSPL is represented by a unique and valid combina-tion of features. The valid combinations of featuresare governed by the SPL’sfeature model, which de-fines relationships between the features in an SPL.As shown in Figure 1, a feature model is a tree-likestructure that defines the relationships among fea-tures in a hierarchical manner. The relationships be-tween the features indicate the choices that the cus-tomer can make when customizing a product. Forexample, in Figure 1, theAlternativerelationship be-tween the two features “Static” and “Dynamic” in-dicates that developers can either choose a static ordynamic mechanism for implementing the memoryallocation (the “MemAlloc” feature) in the database,but not both. The goal of product derivation is to se-lect an optimal combination of features from the fea-ture model that meets the stakeholder requirements.

Open Problem⇒ SPL Feature Selection Opti-mization with Resource Constraints. Despite thebenefits of SPLs, industrial case studies have shownthat product derivation is still “a time-consumingand expensive activity” (Deelstra et al., 2004, 2005).Developers face a number of challenges when at-tempting to derive an optimized feature selection thatmeets an arbitrary set of requirements. For example,developers of the database SPL shown in Figure 1may need to derive the set of features that meets a setof functional requirements for a mobile applicationwhile simultaneously minimizing memory consump-tion (the optimization goal). Some configurations ofthe database may reduce memory consumption butrequire more disk space than can be supported by themobile platform (a resource constraint).

Even in a small feature model, feature combina-torics can produce an exponential number of prod-uct configurations. For example, about 280 differ-ent product configurations can be derived from therelatively simple feature model shown in Figure 1.Moreover, once a feature selection is made, it mustbe verified to conform to the myriad constraints inthe feature model. Additional resource constraints ornon-functional requirements, such as the binary size,performance, or total budget for a product (Siegmund

et al., 2008; White et al., 2009), make the feature se-lection process even more complex and difficult. Pre-vious research has shown that finding an optimal fea-ture selection that conforms to both the feature modelconstraints and the resource constraints is an NP-hard problem (White et al., 2009). To overcome thecomplexity of selecting an optimized feature selec-tion that meets resource constraints, developers needtools to help automate the process.

SPL feature selection optimization with resourceconstraints is similar to configuration optimizationproblems that have been addressed by other auto-mated feature selection approaches that do not con-sider resource constraints (Benavides et al., 2008).Many researchers in the SPL community have ap-plied and extended various AI techniques to solve theSPL feature selection problem, e.g., using CSP (Be-navides et al., 2005), BDD (Czarnecki & Wasowski,2007; Mendonca et al., 2009) and SAT solvers (Ba-tory, 2005), but have not considered resource con-straints in their work. Moreover, these exact tech-niques show exponential time complexity and do notscale to large industrial feature models with hundredsor thousands of features (White et al., 2009). Otherresearchers have developed polynomial-time approx-imation algorithms for selecting highly optimal fea-ture sets (White et al., 2009), but their approachesstill require significant computing time. In addition,some visualization techniques (Sellier & Mannion,2007; Botterweck et al., 2007) have been devisedto assist developers in feature selection. However,industrial-sized feature models can have hundreds orthousands of features (Steger et al., 2004; Loesch& Ploedereder, 2007), which makes a manual fea-ture selection process challenging, particularly foran NP-hard feature selection problem with resourceconstraints.

Solution Approach ⇒ Genetic Feature Selec-tion Optimization Algorithms. In this paper, weintroduce an AI approach, based on genetic algo-rithms (GAs), to SPL feature selection optimizationwith resource constraints. A GA is a stochastic andglobal search heuristic that mimics natural evolution(Mitchell, 1996). It can quickly scan a vast popu-lation of possible solutions. GAs often work wellfor highly constrained problems, such as the Travel-ing Salesman (Muhlenbein, 1989) and Satisfiability

2

DB

OS BufferMgr DebugLogging

NutOS Win InMemory

BTree

API

put deleteget

Mandatory feature

Alternative-group

Or-group

And-groupNon-terminal feature

Terminal feature

Optional feature [1..n] Cardinality

Persistent

MemAlloc PageRepl

Static Dynamic LRU LFU

Storage

Indexing

Unindexed

[1..3]

Figure 1: Example feature model for the FAME-DBMS SPL

problems (De Jong & Spears, 1989).In this paper, we adapt GAs to the SPL feature

selection problem, and propose a GA-based AI ap-proach, calledGAFES. GAFES can quickly derivean optimized feature selection by evaluating differ-ent configurations that both optimize product capa-bilities and honor resource limitations. GAFES com-bines a novel feature selection repair operator anda penalty function for resource constraints to ob-tain fast feature selection times. Our empirical re-sults show that GAFES can produce feature selec-tions with objective function scores that are within86-97% of the feature selections produced by priorAI feature selection techniques, such as FCF (Whiteet al., 2009) and CSP/SAT-based approaches (Bena-vides et al., 2005; Batory, 2005; Czarnecki & Wa-sowski, 2007), for feature models whose size rangesfrom 10 to 10,000 features. Moreover, in our exper-iments, GAFES derives feature selections 45-99%faster than these existing heuristic and exact featureselection optimization techniques.

The main contributions of this paper to the studyof AI-based optimized feature selection for SPLs aresummarized as follows:

• We show how to adapt GAs to the SPL featureselection optimization with resource constraintsand propose a modified GA named GAFES toderive an optimized feature selection subject tothe feature model constraints and the resourceconstraints.

• We present a repair operator that allows a GAor other evolutionary algorithm to transform anarbitrary feature set into a valid feature combi-nation that conforms to the feature model con-straints.

• We describe a penalty function, based on the ra-tio of objective function value to consumed re-sources, that can improve the search process ofa GA or evolutionary algorithm to that meets re-source constraints.

• We present empirical results from experimentson feature models with 10 to 10000 features.The results show that GAFES can produce fea-ture selections that are within objective functionscores that are within 86-97% of feature selec-tions produced by other feature selection tech-niques. GAFES, however, derives feature selec-tions 45-99% faster than existing heuristic andexact feature selection techniques.

The remainder of this paper is organized as fol-lows: Section 2 contains a brief introduction to fea-ture models; Section 3 presents a motivating exam-ple; Section 4 formalizes the SPL feature selectionproblem; Section 5 discusses the challenges of adapt-ing GAs to the SPL feature selection problem; Sec-tion 6 details our genetic feature selection optimiza-tion algorithm, called GAFES, and analyzes its algo-rithmic complexity; Section 7 introduces our exper-iments with feature model generation and test pat-

3

tern generation, and presents our empirical results;Section 8 compares our work with related work; andSection 9 presents concluding remarks and lessonslearned.

2. Feature Modeling Background

Figure 1 shows a partial feature model of an em-bedded database SPL called FAME-DBMS inspiredfrom (Rosenmuller et al., 2008; Thum et al., 2009).A feature model is a tree of features. Every node inthe tree has one parent except theroot feature(e.g.,‘DB’). A terminal feature (e.g., ‘NutOS’) is a leafand anon-terminalfeature (e.g., ‘OS’) is an interiornode of a feature diagram (Batory, 2005; Thum etal., 2009). Every non-terminal feature represents acomposition of features that are its descendants.

A feature model is organized hierarchically andis graphically depicted as an AND-ORfeature di-agram (Kang et al., 1990).Cross-tree constraintsare used to represent non-hierarchical compositionrules comprising mutual dependency (requires) andmutual exclusion (excludes) relationships (Kang etal., 1990). There are two cross-tree constraints inFigure 1, “LRU requires BTree” and “LFU requiresBTree”.

New products are derived from a feature model byfinding aconfigurationof terminal features (Thum etal., 2009). A feature selection represents a specificproduct satisfying customer requirements. The ac-tual resource consumption and the benefits of a prod-uct can be calculated from the set of terminal features(White et al., 2009). For example, a feature selectionfrom the FAME-DBMS feature model can be usedto calculate the amount of memory consumed by thebuffer management configuration that is contained inthe feature selection.

A feature selection isvalid if the selection of fea-tures is allowed by the constraints described in thefeature model. Connections between a feature andits group of children define the constraints on fea-ture selection. The constraints that can be usedto govern the allowable child feature selections areAnd-(e.g., ‘OS’, ‘BufferMgr’, ‘DebugLogging’, and‘Storage’), Or- (e.g., ‘get’, ‘put’ and ‘delete’), andAlternative-groups (e.g., ‘NutOS’ and ‘Win’). Themembers of And-groups can be eithermandatory

(e.g. ‘OS’, ‘BufferMgr’, and ‘Storage’) oroptional(e.g. ‘DebugLogging’). Or-groups and Alternative-groups have their owncardinalities (Czarnecki &Wasowski, 2007). For example, at least 1 and at most3 API (‘get’, ‘put’ and ‘delete’) must be selected.Other constraints have been proposed, such as car-dinality constraints (Czarnecki & Wasowski, 2007),but we focus on the core constraints that are commonacross all feature modeling approaches.

The rules for selecting features from a featuremodel can be summarized as follows (Guo & Wang,2010): if a feature is selected, its parent must alsobe selected. If a feature is selected, all of its manda-tory children participating in an And-group must beselected. For example, in Figure 1, “Storage” hasa mandatory sub-feature “API”, which must also beselected if “Storage” is selected. If the selected fea-ture has an Or-group containing children, at least onechild must be selected, and in Alternative-groups, ex-actly one child is selected. For example, at leat 1 andat most 3 child features of “API” can be selected. “In-dexing” requires the selection of either of its “BTree”or “Unindexed” sub-features, but not both.

3. Motivating Example

Figure 2 shows a scenario of a sensor networkthat applies FAME-DBMS, which is excerpted from(FAME-DBMS, 2002). In this motivating exam-ple, sensor nodes measure values for temperature, airpressure, and luminance in a biological environmentto monitor the growth conditions of different plants.A biological scientist can also manually augment thesensor data by entering additional information aboutthe plants into a PDA. Scientists can use a PDA to re-trieve information from the sensor network througha data access point. All the data is stored in a serverfor further analysis. In every embedded system ofthis scenario, such as the sensor nodes and the PDA,the FAME-DBMS is used to store and retrieve thedata.

The key to product derivation is to select a goodfeature combination from a feature model accordingto the requirements of customers and vendors. Inpractice, feature selection must consider the trade-off between the resource consumption and value ofthe target product and the limited vendor budget.

4

Sensor

Data Access

Point

PDA

Server

Sensor

Sensor Sensor

Sensor

Sensor

Data Access

Point

Wireless

WirelessWireless

Wireless

Wireless

Wireless

Bluetooth

Bluetooth

Figure 2: A sample application scenario for the FAME-DBMS

Table 1: Example resource consumption and performance of asubset of the FAME-DBMS features

Feature Perf. CPU Mem. CostDB/OS/NutOS 8 4 2048 10DB/OS/Win 5 2 1024 15...DB/DebugLogging 2 3 256 50...DB/Storage/Indexing/Btree 8 4 512 30DB/Storage/Indexing/Unindexed 4 2 128 20

In the motivating example, the resource consump-tion comes from the consumption of CPU, memory,available budget, and development staff time (Whiteet al., 2009). Regardless of what configuration of thesystem is chosen, the resource consumption must notexceed the available values of the target platform.For example, the total memory consumed by thedatabase deployed to the PDA cannot be more thanthe memory available on the PDA. Moreover, theproduct may need to be optimized for a specific char-acteristic, such as minimizing the total cost of eachnode in the sensor network or maximizing maintain-ability (Siegmund et al., 2008), performance, stabil-ity, etc.

Take the feature model shown in Figure 1 for ex-ample, Table 1 shows example information about theprovided performance and the consumed resourcesof a subset of the FAME-DBMS features. Each fea-ture is identified by the path from the root featurein the model to it. In this example, customers wanta target product (an embedded database) with maxi-mum performance, but that fits within a limited bud-

get. This budget constraint is a resource limitation,e.g., CPU6 40, Memory6 4096, and Cost6 200.Thus the problem is to derive a target product withthe maximum performance subject to the resourcelimitations. For feature models with hundreds orthousands of features and a large number of featurecombinations, choosing an optimized feature selec-tion is hard.

4. Feature Selection with Resource ConstraintsProblem Formalization

More formally, SPL feature selection optimizationwith resource constraints can be defined as follows.Let F = { fi}, 1 6 i 6 ndenote alln features defined ina feature model andC all the dependency constraintsand cross-tree constraints depicted by the arcs in thefeature diagram, e.g., in Figure 1, ‘OS’, ‘BufferMgr’,and ‘Storage’ are required child features of ‘DB’. Ev-ery featurefi ∈ F has an associated resource con-sumption,r( fi) ∈ Z, and a provided value,v( fi) ∈ Z.R(F) indicates the set of resources consumed by allthe features (e.g., columns 3-5 in Table 1) andV(F)the set of values provided by all the features (e.g.,columns 2 in Table 1).Rc includes the resource con-straints, e.g., CPU6 40, Memory6 4096, and Cost6 200.

Here, we only consider the resource consumed byand the value provided by terminal features becausethe actual resource consumption and the benefits of aproduct mainly come from terminal features (Whiteet al., 2009). That is to say, the resource consumed

5

by and the value provided by all the non-terminal fea-tures are 0. Thus:

Definition 1. Given a feature model with n featuresF = { fi}, 1 6 i 6 n and a set of constraints C, thegoal of the feature selection problem is to find a fea-ture subset S⊆ 2F from all valid feature combina-tions defined in the feature model such that

V(S) (i.e.,∑

fi∈S

v( fi)) is maximized (1)

subject to

S −→con f orms toC (2)

and

R(S) (i.e.∑

fi∈S

r( fi)) 6 Rc (3)

for the resource constraint Rc∈ Z+.

5. Challenges of Adapting GAs to SPL FeatureSelection Optimization

To make well-informed configuration decisions,developers need the ability to easily generate andevaluate different feature selections that both satisfyresource limitations and optimize specific productcapabilities, e.g., minimizing total cost or requiredCPU and memory in the motivating example. Forexample, developers of the FAME-DBMS systemmight need to compare two different SPL configu-rations that are optimized to minimize memory con-sumption. In one configuration, the developers mightopt to slightly relax their budget constraint to deter-mine what additional performance and storage ca-pacity could be obtained for slightly more money.

Generating valid feature selections and evaluatingthem is computationally complex and time consum-ing and thus these types of comparative analysis aredifficult. From Definition 1, we can see that theSPL feature selection optimization problem with re-source constraints is a highly constrained problem.The SPL feature selection optimization with resourceconstraints has been proven to be NP-hard and thusexact algorithms for the problem do not scale well(White et al., 2009).

One approach to addressing this problem is to usean AI technique, such as a GA, to automate the fea-ture selection process. However, there are a numberof challenges to develop a GA for SPL feature selec-tion optimization with resource constraints. In the re-mainder of this section, we show that there are threekey challenges to developing a GA for optimized fea-ture selection with resource constraints: 1) randomlygenerating initial solutions tends to produce invalidstarting points for a GA; 2) traditional GAs for gen-eral feature selection (Yang & Honavar, 1998; Ohet al., 2004) generate an arbitrary feature set, whichmay not conform to the feature model constraints;and 3) when facing feature selection problems thatinclude resource constraints, GAs require some formof repair or penalty approach , which is hard to de-rive.

5.1. Challenge 1: Randomly Seeding the Initial Pop-ulation with Valid Feature Selections

A GA is a stochastic algorithm that mimics naturalevolution. Its basic steps and running mechanism areshown in Figure 3. The GA first maintains an initialpopulation of solutions (called individuals orchro-mosomes). A binary encoding is used to represent apotential solution as a chromosome (e.g.string, suchas “10010101”). As in the case of biological evolu-tion, the evolutionary process begins with an initialpopulation of chromosomes and proceeds over a se-ries of generations. In each generation, every chro-mosome in the current population is evaluated by afitness function, which is an objective function thatdetermines the optimality of a chromosome so thatchromosomes can be compared against each other.The best chromosomes are selected as parents, andundergo genetic operations, such as crossover andmutation, to generate a new offspring from them.The offspring then replaces a member of the popu-lation in the next generation based on a replacementpolicy. The evolutionary process continues until achromosome satisfying a minimum criteria is foundor a fixed number of generations reaches.

To employ a GA for feature selection, a group offeature sets must be generated randomly to obtain aninitial population. However, since they are gener-ated randomly, these initial feature sets may not bevalid feature combinations that conform to the fea-

6

PopulationInitialize

populationCalculate fitness

Solution found

/Max generation?

Stop

iterations

Genetic

operationsOffspringReplacement Parents

Yes

Selection

No

Figure 3: The running mechanism of GAs

ture model constraints. For example, according totraditional GAs, a feature set{Win, Dynamic, LRU,API, get, BTree, Unindexed} could be randomly se-lected from the feature model shown in Figure 1 andbe used as an initial population member. However,this random feature set is not a valid product config-uration because only one feature from “BTree” and“Unindexed” can be selected. Thus randomly gener-ating initial solutions tends to produce a large set ofinvalid starting points, which decreases the probabil-ity that a valid or optimized solution will be found. Inorder to develop a good GA for feature selection, de-velopers must devise methods for handling the ran-dom generation of a set of valid feature selections.Section 6.3 describes how we address this challengeby applying an algorithm of transforming randomlygenerated feature sets into valid feature selections.

5.2. Challenge 2: Generating Valid Feature Selec-tions During Feature Selection Evolution

After a GA has generated an initial population, itproceeds to select population members to combineto produce new solutions. A core tenet of a GAis that combining two solutions has the potential toyield another good or better solution. When popula-tion members represent feature selections, however,combining two arbitrary solutions without violatingfeature model constraints is hard.

For example, suppose there are two initial validsolutions for the feature model shown in Fig-ure 1, {DB, OS, BufferMgr, Storage, Win, In-Memory, API, Indexing, get, Unindexed} and {DB,OS, BufferMgr, Storage, NutOs, Persistent, API,Indexing, MemAlloc, PageRepl, get, BTree, Dy-namic, LRU}. Using the chromosome encodingprocess described in Section 6.1, we perform abreadth-first traversal (a.k.a., level-order traversal)

of the feature model and encode the two initialsolutions into two strings, representing their ge-netic chromosomes: “1110101011100100010000”and “1110110101111100100110”. Then we per-form a uniform crossover operation (presentedin Section 6.6) by a random crossover mask“1100111001001110110011”. Finally we can geta new offspring “1110101101110100010100” thatindicates an invalid feature selection{DB, OS,BufferMgr, Storage, Win, Persistent, API, Indexing,MemAlloc, get, Unindexed, Dynamic}. Section 6.2describes how we address this challenge by introduc-ing an algorithm, calledfmTransform, to transformthe new generated offspring into a valid feature selec-tion that conforms to the feature model constraints.

5.3. Challenge 3: Generating Feature Selectionsthat Fit Resource Constraints

The SPL feature selection problem becomes morecomplex when resource constraints, such as CPU6 40, Memory6 4096, and Cost6 200, must be con-sidered. These considerations add further constraintsthat the population members may violate. Not onlymay the randomly generated initial population mem-bers consume more resources that are available butthe evolutionary combination of chromosomes mayyield offspring with excess resource consumption.

One common approach used in GAs to handlesituations where generated offspring may representinvalid solutions is to use arepair operator to fixinvalid solutions. A repair operator takes an in-valid solution as input and generates a valid solu-tion. The key challenge to applying repair operatorsto resource consumption violations is that determin-ing how to reach a valid feature selection that satis-fies resource constraints from an arbitrary feature se-lection is an NP-hard problem (White et al., 2009).

7

Another approach that is commonly employed is todesign the objective function so that invalid solutionsalways score lower than valid solutions. However,designing an objective function to balance resourceconsumption against desired SPL product capabilityis hard. Section 6.4 describes how we address thischallenge by computing the proportion of the gen-erated value to the consumed resource as the fitnessfunction in GAFES.

6. GAFES: A GA-based AI Approach to Opti-mized Feature Selection in SPLs

Following the idea of GAs, we designed a GA-based AI approach to automate SPL feature selec-tion optimization with resource constraints, GAFES,which is presented in Algorithm 1. In the follow-ing sections, we present a detailed specification ofGAFES. The remainder of this section is structuredas follows: 1) we define a mathematical representa-tion for the chromosomes (solutions) (Section 6.1);2) we present an algorithm, calledfmTransform, thatcan repair a feature selection that violates the fea-ture model constraints (Section 6.2); 3) we describehow GAFES uses a combination of random featureselection andfmTransformto obtain an initial popu-lation (Section 6.3); 4) we define the fitness function,present the mechanism of selection and replacement,and give the termination conditions for GAFES (Sec-tion 6.4); 5) we define the genetic operations includ-ing crossover and mutation (Section 6.6); and 6) wediscuss how we determine the initial parameter val-ues for GAFES, including the size of the population,the maximum generation, the crossover probabilityand the mutation rate.

6.1. Feature Chromosome Encoding

A chromosome in our GAFES represents a featurecombination in a feature model. For a feature modelwith n features, each chromosome is composed ofngenes and each gene represents a feature. Here, astring withn binary digits is used to encode a chro-mosome. A binary digit represents a feature, values 1and 0 meaning selected and unselected. For example,a chromosome “1110101011100100010000” meansthat the features{DB, OS, BufferMgr, Storage, Win,

Algorithm 1: GAFESInput : a feature model with F = { fi}, 1 6 i 6 n, C,

R(F), V(F), and Rc.Output : S ⊆ 2F .Initialize population P;1

repeat2

select two parent chromosomes p1 and p23

from P;offspring = crossover(p1, p2);4

mutation(offspring);5

fmTransform(offspring);6

if R(offspring)6 Rc then7

replace(P, offspring);8

until stopping condition;9

return S that is the fittest chromosome in P;10

InMemory, API, Indexing, get, Unindexed} are se-lected, if a breadth-first traversal of the feature modelshown in Figure 1 is performed.

6.2. fmTransform: An Algorithm for Arbitrary Fea-ture Set Transformation

As is described in Section 5, the key to adaptingGAs to the SPL feature selection problem is to de-velop a mechanism to transform an arbitrary (maybeinvalid) feature selection into a valid feature combi-nation. We designed an algorithm, calledfmTrans-form to achieve the task. As is shown in Algorithm 2,the input is a feature model with its featuresF and itsconstraintsC, and an arbitrary feature setSR. Gener-ally SR is generated randomly. The output is a validfeature combinationSV transformed fromSR accord-ing toC. SE contains all the features that should notbe selected.

Algorithm 2 works as follows. For each featuref ∈ SR, if f < SV and f < SE, then f is included inSV. Meanwhile, a path passing through the featuref is extended until one end of the path reaches theroot feature and the other reaches a terminal feature.All the features in the path are also included inSV.The above process is repeated until all the featuresin SR are traversed. After traversingSR, we traverseall the features inSV, find out those features whosechildren have not yet been included inSV and putthem intoSG. For each feature inSG, we generateits path from the root feature to a terminal featureand then include all the features through the path in

8

Step Input feature

in S_R

Initial operation S_V S_E

1 Win Includes(‘Win’) Win, OS, DB, BufferMgr,

Storage, API, Indexing

NutOS

2 Dynamic Includes(‘Dynamic’) (+) Dynamic, MemAlloc,

Persistent, PageRepl,

(+) Static,

InMemory

3 LRU Includes(‘LRU’) (+) LRU (+) LFU

4 API / (=) (=)

5 get Includes(‘get’) (+) get (=)

6 BTree Includes(‘BTree’) (+) BTree (+) Unindexed

7 Unindexed / (=) (=)

(+) means the current result equals to the result of previous step plus the following elements.

(=) means the current result equals to the result of previous step.

Figure 4: Demonstration of executingfmTransform

Algorithm 2: fmTransform(SR)

Input : F, C, SR ⊆ 2F .Output : SV ⊆ 2F .SV ← ∅; SE ← ∅; SG ← ∅;1

foreach feature f ∈ SR do2

if f < SV and f < SE then IncludeFeature( f );3

end4

foreach feature f ∈ SV do5

if every child feature of f< SV then SG ← f ;6

end7

foreach feature f ∈ SG do8

while f is not a terminal featuredo9

f ← randomly select a child feature of f ;10

IncludeFeature( f );11

end12

end13

SV, which guarantees the final feature combinationcomplete and valid.

Figure 4 demonstrates a sample process of execut-ing fmTransformon the feature model show in Fig-ure 1. The inputSR is {Win, Dynamic, LRU, API,get, BTree, Unindexed}. When a feature inSR is in-put, a set of features required by the input featureand the feature model constraints is included inSV,and another set of features excluded by the input fea-ture and the constraints is included inSE. The finaloutputSV is {DB, OS, Win, BufferMgr, Persistent,MemAlloc, Dynamic, PageRepl, LRU, Storage, API,get, Indexing, BTree}, andSE is {NutOS, Static, In-Memory, LFU, Unindexed}.

Algorithm 3: IncludeFeature( f )

if f < SV and f < SE then SV ← f ;1

if f is not the root featurethen2

IncludeFeature(the parent feature of f );3

if f ∈ an Alternative-groupthen4

ExcludeFeature(all f ’s brother features in the5

same group)if f ’s children∈ an And-groupthen6

foreach feature f′ ∈ the group of f ’s childrendo7

if f ′ is mandatorythen IncludeFeature( f ′);8

end9

if f excludes f′ ∈ C or f ′ excludes f∈ C then10

ExcludeFeature( f ′);11

if f requires f′ ∈ C then IncludeFeature( f ′);12

Algorithm 4: ExcludeFeature( f )

if f < SV and f < SE then SE ← f ;1

foreach feature f′ ∈ the group of f ’s childrendo2

ExcludeFeature( f ′);if f ∈ an And-groupand f is mandatorythen3

ExcludeFeature(the parent feature of f );4

if f ′ requires f∈ C then ExcludeFeature( f ′);5

Algorithm 3 and Algorithm 4 determine whethera feature should be included (∈ SV) or be excluded(∈ SE). They works in terms of the following rules.First, if a feature is included and is not the root fea-ture, then its parent should also be included; if afeature is excluded, then its children should also beexcluded. Second, for an Alternative-group, if one

9

of its member featuref is included, then all off ’sbrother features in the same group should be ex-cluded. Third, for an And-group, if its parent featureis included, then all of its mandatory feature shouldbe included; if one of its mandatory feature is ex-cluded, then its parent feature should be excluded.Fourth, for a constraintA requires B, if A is in-cluded, then B should be included; if B is excluded,then A should be excluded. Fifth, for a constraintA excludes B, whatever A or B is included, the otherone should be excluded. Sixth, for other features(e.g., the features in an Or-group), if necessary, anyfeature is selected randomly in order to generate acomplete feature combination.

Since a feature model with cross-tree constraintscan encode an arbitrary satisfiability problem, it isnot always possible to find a valid feature selection.The fmTransformalgorithm uses a retry counter tolimit the time spent attempting to repair a feature se-lection. In practice, we have found it rare forfm-Transformto be unable to generate a valid featureselection, but more research is needed to identify fea-ture model architectures for which it does not per-form well.

6.3. Initial Population

The initial population represents a set of initial so-lutions. GAFES’ algorithm for generating the initialpopulation is shown in Algorithm 5. A string withn binary digits is randomly generated to representa chromosome. Each randomly generated chromo-some essentially represents a random feature set in afeature model. By applying the algorithmfmTrans-form, these random feature sets are then transformedinto a set of valid feature combinations that conformto the feature model constraints. As is describedin Section 5.1, all the transformed feature combi-nations must also satisfy the resource constraints,which means that some initial solutions may be in-valid starting points. To prevent the initial populationgeneration step from running indefinitely, GAFES’controls the parameterretries to generate an initialpopulation in a limited retry time. Moreover GAFEScontrols the parameterd to obtain the expected num-ber of selected features in every generated chromo-some. Here, the functionrandom(1) generates a ran-dom float number within [0, 1].

Algorithm 5: Initial population generation

i = 1;1

repeat2

retries= 0;3

while retries< 2 ∗ ||P|| do4

foreach geneg in chromosomei do5

if random(1) < d/n then g = 1 else6

g = 0;end7

fmTransform(chromosomei );8

if R(chromosomei ) 6 Rc then9

i + +; break;10

else11

retries+ +; continue;12

end13

end14

until i > ||P|| ;15

sort all chromosomes in P nonincreasingly in16

terms of their fitness;

6.4. Feature Selection Fitness Evaluation

In order to evaluate the solutions and select thebest one, a fitness metric is needed. For the SPL fea-ture selection optimization with resource constraints,the obvious approach is to allow a developer tosupply a domain-specific objective function that de-scribes the value of a solution. The challenge, how-ever, is that directly employing this type of functiondoes not penalize solutions that violate resource con-straints. As we described in Section 5.3, this is asignificant problem.

To overcome this challenge, GAFES transformsthe objective function, supplied by the developer, inorder to penalize solutions which overconsume re-sources. The basic heuristic that GAFES uses totransform the objective function is that the best so-lutions will produce more value per unit of resourceconsumption. Thus the fitness of a chromosomeChis defined as the value of the solution produced bythe domain-specific objective function divided by thenet aggregate resource consumption. More formally,GAFES’ objective function is defined as:

Fitness(Ch) = V(Ch)/R(Ch). (4)

where:

10

• Ch - is a chromosome in the population, indi-cating a feature selection in a feature model.

• V(Ch) - is the value of the solutionCh, deter-mined by the domain-specific objective functionprovided by the developer. For example, thevalue produced by a specific feature selectionindicates the possible performance, cost, main-tainability, stability, etc.

• R(Ch) - is the sum of the resources consumedby all the features included in the chromosomeCh. The resources consumed by a specific fea-ture indicates the possible CPU, memory, devel-opment staff time, etc.

Developers can adapt the fitness function to op-timize for different measurement like performance,maintenance, etc. by changingV(Ch). R(Ch) can beseen as a penalty factor of the fitness function, whichmakes solutions that provide higher value per unit ofresource consumption preferred. When facing mul-tiple kinds of resource,R(Ch) can be designed as amultiple-dimensional vector where each dimensionindicates one kind of resource. This approach alsopermits weighting of resources to reflect importance.

6.5. Selection and Replacement of Feature Selec-tions

The chromosome selection process for the nextgeneration adopts a simple random selection mech-anism. For each generation, two parent chromo-somes are selected randomly in the population. Thecrossover operation generates a new chromosome(offspring) out of the two parents, and the mutationoperation slightly perturbs the offspring.

GAFES adopts the replacement mechanism de-veloped in (Bui & Moon, 1996). If the gener-ated offspring is superior to both parents, it replacesthe similar parent; if it is in between the two par-ents, it replaces the inferior parent; otherwise, themost inferior chromosome in the population is re-placed. GAFES stops when the number of genera-tions reaches the predefined maximum generationG.

6.6. Crossover and Mutation

GAFES uses the standard crossover and mutationoperations. As is shown in Figure 5, a uniform

11001001

00010111

1001101110001101

(01010011)

10001001

Parents Crossover Mask OffspringMutationOffspring

Figure 5: An example of uniform crossover and point mutation

crossover operation is used to combine bits sampleduniformly from the two parents. In this case, thecrossover mask is generated as a random bit stringwith each bit chosen at random and independent ofthe others. The point mutation operation producessmall random changes to the bit string by choosing asingle bit at random, then changing its value. Muta-tion is performed after crossover.

6.7. Parameters

No systematic parameter optimization process hasso far been attempted, but we use the following pa-rameters in our experiments: population size||P||=30, maximum generationG= 100, crossover proba-bility= 1 (always applied), and mutation rate= 0.1.

6.8. Algorithmic Complexity

We first analyze the algorithmic complexityof fmTransform. As is shown in Algorithm 3and Algorithm 4, both IncludeFeature( f ) andExcludeFeature( f )costO(cmlogn), wherem is theaverage number of child features for each featureand c is the maximum number of cross-tree con-straints (i.e., all therequiresandexcludesrelation-ships) in the feature model. Thus, in Algorithm 2,step 2∼4 costsO(||SR|| ∗ (cmlogn)). Step 5∼7 costsO(||SV|| ∗ m). Step 8∼13 costsO(||SG|| ∗ cmlog2 n).Since ||SR|| < n, ||SV|| < n, and ||SG|| < n, thetotal time of Algorithm 2, T( f mTrans f orm), isO(cmnlogn+mn+ cmnlog2 n) = O(cmnlog2 n).

The algorithmic complexity of GAFES shownin Algorithm 1 can be decomposed as follows.The first step requiresO(||P|| ∗ log ||P|| + ||P||2 ∗T( f mTrans f orm))= O(||P||2 ∗ T( f mTrans f orm))time to generate an initial population, as is shownin Algorithm 5. The parents selection operation instep 3 costsO(1), the crossover operation in step4 O(n), the mutation operation in step 5O(1), thereplace operation in step 8O(||P||). Thus, step2∼9 costsO(G ∗ (||P|| + n + T( f mTrans f orm)))=

11

O(G ∗ T( f mTrans f orm)). Step 10 costsO(||P||).Here, ||P||=30 andG=100. Thus, the total time ofAlgorithm 2 is O(||P||2 ∗ T( f mTrans f orm) + G ∗T( f mTrans f orm))= O(||P||2 ∗ T( f mTrans f orm))=O(||P||2 ∗ cmnlog2 n). If there are no cross-treeconstraints, the complexity is reduced toO(||P||2 ∗mnlog2 n). Both algorithmic complexities are poly-nomial.

7. Empirical Results

In this section, we present empirical results fromour experiments and evaluate our approach. As anapproximation algorithm, our approach cannot guar-antee that the generated approximation answer is op-timal. For effectiveness evaluation, we introducean important metric,optimality(Approximation An-swer/Optimal Answer), to measure how close the al-gorithm can get to the optimal answer. Another im-portant consideration is thetime consumptionandscalabilitybecause we must ensure that our approachis efficient even if there are hundreds or thousands offeatures.

7.1. Experimental Setup

We first adopt Thum’s method (Thum et al., 2009)to randomly generate feature models with differentcharacteristics. We also use the test pattern gener-ation technique devised by (Akbar et al., 2001) togenerate random optimization problem instances forwhich we knew the optimal answer. Then we para-metrically control the size of feature models for athorough runtime evaluation.

Independent parameters in our experiments are thenumber of features in a feature model. The timeneeded to generate an approximation answer andthe optimality are measured as dependent variables.To reduce the fluctuations in the dependent vari-ables caused by random generation, we performed100 repetitions for each configuration of indepen-dent parameters, i.e., we generated 100 random fea-ture models with the same parameters and each per-formed the same optimization problem. All mea-surements were performed on the same Windows XPwith Inter Pentium 4 CPU 3.0GHz and 2GB RAM.

7.1.1. Feature Model GenerationThe algorithm to randomly generate feature mod-

els of sizen is as follows (Thum et al., 2009): start-ing with a singleroot node, runs several iterations ofthe generation process. In each iteration, an exist-ing node without children is randomly selected, andbetween one and ten children are randomly added.These child nodes are connected to the parent eitherby And- (50% probability), Or- (25% probability) orAlternative-group (25% probability). Children in anAnd-group are optional by a 50% probability. Thisiterative process is continued until the feature modelhasn features. All features with children are con-sidered non-terminal. Moreover, we also generatecross-tree constraints (requiresand excludes). Forevery 10 features, one constraint is generated by thefollowing algorithm: two different features are ran-domly selected, and then are connected randomly byrequires(50% probability) orexcludes(50% proba-bility) link. According to Thum’s survey (Thum etal., 2009), these parameters are backed up by mostof the surveyed feature models and represent a roughaverage. Thus, these generated feature models ba-sically reflect the characteristics of realistic featuremodels.

The above generated feature models can be easilytranslated into propositional formulas (Batory, 2005;Thum et al., 2009). We use the SAT solversat4J1

to validate these feature models and discard all fea-ture models that do not have a single valid configu-ration (mostly due to poorly chosen cross-tree con-straints). We repeat the entire process until the ap-propriate number of valid feature models is gener-ated.

We fixed the following parameters: maximumnumber of children= 10; type of child group= (50%,25%, 25%); optional child= 50%; number of cross-tree constraints= 0.1*n; variables in cross-tree con-strains= 2. Again, Thum’s survey (Thum et al.,2009) shows that these parameters reflect real featuremodels.

7.1.2. Test Pattern GenerationThe optimization problem in product derivation is

initialized by the following pseudo random numbers:

1http://www.sat4j.org

12

resource consumed by each terminal featurer(t fi) =random(Rmax), resource consumed by each non-terminal featurer(n fi) = 0; value per unit resourceunitV = random(UnitVmax); value for each termi-nal featurev(t fi) = r(t fi) ∗ unitV + random(Vmax),value for each non-terminal featurev(n fi) = 0. Here,Rmax, UnitVmax, andVmax are the upper bound of re-source requirement, unit price of resource, and theextra value of a feature after its consumed resourceprice. The value of each item is not directly pro-portional to the resource consumption. The functionrandom(i) returns an integer from 0 to (i − 1), fol-lowing the uniform distribution. The total resourceconstraintRc = Rmax ∗ nT F ∗ 0.8 wherenT F is thenumber of all terminal features in a feature model.

According to (Akbar et al., 2001), if we want togenerate a problem for which an optimal answeris known, the following reinitializations have to bedone. If S is the set of features selected by an ap-proximation algorithm, thenR(S) =

∑fi∈S r( fi), i.e,

exactly equal to the sum of consumed resources of allselected features, and the value associated with eachselected feature isv( fi) = r( fi) ∗ unitV + Vmax. ThusV(S) =

∑fi∈S v( fi) =

∑fi∈S(r( fi)∗unitV+Vmax). This

reinitializations ensures maximum value per unit re-source for the selected feature.

7.2. Experiment 1: Small Feature Models

We implemented GAFES using Java. For com-parison, we implemented White’s method (Whiteet al., 2009), calledFiltered Cartesian Flattening(FCF), that can transform the SPL feature selec-tion optimization with resource constraints into aMulti-dimensional Multiple-choice Knapsack Prob-lem (MMKP). Then we solved the MMKP prob-lem to generate an optimized feature selection usingtwo kinds of optimization techniques: the Branchand Bound with Linear Programming (BBLP) (Al-suwaiyel, 1999) and the Modified Heuristic (M-HEU) algorithms (Akbar et al., 2001). M-HEU isa heuristic technique. It puts an upper limit on thenumber of upgrades and downgrades that can be per-formed (Akbar et al., 2001). BBLP can find an ex-act solution. But finding exact solutions is NP-hard,which means that it is not feasible to apply BBLPto all practical cases (e.g.,n > 500). Therefore, we

first performed an experiment on small feature mod-els whose sizen varies from 10 to 200.

Hypothesis. Our hypothesis was that GAFESwould be faster than FCF+BBLP because GAFES isan approximation algorithm and FCF+BBLP an ex-act one. GAFES would be also faster than FCF+M-HEU due to general knowledge of traditional GAs.In addition, we believed GAFES could obtain betteroptimality than the other two approaches.

Experiment 1 Results.Table 2 shows a compari-son among FCF+BBLP, FCF+M-HEU, and GAFES.Here, the experiment is performed on each featuremodel whose sizen varies from 10 to 200, as islisted in column 1. 100 feature models for each sizeare generated randomly with the same parameters,as is described in section 7.1.1, and we only take themean value of their results. We used the constantsRmax = 10, UnitVmax = 10, andVmax = 20 for gen-eration of test cases, as is explained in section 7.1.2.Data is not initialized for a predefined maximum to-tal value for the selected features.

The columnsTBBLP, TM−HEU, andTGA show thetime requirement for the FCF+BBLP, FCF+M-HEU,and GAFES solutions. The columnTGAInit gives thetime requirement for the initialization of GAFES.The columnTGAInit/TGA presents the proportion ofthe initialization to the whole GAFES in time con-sumption. We can see that the initialization time oc-cupies a considerable proportion (about 30-45%) ofthe whole time.

The columns (TBBLP−TGA)/TBBLP and (TM−HEU −

TGA)/TM−HEU indicate the proportion of the timesaved by GAFES to the whole time consumptionby FCF+BBLP and FCF+M-HEU respectively. Fortiny feature models with 10 features, GAFES con-sumed more time than FCF+BBLP and FCF+M-HEU. For feature models whose size varies from50 to 200, GAFES reduced 94-99% time consump-tion of FCF+BBLP and 92-97% time consumptionof FCF+M-HEU.

The columnVBBLP gives the average value earnedfrom the FCF+BBLP solutions. The columnsVM−HEU/VBBLP andVGA/VBBLP indicate the averagestandardized value earned in the two heuristics (i.e.,FCF+M-HEU and GAFES) with respect to the ex-act solutions generated by FCF+BBLP. FCF+M-HEU obtains 98-99% optimality while GAFES 87-

13

Table 2: Experimental results for small feature models

n TBBLP(ms)

TM−HEU(ms)

TGAInit(ms)

TGA(ms)

TGAInitTGA

TBBLP−TGATBBLP

TM−HEU−TGATM−HEU

VBBLPVM−HEUVBBLP

VGAVBBLP

VGAVM−HEU

σM−HEU σGA

10 0.95 0.79 1.03 3.15 0.3270 -2.3158 -2.9873 348.37 0.986 0.866 0.878 0.0685 0.069950 202.99 160.35 3.64 12.27 0.2967 0.9396 0.9235 1362.05 0.989 0.874 0.884 0.0540 0.0402

100 1813.74 1217.44 12.39 31.79 0.3897 0.9825 0.9739 2211.03 0.981 0.903 0.920 0.0356 0.0439200 8800.76 3070.32 45.77 101.53 0.4508 0.9885 0.9669 3825.20 0.976 0.895 0.917 0.0498 0.0582

90%, which did not beat FCF’s optimality as ex-pected. The columnVGA/VM−HEU shows that GAFESproduces solutions with 88-92% of the optimalityof FCF+M-HEU. In addition,σM−HEU andσGA arethe standard deviation of standardized total valueachieved in the 100 feature models, given to indicatestability.

7.3. Experiment 2: Large Feature Models

From Table 2, we can see that the time require-ments of FCF+BBLP solutions increase dramati-cally with the size of feature models because of theexponential computation complexity of exact tech-niques. It is impractical to test the performanceof FCF+BBLP solutions for larger feature mod-els. Hence, in this experiment, we only performedFCF+M-HEU and GAFES on large feature modelswhose size varies from 500 to 10000. Moreover,due to the lack of the solutions of exact techniques(i.e., FCF+BBLP), we adopt the test pattern genera-tion technique described in section 7.1.2 to estimatethe optimality achieved by the two heuristics (i.e.,FCF+M-HEU and GAFES).

Hypothesis. Based on the results of Experi-ment 1, our hypothesis for this experiment wasthat GAFES would be faster than FCF+M-HEU.GAFES would have better scalability than FCF+M-HEU. But, based on the first experiment, we ex-pected GAFES to have slightly lower optimality thanFCF+M-HEU.

Experiment 2 Results.Table 3 shows a compari-son between FCF+M-HEU and GAFES. Like Exper-iment 1, 100 feature models for each size are gener-ated randomly with the same parameters, as is de-scribed in section 7.1.1, and we only take the meanvalue of their results. We still used the constantsRmax = 10, UnixVmax = 10, andVmax = 20 for gen-eration of test cases, as is explained in section 7.1.2.Data is not initialized for a predefined maximum to-tal value for the selected features.

The columnsTM−HEU andTGA show the time re-quirement for the FCF+M-HEU and GAFES solu-tions. From the columnTGAInit/TGA, we can seethat the time requirement for GAFES initialization,TGAInit, accounts for 20-23% of the total time ofGAFES. The column (TM−HEU−TGA)/TM−HEU showsthat GAFES reduces 45-94% time consumption ofFCF+M-HEU.

The columnMaxValueindicates the average valuegenerated by the reinitialization in the test patterngeneration technique. TakingMaxValueas the op-timal answer, FCF+M-HEU obtains 89-93% op-timality while GAFES about 86%. The columnVGA/VM−HEU shows that GAFES produces solutionswith 93-97% of the optimality of FCF+M-HEU.From the columnsσM−HEU andσGA, we can see thatall the values, achieved in the 100 random featuremodels for each size, maintains relatively stable.

7.4. Discussion of Results& Threats to Validity

From Table 2 and Table 3, we can conclude thatGAFES performs more efficiently than existing exact(i.e., FCF+BBLP) and heuristic (i.e., FCF+M-HEU)approaches. That is, it produced nearly as optimalresults (within 86-90%) in 45-99% less time. Onlyfor tiny feature models with 10 features, GAFESconsumed more time than FCF+BBLP and FCF+M-HEU, which is caused by the fixed retry time forgenerating an initial population and the fixed num-ber of maximum generation defined in GAFES. Formost of the feature models whose size varies from 20to 10000, GAFES reduces 45-99% time consump-tion of FCF+BBLP or FCF+M-HEU. Especially forlarge feature models (size> 500), GAFES obtainsbetter scalability than FCF+M-HEU. As for optimal-ity, GAFES stays well above 86% optimality, whichis with 88-97% optimal of FCF+M-HEU. In addi-tion, the stability of the solution performance is al-most same in both the cases generated by the twoheuristics.

14

Table 3: Experimental results for large feature models

n TM−HEU(ms)

TGAInit(ms)

TGA(ms)

TGAInitTGA

TM−HEU−TGATM−HEU

MaxV VM−HEUMaxV

VGAMaxV

VGAVM−HEU

σM−HEU σGA

500 6193.43 88.63 384.42 0.2306 0.9379 348.37 0.933 0.863 0.925 0.0530 0.05121000 14083.91 354.55 1513.68 0.2342 0.8925 1362.05 0.905 0.862 0.952 0.0699 0.05392000 33270.84 1453.25 6211.57 0.2340 0.8133 2211.03 0.897 0.861 0.960 0.0780 0.04845000 91429.64 6213.59 31358.35 0.1981 0.6570 3825.20 0.8900.856 0.962 0.0769 0.0506

10000 194093.20 22486.81 107632.03 0.2089 0.4455 77470.710.887 0.857 0.966 0.0715 0.0523

Threats to internal validity are influences that canaffect the time requirement of all exact and heuristicsolutions that have not been considered. We cannotguarantee that computation time depends on certainshapes of a feature model or certain resource con-straints. However, to avoid effects of certain featuremodels, all of input feature models are generated au-tomatically by simulating known feature models andeach measurement is repeated 100 times with freshlygenerated feature models.

Threats to external validity are conditions thatlimit our ability to generalize the results of our ex-periments to industrial practice. We generated fea-ture models with the described algorithm and param-eters, and confirmed that they align well with thoseknown feature models acquired from existing publi-cations in SPL community. We cannot guarantee thatour generated test cases are typical in practice. How-ever, to simulate practical situation as far as possi-ble, we randomly generate the consumed resourceand the provided value for each terminal features.Moreover, test data is not initialized for a predefinedmaximum total value for the selected features. Thetotal resource constraint is generated dynamically foreach random feature model.

8. Related Work

This section compares our work on GAFES to re-lated work on AI approaches and other approachesfor feature selection optimization, including ex-act and heuristic techniques and visualization tech-niques.

8.1. Exact Techniques for SPL Feature Selection

Many exact techniques in the AI community havebeen applied and extended to solve the SPL featureselection optimization. Benavides et al. (Benavides

et al., 2005) considered resource constraints in prod-uct derivation process and applied Constraint Satis-faction Problems (CSPs) to model and solve featureselection problems automatically. Their techniqueworks well for small-scale problems where an ap-proximation technique is not needed. For large-scaleproblems, however, their technique is too computa-tionally demanding. In contrast, GAFES works wellon large-scale problems.

Mannion (Mannion, 2002), Batory (Batory, 2005),and Czarnecki et al. (Czarnecki & Wasowski, 2007)applied propositional logic to automated feature se-lection. However, these techniques were not de-signed to handle integer resource constraints and thuscannot handle the SPL feature selection optimiza-tion with resource constraints. Moreover, these tech-niques depend on SAT or BDD solvers that use expo-nential algorithms. GAFES is a polynomial-time al-gorithm that can handle integer resource constraintsand thus can solve the SPL feature selection op-timization with resource constraints even on large-scale problems.

Zhang et al. (Zhang et al., 2003) poposed a tech-nique for reasoning about SPL variant quality at-tributes using Bayesian Belief Networks. Jarzabeket al. (Jarzabek et al., 2006) developed another tech-nique in this area based on soft goals. However, thesetechniques is designed for selecting features in situ-ations where it is difficult to predict the impact of afeature selection. But in GAFES, the exact impactof each feature selection on the resource consump-tion of the variant is known. The two techniques arecomplementary to each other. If the exact impact offeature selections on variant quality is not known, thetechniques of Zhang or Jarzabek can be used. If theimpact of feature selection is known, GAFES is ap-propriate.

15

8.2. Heuristics for SPL Feature Selection

Although exact algorithms to solve NP-hard prob-lems have exponential complexity, each NP-hardproblem typically has a number of approximation al-gorithms that can be used to solve it with acceptableoptimality. Search-based software engineering (Har-man, 2007) advocates the application of optimizationtechniques from the operations research and heuris-tic (or metaheuristic) computation research commu-nities to software engineering. It has already hadseveral successful application domains in softwareengineering (Harman, 2007), such as optimizing thesearch for requirements to form the next release(Bagnall et al., 2001) and optimizing test data selec-tion and prioritization (Walcott et al., 2006). How-ever, to the best of our knowledge, there is only onework from White et al. (White et al., 2009) provid-ing an approximation algorithm for the SPL featureselection optimization with resource constraints.

White et al. (White et al., 2009) provided a poly-nomial time approximation algorithm for selecting ahighly optimal set of features that adheres to a set ofresource constraints. They proposed the FCF tech-nique that transforms the optimized feature selec-tion problem into the MMKP problem, and then usethe M-HEU technique (Akbar et al., 2001) to solvethe MMKP problem. However, their approach stillrequires significant computing time for large-scaleproblems. Moreover their approach involves two ap-proximation processes, one is in the FCF stage totransform all And- and Or-groups into Alternative-groups, another is in the M-HEU stage to generate anapproximation solution. In addition, their approachdemands a higher resource tightness, which is alsoa common problem for the heuristic techniques thatsolve the MMKP problem (Akbar et al., 2001; Whiteet al., 2009).

GAFES also provides a heuristic solution. Com-pared with White’s method (FCF+M-HEU), it hasonly one approximation process, i.e, the genetic pro-cess of generating an approximation solution. More-over GAFES avoids the limitation of resource tight-ness because it transforms the SPL feature selectionoptimization into a genetic problem not an MMKPproblem. Experiments show that GAFES can reduce45-97% time consumption of FCF+M-HEU.

8.3. Feature Model Visualization Techniques

Some researchers developed special visualizationtechniques to assist developers in decision mak-ing during the product derivation process. Sell-ier and Mannion (Sellier & Mannion, 2007) pro-posed a visualization metamodel for representinginter-dependencies between SPLs and described atool to realize these visualizations. Botterweck etal. (Botterweck et al., 2007) presented a metamodelthat described staged feature configuration and intro-duced a tool that illustrated the advantages of inter-active visualization in managing feature configura-tion. However, these approaches focus more on func-tional requirements of a product and their dependen-cies and less on non-functional requirements or re-source constraints (Siegmund et al., 2008). More-over, industrial-sized feature models with hundredsor thousands of features make a fully manual featureselection process usually impossible.

9. Concluding Remarks & Lessons Learned

To quickly derive an optimal product configura-tion, developers need algorithmic techniques to au-tomatically generate a valid feature selection that op-timize desired product properties. However, findinga feature selection with optimal product capabilitysubject to required constraints is an NP-hard prob-lem. Although there are numerous heuristic tech-niques for many NP-hard problems, they cannot di-rectly support the SPL feature selection optimizationwith resource constraints because they are not de-signed to handle various structural and semantic con-straints defined in the feature model. This lack ofheuristics limits the scale of feature model on whichdevelopers can realistically optimize and evaluate alarge number of possible product configurations.

This paper presents a GA-based feature selectionapproach, GAFES, for automated product derivationin SPLs. The key to adapting GAs to the SPL fea-ture selection optimization successfully is that wedesign an algorithm that can transform an arbitraryfeature set into a valid feature combination conform-ing to the feature model constraints. Experimentsshow that GAFES can achieve an average of 86-90% optimality even for large feature models. More-over GAFES can derive a feature selection 45-99%

16

faster than existing heuristic and exact feature selec-tion techniques.

From our research on GAFES, we learned the fol-lowing important lessons:

1. For large-scale feature models, exact techniquessuffer from exponential algorithmic complex-ity and typically require days, months, or more,to solve the SPL feature selection optimizationwith resource constraints. In general, heuris-tic techniques have a polynomial algorithmiccomplexity. GAFES is heuristic and achieves45-97% faster than existing heuristics (e.g.,FCF+M-HEU) for the SPL feature selectionproblem.

2. GAFES produces solutions with 86-97% opti-mal of prior feature selection techniques. How-ever, prior feature selection techniques, suchas FCF+BBLP and FCF+M-HEU, obtain bet-ter optimality at the cost of longer solving time,which is caused by traversing larger solutionspace. In addition, the optimality gained byGAFES maintains relatively stable for differentsize of feature models.

3. GAFES is most applicable to domains, suchas distributed real-time and embedded systemswhere the precise impact of feature selectionson the performance and resource consumptionof the product is known.

4. A key component of GAFES is the algorithmfmTransformthat can, in most cases, trans-form an arbitrary feature set into a valid fea-ture combination in a feature model. Sincemany stochastic heuristics, such as GAs andAnt Colony Optimization techniques, work on arange of randomly generated feature sets and fi-nally generate an arbitrary feature selection, thealgorithmfmTransformcan be seen as a triggerfor applying these stochastic heuristics to theSPL feature selection problem.

In future work, we plan to adapt more heuristics,such as ant colony optimization (Al-Ani, 2005) andparticle swarm optimization (Lin et al., 2008), tothe SPL feature selection optimization with resourceconstraints.

Acknowledgments

Funding was provided by the National Nat-ural Science Foundation of China (NSFC No.60773088), the National High-tech R&D Program ofChina (863 Program No. 2009AA04Z106), and theKey Program of Basic Research of Shanghai Munic-ipal S&T Commission (No. 08JC1411700).

Akbar, M. M., Manning, E. G., Shoja, G. C., Khan, S.,2001. Heuristic Solutions for the Multiple-Choice Multi-dimension Knapsack Problem. In: Proceedings of ICCS’01,San Francisco, CA, USA, pp. 659-668.

Al-Ani, A., 2005. An Ant Colony Optimization Based Ap-proach for Feature Selection. In: Proceedings of AIML’05,Cairo, Egypt.

Alsuwaiyel, M. H., 1999. Algorithms: design techniques andanalysis. World Scientific.

Bagnall, A. J., Rayward-Smith, V. J., Whittley, I., 2001. Thenext release problem. Information & Software Technology43 (14), 883-890.

Batory, D., 2005. Feature Models, Grammars, and Propo-sitional Formulas. In: Proceedings of SPLC’05, Rennes,France, pp. 7-20.

Batory, D., Benavides, D., Ruiz-Cortes, A., 2006. Automatedanalysis of feature models: challenges ahead. Communica-tions of the ACM 49 (12), 45-47.

Benavides, D., Martin-Arroyo, P. T., Cortes, A. R., 2005. Au-tomated Reasoning on Feature Models. In: Proceedings ofCAiSE’05, Porto, Portugal, pp. 491-503.

Benavides, D., Ruiz-Cortes, A., Batory, D., Heymans, P., 2008.First International Workshop on Analysis of Software Prod-uct Lines (ASPL’08). In: Proceedings of SPLC’08, Limer-ick, Ireland, pp. 385.

Boeing Company, 2002. Bold Stroke Avionics Software Fam-ily. URL: <http://splc.net/fame/boeing.html>.

Botterweck, G., Nestor, D., Preubner, A., Cawley, C., Thiel,S., 2007. Towards Supporting Feature Configuration by In-teractive Visualisation. In: Proceedings of 1st InternationalWorkshop on Visualisation in Software Product Line Engi-neering (ViSPLE 2007), Tokyo, Japan, pp. 125-131.

Bui, T. N., Moon, B. R., 1996. Genetic Algorithm and GraphPartitioning. IEEE Transactions on Computers 45 (7), 841-855.

Clements, P., Northrop, L., 2001. Software Product Lines:Practices and Patterns. Addison-Wesley, Boston, MA, USA.

Czarnecki, K., Wasowski, A., 2007. Feature Diagrams and Log-ics: There and Back Again. In: Proceedings of SPLC’07,Kyoto, Japan, pp. 23-34.

Deelstra, S., Sinnema, M., Bosch, J., 2004. Experiences in Soft-ware Product Families: Problems and Issues During ProductDerivation. In: Proceedings of SPLC’04, Boston, MA, USA,pp. 165-182.

Deelstra, S., Sinnema, M., Bosch, J., 2005. Product derivationin software product families: a case study. Journal of Sys-tems and Software 74 (2), 173-194.

17

Dudley, G., Joshi, N., Ogle, D. M., Subramanian, B., Topo,B. B., 2004. Autonomic Self-Healing Systems in a Cross-Product IT Environment. In: Proceedings of ICAC’04, NewYork, NY, USA, pp. 312-313.

FAME-DBMS, Project Prototypes. URL: <http://fame-dbms.org/index.shtml>.

Guo, J., Wang, Y., 2010. Towards Consistent Evolution of Fea-ture Models. In: Proceedings of SPLC’10, Jeju Island, SouthKorea, pp. 451-455.

Harman, M., 2007. The Current State and Future of SearchBased Software Engineering. In: Proceedings of Workshopon the Future of Software Engineering (ISCE’07), Min-neapolis, MN, USA, pp. 342-357.

Jarzabek, S., Yang, B., Yoeun, S., 2006. Addressing qualityat-tributes in domain analysis for product lines. Software, IEEProceedings 153, 61-73.

De Jong, K. A., Spears, W. M., 1989. Using Genetic Algo-rithms to Solve NP-Complete Problems. In: Proceedings ofICGA’89, George Mason University, Fairfax, Virginia, USA,pp. 124-132.

Kang, K. C., Cohen, S. G., Hess, J. A., Novak, W. E., Peterson,A. S., 1990. Feature-Oriented Domain Analysis (FODA)Feasibility Study, Technical Report CMU/SEI-90-TR-021.Software Engineering Institute, CMU.

Kang, K. C., Kim, S., Lee, J., Kim, K., Shin, E., Huh, M.,1998. FORM: A feature-oriented reuse method with domain-specific reference architectures. Annals of Software Engi-neering 5 (1), 143-168.

Kang, K. C., Lee, J., Donohoe, P., 2002. Feature-oriented prod-uct line engineering. IEEE Software 19 (4), 58-65.

Lin, S.-W., Ying, K.-C., Chen, S.-C., Lee, Z.-J., 2008. Particleswarm optimization for parameter determination and featureselection of support vector machines. Expert Systems withApplications 35 (4), 1817-1824.

Loesch, F., Ploedereder, E., 2007. Optimization of Variabilityin Software Product Lines. In: Proceedings of SPLC’07, Ky-oto, Japan, pp. 151-162.

Mannion, M., 2002. Using First-Order Logic for Product LineModel Validation. In: Proceedings of SPLC’02, San Diego,CA, USA, pp. 176-187.

Mendonca, M., Branco, M., Cowan, D. D., 2009. S.P.L.O.T.:software product lines online tools. In: Proceedings of OOP-SLA, Orlando, Florida, USA, pp. 761-762.

Mitchell, M., 1996. An introduction to genetic algorithms.MITPress, Cambridge, MA.

Muhlenbein, H., 1989. Parallel Genetic Algorithms PopulationGenetics and Combinatorial Optimization. In: Proceedingsof ICGA’89, George Mason University, Fairfax, Virginia,USA, pp. 416-421.

Oh, I.-S., Lee, J.-S., Moon, B. R., 2004. Hybrid Genetic Algo-rithms for Feature Selection. IEEE Transactions on PatternAnalysis and Machine Intelligence 26 (11), 1424-1437.

Pohl, K., Bockle, G., van der Linden, F., 2005. Software Prod-uct Line Engineering: Foundations, Principles, and Tech-niques. Springer-Verlag, Berlin Heidelberg.

Rosenmuller, M., Siegmund, N., Schirmeier, H., Sincero, J.,

Apel, S., Leich, T., Spinczyk, O., Saake, G., 2008. FAME-DBMS: Tailor-made Data Management Solutions for Em-bedded Systems. In: Proceedings of EDBT Workshop onSoftware Engineering for Tailor-made Data Management,Nantes, France, pp. 1-6.

Sellier, D., Mannion, M., 2007. Visualizing Product Line Re-quirement Selection Decision. In: Proceedings of 2nd Inter-national Workshop on Requirements Engineering Visualiza-tion (REV’07), New Delhi, India.

Siegmund, N., Rosenmuller, M., Kuhlemann, M., Kastner, C.,Saake, G., 2008. Measuring Non-Functional Properties inSoftware Product Line for Product Derivation. In: Proceed-ings of APSEC’08, Beijing, China.

Steger, M., Tischer, C., Boss, B., Muller, A., Pertler, O., Stolz,W., Ferber, S., 2004. Introducing PLA at Bosch GasolineSystems: Experiences and Practices. In: Proceedings ofSPLC’04, Boston, MA, USA, pp. 34-50.

Thum, T., Batory, D. S., Kastner, C., 2009. Reasoning aboutedits to feature models. In: Proceedings of ICSE’09, Van-couver, Canada, pp. 254-264.

Walcott, K. R., Soffa, M. L., Kapfhammer, G. M., Roos, R. S.,2006. Time aware test suite prioritization. In: Proceedingsof ISSTA’06, Portland, Maine, USA, pp. 1-12.

White, J., Dougherty, B., Schmidt, D. C., 2009. Selectinghighly optimal architectural feature sets with Filtered Carte-sian Flattening. Journal of Systems and Software 82 (8),1268-1284.

Yang, J. H., Honavar, V., 1998. Feature Subset Selection Usinga Genetic Algorithm. IEEE Intelligent Systems 13 (2), 44-49.

Zhang, H., Jarzabek, S., Yang, B., 2003. Quality Predic-tion and Assessment for Product Lines. In: Proceedings ofCAiSE’03, Klagenfurt, Austria, pp. 681-695.

18

Documents

A Genetic Algorithm for Optimized Feature Selection with ...jules/jianmei.pdf · The results show that GAFES can produce fea-ture selections that are within objective function scores