12
Research Article Ensemble Merit Merge Feature Selection for Enhanced Multinomial Classification in Alzheimer’s Dementia T. R. Sivapriya, 1,2 A. R. Nadira Banu Kamal, 3 and P. Ranjit Jeba Thangaiah 4 1 Department of Computer Science, Lady Doak College, Madurai, Tamil Nadu 625002, India 2 e Alzheimer’s Disease Neuroimaging Initiative, San Diego, CA 92093-0949, USA 3 Department of Computer Science, TBAK College, Kilakarai, Tamil Nadu 623517, India 4 Department of Computer Applications, Karunya University, Coimbatore, Tamil Nadu 641114, India Correspondence should be addressed to T. R. Sivapriya; [email protected] Received 8 February 2015; Revised 8 May 2015; Accepted 18 May 2015 Academic Editor: Maria N. D. S. Cordeiro Copyright © 2015 T. R. Sivapriya et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e objective of this study is to develop an ensemble classifier with Merit Merge feature selection that will enhance efficiency of classification in a multivariate multiclass medical data for effective disease diagnostics. e large volumes of features extracted from brain Magnetic Resonance Images and neuropsychological tests for diagnosis lead to more complexity in classification procedures. A higher level of objectivity than what readers have is needed to produce reliable dementia diagnostic techniques. Ensemble approach which is trained with features selected from multiple biomarkers facilitated accurate classification when compared with conventional classification techniques. Ensemble approach for feature selection is experimented with classifiers like Na¨ ıve Bayes, Random forest, Support Vector Machine, and C4.5. Feature search is done with Particle Swarm Optimisation to retrieve the subset of features for further selection with the ensemble classifier. Features selected by the proposed C4.5 ensemble classifier with Particle Swarm Optimisation search, coupled with Merit Merge technique (CPEMM), outperformed bagging feature selection of SVM, NB, and Random forest classifiers. e proposed CPEMM feature selection found the best subset of features that efficiently discriminated normal individuals and patients affected with Mild Cognitive Impairment and Alzheimer’s Dementia with 98.7% accuracy. 1. Introduction Dementia is a neuropsychiatric disease widespread in many countries that affects people in older age [1]. Early diagnosis helps in palliative care, mitigation, and prevention of disease progression. Accurate diagnosis of crucial factors that cause the disease is vital for timely treatment [2]. Several high- dimensional pattern classification techniques have been built upon methods of computational anatomy, functional neu- roimaging [3], and neuropsychological analysis demonstrat- ing that classifications of individuals, in contrast to group analysis, can be achieved with relatively high classification accuracy. Recently there has been a growing interest for high- dimensional feature selection and classification methods that can combine information from the whole brain measurement [4] and neuropsychological data [5] to discriminate between individual subjects. Moreover another study indicates that not only older population but also men and women under the age of 50 are affected by dementia [6]. ere are several stud- ies that have proved the effective utilization of neuropsycho- logical test data [79] for earlier diagnosis of dementia and for conversion from Mild Cognitive Impairment to Dementia. e application of artificial intelligence techniques to cognitive measures provides enhanced feature specific ana- lytic methods for neuropsychological data that has already been experimented for the diagnosis of dementia caused by Alzheimer’s disease [10]. Automated classification of Demen- tia with PET images has been done with structural warp- ing of neuroimaging data [11]. Kl¨ oppel et al. developed auto- mated classification of Magnetic Resonance scans and com- pared the performance of computerized method with a radiologist in this area of research [12]. Larner has reviewed the importance of cognitive screening instruments and their accuracy in diagnosis of Dementia [13]. A diagnostic method Hindawi Publishing Corporation Computational and Mathematical Methods in Medicine Volume 2015, Article ID 676129, 11 pages http://dx.doi.org/10.1155/2015/676129

Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Research ArticleEnsemble Merit Merge Feature Selection for EnhancedMultinomial Classification in Alzheimerrsquos Dementia

T R Sivapriya12 A R Nadira Banu Kamal3 and P Ranjit Jeba Thangaiah4

1Department of Computer Science Lady Doak College Madurai Tamil Nadu 625002 India2The Alzheimerrsquos Disease Neuroimaging Initiative San Diego CA 92093-0949 USA3Department of Computer Science TBAK College Kilakarai Tamil Nadu 623517 India4Department of Computer Applications Karunya University Coimbatore Tamil Nadu 641114 India

Correspondence should be addressed to T R Sivapriya spriyatrgmailcom

Received 8 February 2015 Revised 8 May 2015 Accepted 18 May 2015

Academic Editor Maria N D S Cordeiro

Copyright copy 2015 T R Sivapriya et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

The objective of this study is to develop an ensemble classifier with Merit Merge feature selection that will enhance efficiency ofclassification in amultivariate multiclass medical data for effective disease diagnosticsThe large volumes of features extracted frombrain Magnetic Resonance Images and neuropsychological tests for diagnosis lead to more complexity in classification proceduresA higher level of objectivity than what readers have is needed to produce reliable dementia diagnostic techniques Ensembleapproach which is trained with features selected from multiple biomarkers facilitated accurate classification when compared withconventional classification techniques Ensemble approach for feature selection is experimented with classifiers like Naıve BayesRandom forest Support Vector Machine and C45 Feature search is done with Particle Swarm Optimisation to retrieve the subsetof features for further selection with the ensemble classifier Features selected by the proposed C45 ensemble classifier with ParticleSwarmOptimisation search coupled withMerit Merge technique (CPEMM) outperformed bagging feature selection of SVM NBandRandom forest classifiersTheproposedCPEMMfeature selection found the best subset of features that efficiently discriminatednormal individuals and patients affected with Mild Cognitive Impairment and Alzheimerrsquos Dementia with 987 accuracy

1 Introduction

Dementia is a neuropsychiatric disease widespread in manycountries that affects people in older age [1] Early diagnosishelps in palliative care mitigation and prevention of diseaseprogression Accurate diagnosis of crucial factors that causethe disease is vital for timely treatment [2] Several high-dimensional pattern classification techniques have been builtupon methods of computational anatomy functional neu-roimaging [3] and neuropsychological analysis demonstrat-ing that classifications of individuals in contrast to groupanalysis can be achieved with relatively high classificationaccuracy Recently there has been a growing interest for high-dimensional feature selection and classification methods thatcan combine information from thewhole brainmeasurement[4] and neuropsychological data [5] to discriminate betweenindividual subjects Moreover another study indicates that

not only older population but alsomen andwomen under theage of 50 are affected by dementia [6] There are several stud-ies that have proved the effective utilization of neuropsycho-logical test data [7ndash9] for earlier diagnosis of dementia and forconversion fromMild Cognitive Impairment to Dementia

The application of artificial intelligence techniques tocognitive measures provides enhanced feature specific ana-lytic methods for neuropsychological data that has alreadybeen experimented for the diagnosis of dementia caused byAlzheimerrsquos disease [10] Automated classification of Demen-tia with PET images has been done with structural warp-ing of neuroimaging data [11] Kloppel et al developed auto-mated classification of Magnetic Resonance scans and com-pared the performance of computerized method with aradiologist in this area of research [12] Larner has reviewedthe importance of cognitive screening instruments and theiraccuracy in diagnosis of Dementia [13] A diagnostic method

Hindawi Publishing CorporationComputational and Mathematical Methods in MedicineVolume 2015 Article ID 676129 11 pageshttpdxdoiorg1011552015676129

2 Computational and Mathematical Methods in Medicine

was developed using neuropsychological test improved bymultivariate analyses using PCA [7] A research report com-paring the conventional statistical classifiers and machinelearning methods demonstrated the comparable improvedperformance of the machine learning methods [14] A studyby Quintana et al provides evidence that Artificial NeuralNetworks can be a useful tool for the analysis of neuropsy-chological profiles related to clinical syndromes Yu et aldeveloped a model of Support Vector Machine for predictionof common diseases in the case of occurrence of diabetes andprediabetes [15] Hachesu et al applied the Neural NetworksDecision Tree and SVM to determine and predict the lengthof stay of cardiac patients [16]

Kabir et al presented a new feature selection (FS) algo-rithm based on the wrapper approach usingNeural Networks[17] The vital aspect of this algorithm is the automaticdetermination of Neural Network architectures during thefeature selection process Maldonado et al have applied SVMfor simultaneous feature selection and classification [18] Newapproach for classification of microarray high-dimensionaldata has been evolved [19] Chen et al applied classificationtrees for larger datasets in Bioinformatics [20] Calle et aldeveloped a new strategy for genome data profiling withRandom forest [21]

Several studies with multimodal data [22] have proventhe classification efficiency of Random forest [14 20 21] In astudy forDifferentiation ofMCI fromADNaıve Bayes SVMNN and Decision Tree (DT) were used for feature selectionand Naıve Bayes was used as the base classifier [23] In thatstudy Naıve Bayes andDT gave better results when comparedwith SVM

Relevance of This Study Attribute selection performs a keyrole in building a good classifier which can efficiently delin-eate the patient records with absolute accuracy and efficiencyThis study proposes an ensemble feature selection approachusing J48 classifier with PSO search strategy andMeritMergetechnique to do the following

(a) Find the optimal subset that can effectively delineatethe three classes as Normal (NL) Mild CognitiveImpairment (MCI) and Alzheimerrsquos Dementia (AD)with ensemble feature selection

(b) Find all possible subset combinations that canincrease the accuracy in the discrimination of MildCognitive Impairment from Dementia

(c) Train and test an ensemble model that can effectivelyclassify multiclass medical data

2 Feature Selection and Classification

21 Feature Selection Feature selection is an important stepthat determines the performance of a classifier Dimensionreduction [24] is compulsory for better classification oflarger datasets Feature extraction selects the most relevantnonredundant features of interest from the given data Ingeneral feature selection can be performed by filter wrapper[17] and embedded methods Several studies have beenreported for feature selection with Support Vector Machine[18 25 26] and Random forest [21] Uncu and Turksen

developed a new approach with combination of filters andwrapper for feature selection [27]

Particle SwarmOptimisation (PSO) is a search techniquethat is a proven feature selection mechanism [28] The capa-bility of PSO is that it can search in a very large search spaceand find solutions quickly compared to other evolutionarysearch techniques like Genetic Algorithm Optimisation ofsolution plays a great role in classification and clusteringapplications PSO has been used not only for feature selection[29] it has been applied for the optimization of parameters inmachine learning algorithms like SVM

22 Bagging Bagging follows a bootstrap method of dataselection for classification It uses classifiers of the same typeBagging follows sampling with replacement procedure forselecting a set of data as input for a classifier Since it hasclassifiers of the same type majority vote across the ensembleformulates the final result Boosting ensemble follows asequential method where every classifier is formed based onthe output and error of the previously constructed classifier[30] Second classifier performs better than the first andthe same for the consecutively constructed classifiers Henceit takes more time for model construction and complexityincreases Moreover it results in overfitting of the givendata Ensemble classifier is a supervised learning model [31]that employs the concept of a group of multiple classifiersto improve classification accuracy It combines many weaklearners in order to generate a strong learning algorithmTheaim of applying ensemble method is to overcome the risk ofoverfitting by individual classifier

23 Classification

231 Support Vector Machines Support Vector Machines(SVMs) were introduced in 1995 by Cortes and Vapnik [32]In terms of theory the SVMs are well founded and proved tobe very efficient in classification tasksThe advantages of suchclassifiers are that they are independent of the dimensionalityof the feature space and that the results obtained are veryaccurate although the training time is very high SupportVector Machines are feed-forward networks with a singlelayer of nonlinear unitsTheir design has good generalizationperformance as an objective and follows for that reason theprinciple of structural risk minimization that is rooted in VCdimension theory

The training points for which the equality of the separat-ing plane is satisfied that is

forall119894 119910119894(119909119894sdot 119908 + 119887) ge 0 (1)

those which wind up lying on one of the hyperplane 1198671

1198672 and whose removal would change the solution found

are called Support Vectors (SVs) This algorithm is firmlygrounded in the framework of statistical learning theoryVapnik-Chervonenkis (VC) theory which improves the gen-eralization ability of learning machines to unseen data In thelast few years Support Vector Machines have shown excel-lent performance in many real-world applications includingobject recognition face detection and dementia diagnosis inimages

Computational and Mathematical Methods in Medicine 3

Table 1 Datasets used in the study

Dataset Number of instances Number of attributes Number of classes AD Normal MCINeuropsychological dataset 750 48 3 150 200 400Neuroimaging dataset 650 108 2 250 250 200Baseline combined data 870 65 3 140 280 450Combined dataset 750 40 2 150 200 400

232 Random Forest Random forest trees introduced byBreiman [33] are amethod of building a forest of uncorrelatedtrees with randomized node optimization and bagging Outof bag errors is used as an estimate of the generalizationerror Random forest (RF) is used to measure variableimportance through permutation [34]The general techniqueof bootstrap aggregation is applied in the training algorithmIn Random forest implementation only the number of treesin the forest and the number of attributes for prediction needto be defined [35]

233 C45 C45 algorithm is used to generate a DecisionTree that can be used for classification problems [36] Deci-sion Tree is built using the entropy value obtained from thegiven data C45 uses binary split or multivalued split inselection of attributes Performance of the algorithm varieswith cross validation and train-test method The averageaccuracy across several folds should be taken as the eval-uation measure As with all other classifiers precision andrecall increases with more records in the training dataset J48is the Java implementation of C45 in Weka tool C45 is animprovement of the ID3 algorithm and is capable of handlingboth discrete and continuous values Another advantage isthat fields with missing values need not be imputed with anyvalues Rather that field will not be used for calculation ofentropy and information gain

234 Naıve Bayes Naıve Bayes classifier is a statisticaltechnique [37] that is applied for classification in data miningproblems It is based on probabilistic outcomes of a givendata It is a supervised learning technique and hence priorknowledge can be incorporated in its learning process Henceit is well suited for medical diagnostics where the knowledgeof the domain expert can be incorporated in prior in order toachieve higher performance

3 Experimental Design

The reason for selection for C45 classifier is that it pro-vides better accuracy when compared with Random forestNaıve Bayes and Support Vector Machine in multiclassclassification problems Ensemble feature selection is donewith C45 SVM RF and NB followed by classification withC45 AdaBoost has the disadvantage of overfitting and themodel construction involved more time and complexityHence bagging approach is selected for the multiclass datasetclassification

31 Dataset Data used in the preparation of this paperwere obtained from the Alzheimerrsquos Disease Neuroimaging

Initiative (ADNI) database (adniloniuscedu) The ADNIwas launched in 2003 as a public-private partnership led byPrincipal Investigator Michael W Weiner MD The primarygoal of ADNI has been to test whether serial Magnetic Reso-nance Imaging (MRI) positron emission tomography (PET)other biologicalmarkers and clinical andneuropsychologicalassessment can be combined to measure the progression ofMild Cognitive Impairment (MCI) and early Alzheimerrsquosdisease (AD) Table 1 shows the details of data sets used inthe study Table 2 lists the attributes in the dataset

32 Preprocessing Preprocessing precedes classification fornoise removal and missing data management Data waspartitioned based on the month of visit Records in eachpartition are clustered based on the diagnosis in that visitData was normalized with 119911-score normalization Values ofselective attributes were normalized to a range from 0 to 1 Inprediction of length of stay of patients classwise mean valuesof respective classes were used to replace numeric missingvalues and mode of different classes replaced nominal orordinal missing values Moving average (MA) operators areused for handling missing values in time series data MA hasbeen applied for medical data and nonstationary signals also[38] Expectation maximization (EM) algorithm was usedto impute the missing data in a study [39] EM has alreadybeen applied in the analysis of Alzheimerrsquos data and foundto be more effective than multiple imputation methods [40]Attributes with more than 40 missing data were removedfrom the attribute set to avoid misclassification and bias

33 Ensemble Feature Selection There are 3 phases in theproposed Merit Merge feature selection technique Baseclassifier to be applied for feature selection is determined inPhase I by comparing the classifiers reported in the literaturewith the ensemble classifiers After the identification of baseclassifier PSO search is coupled with ensemble classifiers toidentify feature sets with higher merit The ensemble modelis trained and tested with feature set to obtain the optimalsubset that can be used for the multinomial classification

Phase IThis phase determines the base classifier that can beused for modelling the ensemble classification model Clini-cal dementia ratio is a key attribute in the discrimination ofNL MCI and AD Hence that key attribute is removed andthe performance of classifiers is compared It was noted in theprevious study that classification by NB outperformed SVMHence those classifiers are compared with C45 in the classi-fication of our multiclass problem Figure 1 shows the stepsin the selection of base classifier Data set containing bothneuropsychological test data and neuroimaging measures

4 Computational and Mathematical Methods in Medicine

Table 2 List of attributes derived from neuropsychological test andneuroimaging measures

Neuropsychological and neuroimaging measuresAverage FDG-PET of angulartemporal and posteriorcingulate

Mini Mental StateExamination-baseline

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex

Ventricles measure

Average AV45 SUVR of frontalanterior cingulate precuneusand parietal cortex relative tothe cerebellum

Hippocampus-baseline volume

Clinical dementia ratio-SB Whole brain-baseline volume

ADAS 11 UCSF entorhinal-baselinevolume

ADAS 13 UCSF fusiform-baselinevolume

Mini Mental Scale Examinationscore UCSF Med Temp-baseline

RAVLT (forgetting) UCSF ICV-baselineRAVLT (5 sum) MOCA-baselineFunctional AssessmentQuestionnaire Pt ECog-Memory-baseline

MOCA Pt ECog-Language-baselinePt ECog-Memory Pt ECog-VisSpat-baselinePt ECog-Language Pt ECog-Plan-baselinePt ECog-Visual Pt ECog-Organ-baselinePt ECog-Plan Pt ECog-Div atten-baselinePt ECog-Organ Pt ECog-Total-baselinePt ECog-Div atten SP ECog-Mem-baselinePt ECog-Total SP ECog-Lang-baselineSP ECog-Memory SP ECog-VisSpat-baselineSP ECog-Language SP ECog-Plan-baselineSP ECog-Visual SP ECog-Organ-baselineSP ECog-Plan SP ECog-Div atten-baselineSP ECog-Organ SP ECog-Total-baseline

SP ECog-AttentionAverage FDG-PET of angulartemporal and posteriorcingulate at baseline

SP ECog-Total

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex at baseline

UCSF ventricles measures

Average AV45 (PET ligand)SUVR of frontal anteriorcingulate precuneus andparietal cortex relative to thecerebellum at baseline

UCSF hippocampus measure CDR-SBUCSF whole brain measure ADAS 11 baselineUCSF entorhinal measure ADAS 13 baselineUCSF fusiform measureUCSF temporal measure RAVLT (forgetting) baselineUCSF ICV RAVLT (5 sum) baselinePt patient ECog everyday cognition test SP study partner ADASAlzheimerrsquos disease assessment scale MOCA Montreal Cognitive Assess-ment Ray Auditory Verbal Learning Test ICV intracranial volume SUVRStandard Uptake value ratio and CDR-SB Clinical Dementia Rating Sum ofBoxes

with 870 instances was classified by NB RF SVM andC45 decision tree Data set without clinical dementia ratioattribute is again classified with the four classifiers Thisis done to evaluate the sensitivity of the classifier even inthe absence of relevant attributes Since C45 decision treeclassifier outperformed the other classifiers used in the mul-ticlass classification ensemble approach with PSO search isproposed and tested in this work Naıve Bayes provided abetter accuracy compared to RF and SVM Ensemble featureselection is performed with C45 tree having binary splitand pruning with minimum description length techniqueRandom forest ensemble is implemented with 100 to 1000trees Out of bag error reduced and remained constant with600 and more number of trees Support vector machine isimplementedwith LIBSVMTheRadial Basis Function (RBF)kernel was used for classification RBF kernel showed higheraccuracy than other kernels Kernel parameters C and 120574values are optimized with grid search

Given pair of values 119909119894 119909119895 RBF kernel to find the sepa-

rating hyperplane is defined as follows

119896 (119909119894 119909119895) = exp (minus120574 10038171003817100381710038171003817119909119894 minus 119909119895

10038171003817100381710038171003817

2

) (2)

It was observed that the sensitivity of J48 for each class washigher compared to NB SVM and RF Hence J48 is selectedas the base classifier for feature selection and classificationJ48 is the base ensemble classifier used in CPEMM

Phase II An overview of the steps in Phases II and IIIis presented in Figure 2 Ensemble feature selection is per-formed with C45 having binary split and pruning Numberof iterations in PSO search done is experimented in the range60ndash100 Feature subsets were reduced in size as the numberof iterations increased With smaller number of iterationsensemble search selected subsets with more features As theiterations increased to find the best optimal solution PSOresulted in generating subsets with lesser number of featuresPSO search combined with NB RF and C45 ensembles gen-erated the feature subsets Feature subsets were sorted basedon merit given by the search technique It was observed thatC45 ensemble selected the optimum subsets with ParticleSwarm Optimisation with minimum number of iterationscompared with NB and RF RF ensemble returned goodsubsets in 2-class dataset Binary split at node implementedin C45 selected relevant features with minimum iterationsCPEMM technique is presented as an Algorithm 1 followingthe overview of Phases II and III in Figure 2

Phase III For each classifier subset with the highest merit isconsidered for evaluation by the base classifier C45 and theaccuracy is stored for further comparison

Case 1 If subsets have equal merit each subset is evaluatedindividually and also as a single subset after merging If themerged subset does not increase accuracy individual subsetsare selected as relevant feature set

Case 2 If more than 50 of subsets at the top of the sortedsubset list have the same merit the number of iterations

Computational and Mathematical Methods in Medicine 5

Phase IMultivariate dataset

Preprocessing

Dataset with CDR Dataset without CDR

Classification withSVM NB RF and C45

Classification withSVM NB RF and C45

Evaluate performance

Division of dataset

Select base classifier for ensemble featureselection

Figure 1 Selection of base classifier for ensemble feature selection

is increased to get a minimal feature subset for evaluationOne limitation is that if the increase in iterations did notreturn reduced subset this case should be probed further forenhancing feature selection

Case 3 If there is a successive subset with much lower meritthe search for subset is terminated

5-fold cross validation ensured that all the instancesare used in the model development [41] Alternate recordsare left out and trained with remaining records in everyconsecutive execution of the loop Ensemble classifiers whichare implemented in Weka tool is run in Pentium processorwith 253Ghz speed 4GBRAM and 64 bit operating systemStatistical analysis of feature selection methods and perfor-mance of classifiers were implemented in 119877

4 Results

The results are evaluated based on the performance of theclassifier by feeding the different sets of feature set selectedby

(i) C45 NB and RF coupled with PSO(ii) features selected by the CPEMM approach

Accuracy precision and recall of the classifier is evaluatedwith four datasets listed in Table 3

41 Performance Measures All classification results couldhave an error rate and will either fail to identify dementia ormisclassify a normal patient as demented It is common todescribe this error rate by the terms True Positive and FalsePositive and True Negative and False Negative as follows

True Positive (TP) is as follows the result of classificationis positive in the presence of the clinical abnormality TrueNegative (TN) is as follows the result of classification isnegative in the absence of the clinical abnormality FalsePositive (FP) is as follows the result of classification is positivein the absence of the clinical abnormality FalseNegative (FN)

is as follows the result of classification is negative in thepresence of the clinical abnormality

Accuracy defines the overall correctness of the modelPrecision defines the number of correct classificationobtained for each class Its value falls between 0 and 1 Recallcorresponds to the True Positive rate Consider

Accuracy = No of correct classificationTotal no of classification

(3)

Precision = TP(TP + FP)

(4)

Recall = TP(TP + FN)

(5)

Receiver optimistic curves (ROC) are used to analyse theprediction capability ofmachine learning techniques used forclassification and clustering [42] ROC analysis is a graphicalrepresentation comparing the True Positive rate and FalsePositive rate in classification results Area under the curve(AUC) characterizes the ROC of a classifier The larger thevalue of AUC is the more effective the performance theclassifier will be Pressrsquo 119876 test was used to evaluate thestatistical significance of the difference in accuracy yielded bythe classifiers Given ldquo119898rdquo samples ldquo119899rdquo correct classificationand ldquo119901rdquo groups test statistic was evaluated as follows

119876 =(119898 minus 119899119901)

2

119898(119901 minus 1)sim 1205942 (6)

Naıve Bayes C45 Decision Tree Random forest and SVMyielded statistically significant accuracy It was found thatfeature selection by CPEMM considerably increased the per-centage of records thatwere correctly classified C45 classifiercombined with CPEMM methodology provided the higheststatistically significant difference in performance when com-pared with PSO and the conventional ensemble based featureselection technique as shown in Figure 3 Higher Median of0987 was yielded by the proposed combination of CPEMM

6 Computational and Mathematical Methods in Medicine

Phase III Select subsets with minimal differencein merit feature subsets

Select the feature with highest ranking and evaluate

Evaluate accuracy of the classifier

Merge the next feature subset from the sorted list with the current set

Terminating condition true

Classify with new subset

Finalize best subset for normal MCIand dementia

Yes

No

Phase II

PSO search

Optimized feature subsets

Ensemble feature selectionNB SVM RF and J48

Dataset

Figure 2 Steps in Phase II and Phase III

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 2: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

2 Computational and Mathematical Methods in Medicine

was developed using neuropsychological test improved bymultivariate analyses using PCA [7] A research report com-paring the conventional statistical classifiers and machinelearning methods demonstrated the comparable improvedperformance of the machine learning methods [14] A studyby Quintana et al provides evidence that Artificial NeuralNetworks can be a useful tool for the analysis of neuropsy-chological profiles related to clinical syndromes Yu et aldeveloped a model of Support Vector Machine for predictionof common diseases in the case of occurrence of diabetes andprediabetes [15] Hachesu et al applied the Neural NetworksDecision Tree and SVM to determine and predict the lengthof stay of cardiac patients [16]

Kabir et al presented a new feature selection (FS) algo-rithm based on the wrapper approach usingNeural Networks[17] The vital aspect of this algorithm is the automaticdetermination of Neural Network architectures during thefeature selection process Maldonado et al have applied SVMfor simultaneous feature selection and classification [18] Newapproach for classification of microarray high-dimensionaldata has been evolved [19] Chen et al applied classificationtrees for larger datasets in Bioinformatics [20] Calle et aldeveloped a new strategy for genome data profiling withRandom forest [21]

Several studies with multimodal data [22] have proventhe classification efficiency of Random forest [14 20 21] In astudy forDifferentiation ofMCI fromADNaıve Bayes SVMNN and Decision Tree (DT) were used for feature selectionand Naıve Bayes was used as the base classifier [23] In thatstudy Naıve Bayes andDT gave better results when comparedwith SVM

Relevance of This Study Attribute selection performs a keyrole in building a good classifier which can efficiently delin-eate the patient records with absolute accuracy and efficiencyThis study proposes an ensemble feature selection approachusing J48 classifier with PSO search strategy andMeritMergetechnique to do the following

(a) Find the optimal subset that can effectively delineatethe three classes as Normal (NL) Mild CognitiveImpairment (MCI) and Alzheimerrsquos Dementia (AD)with ensemble feature selection

(b) Find all possible subset combinations that canincrease the accuracy in the discrimination of MildCognitive Impairment from Dementia

(c) Train and test an ensemble model that can effectivelyclassify multiclass medical data

2 Feature Selection and Classification

21 Feature Selection Feature selection is an important stepthat determines the performance of a classifier Dimensionreduction [24] is compulsory for better classification oflarger datasets Feature extraction selects the most relevantnonredundant features of interest from the given data Ingeneral feature selection can be performed by filter wrapper[17] and embedded methods Several studies have beenreported for feature selection with Support Vector Machine[18 25 26] and Random forest [21] Uncu and Turksen

developed a new approach with combination of filters andwrapper for feature selection [27]

Particle SwarmOptimisation (PSO) is a search techniquethat is a proven feature selection mechanism [28] The capa-bility of PSO is that it can search in a very large search spaceand find solutions quickly compared to other evolutionarysearch techniques like Genetic Algorithm Optimisation ofsolution plays a great role in classification and clusteringapplications PSO has been used not only for feature selection[29] it has been applied for the optimization of parameters inmachine learning algorithms like SVM

22 Bagging Bagging follows a bootstrap method of dataselection for classification It uses classifiers of the same typeBagging follows sampling with replacement procedure forselecting a set of data as input for a classifier Since it hasclassifiers of the same type majority vote across the ensembleformulates the final result Boosting ensemble follows asequential method where every classifier is formed based onthe output and error of the previously constructed classifier[30] Second classifier performs better than the first andthe same for the consecutively constructed classifiers Henceit takes more time for model construction and complexityincreases Moreover it results in overfitting of the givendata Ensemble classifier is a supervised learning model [31]that employs the concept of a group of multiple classifiersto improve classification accuracy It combines many weaklearners in order to generate a strong learning algorithmTheaim of applying ensemble method is to overcome the risk ofoverfitting by individual classifier

23 Classification

231 Support Vector Machines Support Vector Machines(SVMs) were introduced in 1995 by Cortes and Vapnik [32]In terms of theory the SVMs are well founded and proved tobe very efficient in classification tasksThe advantages of suchclassifiers are that they are independent of the dimensionalityof the feature space and that the results obtained are veryaccurate although the training time is very high SupportVector Machines are feed-forward networks with a singlelayer of nonlinear unitsTheir design has good generalizationperformance as an objective and follows for that reason theprinciple of structural risk minimization that is rooted in VCdimension theory

The training points for which the equality of the separat-ing plane is satisfied that is

forall119894 119910119894(119909119894sdot 119908 + 119887) ge 0 (1)

those which wind up lying on one of the hyperplane 1198671

1198672 and whose removal would change the solution found

are called Support Vectors (SVs) This algorithm is firmlygrounded in the framework of statistical learning theoryVapnik-Chervonenkis (VC) theory which improves the gen-eralization ability of learning machines to unseen data In thelast few years Support Vector Machines have shown excel-lent performance in many real-world applications includingobject recognition face detection and dementia diagnosis inimages

Computational and Mathematical Methods in Medicine 3

Table 1 Datasets used in the study

Dataset Number of instances Number of attributes Number of classes AD Normal MCINeuropsychological dataset 750 48 3 150 200 400Neuroimaging dataset 650 108 2 250 250 200Baseline combined data 870 65 3 140 280 450Combined dataset 750 40 2 150 200 400

232 Random Forest Random forest trees introduced byBreiman [33] are amethod of building a forest of uncorrelatedtrees with randomized node optimization and bagging Outof bag errors is used as an estimate of the generalizationerror Random forest (RF) is used to measure variableimportance through permutation [34]The general techniqueof bootstrap aggregation is applied in the training algorithmIn Random forest implementation only the number of treesin the forest and the number of attributes for prediction needto be defined [35]

233 C45 C45 algorithm is used to generate a DecisionTree that can be used for classification problems [36] Deci-sion Tree is built using the entropy value obtained from thegiven data C45 uses binary split or multivalued split inselection of attributes Performance of the algorithm varieswith cross validation and train-test method The averageaccuracy across several folds should be taken as the eval-uation measure As with all other classifiers precision andrecall increases with more records in the training dataset J48is the Java implementation of C45 in Weka tool C45 is animprovement of the ID3 algorithm and is capable of handlingboth discrete and continuous values Another advantage isthat fields with missing values need not be imputed with anyvalues Rather that field will not be used for calculation ofentropy and information gain

234 Naıve Bayes Naıve Bayes classifier is a statisticaltechnique [37] that is applied for classification in data miningproblems It is based on probabilistic outcomes of a givendata It is a supervised learning technique and hence priorknowledge can be incorporated in its learning process Henceit is well suited for medical diagnostics where the knowledgeof the domain expert can be incorporated in prior in order toachieve higher performance

3 Experimental Design

The reason for selection for C45 classifier is that it pro-vides better accuracy when compared with Random forestNaıve Bayes and Support Vector Machine in multiclassclassification problems Ensemble feature selection is donewith C45 SVM RF and NB followed by classification withC45 AdaBoost has the disadvantage of overfitting and themodel construction involved more time and complexityHence bagging approach is selected for the multiclass datasetclassification

31 Dataset Data used in the preparation of this paperwere obtained from the Alzheimerrsquos Disease Neuroimaging

Initiative (ADNI) database (adniloniuscedu) The ADNIwas launched in 2003 as a public-private partnership led byPrincipal Investigator Michael W Weiner MD The primarygoal of ADNI has been to test whether serial Magnetic Reso-nance Imaging (MRI) positron emission tomography (PET)other biologicalmarkers and clinical andneuropsychologicalassessment can be combined to measure the progression ofMild Cognitive Impairment (MCI) and early Alzheimerrsquosdisease (AD) Table 1 shows the details of data sets used inthe study Table 2 lists the attributes in the dataset

32 Preprocessing Preprocessing precedes classification fornoise removal and missing data management Data waspartitioned based on the month of visit Records in eachpartition are clustered based on the diagnosis in that visitData was normalized with 119911-score normalization Values ofselective attributes were normalized to a range from 0 to 1 Inprediction of length of stay of patients classwise mean valuesof respective classes were used to replace numeric missingvalues and mode of different classes replaced nominal orordinal missing values Moving average (MA) operators areused for handling missing values in time series data MA hasbeen applied for medical data and nonstationary signals also[38] Expectation maximization (EM) algorithm was usedto impute the missing data in a study [39] EM has alreadybeen applied in the analysis of Alzheimerrsquos data and foundto be more effective than multiple imputation methods [40]Attributes with more than 40 missing data were removedfrom the attribute set to avoid misclassification and bias

33 Ensemble Feature Selection There are 3 phases in theproposed Merit Merge feature selection technique Baseclassifier to be applied for feature selection is determined inPhase I by comparing the classifiers reported in the literaturewith the ensemble classifiers After the identification of baseclassifier PSO search is coupled with ensemble classifiers toidentify feature sets with higher merit The ensemble modelis trained and tested with feature set to obtain the optimalsubset that can be used for the multinomial classification

Phase IThis phase determines the base classifier that can beused for modelling the ensemble classification model Clini-cal dementia ratio is a key attribute in the discrimination ofNL MCI and AD Hence that key attribute is removed andthe performance of classifiers is compared It was noted in theprevious study that classification by NB outperformed SVMHence those classifiers are compared with C45 in the classi-fication of our multiclass problem Figure 1 shows the stepsin the selection of base classifier Data set containing bothneuropsychological test data and neuroimaging measures

4 Computational and Mathematical Methods in Medicine

Table 2 List of attributes derived from neuropsychological test andneuroimaging measures

Neuropsychological and neuroimaging measuresAverage FDG-PET of angulartemporal and posteriorcingulate

Mini Mental StateExamination-baseline

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex

Ventricles measure

Average AV45 SUVR of frontalanterior cingulate precuneusand parietal cortex relative tothe cerebellum

Hippocampus-baseline volume

Clinical dementia ratio-SB Whole brain-baseline volume

ADAS 11 UCSF entorhinal-baselinevolume

ADAS 13 UCSF fusiform-baselinevolume

Mini Mental Scale Examinationscore UCSF Med Temp-baseline

RAVLT (forgetting) UCSF ICV-baselineRAVLT (5 sum) MOCA-baselineFunctional AssessmentQuestionnaire Pt ECog-Memory-baseline

MOCA Pt ECog-Language-baselinePt ECog-Memory Pt ECog-VisSpat-baselinePt ECog-Language Pt ECog-Plan-baselinePt ECog-Visual Pt ECog-Organ-baselinePt ECog-Plan Pt ECog-Div atten-baselinePt ECog-Organ Pt ECog-Total-baselinePt ECog-Div atten SP ECog-Mem-baselinePt ECog-Total SP ECog-Lang-baselineSP ECog-Memory SP ECog-VisSpat-baselineSP ECog-Language SP ECog-Plan-baselineSP ECog-Visual SP ECog-Organ-baselineSP ECog-Plan SP ECog-Div atten-baselineSP ECog-Organ SP ECog-Total-baseline

SP ECog-AttentionAverage FDG-PET of angulartemporal and posteriorcingulate at baseline

SP ECog-Total

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex at baseline

UCSF ventricles measures

Average AV45 (PET ligand)SUVR of frontal anteriorcingulate precuneus andparietal cortex relative to thecerebellum at baseline

UCSF hippocampus measure CDR-SBUCSF whole brain measure ADAS 11 baselineUCSF entorhinal measure ADAS 13 baselineUCSF fusiform measureUCSF temporal measure RAVLT (forgetting) baselineUCSF ICV RAVLT (5 sum) baselinePt patient ECog everyday cognition test SP study partner ADASAlzheimerrsquos disease assessment scale MOCA Montreal Cognitive Assess-ment Ray Auditory Verbal Learning Test ICV intracranial volume SUVRStandard Uptake value ratio and CDR-SB Clinical Dementia Rating Sum ofBoxes

with 870 instances was classified by NB RF SVM andC45 decision tree Data set without clinical dementia ratioattribute is again classified with the four classifiers Thisis done to evaluate the sensitivity of the classifier even inthe absence of relevant attributes Since C45 decision treeclassifier outperformed the other classifiers used in the mul-ticlass classification ensemble approach with PSO search isproposed and tested in this work Naıve Bayes provided abetter accuracy compared to RF and SVM Ensemble featureselection is performed with C45 tree having binary splitand pruning with minimum description length techniqueRandom forest ensemble is implemented with 100 to 1000trees Out of bag error reduced and remained constant with600 and more number of trees Support vector machine isimplementedwith LIBSVMTheRadial Basis Function (RBF)kernel was used for classification RBF kernel showed higheraccuracy than other kernels Kernel parameters C and 120574values are optimized with grid search

Given pair of values 119909119894 119909119895 RBF kernel to find the sepa-

rating hyperplane is defined as follows

119896 (119909119894 119909119895) = exp (minus120574 10038171003817100381710038171003817119909119894 minus 119909119895

10038171003817100381710038171003817

2

) (2)

It was observed that the sensitivity of J48 for each class washigher compared to NB SVM and RF Hence J48 is selectedas the base classifier for feature selection and classificationJ48 is the base ensemble classifier used in CPEMM

Phase II An overview of the steps in Phases II and IIIis presented in Figure 2 Ensemble feature selection is per-formed with C45 having binary split and pruning Numberof iterations in PSO search done is experimented in the range60ndash100 Feature subsets were reduced in size as the numberof iterations increased With smaller number of iterationsensemble search selected subsets with more features As theiterations increased to find the best optimal solution PSOresulted in generating subsets with lesser number of featuresPSO search combined with NB RF and C45 ensembles gen-erated the feature subsets Feature subsets were sorted basedon merit given by the search technique It was observed thatC45 ensemble selected the optimum subsets with ParticleSwarm Optimisation with minimum number of iterationscompared with NB and RF RF ensemble returned goodsubsets in 2-class dataset Binary split at node implementedin C45 selected relevant features with minimum iterationsCPEMM technique is presented as an Algorithm 1 followingthe overview of Phases II and III in Figure 2

Phase III For each classifier subset with the highest merit isconsidered for evaluation by the base classifier C45 and theaccuracy is stored for further comparison

Case 1 If subsets have equal merit each subset is evaluatedindividually and also as a single subset after merging If themerged subset does not increase accuracy individual subsetsare selected as relevant feature set

Case 2 If more than 50 of subsets at the top of the sortedsubset list have the same merit the number of iterations

Computational and Mathematical Methods in Medicine 5

Phase IMultivariate dataset

Preprocessing

Dataset with CDR Dataset without CDR

Classification withSVM NB RF and C45

Classification withSVM NB RF and C45

Evaluate performance

Division of dataset

Select base classifier for ensemble featureselection

Figure 1 Selection of base classifier for ensemble feature selection

is increased to get a minimal feature subset for evaluationOne limitation is that if the increase in iterations did notreturn reduced subset this case should be probed further forenhancing feature selection

Case 3 If there is a successive subset with much lower meritthe search for subset is terminated

5-fold cross validation ensured that all the instancesare used in the model development [41] Alternate recordsare left out and trained with remaining records in everyconsecutive execution of the loop Ensemble classifiers whichare implemented in Weka tool is run in Pentium processorwith 253Ghz speed 4GBRAM and 64 bit operating systemStatistical analysis of feature selection methods and perfor-mance of classifiers were implemented in 119877

4 Results

The results are evaluated based on the performance of theclassifier by feeding the different sets of feature set selectedby

(i) C45 NB and RF coupled with PSO(ii) features selected by the CPEMM approach

Accuracy precision and recall of the classifier is evaluatedwith four datasets listed in Table 3

41 Performance Measures All classification results couldhave an error rate and will either fail to identify dementia ormisclassify a normal patient as demented It is common todescribe this error rate by the terms True Positive and FalsePositive and True Negative and False Negative as follows

True Positive (TP) is as follows the result of classificationis positive in the presence of the clinical abnormality TrueNegative (TN) is as follows the result of classification isnegative in the absence of the clinical abnormality FalsePositive (FP) is as follows the result of classification is positivein the absence of the clinical abnormality FalseNegative (FN)

is as follows the result of classification is negative in thepresence of the clinical abnormality

Accuracy defines the overall correctness of the modelPrecision defines the number of correct classificationobtained for each class Its value falls between 0 and 1 Recallcorresponds to the True Positive rate Consider

Accuracy = No of correct classificationTotal no of classification

(3)

Precision = TP(TP + FP)

(4)

Recall = TP(TP + FN)

(5)

Receiver optimistic curves (ROC) are used to analyse theprediction capability ofmachine learning techniques used forclassification and clustering [42] ROC analysis is a graphicalrepresentation comparing the True Positive rate and FalsePositive rate in classification results Area under the curve(AUC) characterizes the ROC of a classifier The larger thevalue of AUC is the more effective the performance theclassifier will be Pressrsquo 119876 test was used to evaluate thestatistical significance of the difference in accuracy yielded bythe classifiers Given ldquo119898rdquo samples ldquo119899rdquo correct classificationand ldquo119901rdquo groups test statistic was evaluated as follows

119876 =(119898 minus 119899119901)

2

119898(119901 minus 1)sim 1205942 (6)

Naıve Bayes C45 Decision Tree Random forest and SVMyielded statistically significant accuracy It was found thatfeature selection by CPEMM considerably increased the per-centage of records thatwere correctly classified C45 classifiercombined with CPEMM methodology provided the higheststatistically significant difference in performance when com-pared with PSO and the conventional ensemble based featureselection technique as shown in Figure 3 Higher Median of0987 was yielded by the proposed combination of CPEMM

6 Computational and Mathematical Methods in Medicine

Phase III Select subsets with minimal differencein merit feature subsets

Select the feature with highest ranking and evaluate

Evaluate accuracy of the classifier

Merge the next feature subset from the sorted list with the current set

Terminating condition true

Classify with new subset

Finalize best subset for normal MCIand dementia

Yes

No

Phase II

PSO search

Optimized feature subsets

Ensemble feature selectionNB SVM RF and J48

Dataset

Figure 2 Steps in Phase II and Phase III

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 3: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Computational and Mathematical Methods in Medicine 3

Table 1 Datasets used in the study

Dataset Number of instances Number of attributes Number of classes AD Normal MCINeuropsychological dataset 750 48 3 150 200 400Neuroimaging dataset 650 108 2 250 250 200Baseline combined data 870 65 3 140 280 450Combined dataset 750 40 2 150 200 400

232 Random Forest Random forest trees introduced byBreiman [33] are amethod of building a forest of uncorrelatedtrees with randomized node optimization and bagging Outof bag errors is used as an estimate of the generalizationerror Random forest (RF) is used to measure variableimportance through permutation [34]The general techniqueof bootstrap aggregation is applied in the training algorithmIn Random forest implementation only the number of treesin the forest and the number of attributes for prediction needto be defined [35]

233 C45 C45 algorithm is used to generate a DecisionTree that can be used for classification problems [36] Deci-sion Tree is built using the entropy value obtained from thegiven data C45 uses binary split or multivalued split inselection of attributes Performance of the algorithm varieswith cross validation and train-test method The averageaccuracy across several folds should be taken as the eval-uation measure As with all other classifiers precision andrecall increases with more records in the training dataset J48is the Java implementation of C45 in Weka tool C45 is animprovement of the ID3 algorithm and is capable of handlingboth discrete and continuous values Another advantage isthat fields with missing values need not be imputed with anyvalues Rather that field will not be used for calculation ofentropy and information gain

234 Naıve Bayes Naıve Bayes classifier is a statisticaltechnique [37] that is applied for classification in data miningproblems It is based on probabilistic outcomes of a givendata It is a supervised learning technique and hence priorknowledge can be incorporated in its learning process Henceit is well suited for medical diagnostics where the knowledgeof the domain expert can be incorporated in prior in order toachieve higher performance

3 Experimental Design

The reason for selection for C45 classifier is that it pro-vides better accuracy when compared with Random forestNaıve Bayes and Support Vector Machine in multiclassclassification problems Ensemble feature selection is donewith C45 SVM RF and NB followed by classification withC45 AdaBoost has the disadvantage of overfitting and themodel construction involved more time and complexityHence bagging approach is selected for the multiclass datasetclassification

31 Dataset Data used in the preparation of this paperwere obtained from the Alzheimerrsquos Disease Neuroimaging

Initiative (ADNI) database (adniloniuscedu) The ADNIwas launched in 2003 as a public-private partnership led byPrincipal Investigator Michael W Weiner MD The primarygoal of ADNI has been to test whether serial Magnetic Reso-nance Imaging (MRI) positron emission tomography (PET)other biologicalmarkers and clinical andneuropsychologicalassessment can be combined to measure the progression ofMild Cognitive Impairment (MCI) and early Alzheimerrsquosdisease (AD) Table 1 shows the details of data sets used inthe study Table 2 lists the attributes in the dataset

32 Preprocessing Preprocessing precedes classification fornoise removal and missing data management Data waspartitioned based on the month of visit Records in eachpartition are clustered based on the diagnosis in that visitData was normalized with 119911-score normalization Values ofselective attributes were normalized to a range from 0 to 1 Inprediction of length of stay of patients classwise mean valuesof respective classes were used to replace numeric missingvalues and mode of different classes replaced nominal orordinal missing values Moving average (MA) operators areused for handling missing values in time series data MA hasbeen applied for medical data and nonstationary signals also[38] Expectation maximization (EM) algorithm was usedto impute the missing data in a study [39] EM has alreadybeen applied in the analysis of Alzheimerrsquos data and foundto be more effective than multiple imputation methods [40]Attributes with more than 40 missing data were removedfrom the attribute set to avoid misclassification and bias

33 Ensemble Feature Selection There are 3 phases in theproposed Merit Merge feature selection technique Baseclassifier to be applied for feature selection is determined inPhase I by comparing the classifiers reported in the literaturewith the ensemble classifiers After the identification of baseclassifier PSO search is coupled with ensemble classifiers toidentify feature sets with higher merit The ensemble modelis trained and tested with feature set to obtain the optimalsubset that can be used for the multinomial classification

Phase IThis phase determines the base classifier that can beused for modelling the ensemble classification model Clini-cal dementia ratio is a key attribute in the discrimination ofNL MCI and AD Hence that key attribute is removed andthe performance of classifiers is compared It was noted in theprevious study that classification by NB outperformed SVMHence those classifiers are compared with C45 in the classi-fication of our multiclass problem Figure 1 shows the stepsin the selection of base classifier Data set containing bothneuropsychological test data and neuroimaging measures

4 Computational and Mathematical Methods in Medicine

Table 2 List of attributes derived from neuropsychological test andneuroimaging measures

Neuropsychological and neuroimaging measuresAverage FDG-PET of angulartemporal and posteriorcingulate

Mini Mental StateExamination-baseline

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex

Ventricles measure

Average AV45 SUVR of frontalanterior cingulate precuneusand parietal cortex relative tothe cerebellum

Hippocampus-baseline volume

Clinical dementia ratio-SB Whole brain-baseline volume

ADAS 11 UCSF entorhinal-baselinevolume

ADAS 13 UCSF fusiform-baselinevolume

Mini Mental Scale Examinationscore UCSF Med Temp-baseline

RAVLT (forgetting) UCSF ICV-baselineRAVLT (5 sum) MOCA-baselineFunctional AssessmentQuestionnaire Pt ECog-Memory-baseline

MOCA Pt ECog-Language-baselinePt ECog-Memory Pt ECog-VisSpat-baselinePt ECog-Language Pt ECog-Plan-baselinePt ECog-Visual Pt ECog-Organ-baselinePt ECog-Plan Pt ECog-Div atten-baselinePt ECog-Organ Pt ECog-Total-baselinePt ECog-Div atten SP ECog-Mem-baselinePt ECog-Total SP ECog-Lang-baselineSP ECog-Memory SP ECog-VisSpat-baselineSP ECog-Language SP ECog-Plan-baselineSP ECog-Visual SP ECog-Organ-baselineSP ECog-Plan SP ECog-Div atten-baselineSP ECog-Organ SP ECog-Total-baseline

SP ECog-AttentionAverage FDG-PET of angulartemporal and posteriorcingulate at baseline

SP ECog-Total

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex at baseline

UCSF ventricles measures

Average AV45 (PET ligand)SUVR of frontal anteriorcingulate precuneus andparietal cortex relative to thecerebellum at baseline

UCSF hippocampus measure CDR-SBUCSF whole brain measure ADAS 11 baselineUCSF entorhinal measure ADAS 13 baselineUCSF fusiform measureUCSF temporal measure RAVLT (forgetting) baselineUCSF ICV RAVLT (5 sum) baselinePt patient ECog everyday cognition test SP study partner ADASAlzheimerrsquos disease assessment scale MOCA Montreal Cognitive Assess-ment Ray Auditory Verbal Learning Test ICV intracranial volume SUVRStandard Uptake value ratio and CDR-SB Clinical Dementia Rating Sum ofBoxes

with 870 instances was classified by NB RF SVM andC45 decision tree Data set without clinical dementia ratioattribute is again classified with the four classifiers Thisis done to evaluate the sensitivity of the classifier even inthe absence of relevant attributes Since C45 decision treeclassifier outperformed the other classifiers used in the mul-ticlass classification ensemble approach with PSO search isproposed and tested in this work Naıve Bayes provided abetter accuracy compared to RF and SVM Ensemble featureselection is performed with C45 tree having binary splitand pruning with minimum description length techniqueRandom forest ensemble is implemented with 100 to 1000trees Out of bag error reduced and remained constant with600 and more number of trees Support vector machine isimplementedwith LIBSVMTheRadial Basis Function (RBF)kernel was used for classification RBF kernel showed higheraccuracy than other kernels Kernel parameters C and 120574values are optimized with grid search

Given pair of values 119909119894 119909119895 RBF kernel to find the sepa-

rating hyperplane is defined as follows

119896 (119909119894 119909119895) = exp (minus120574 10038171003817100381710038171003817119909119894 minus 119909119895

10038171003817100381710038171003817

2

) (2)

It was observed that the sensitivity of J48 for each class washigher compared to NB SVM and RF Hence J48 is selectedas the base classifier for feature selection and classificationJ48 is the base ensemble classifier used in CPEMM

Phase II An overview of the steps in Phases II and IIIis presented in Figure 2 Ensemble feature selection is per-formed with C45 having binary split and pruning Numberof iterations in PSO search done is experimented in the range60ndash100 Feature subsets were reduced in size as the numberof iterations increased With smaller number of iterationsensemble search selected subsets with more features As theiterations increased to find the best optimal solution PSOresulted in generating subsets with lesser number of featuresPSO search combined with NB RF and C45 ensembles gen-erated the feature subsets Feature subsets were sorted basedon merit given by the search technique It was observed thatC45 ensemble selected the optimum subsets with ParticleSwarm Optimisation with minimum number of iterationscompared with NB and RF RF ensemble returned goodsubsets in 2-class dataset Binary split at node implementedin C45 selected relevant features with minimum iterationsCPEMM technique is presented as an Algorithm 1 followingthe overview of Phases II and III in Figure 2

Phase III For each classifier subset with the highest merit isconsidered for evaluation by the base classifier C45 and theaccuracy is stored for further comparison

Case 1 If subsets have equal merit each subset is evaluatedindividually and also as a single subset after merging If themerged subset does not increase accuracy individual subsetsare selected as relevant feature set

Case 2 If more than 50 of subsets at the top of the sortedsubset list have the same merit the number of iterations

Computational and Mathematical Methods in Medicine 5

Phase IMultivariate dataset

Preprocessing

Dataset with CDR Dataset without CDR

Classification withSVM NB RF and C45

Classification withSVM NB RF and C45

Evaluate performance

Division of dataset

Select base classifier for ensemble featureselection

Figure 1 Selection of base classifier for ensemble feature selection

is increased to get a minimal feature subset for evaluationOne limitation is that if the increase in iterations did notreturn reduced subset this case should be probed further forenhancing feature selection

Case 3 If there is a successive subset with much lower meritthe search for subset is terminated

5-fold cross validation ensured that all the instancesare used in the model development [41] Alternate recordsare left out and trained with remaining records in everyconsecutive execution of the loop Ensemble classifiers whichare implemented in Weka tool is run in Pentium processorwith 253Ghz speed 4GBRAM and 64 bit operating systemStatistical analysis of feature selection methods and perfor-mance of classifiers were implemented in 119877

4 Results

The results are evaluated based on the performance of theclassifier by feeding the different sets of feature set selectedby

(i) C45 NB and RF coupled with PSO(ii) features selected by the CPEMM approach

Accuracy precision and recall of the classifier is evaluatedwith four datasets listed in Table 3

41 Performance Measures All classification results couldhave an error rate and will either fail to identify dementia ormisclassify a normal patient as demented It is common todescribe this error rate by the terms True Positive and FalsePositive and True Negative and False Negative as follows

True Positive (TP) is as follows the result of classificationis positive in the presence of the clinical abnormality TrueNegative (TN) is as follows the result of classification isnegative in the absence of the clinical abnormality FalsePositive (FP) is as follows the result of classification is positivein the absence of the clinical abnormality FalseNegative (FN)

is as follows the result of classification is negative in thepresence of the clinical abnormality

Accuracy defines the overall correctness of the modelPrecision defines the number of correct classificationobtained for each class Its value falls between 0 and 1 Recallcorresponds to the True Positive rate Consider

Accuracy = No of correct classificationTotal no of classification

(3)

Precision = TP(TP + FP)

(4)

Recall = TP(TP + FN)

(5)

Receiver optimistic curves (ROC) are used to analyse theprediction capability ofmachine learning techniques used forclassification and clustering [42] ROC analysis is a graphicalrepresentation comparing the True Positive rate and FalsePositive rate in classification results Area under the curve(AUC) characterizes the ROC of a classifier The larger thevalue of AUC is the more effective the performance theclassifier will be Pressrsquo 119876 test was used to evaluate thestatistical significance of the difference in accuracy yielded bythe classifiers Given ldquo119898rdquo samples ldquo119899rdquo correct classificationand ldquo119901rdquo groups test statistic was evaluated as follows

119876 =(119898 minus 119899119901)

2

119898(119901 minus 1)sim 1205942 (6)

Naıve Bayes C45 Decision Tree Random forest and SVMyielded statistically significant accuracy It was found thatfeature selection by CPEMM considerably increased the per-centage of records thatwere correctly classified C45 classifiercombined with CPEMM methodology provided the higheststatistically significant difference in performance when com-pared with PSO and the conventional ensemble based featureselection technique as shown in Figure 3 Higher Median of0987 was yielded by the proposed combination of CPEMM

6 Computational and Mathematical Methods in Medicine

Phase III Select subsets with minimal differencein merit feature subsets

Select the feature with highest ranking and evaluate

Evaluate accuracy of the classifier

Merge the next feature subset from the sorted list with the current set

Terminating condition true

Classify with new subset

Finalize best subset for normal MCIand dementia

Yes

No

Phase II

PSO search

Optimized feature subsets

Ensemble feature selectionNB SVM RF and J48

Dataset

Figure 2 Steps in Phase II and Phase III

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 4: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

4 Computational and Mathematical Methods in Medicine

Table 2 List of attributes derived from neuropsychological test andneuroimaging measures

Neuropsychological and neuroimaging measuresAverage FDG-PET of angulartemporal and posteriorcingulate

Mini Mental StateExamination-baseline

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex

Ventricles measure

Average AV45 SUVR of frontalanterior cingulate precuneusand parietal cortex relative tothe cerebellum

Hippocampus-baseline volume

Clinical dementia ratio-SB Whole brain-baseline volume

ADAS 11 UCSF entorhinal-baselinevolume

ADAS 13 UCSF fusiform-baselinevolume

Mini Mental Scale Examinationscore UCSF Med Temp-baseline

RAVLT (forgetting) UCSF ICV-baselineRAVLT (5 sum) MOCA-baselineFunctional AssessmentQuestionnaire Pt ECog-Memory-baseline

MOCA Pt ECog-Language-baselinePt ECog-Memory Pt ECog-VisSpat-baselinePt ECog-Language Pt ECog-Plan-baselinePt ECog-Visual Pt ECog-Organ-baselinePt ECog-Plan Pt ECog-Div atten-baselinePt ECog-Organ Pt ECog-Total-baselinePt ECog-Div atten SP ECog-Mem-baselinePt ECog-Total SP ECog-Lang-baselineSP ECog-Memory SP ECog-VisSpat-baselineSP ECog-Language SP ECog-Plan-baselineSP ECog-Visual SP ECog-Organ-baselineSP ECog-Plan SP ECog-Div atten-baselineSP ECog-Organ SP ECog-Total-baseline

SP ECog-AttentionAverage FDG-PET of angulartemporal and posteriorcingulate at baseline

SP ECog-Total

Average PIB SUVR of frontalcortex anterior cingulateprecuneus cortex and parietalcortex at baseline

UCSF ventricles measures

Average AV45 (PET ligand)SUVR of frontal anteriorcingulate precuneus andparietal cortex relative to thecerebellum at baseline

UCSF hippocampus measure CDR-SBUCSF whole brain measure ADAS 11 baselineUCSF entorhinal measure ADAS 13 baselineUCSF fusiform measureUCSF temporal measure RAVLT (forgetting) baselineUCSF ICV RAVLT (5 sum) baselinePt patient ECog everyday cognition test SP study partner ADASAlzheimerrsquos disease assessment scale MOCA Montreal Cognitive Assess-ment Ray Auditory Verbal Learning Test ICV intracranial volume SUVRStandard Uptake value ratio and CDR-SB Clinical Dementia Rating Sum ofBoxes

with 870 instances was classified by NB RF SVM andC45 decision tree Data set without clinical dementia ratioattribute is again classified with the four classifiers Thisis done to evaluate the sensitivity of the classifier even inthe absence of relevant attributes Since C45 decision treeclassifier outperformed the other classifiers used in the mul-ticlass classification ensemble approach with PSO search isproposed and tested in this work Naıve Bayes provided abetter accuracy compared to RF and SVM Ensemble featureselection is performed with C45 tree having binary splitand pruning with minimum description length techniqueRandom forest ensemble is implemented with 100 to 1000trees Out of bag error reduced and remained constant with600 and more number of trees Support vector machine isimplementedwith LIBSVMTheRadial Basis Function (RBF)kernel was used for classification RBF kernel showed higheraccuracy than other kernels Kernel parameters C and 120574values are optimized with grid search

Given pair of values 119909119894 119909119895 RBF kernel to find the sepa-

rating hyperplane is defined as follows

119896 (119909119894 119909119895) = exp (minus120574 10038171003817100381710038171003817119909119894 minus 119909119895

10038171003817100381710038171003817

2

) (2)

It was observed that the sensitivity of J48 for each class washigher compared to NB SVM and RF Hence J48 is selectedas the base classifier for feature selection and classificationJ48 is the base ensemble classifier used in CPEMM

Phase II An overview of the steps in Phases II and IIIis presented in Figure 2 Ensemble feature selection is per-formed with C45 having binary split and pruning Numberof iterations in PSO search done is experimented in the range60ndash100 Feature subsets were reduced in size as the numberof iterations increased With smaller number of iterationsensemble search selected subsets with more features As theiterations increased to find the best optimal solution PSOresulted in generating subsets with lesser number of featuresPSO search combined with NB RF and C45 ensembles gen-erated the feature subsets Feature subsets were sorted basedon merit given by the search technique It was observed thatC45 ensemble selected the optimum subsets with ParticleSwarm Optimisation with minimum number of iterationscompared with NB and RF RF ensemble returned goodsubsets in 2-class dataset Binary split at node implementedin C45 selected relevant features with minimum iterationsCPEMM technique is presented as an Algorithm 1 followingthe overview of Phases II and III in Figure 2

Phase III For each classifier subset with the highest merit isconsidered for evaluation by the base classifier C45 and theaccuracy is stored for further comparison

Case 1 If subsets have equal merit each subset is evaluatedindividually and also as a single subset after merging If themerged subset does not increase accuracy individual subsetsare selected as relevant feature set

Case 2 If more than 50 of subsets at the top of the sortedsubset list have the same merit the number of iterations

Computational and Mathematical Methods in Medicine 5

Phase IMultivariate dataset

Preprocessing

Dataset with CDR Dataset without CDR

Classification withSVM NB RF and C45

Classification withSVM NB RF and C45

Evaluate performance

Division of dataset

Select base classifier for ensemble featureselection

Figure 1 Selection of base classifier for ensemble feature selection

is increased to get a minimal feature subset for evaluationOne limitation is that if the increase in iterations did notreturn reduced subset this case should be probed further forenhancing feature selection

Case 3 If there is a successive subset with much lower meritthe search for subset is terminated

5-fold cross validation ensured that all the instancesare used in the model development [41] Alternate recordsare left out and trained with remaining records in everyconsecutive execution of the loop Ensemble classifiers whichare implemented in Weka tool is run in Pentium processorwith 253Ghz speed 4GBRAM and 64 bit operating systemStatistical analysis of feature selection methods and perfor-mance of classifiers were implemented in 119877

4 Results

The results are evaluated based on the performance of theclassifier by feeding the different sets of feature set selectedby

(i) C45 NB and RF coupled with PSO(ii) features selected by the CPEMM approach

Accuracy precision and recall of the classifier is evaluatedwith four datasets listed in Table 3

41 Performance Measures All classification results couldhave an error rate and will either fail to identify dementia ormisclassify a normal patient as demented It is common todescribe this error rate by the terms True Positive and FalsePositive and True Negative and False Negative as follows

True Positive (TP) is as follows the result of classificationis positive in the presence of the clinical abnormality TrueNegative (TN) is as follows the result of classification isnegative in the absence of the clinical abnormality FalsePositive (FP) is as follows the result of classification is positivein the absence of the clinical abnormality FalseNegative (FN)

is as follows the result of classification is negative in thepresence of the clinical abnormality

Accuracy defines the overall correctness of the modelPrecision defines the number of correct classificationobtained for each class Its value falls between 0 and 1 Recallcorresponds to the True Positive rate Consider

Accuracy = No of correct classificationTotal no of classification

(3)

Precision = TP(TP + FP)

(4)

Recall = TP(TP + FN)

(5)

Receiver optimistic curves (ROC) are used to analyse theprediction capability ofmachine learning techniques used forclassification and clustering [42] ROC analysis is a graphicalrepresentation comparing the True Positive rate and FalsePositive rate in classification results Area under the curve(AUC) characterizes the ROC of a classifier The larger thevalue of AUC is the more effective the performance theclassifier will be Pressrsquo 119876 test was used to evaluate thestatistical significance of the difference in accuracy yielded bythe classifiers Given ldquo119898rdquo samples ldquo119899rdquo correct classificationand ldquo119901rdquo groups test statistic was evaluated as follows

119876 =(119898 minus 119899119901)

2

119898(119901 minus 1)sim 1205942 (6)

Naıve Bayes C45 Decision Tree Random forest and SVMyielded statistically significant accuracy It was found thatfeature selection by CPEMM considerably increased the per-centage of records thatwere correctly classified C45 classifiercombined with CPEMM methodology provided the higheststatistically significant difference in performance when com-pared with PSO and the conventional ensemble based featureselection technique as shown in Figure 3 Higher Median of0987 was yielded by the proposed combination of CPEMM

6 Computational and Mathematical Methods in Medicine

Phase III Select subsets with minimal differencein merit feature subsets

Select the feature with highest ranking and evaluate

Evaluate accuracy of the classifier

Merge the next feature subset from the sorted list with the current set

Terminating condition true

Classify with new subset

Finalize best subset for normal MCIand dementia

Yes

No

Phase II

PSO search

Optimized feature subsets

Ensemble feature selectionNB SVM RF and J48

Dataset

Figure 2 Steps in Phase II and Phase III

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 5: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Computational and Mathematical Methods in Medicine 5

Phase IMultivariate dataset

Preprocessing

Dataset with CDR Dataset without CDR

Classification withSVM NB RF and C45

Classification withSVM NB RF and C45

Evaluate performance

Division of dataset

Select base classifier for ensemble featureselection

Figure 1 Selection of base classifier for ensemble feature selection

is increased to get a minimal feature subset for evaluationOne limitation is that if the increase in iterations did notreturn reduced subset this case should be probed further forenhancing feature selection

Case 3 If there is a successive subset with much lower meritthe search for subset is terminated

5-fold cross validation ensured that all the instancesare used in the model development [41] Alternate recordsare left out and trained with remaining records in everyconsecutive execution of the loop Ensemble classifiers whichare implemented in Weka tool is run in Pentium processorwith 253Ghz speed 4GBRAM and 64 bit operating systemStatistical analysis of feature selection methods and perfor-mance of classifiers were implemented in 119877

4 Results

The results are evaluated based on the performance of theclassifier by feeding the different sets of feature set selectedby

(i) C45 NB and RF coupled with PSO(ii) features selected by the CPEMM approach

Accuracy precision and recall of the classifier is evaluatedwith four datasets listed in Table 3

41 Performance Measures All classification results couldhave an error rate and will either fail to identify dementia ormisclassify a normal patient as demented It is common todescribe this error rate by the terms True Positive and FalsePositive and True Negative and False Negative as follows

True Positive (TP) is as follows the result of classificationis positive in the presence of the clinical abnormality TrueNegative (TN) is as follows the result of classification isnegative in the absence of the clinical abnormality FalsePositive (FP) is as follows the result of classification is positivein the absence of the clinical abnormality FalseNegative (FN)

is as follows the result of classification is negative in thepresence of the clinical abnormality

Accuracy defines the overall correctness of the modelPrecision defines the number of correct classificationobtained for each class Its value falls between 0 and 1 Recallcorresponds to the True Positive rate Consider

Accuracy = No of correct classificationTotal no of classification

(3)

Precision = TP(TP + FP)

(4)

Recall = TP(TP + FN)

(5)

Receiver optimistic curves (ROC) are used to analyse theprediction capability ofmachine learning techniques used forclassification and clustering [42] ROC analysis is a graphicalrepresentation comparing the True Positive rate and FalsePositive rate in classification results Area under the curve(AUC) characterizes the ROC of a classifier The larger thevalue of AUC is the more effective the performance theclassifier will be Pressrsquo 119876 test was used to evaluate thestatistical significance of the difference in accuracy yielded bythe classifiers Given ldquo119898rdquo samples ldquo119899rdquo correct classificationand ldquo119901rdquo groups test statistic was evaluated as follows

119876 =(119898 minus 119899119901)

2

119898(119901 minus 1)sim 1205942 (6)

Naıve Bayes C45 Decision Tree Random forest and SVMyielded statistically significant accuracy It was found thatfeature selection by CPEMM considerably increased the per-centage of records thatwere correctly classified C45 classifiercombined with CPEMM methodology provided the higheststatistically significant difference in performance when com-pared with PSO and the conventional ensemble based featureselection technique as shown in Figure 3 Higher Median of0987 was yielded by the proposed combination of CPEMM

6 Computational and Mathematical Methods in Medicine

Phase III Select subsets with minimal differencein merit feature subsets

Select the feature with highest ranking and evaluate

Evaluate accuracy of the classifier

Merge the next feature subset from the sorted list with the current set

Terminating condition true

Classify with new subset

Finalize best subset for normal MCIand dementia

Yes

No

Phase II

PSO search

Optimized feature subsets

Ensemble feature selectionNB SVM RF and J48

Dataset

Figure 2 Steps in Phase II and Phase III

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 6: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

6 Computational and Mathematical Methods in Medicine

Phase III Select subsets with minimal differencein merit feature subsets

Select the feature with highest ranking and evaluate

Evaluate accuracy of the classifier

Merge the next feature subset from the sorted list with the current set

Terminating condition true

Classify with new subset

Finalize best subset for normal MCIand dementia

Yes

No

Phase II

PSO search

Optimized feature subsets

Ensemble feature selectionNB SVM RF and J48

Dataset

Figure 2 Steps in Phase II and Phase III

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 7: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Computational and Mathematical Methods in Medicine 7

Algorithm CPEMM (DS threshold)Input DS rarr dataset merit threshold rarr meritvalue

acc threshold rarr accuracy threshold flag = 1Output ffs rarr finalised feature set119860 rarr Accuracy vector MS rarr merged set vectorbs rarr bootstrap vector119898 rarr merit subsets obtained from wrapper feature selection119873 rarr Sorted subset list in descending order119878119894rarr subset with highest ranking

119896 larr no of subsets generated (1 119870)(1) Initialize subset 119904 = null(2) Generate feature subsets with Ensemble

Initialise no of iterations 119899Repeat

Generate subset pset119896from DS

Verify merit119878119896larr pset

119896

Optimization of feature subset119898119896larrmerit of psetk

Until 119899 iterations(3) Evaluate accuracy of classifier with subset 119904119894 = 1119898

119894= 119898(1) 119895 = 2

Evaluate accuracy 119860119894of 119878119894

RepeatRepeatdiff =119898

119894minus 119898119895

MS119894119895larr 119878119894and 119878

119895merged

119860MS119894119895 larr Evaluate accuracy of MS119894119895

If 119860119894le 119860MS119894119895

Append MS119894119895to ffs

flag = 1else

flag = 0Append 119878

119894to ffs

endifIncrement 119895

until flag = 0 or 119895 lt 119896 or diff ge thresholdIncrement 119894

until 119860119894ge acc threshold

(4) bs119894larr bootstrap vector from ffs

RepeatTrain classifier c

119894with bs

119894

Evaluate out of bag errorUntil 119899 sets are bootstrapped

with 5-fold cross validation

Algorithm 1

11

10

09

08

07

Accu

racy

Nai

ve B

ayes

PSO

J48

PSO

RF P

SO

SVM

PSO

NB

CPEM

M

J48

CPEM

M

RF C

PEM

M

Ensemble classifier combined with PSO and CPEMM

Figure 3 Accuracy of classifiers with features selected by PSO CPEMMmethods

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 8: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

8 Computational and Mathematical Methods in Medicine

Table3Re

sults

forthe

3-cla

ssdatasetw

ithensembleo

fNB

J48RF

SVM

andwith

CPEM

Mensemble

NBEn

semble

J48En

semble

RFEn

semble

SVM

Ensemble

NB-CP

EMM

J48-CP

EMM

RF-C

PEMM

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

Acc

Pre

Rec

NSD

S0838

0854

0838

0911

0895

0889

0936

0941

0799

064

70564

0677

0899

0904

0912

0932

0958

0933

0936

0941

0799

NID

S0756

0737

0556

0936

0941

0937

0545

064

50545

0667

0685

0734

0834

0895

0738

0904

0934

0907

0545

064

50545

CIDS-I

0863

0833

0823

0916

0933

0923

0963

0963

0963

0673

0699

0788

0896

0812

0822

0987

0955

0963

0963

0963

0963

CIDS-II

0756

0654

0563

0945

0931

0924

0797

0723

0634

0685

0676

0657

0885

0823

0813

0971

0965

0923

0888

0856

0833

(NSD

SNeuropsycho

logicald

atas

etN

IDSNeuroim

agingdatasetaC

IDScombineddataset)

AccAc

curacyP

reprecisio

nRe

cRe

call

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 9: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Computational and Mathematical Methods in Medicine 9

Table 4 Results with the maximal feature subset obtained by divide and merge feature selection technique

Classifier Normal class Dementia Mild Cognitive ImpairmentAcc Pre Rec Acc Pre Rec Acc Pre Rec

J48 0963 0963 0963 0978 0953 0967 0966 0968 0987J48-TT 0983 0972 0977 0986 0973 0963 0977 0976 0977J48-CV 0965 0954 0945 0966 0956 0955 0964 0954 0945TT training and testing CV cross validation

12

1

08

06

04

02

0

ROC

area

EN-d

atas

et-I

CPEM

M-d

atas

et-I

EN-d

atas

et-I

I

CPEM

M-d

atas

et-I

I

EN-d

atas

et-I

II

CPEM

M-d

atas

et-I

II

EN-d

atas

et-I

V

CPEM

M-d

atas

et-I

V

Dataset and feature selectionRandom forestJ48Naiumlve Bayes

ROC analysis for datasets with CPEMM

Figure 4 Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets

and C45 ensemble classifier while the other classifiers testedthat is NB RF and SVM had a median approximately above075 to 088

With C45 as base classifier features selected by PSOin combination with NB SVM C45 and RF ensemble arelisted in Table 3 CPEMMmethod is applied tomerge subsetsbased on merit The resultant subsets from each dataset areevaluated with C45 classifier The accuracy obtained foreach class (NL AD and MCI) is evaluated in each datasetSensitivity of classifier to the multiclass classification usingthe CPEMM approach is tabulated in Table 4

CPEMM was applied to the feature sets obtained by NBC45 and RF since the sensitivity of the SVM classifier wasvery low compared with other classifiers The nonlinear RBFkernel was the best fitting kernel with SVM Yet the accuracyobtained was below 70 Hence the CPEMM strategy isapplied and tested with NB C45 and RF

The discriminating efficiency of J48 with respect to thethree classes Normal Dementia and Mild Cognitive Impair-ment is evaluated Classification of Normal class had highersensitivity compared to the delineation of Mild CognitiveImpairment and Dementia The results are given in Table 4Ensemble feature selection returned list of subsets withhigher merit CPEMM technique merged and evaluated theaccuracy of successive subsets with higher merits Efficiencyof the classifier with features selected using CPEMM andthe features selected with conventional ensemble featureselection is given as a comparison through ROC analysis inFigure 4 ROC area that is obtained with the four datasetsis plotted in the graph ROC of individual ensemble featureselection is plottedwith ROCobtainedwith CPEMM Table 5

Table 5 Description of the J48 ensemble model used for the multi-class classificationDetails ValueSplit method Binary splitCross validation accuracy 0976AUC with CV 0971Train and test accuracy 0986AUC with train and test 0987

Common features selected byall methods

MMSE CDR hippocampusvolume and everydaycognition measures

Features added by CPEMMEntorhinal measures CDRSBand Ray Auditory VerbalLearning Test-immediate

describes the features of the ensemble model for classifica-tion

CPEMM yielded higher area under the curve values forall the four datasets experimented in our study

5 Conclusion

C45 classifier provided better accuracy and sensitivity inmulticlass classification of Alzheimerrsquos Dementia Ensembleof C45 classifier selected best fit subset for the evaluationof the three different classes with the highest Recall value987 for the class MCI It was evident that features selectedby the C45 algorithm further increased the performance ofRandom forest and Naıve Bayes classifier also The proposedensemble with PSO search selected the minimal subset that

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 10: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

10 Computational and Mathematical Methods in Medicine

is needed for the discrimination of diseases Merit Mergeapproach further enhanced the feature selection by identify-ing the effective consolidated subset that could be used forthe clinical diagnosis of dementia andMildCognitive Impair-ment that will lead to dementia Our work also confirmedthe fact that performance of SVM in the delineation of MildCognitive Impairment and Dementia is very low comparedto Random forest Naıve Bayes and C45 algorithm as men-tioned by Williams et al [23] Although the performanceof Random forest was comparable to C45 and NB in thediscrimination of 2-class data accuracy of approximately 75was provided for the 3-class problem CPEMM was able topredict the relevant features for all datasets especially theCIDS The proposed split and merge ensemble approach canbe applied for any 3-class classification problem It can beextended for the classification of high-dimensional datasetslike microarray data also with preliminary feature reduction

Classification with NB for discrimination of Dementiaand MCI by previous study resulted in accuracy of around80 and sensitivity of approximately 70 [14 23] OurCPEMMbased on Bagging ensemble of J48withMeritMergetechnique yielded higher accuracy of 987 in train and testmethod [43] Bagging approach with learning from morethan one classifier found the minimal subset for effectivediagnostics Merit Merge approach found highly relevant allpossible subsets that contribute towards the multiclass classi-fication Proposed approach yielded a statistically significantdifference with amean area under the curve of approximately0977 in the multivariate classification of Dementia

Bagging ensemblemodels provide a promising error freestatistically significant machine learning method for diseasediagnosis The proposed methodology can be for applieddisease state prediction even with class imbalanced datasets

Disclosure

T R Sivapriya is the registered user of ADNI Data used inpreparation of this paper were obtained from the AlzheimerrsquosDisease Neuroimaging Initiative (ADNI) database (adniloniuscedu) As such the investigators within the ADNI con-tributed to the design and implementation of ADNI andorprovided data but did not participate in analysis or writing ofthis report A complete listing of ADNI investigators can befound at httpadniloniusceduwp-contentuploadshowto applyADNI Acknowledgement Listpdf

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this work

Acknowledgments

The authors thank Professor Kumanan Head of Departmentof Psychiatry Madurai Medical College Tamil Nadu Indiafor many helpful discussions in the study and interpreta-tion of data for building the classifier Data collection andsharing for this project was funded by the Alzheimerrsquos Dis-ease Neuroimaging Initiative (ADNI) (National Institutes ofHealth Grant U01 AG024904) and DODADNI (Department

of Defense award number W81XWH-12-2-0012) ADNI isfunded by the National Institute on Aging the NationalInstitute of Biomedical Imaging and Bioengineering andthrough generous contributions from the following AbbVieAlzheimerrsquos Association Alzheimerrsquos Drug Discovery Foun-dation Araclon Biotech BioClinica Inc Biogen Bristol-Myers Squibb Company CereSpir Inc Eisai Inc ElanPharmaceuticals Inc Eli Lill and Company EuroImmun FHoffmann-La Roche Ltd and its affiliated company Genen-tech Inc Fujirebio GE Healthcare IXICO Ltd JanssenAlzheimer Immunotherapy Research amp Development LLCJohnson amp Johnson Pharmaceutical Research amp Develop-ment LLC Lumosity Lundbeck Merck amp Co Inc MesoScale Diagnostics LLC NeuroRx Research NeurotrackTechnologies Novartis Pharmaceuticals Corporation PfizerInc Piramal Imaging Servier Takeda Pharmaceutical Com-pany andTransitionTherapeuticsTheCanadian Institutes ofHealth Research is providing funds to support ADNI clinicalsites in Canada Private sector contributions are facilitatedby the Foundation for the National Institutes of Health(wwwfnihorg) The grantee organization is the NorthernCalifornia Institute for Research and Education and thestudy is coordinated by the Alzheimerrsquos Disease CooperativeStudy at the University of California San Diego ADNI dataare disseminated by the Laboratory for Neuro Imaging at theUniversity of Southern California

References

[1] M Prince E Albanese M Guerchet and M Prina AlzheimerReport Dementia and Risk Reduction An Analysis of Protec-tive and Modifiable Factors Alzheimerrsquos Disease InternationalLondon UK 2014 httpwwwalzcoukresearchWorldAlz-heimerReport2014pdf

[2] M Prince R Bryce E Albanese AWimoW Ribeiro and C PFerri ldquoThe global prevalence of dementia a systematic reviewand metaanalysisrdquo Alzheimerrsquos amp Dementia vol 9 no 1 pp63e2ndash75e2 2013

[3] Y Xia ldquoGA and AdaBoost-based feature selection and com-bination for automated identification of dementia using FDG-PET imagingrdquo in Intelligent Science and Intelligent Data Engi-neering vol 7202 of Lecture Notes in Computer Science pp 128ndash135 Springer Berlin Germany 2012

[4] B Magnin L Mesrob S Kinkingnehun et al ldquoSupport vec-tor machine-based classification of Alzheimerrsquos disease fromwhole-brain anatomicalMRIrdquoNeuroradiology vol 51 no 2 pp73ndash83 2009

[5] M C Tierney R Moineddin and I McDowell ldquoPrediction ofall-cause dementia using neuropsychological tests within 10 and5 years of diagnosis in a community-based samplerdquo Journal ofAlzheimerrsquos Disease vol 22 no 4 pp 1231ndash1240 2010

[6] M N Rossor N C Fox C J Mummery J M Schott and J DWarren ldquoThe diagnosis of young-onset dementiardquo The LancetNeurology vol 9 no 8 pp 793ndash806 2010

[7] R M Chapman MMapstone J WMcCrary et al ldquoPredictingconversion frommild cognitive impairment to Alzheimerrsquos dis-ease using neuropsychological tests and multivariate methodsrdquoJournal of Clinical and Experimental Neuropsychology vol 33no 2 pp 187ndash199 2011

[8] A Pozueta E Rodrıguez-Rodrıguez J L Vazquez-Higuera etal ldquoDetection of early Alzheimerrsquos disease in MCI patients by

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 11: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Computational and Mathematical Methods in Medicine 11

the combination of MMSE and an episodic memory testrdquo BMCNeurology vol 11 article 78 2011

[9] M Quintana J Guradia G Sanchez-Benavides et al ldquoUsingartificial neural networks in clinical neuropsychology highperformance in mild cognitive impairment and Alzheimerrsquosdiseaserdquo Journal of Clinical and Experimental Neuropsychologyvol 34 no 2 pp 195ndash208 2012

[10] R Cuingnet E Gerardin J Tessieras et al ldquoAutomatic clas-sification of patients with Alzheimerrsquos disease from structuralMRI a comparison of ten methods using the ADNI databaserdquoNeuroImage vol 56 no 2 pp 766ndash781 2011

[11] M Q Wilks H Protas M Wardak et al ldquoAutomated VOIanalysis in FDDNP PET using structural warping validationthrough classification of Alzheimerrsquos disease patientsrdquo Inter-national Journal of Alzheimerrsquos Disease vol 2012 Article ID512069 8 pages 2012

[12] S Kloppel C M Stonnington J Barnes et al ldquoAccuracy ofdementia diagnosismdasha direct comparison between radiologistsand a computerized methodrdquo Brain vol 131 no 11 pp 2969ndash2974 2008

[13] A Larner ldquoComparing diagnostic accuracy of cognitive screen-ing instruments a weighted comparison approachrdquo Dementiaand Geriatric Cognitive Disorders Extra vol 3 no 1 pp 60ndash652013

[14] J Maroco D Silva A Rodrigues M Guerreiro I Santana andA de Mendonca ldquoData mining methods in the prediction ofdementia a real-data comparison of the accuracy sensitivityand specificity of linear discriminant analysis logistic regres-sion neural networks support vector machines classificationtrees and random forestsrdquo BMC Research Notes vol 4 article299 14 pages 2011

[15] W Yu T Liu R Valdez M Gwinn and M J Khoury ldquoAppli-cation of support vector machine modeling for prediction ofcommon diseases the case of diabetes and pre-diabetesrdquo BMCMedical Informatics and Decision Making vol 10 no 1 article16 2010

[16] P R HachesuM Ahmadi S Alizadeh and F Sadoughi ldquoUse ofdata mining techniques to determine and predict length of stayof cardiac patientsrdquoHealthcare Informatics Research vol 19 no2 pp 121ndash129 2013

[17] M M Kabir M M Islam and K Murase ldquoA new wrapper fea-ture selection approach using neural networkrdquo Neurocomput-ing vol 73 no 16ndash18 pp 3273ndash3283 2010

[18] S Maldonado R Weber and J Basak ldquoSimultaneous featureselection and classification using kernel-penalized supportvector machinesrdquo Information Sciences vol 181 no 1 pp 115ndash128 2011

[19] J Geraci M Dharsee P Nuin et al ldquoExploring high dimen-sional data with Butterfly a novel classification algorithm basedondiscrete dynamical systemsrdquoBioinformatics vol 30 no 5 pp712ndash718 2014

[20] XChenMWang andHZhang ldquoTheuse of classification treesfor bioinformaticsrdquoData Mining and Knowledge Discovery vol1 no 1 pp 55ndash63 2011

[21] M L Calle V Urrea A-L Boulesteix and N Malats ldquoAUC-RF a new strategy for genomic profiling with random forestrdquoHuman Heredity vol 72 no 2 pp 121ndash132 2011

[22] K R Gray P Aljabar R A Heckemann A Hammers and DRueckert ldquoRandom forest-based similarity measures for multi-modal classification of Alzheimerrsquos diseaserdquo NeuroImage vol65 pp 167ndash175 2013

[23] J A Williams A Weakley D J Cook and M Schmitter-Edgecombe ldquoMachine learning techniques for diagnostic dif-ferentiation of mild cognitive impairment and dementiardquo inProceedings of the 27thAAAIConference onArtificial IntelligenceWorkshop 2013

[24] B Chizi and O Maimon ldquoDimension reduction and featureselectionrdquo in Data Mining and Knowledge Discovery Handbookpp 93ndash111 2005

[25] I Guyon and A Elisseeff ldquoAn introduction to variable andfeature selectionrdquo Journal of Machine Learning Research vol 3pp 1157ndash1182 2003

[26] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[27] O Uncu and I B Turksen ldquoA novel feature selection approachcombining feature wrappers and filtersrdquo Information Sciencesvol 177 no 2 pp 449ndash466 2007

[28] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theIEEE International Conference on Evolutionary Computationpp 81ndash86 Seoul The Republic of Korea May 2001

[29] H H Inbarani A T Azar and G Jothi ldquoSupervised hybridfeature selection based on PSO and rough sets for medicaldiagnosisrdquo Computer Methods and Programs in Biomedicinevol 113 no 1 pp 175ndash185 2014

[30] P Buhlmann and T Hothorn ldquoBoosting algorithms regulariza-tion prediction and model fittingrdquo Statistical Science vol 22no 4 pp 477ndash505 2007

[31] L Rocach Pattern Classification Using Ensemble MethodsWorld Scientific Singapore 2009

[32] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[34] A Liaw and M Wiener ldquoClassification and regression byrandom forestrdquo R News vol 2 no 3 pp 18ndash22 2002

[35] S Janitza C Strobl and A L Boulesteix ldquoAn AUC-based per-mutation variable importance measure for random forestsrdquoBMC Bioinformatics vol 14 article 119 2013

[36] J R Quinlan C45 Programs for Machine Learning MorganKaufmann Boston Mass USA 1993

[37] S B Kotsiantis ldquoSupervised machine learning a review of clas-sification techniquesrdquo Informatica vol 31 no 3 pp 249ndash2682007

[38] H Azami K Mohammadi and B Bozorgtabar ldquoAn improvedsignal segmentation using moving average and Savitzky-Golayfilterrdquo Journal of Signal and Information Processing vol 3 no 1pp 39ndash44 2012

[39] G J McLachlan and T KrishnanThe EMAlgorithm and Exten-sions Wiley Hoboken NJ USA 2nd edition 2008

[40] G M DrsquoAngelo J Luo and C Xiong ldquoMissing data methodsfor partial correlationrdquo Journal of Biometrics amp Biostatistics vol3 no 8 p 155 2012

[41] S Arlot andA Celisse ldquoA survey of cross-validation proceduresfor model selectionrdquo Statistics Surveys vol 4 pp 40ndash79 2010

[42] T Fawcett ldquoAn introduction to ROC analysisrdquo Pattern Recogni-tion Letters vol 27 no 8 pp 861ndash874 2006

[43] J C Xu L Sun Y P Gao and T H Xu ldquoAn ensemble featureselection technique for cancer recognitionrdquo Bio-Medical Mate-rials and Engineering vol 24 no 1 pp 1001ndash1008 2014

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 12: Research Article Ensemble Merit Merge Feature Selection ...downloads.hindawi.com/journals/cmmm/2015/676129.pdf · Research Article Ensemble Merit Merge Feature Selection for Enhanced

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom