8
Contents lists available at ScienceDirect Clinical Epidemiology and Global Health journal homepage: www.elsevier.com/locate/cegh Original article Lesion analysis towards melanoma detection using soft computing techniques Rohan Gaonkar, Kanwaljeet Singh, G.R. Prashanth, Venkatanareshbabu Kuppili Department of Electronics and Communication Engineering, National Institute of Technology, Goa, India ARTICLE INFO Keywords: Skin lesion Binary bat algorithm Epiluminicence Gray level Co-Variance matrix (GLCM) Morphological operations ABSTRACT Introduction: Melanoma has been increasing worldwide. An efficient method based on bio-inspired algorithms and neural networks has been suggested in this study. Objective: The main goal of this study is to reduce the complexity of classifier using feature selection method thereby reducing time for classification and to balance specificity and sensitivity. Materials and methods: The approach to the problem has been divided into three basic steps; lesion segmentation feature extraction and the last step is classification. Segmentation and feature extraction was performed using image processing techniques. A novel fitness function has been proposed that will be optimized using Binary Bat Algorithm (BBA) to obtain the most relevant feature set. Result: Support Vector Machine (SVM) and Radial Basis Function Network (RBFN) were used for classification process. SVM and RBFN produced accuracy of 87% and 91% respectively for K10 protocol. Specificity and Sensitivity for SVM in K10 protocol was obtained to be 82% and 92% respectively. As for RBFN specificity and sensitivity was obtained to be 90% and 93% respectively. We were able to obtain balance between specificity and sensitivity through our approach. Conclusion: With simple network structure like RBFN and SVM we were able to obtain results better than other complex networks. 1. Introduction There are numerous infections or anomalies that influence the skin, one such abnormality occurring in the skin is melanoma. Normal cells in the human body grow in a controlled way but abnormal cells grow in an uncontrolled manner. The uncontrolled growth of cells is what we call cancer. It is possible to cure skin cancer at its initial stages. But if it is not diagnosed at its initial stages it can prove to be life threatening. Melanoma is named after the melanocyte, the cell it originates from . 1 This disease is curable if diagnosed at the right time. But the simi- larity in appearance between skin moles and. melanomas at its initial stages is one of the main problems associated with its detection. If this is left unchecked it starts spreading to other parts of the body and can become untreatable. Early detection is therefore of paramount im- portance in melanoma treatment. 2 Biopsy is a method mostly used to detect skin cancer. It involves testing and analysis of the affected part in the laboratory. This can be painful and often causes inflammation or even spread of lesion to the surrounding areas. A less painful and reliable method for melanoma detection is therefore needed. Diagnosis based on a computer can therefore prove to be a better method over conventional techniques. There are some unique features of the skin cancer which differentiate it from normal moles. These are, asymmetry, irregularity of the border, color, diameter, and evolution. Melanoma has a radius more than 6 mm in diameter and characterized by brown or red color. Many researchers have worked on this problem wherein they have used different network architectures. Andre Esteva et al. classified melanoma using image processing and obtained an accuracy of 76.9%. 3 Ning Situ et al. used bag-of-features approach to detect melanoma based on microscopic imaging with epiluminescence. They used Naive Bayes classification and SVM and got 82.21% accuracy. 2 Garg N. et al. used image processing approach to detect melanoma. 4 Aya Abu Ali et al. used Convolutional Neural Network (CNN) for detection of mel- anoma and obtained an accuracy of 81.1%. 5 Neural networks are usually computationally expensive compared to traditional classifica- tion algorithms. The computational complexity or amount of training time for a neural network depends on the data size as well as the net- work's depth. https://doi.org/10.1016/j.cegh.2019.11.003 Received 26 April 2019; Received in revised form 23 October 2019; Accepted 7 November 2019 Corresponding author. E-mail addresses: [email protected] (R. Gaonkar), [email protected] (K. Singh), [email protected] (G.R. Prashanth), [email protected] (V. Kuppili). Clinical Epidemiology and Global Health 8 (2020) 501–508 Available online 18 November 2019 2213-3984/ © 2019 INDIACLEN. Published by Elsevier, a division of RELX India, Pvt. Ltd. All rights reserved. T

Lesion analysis towards melanoma detection using soft

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lesion analysis towards melanoma detection using soft

Contents lists available at ScienceDirect

Clinical Epidemiology and Global Health

journal homepage: www.elsevier.com/locate/cegh

Original article

Lesion analysis towards melanoma detection using soft computingtechniquesRohan Gaonkar, Kanwaljeet Singh, G.R. Prashanth, Venkatanareshbabu Kuppili∗

Department of Electronics and Communication Engineering, National Institute of Technology, Goa, India

A R T I C L E I N F O

Keywords:Skin lesionBinary bat algorithmEpiluminicenceGray level Co-Variance matrix (GLCM)Morphological operations

A B S T R A C T

Introduction: Melanoma has been increasing worldwide. An efficient method based on bio-inspired algorithmsand neural networks has been suggested in this study.Objective: The main goal of this study is to reduce the complexity of classifier using feature selection methodthereby reducing time for classification and to balance specificity and sensitivity.Materials and methods: The approach to the problem has been divided into three basic steps; lesion segmentationfeature extraction and the last step is classification. Segmentation and feature extraction was performed usingimage processing techniques. A novel fitness function has been proposed that will be optimized using Binary BatAlgorithm (BBA) to obtain the most relevant feature set.Result: Support Vector Machine (SVM) and Radial Basis Function Network (RBFN) were used for classificationprocess. SVM and RBFN produced accuracy of 87% and 91% respectively for K10 protocol. Specificity andSensitivity for SVM in K10 protocol was obtained to be 82% and 92% respectively. As for RBFN specificity andsensitivity was obtained to be 90% and 93% respectively. We were able to obtain balance between specificityand sensitivity through our approach.Conclusion: With simple network structure like RBFN and SVM we were able to obtain results better than othercomplex networks.

1. Introduction

There are numerous infections or anomalies that influence the skin,one such abnormality occurring in the skin is melanoma. Normal cellsin the human body grow in a controlled way but abnormal cells grow inan uncontrolled manner. The uncontrolled growth of cells is what wecall cancer. It is possible to cure skin cancer at its initial stages. But if itis not diagnosed at its initial stages it can prove to be life threatening.

Melanoma is named after the melanocyte, the cell it originates from.1 This disease is curable if diagnosed at the right time. But the simi-larity in appearance between skin moles and. melanomas at its initialstages is one of the main problems associated with its detection. If thisis left unchecked it starts spreading to other parts of the body and canbecome untreatable. Early detection is therefore of paramount im-portance in melanoma treatment.2

Biopsy is a method mostly used to detect skin cancer. It involvestesting and analysis of the affected part in the laboratory. This can bepainful and often causes inflammation or even spread of lesion to thesurrounding areas. A less painful and reliable method for melanoma

detection is therefore needed. Diagnosis based on a computer cantherefore prove to be a better method over conventional techniques.There are some unique features of the skin cancer which differentiate itfrom normal moles. These are, asymmetry, irregularity of the border,color, diameter, and evolution. Melanoma has a radius more than 6 mmin diameter and characterized by brown or red color.

Many researchers have worked on this problem wherein they haveused different network architectures. Andre Esteva et al. classifiedmelanoma using image processing and obtained an accuracy of 76.9%.3

Ning Situ et al. used bag-of-features approach to detect melanomabased on microscopic imaging with epiluminescence. They used NaiveBayes classification and SVM and got 82.21% accuracy.2 Garg N. et al.used image processing approach to detect melanoma.4 Aya Abu Aliet al. used Convolutional Neural Network (CNN) for detection of mel-anoma and obtained an accuracy of 81.1%.5 Neural networks areusually computationally expensive compared to traditional classifica-tion algorithms. The computational complexity or amount of trainingtime for a neural network depends on the data size as well as the net-work's depth.

https://doi.org/10.1016/j.cegh.2019.11.003Received 26 April 2019; Received in revised form 23 October 2019; Accepted 7 November 2019

∗ Corresponding author.E-mail addresses: [email protected] (R. Gaonkar), [email protected] (K. Singh), [email protected] (G.R. Prashanth),

[email protected] (V. Kuppili).

Clinical Epidemiology and Global Health 8 (2020) 501–508

Available online 18 November 20192213-3984/ © 2019 INDIACLEN. Published by Elsevier, a division of RELX India, Pvt. Ltd. All rights reserved.

T

Page 2: Lesion analysis towards melanoma detection using soft

We propose a novel approach to detect melanoma based on acombination of neural network and bio-inspired algorithm.6 Our mainfocus is on improving accuracy and reducing the complexity of thenetwork. Around 22 features were extracted from lesion images usingGLCM technique. We have designed fitness function to choose mostrelevant features from available feature set. This function is multi-ob-jective optimization problem.7 In order to optimize this function wehave adopted BBA which is a multi-variable optimization algorithm.Finally SVM and RBFN was used to classify images into skin moles ormelanoma.

There are many heuristic optimization algorithms, bat algorithm isone of them which mimic the bats’ echolocation behavior to find thebest solution globally. Bat algorithm has shown to be better than othermostly used algorithms like Particle Swarm Optimization (PSO),Genetic Algorithm (GA) and Glowworm Swarm Optimization (GSO).8

We have used the binary version of bat algorithm. SVM is one of themost frequently used Machine Learning (ML) techniques.9 In the SVMkernel methods are used in the transformation of the problem into ahigh-dimensional space called the feature space so that classes can belinearly separated and then it tries to find the best hyperplane thatseparates the two classes.

RBFN measures the similarity of the input to the training set ex-amples and carries out classification.10 Each neuron of the RBFN can beregarded as a prototype that stores an example of the set used fortraining. Each neuron calculates the Euclidean distance from the inputto the prototype. If the test input is closer to ‘class A’, then it is classifiedas ‘class A’ or else it is classified as ‘class B’.

Since our data was limited we carried out K cross-validation on theinput data. Four kinds of cross-validation protocols were performed onthe data (k = 2, 3, 5, 10). In each of the cross-validation protocol, inputdata was divided into K partitions wherein K-1 parts are used fortraining and remaining 1 part is used for testing. This is repeated foreach of the partition and then the result is averaged over K.

2. Proposed approach

The process of melanoma detection can be broadly divided into 4basic steps.

1. Lesion segmentation.2. Feature extraction from lesion.3. Feature selection using optimization.4. Classification

Before proceeding with segmentation it is necessary to performpreprocessing on the input images. The whole system architecture hasbeen depicted in Fig. 1.

2.1. Preprocessing

Color constancy is the first step involved in preprocessing. In com-puter vision techniques such as image processing, color extraction,5 andcolor appearance models,11 color constancy is an important factor.

When an image is being taken its color is not only determined byintrinsic property of the object but also by the light source under whichthe image is taken. To make our classification system color-robust theeffects of light source need to be filtered out. We have used gray worldalgorithm to filter the effects of light source.12

The gray world algorithm is a color constancy method which as-sumes a scene to be neutral gray on an average. The assumption of grayworld algorithm holds only if we have a good color distribution in thescene. By looking at the average color and then comparing it to gray wecan estimate the illumination color cast. The gray world algorithmpredicts the color cast of the illumination by calculating the mean ofeach of the three channels of the image. To normalize channel i of theimage, pixel value is scaled using the following equation,

=s avgavgi

i (1)

Where si is scaling factor, avg is the illumination estimate and avgi is thechannel mean.

Next step involved in preprocessing is noise removal. As these areskin images they are affected by the presence of hair. In order to filterthe noise we have performed morphological operations on the images.Morphology is an image processing technique which processes imagesbased on shapes and structures present in an image. Morphologicaloperations process image using structuring element which is usually a3 × 3 matrix. Structuring element processes image pixel by pixel ad-justing each pixel depending on the value of adjacent pixels. You canmake a morphological operation that is sensitive to particular shape inthe input picture by selecting the shape and size and of the structuringelement.13 In our case we had to remove hair from the images, there-fore, we dilated the input image using the line as a structuring element.We adjusted the length and angle of inclination of the line which issuitable for our experiment. The dilation operation increases the size ofan object. The degree to which it increases depends on the type and sizeof the structuring element. The dilation operation by structuring ele-ment Y on an image X is given by,

=X Y z Y X{ ( ˆ ) }z (2)

When Y is reflected about the origin we get Yˆ. When structuring ele-ment Y dilates an image X, it produces a resultant image which has onesat locations of origin of the structuring element if structuring elementcorrectly overlaps with the input image X, else resultant image has zeroat locations of origin of structuring element. Thus dilating operationwill enlarge those pixels with value one and shrink pixels having valuezero.6–13

2.2. Segmentation

In order to perform operations on specific region of an image, it isvery important to perform thresholding of the input image. Mostly,thresholding is bi-level where if value of the pixel is greater than some

Fig. 1. System Architecture.Processing involved in the detection of melanoma.

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

502

Page 3: Lesion analysis towards melanoma detection using soft

threshold it is classified as one class else it is classified as anotherclass.11 In an iterative thresholding or exhaustive search algorithm14

optimum value of threshold is calculated by the optimization of clas-sification error rate of the pixels this is called as Minimum ErrorThresholding. The object class is the skin lesion and the backgroundclass is the normal skin. Segmentation is performed using MATLAB tool.Fig. 2 depicts the results of preprocessing of skin lesion images.

3. Feature extraction

Feature Extraction is the method of extracting unique features from theimage that best represents the image. Extracting features from imagesrather than using whole image as input improves computational cost.Using this method it is ensured that complexity of the classifier is mini-mized. That is instead of giving the whole image to the network mostimportant features which describe image the best are fed to the network.

There are certain features like geometry, color features and GLCM ortexture features which help in classifying image as skin mole or melanoma.We can distinguish moles from melanoma by extracting these features andtraining a classifier using these features. In order to extract features from

cancer images, we have used GLCM technique which extracts severalfeatures from the lesion. Table 1 shows some of the features extractedusing GLCM technique. We worked on a total of 300 images out of which150 were melanoma images and rest were skin moles.

4. Feature selection

Next task was to select the most pertinent features to use forlearning and classification. To select the most pertinent features wehave used BBA.

4.1. Feature selection using BBA

This section describes the BBA approach for feature selection.Bats are fascinating animals and have attracted scientists in various

fields by their advanced capacity to locate their food. Bats have alouder, short pulse of sound, they use echolocation to locate their prey.Echolocation is like a biological sonar: Echo returns to the ears of thebat, when the sound hits an object and they are able to estimate thelocation of their prey. This mechanism helps them to identify the lo-cation of their prey even in complete darkness. Based on this, Yang16

developed a new algorithm called bat algorithm which is a meta-heuristic optimization technique for finding the global optimum. Thistechnique assumes that a group of bats will track food using theirecholocation capability. This algorithm is modeled by, Yang using thefollowing rules:

1. Bats distinguish between food, prey and background barriers usingtheir echolocation capability and they are also able to sense thedistance.

2. ai be the position of the bat bi and let si be the velocity of the bat. Letpi be the frequency, be the varying wavelength and L0 be theloudness. Emitted pulse has a wavelength which can be auto-matically adjusted by the bats along with the rate of emission of thepulses depending upon where the target is located.

3. Basically loudness varies in many ways but Yang suggests that itvaries between a positive value L0 and a minimum constant valueLmin.

Initially, for each bat bi the position ai, the initial velocity si andfrequency pi are initialized. Let T be the maximum number of iterations,in every step of t, the movement of bats is updated according to thefollowing equations:

= +p p p p( )i min min max (3)

= +s t s t a a t p( ) ( 1) [ ˆ ˆ ( 1)]ij

ij

i (4)

= +a t a t s t( ) ( 1) ( )ij

ij

ij (5)

Fig. 2. Pre Processing of Skin ImagesColorconstancy is the first step of preprocessing. Color constancy performed usinggray world algorithm. A. Skin lesion image before color constancy. B. Skincancer Images after color constancy. Next step is noise removal. C. Lesion imageaffected by presence of hair. D. Morphological operations used to filter thenoise (hair). To perform operations on specific region in an image segmentationis done. E. Original grayscale image F. Image after segmentation.

Table 1Feature Extracted using GLCM.15

Feature Extracted

Uniformity Entropy (done)Dissimilarity ContrastInverse difference correlationHomogeneity Auto-correlationCluster Shade Cluster ProminenceMaximum probability Sum of SquaresSum Average Sum VarianceSum Entropy Difference varianceDifference entropy Information measures of correlation

(1)Information measures of correlation (2) Maximal correlation coefficientInverse difference normalized (INN) Inverse difference moment normalized

(IDN)

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

503

Page 4: Lesion analysis towards melanoma detection using soft

represents a random number belonging to the interval [0,1]. At timestep t, decision variable value is represented by a t( )i

j for each bat i.Equation (3) controls the range and speed of the bats' movement. Afterfinding all the solutions from each of the m bats, current global bestsolution is calculated and is represented by aˆ j.

= +a a L t( )new old (6)

Yang used random walks to improve possible variability in solu-tions. From the current best solutions, one solution is selected and therandom walk is then applied to generate a new solution.

L t( ) represents the average value of the loudness at time t for all thebats. The strength and direction of the random walk is denoted by

[ 1,1]. After every iteration the emission pulse rate zi and loudnessLi are updated using the following equation:

+ =L t L t( 1) ( )i i (7)

+ =z t z t( 1) (0)[1 exp( )]i i (8)

and represents ad-hoc constants. When the algorithm starts theloudness L (0)i and emission rate z (0)i are chosen randomly. Usually,L (0)i belongs to Refs. 1,2 and z (0)i belongs to [0, 1].

4.2. Binary Bat Algorithm(BBA)

As we saw in the previous section, each bat moves to continuouslyevaluated positions in the search space. In case of feature selection,however, the search space can be modeled as an n-dimensional Booleanlattice where each bat is moved along the corners of a hypercube. Theposition of the bat is then shown by binary vectors, because the pro-blem is to either select or not to select a specific feature. We have usedbinary version of bat algorithm to limit the position of the new bat tobinary values with a sigmoid function.17

=+

S ve

( ) 11

ij

vij (9)

Equation (9) can be written as:

= >x S votherwise

1 ( ) ,0

ij

(10)

U O~ ( , 1) Thus Equation (10) shows that, in the Boolean lattice eachbat's coordinate is represented by a binary value, which represents ei-ther presence or absence of a feature. Features which were selected byBBA have been shown in Table 2.

Algorithm 1. Bat algorithm

4.3. Fitness function

In bio-inspired algorithms, the population evolves by optimizing anobjective function, also called as a fitness function or objective func-tion.18 We have proposed a new fitness function that maximizes theaccuracy, reduces the dimensions of input as well as balances the spe-cificity and sensitivity. Fitness function helps in selecting appropriatefeature subset that has best classification performance.

Fitness function is expressed as follows:

= + +Fitness nF

Acc sp st1 ( ) (1 )(11)

, , are constants such that + + = 1=n Number of features selected=F Total number of features=sp Specificity=st Sensitivity

5. Classification

In this paper, we have used two classifiers to classify moles andmelanoma namely RBFN and SVM.

5.1. Radial Basis Function Network(RBFN)

RBFN measures the similarity of the input to the training set ex-amples and carries out classification. Each neuron of the RBFN can beregarded as a prototype that stores an example of the set used fornetwork training. Each neuron calculates the Euclidean distance fromthe input to the prototype when we want to classify the test images. Ifthe test input is closer to ‘class A’ then it is classified as ‘class A’ else it is

Table 2Feature Selection using Binary Bat algorithm.

Feature Selected

Uniformity DissimilarityInverse difference CorrelationHomogeneity Cluster ProminenceSum Average Sum VarianceDifference variance Information measures of correlation (1)Maximal correlation coefficient Inverse difference normalized (INN)

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

504

Page 5: Lesion analysis towards melanoma detection using soft

classified as ‘class B’.A measure of the similarity between the input and its prototype

vector is calculated by each RBFN neuron. Those input vectors whichare closer to the prototype vector return the result as 1. Fig. 3 depictsthe structure of RBFN. There are different possible function choices, butthe Gaussian function is the most popular.

Gaussian equation can be written as,Gaussian function can be written as,

=g y e( ) 12

y µ2

( ) /22 2

(12)

Where y is input, is the variance and is µ the mean. The neuron ofRBFN has an activation function that is similar to Gaussian function,and is typically written as:

=y e( ) y µ 2(13)

For the activation function , we aren't directly interested in the valueof the standard deviation , so we make some simplifying modifications.The first change is that we've eliminated the outer coefficient 1

2 2

.This term normally represents the height of the Gaussian. DuringTraining, the output nodes learn the correct coefficient that needs to beapplied to obtain the desired result.

5.2. Support Vector Machine(SVM)

SVM14 is one of the most frequently used ML supervised learningtechniques. SVM uses two main techniques of classification. Firstly,kernel methods are used in the transformation of the problem into ahigh-dimensional space called the feature space so that training samplesfrom different classes can be linearly separated. Then it finds the besthyperplane that separates the two classes.

Hyperplanes are basically decision boundaries that help to classify

data points. The number of features determine the dimensions of thehyperplane. If the number of features are just 2, then hyperplane is justa line. If the number of features is 3, then hyperplane is a two-dimen-sional plane. It becomes difficult to imagine the hyperplane when thenumber of features is greater than 3. Support vectors are data pointscloser to the hyperplane and affect the hyperplane's orientation andposition. Using these support vectors, the classification margin ismaximized. Deleting the support vectors will change the hyperplane'sposition.

These Classifiers were used to detect melanoma along with the BBAfor feature selection. Out of a total of 22 features, some of the mostpertinent features were selected by BBA.

Input data was split into K groups and accuracy of each group wascalculated and then it was averaged over K. This method is called K foldcross-validation. Cross-validation is mainly used in machine learning toestimate the competencies of a machine learning model on unseen data.That is, using a limited sample to estimate how the performance of themodel in general is when used to predict data not used during themodel's training. Accuracy for different values of K was noted down.

6. Results

We conducted cross-validation protocol on our data to measure theclassification performance of our network. Table 3 discusses the accu-racy, specificity and sensitivity for different cross-validation protocols.Table 4 uses all four cross-validation protocols to discuss training andtesting time. We compared the SVM and RBFN in this table for trainingand testing time.

6.1. Effect of size of the data

The cross-validation protocol allows the number of training andtesting samples to vary. We used four types of protocols: K2, K3, K5 andK10. The data set is divided into two equal groups in K2 cross-valida-tion, where one part is used for training and the other part is used fortesting. Data was divided into 3 parts in K3, 2 parts being used fortraining and 1 part being used for testing. Similarly with data divided

Fig. 3. RBFN Architecture.Architecture of RBFN paradigm using single layer feed-forward neural network.It comprises input layer, hidden layer and output layer. Input vector or featurevector is fed to the network. Hidden layer consist of neurons with Gaussianactivation function.

Table 3Comparison between SVM-based and RBFN-based learning method.

Cross Validation Accuracy (%) Specificity

True Negative (%)SensitivityTrue Positive (%)

Area Under Curve (%)

Classifier SVM RBFN SVM RBFN SVM RBFN SVM RBFN

K2 85.55 89.35 84.7 80.5 86.4 92.2 0.86 0.89K3 85.65 88.85 80.2 85.4 91.1 92.3 0.86 0.89K5 86.10 90.25 90.0 90.2 82.0 90.3 0.86 0.90K10 87.35 91.85 82.5 90.6 92.2 93.1 0.87 0.92

Table 4Time comparison between SVM and RBFN.

Training & Testing Time CVa SVM RBFN Speed-Upc (%)

Average K2 1.35 1.27 5.97Training Time K3 1.35 1.29 4.19(secb) K5 1.36 1.29 5.45

K10 1.48 1.29 12.99

Average K2 0.051 0.049 11.92Testing Time K3 0.040 0.039 12.29(secb) K5 0.039 0.036 13.60

K10 0.020 0.018 13.41

a CV:Crossvalidation protocol.b sec: seconds.c =Speed up X100TimeSVM TimeRBFN

TimeSVM

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

505

Page 6: Lesion analysis towards melanoma detection using soft

into K parts for K5 and K10 wherein for training we used dataset in-creases. This is mainly due to the fact that as data size is increasedtraining sample size also increases. The results have been plotted inFig. 4.

6.2. Time analysis of RBFN and SVM

For K2 protocol, the training time is less for SVM as well as RBFNand increases as we increase folds. But for testing, this order is reversed.This basically happens because as we move from K2 to K10 the trainingdata size increases and testing data size decreases. It can be clearly seenthat both SVM and RBFN have much less training and testing time.RBFN has less training time compared to SVM in particular.

6.3. Performance evaluation

For all the four cross-validation protocol Receiver OperatingCharacteristics (ROC) and Area Under Curve (AUC) were plotted andthe performance of both RBFN and SVM system is computed.

6.4. ROC curves

The performance of both RBFN as well as SVM is computed usingROC plots for all the cross-validation protocols. Results have beenplotted in Fig. 5. It was noted that RBFN performs better than SVM in allthe cross-validation protocols. Here we have used SVM with GaussianKernel. RBFN network has 10 neurons (receptors) in the hidden layerand radial function has a spread of 5. It can be observed that RBFN hasan AUC of 0.91 as compared to 0.87 for SVM in K10 cross-validationprotocol.

7. Discussion

We proposed a novel fitness function to maximize accuracy, mini-mize the number of features selected and balance specificity and sen-sitivity. Most of the work done earlier on this topic mainly concentratedon improving accuracy. But accuracy alone cannot be used to test thereliability and performance of the network, specificity and sensitivityalso have to be taken into account. In our work we have proposed

Fig. 4. Accuracy vs. Data size Curves.Accuracy analysis for A. K2, B. K3, C. K5 and D. K10 cross-validation protocols for different data sizes. Original data is randomly partitioned into different sizes andaccuracy is computed for each cross validation protocol. Accuracy found to increase with increase in data size.

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

506

Page 7: Lesion analysis towards melanoma detection using soft

fitness function to improve accuracy as well as balance specificity andsensitivity. Earlier, researchers have used CNN and other complexnetworks to detect melanoma where they applied network directly onimages. But we have first extracted features from images, thus de-creasing the dimension of the input data which enabled us to use simpleclassifiers like RBFN and SVM.

Our image dataset consisted of different lesion sizes so our imageprocessing algorithm had to be scale invariant. Resolution of images isalso an important factor when applying GLCM technique for featureextraction. Therefore we had to account for scale as well as resolutionof image dataset. Next problem was segmenting the lesion from theoriginal image. If input images are not segmented properly performanceof classification network will be affected, added to that we cannotsimply segment lesion manually because our main goal is to automatedetection of melanoma. Therefore we adopted minimum error thresh-olding to segment lesion images. Feature extraction task was relativelysimpler. Next task was to design novel fitness function which has not

been adopted by anyone used by anyone. In designing fitness functionwe mainly focused on balancing specificity and sensitivity. In the endwe had to choose network parameters which best suit our classificationproblem. Since our data was limited we adopted K cross-validation onour data. The results for each cross-validation were averaged over K andplotted for each value of K. Added to that we also experimented withdata size and observed slight increase in accuracy as datasize was in-creased. This could be because of increase in training dataset, as net-work was exposed to more data it was able to learn more about theinput data.

In the result section we have also discussed the average training andtesting time for the classifiers that we used. In the next section, variousapproaches used by other people have been benchmarked.

8. Benchmarking

Many researchers have already used machine learning approach to

Fig. 5. ROC Curves.ROC curves for A K2, B. K3, C. K5 and D. K10 cross-validations. ROC is a performance measure of a model at various threshold settings. AUC is directly related toaccuracy of model.ROC is plot of true positive rate against false positive rate.

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

507

Page 8: Lesion analysis towards melanoma detection using soft

classify melanoma. Ning Situ et al. used Bag-of-Features approach todetect melanoma. They used Naive Bayes classification and SVM andobtained an accuracy of 82.21%.2–5 Garg N. et al. used image proces-sing approach to detect melanoma.3-4 Aya Abu Ali et al. used CNN fordetection of melanoma and obtained an accuracy of 81.1%.2–5 Keun-Kwang Lee et al. used pre-trained net to extract features and used SVMfor classification obtaining an accuracy of 90.74%.4-10 Yihua Ding et al.used K-Nearest Neighbors algorithm (KNN) for classification and ob-tained an accuracy of 89%.19 Table 5 summarizes the methods used byother researchers in detecting melanoma.

9. Conclusion

Our study proposed a superior method to detect dangerous cancerlike melanoma, where we did not use complex networks like CNN, DeepNeural Network (DNN) but used simpler network architectures likeRBFN and SVM. We extracted GLCM or texture features from imagesand selected the most pertinent features based on the bio-inspired al-gorithm. We proposed a novel fitness function to balance specificity andsensitivity and used the bio-inspired algorithm to optimize the fitnessfunction. With this simpler approach, we were able to produce resultswhich are at par with other complex network structures.

We have obtained good accuracy with relatively simpler networkarchitectures and were also able to balance specificity and sensitivity.But the number of input data samples have to be increased so that thenetwork is able to detect melanoma from images taken under differentillumination conditions. In our approach, we have used GLCM features.The number of features can further be increased to include color fea-tures in other color domains such as HSV (Hue, Saturation, Value). Thiswill help in obtaining better results.

Declaration of competing interest

The authors declare that they have no competing interests.

References

1. Agarwal Ashi, Issac Ashish, Dutta Malay Kishore, Riha Kamil, Uher Vaclac.Automated skin lesion segmentation using K-Means cluster- ing from digital der-moscopic images. Telecommunications and Signal Processing (TSP) 2017 40thInternational Conference on. 2017; 2017:743–748.

2. Situ Ning, Yuan Xiaojing, Chen Ji, Zouridakis George. Malignant melanoma detectionby Bag-of-Features classification. 2008 30th Annual International Conference of theIEEE Engineering in Medicine and Biology Society. Aug. 2008; Aug. 2008:20–25.

3. Garg N, Sharma V, Kaur P. Melanoma skin cancer detection using image processing.Sensors and Image Processing, Advances in Intelligent Systems and Computing. vol. 651.Singapore: Springer; 2009.

4. Lee Keun-Kwang, Ulzii-Orshikh. The skin cancer classification using deep convolu-tional neural network. Multimed Tools Appl. 2018:9909–9924.

5. Abu Ali Aya, Al-Marzouqi Hasan. Melanoma detection using regular convolutionalneural networks. 2017 International Conference on Electrical and ComputingTechnologies and Applications (ICECTA). Nov. 2017; Nov. 2017:21–23.

6. Salkuti Surender. Optimal reactive power scheduling using cuckoo search algorithm.Int J Electr Comput Eng. 2017;7:2349. https://doi.org/10.11591/ijece.v7i5.pp2349-2356.

7. Surender Reddy S, Bijwe PR, Abhyankar AR. Faster evolutionary algorithm basedoptimal power flow using incremental variables. Int J Electr Power Energy Syst.2014;54:198–210. https://doi.org/10.1016/j.ijepes.2013.07.019.

8. Surender Reddy S, Srinivasa Rathnam C. Optimal power flow using GlowwormSwarm optimization. Int J Electr Power Energy Syst. 2016;80:128–139. https://doi.org/10.1016/j.ijepes.2016.01.036.

9. Burges CJC. Simplified support vector decision rules. Proc. 13th Int'l Conf. MachineLearning. 1996; 1996:71–77.

10. Broomhead DS, Low David. Multivariable functional interpolation and adaptivenetworks. Complex Syst. 1988:321–355.

11. Alquran Hiam, Abu Qasmieh Isam, Alqudah Ali Mohammad, et al. The melanomaskin cancer detection and classification using support vector machine. 2017 IEEEJordan Conference on Applied Electrical Engineering and Computing Technologies(AEECT). Oct. 2017; Oct. 2017:11–13.

12. Gijsenij A, Gevers T, van de Weijer J. Computational color constancy : survey andexperiments. IEEE Trans Image Process. 2011.

13. Golbon-Haghighi MH, Saeidi-manesh H, Zhang G, Zhang Y. Pattern synthesis for thecylindrical polarimetric phased array radar(CPPAR)”. Electromagn Waves.2003:87–98.

14. Cortes Corinna, Vapnik, Vladimir N. Support-vector networks. Mach Learn.1995:273–297.

15. Henawy Ibrahim El-Bakry, El Hadad Hazem M, Mastorakis Hagar, Nikos. MuzzleFeature Extraction Based on gray level co-occurrence matrix. Isr J Vet Med. 2016.

16. Yang XS. A new metaheuristic bat-inspired algorithm. Nature Inspired CooperativeStrategies for Optimization (NICSO 2010). 2010; 2010:65–74.

17. Nakamura RYM, Pereira LAM, Costa KA, Rodrigues D, Papa JP, Yang XS. BBA: abinary bat algorithm for feature selection. 2012 25th SIBGRAPI Conference onGraphics, Patterns and Images. 2012; 2012.

18. Salkuti Surender, Panigrahi Bijaya. Optimal power flow using clustered adaptiveteaching learning-based optimization. Int J Bio-Inspired Comput. 2017;9:226. https://doi.org/10.1504/IJBIC.2017.084316.

19. Yihua Ding, Qizhi Zhang, Jiang Bruce and Z Wang. “Automatic diagnosis of mela-noma using machine learning methods on a spectroscopic system”, BMC MedImaging.

Table 5Benchmark table.

SN Author Name Data Type Classifier Accuracy

1 Ning Situ et al.2 Melanoma SVM 82.212 Garg N. et al.3 Melanoma SVM 85.553 Keun-Kwang Lee et al.4 Melanoma SVM 90.744 Aya Abu Ali et al.5 Melanoma CNN 815 Yihua Ding et al.19 Melanoma KNN 89%

R. Gaonkar, et al. Clinical Epidemiology and Global Health 8 (2020) 501–508

508