12
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME 71 LEAF IDENTIFICATION BASED ON FUZZY C MEANS AND NAÏVE BAYESIAN CLASSIFICATION Shilpa Ankalaki 1 , Laxmidevi Noolvi 2 , Dr. Jharna Majumdar 3 1 Department of CSE (PG), NMIT Bangalore -560064, India 2 Department of CSE, Assistant Professor, NMIT Bangalore-560064, India 3 Dean R&D, Prof and Head CSE (PG), NMIT, Bangalore, 560064, India ABSTRACT Recognition of plants has become an active area of research as most of the plant species are at the risk of extinction. This paper uses efficient features including moment invariants features are extracted during the feature extraction phase. The proposed system proposes Fuzzy C means clustering method for clustering the similar images and Naïve Bayesian classification to classify the leaf image into the one of the cluster. Different distance methods can are be used to identify the closest match of the leaf. In proposed system Euclidian distance is used to search similar leaf in the cluster. Keywords: Euclidian distance, Fuzzy C Means Clustering and Naive Bayesian Classification. 1. INTRODUCTION The Plant is one of the most important forms of life on earth. Plants maintain the balance of oxygen and carbon dioxide of earth’s atmosphere [13]. The relations between plants and human beings are also very close. In addition, plants are important means of livelihood and production of human beings. . Plants are vitally important for environmental protection. However, it is an important and difficult task to recognize plant species on earth. Many of them carry significant information for the development of human society. The urgent situation is that many plants are at the risk of extinction [10]. So it is very necessary to set up a database for plant protection [3-4]. The proposed method mainly concentrates on leaf shape features regardless of color features, because the color of the leaf may change due to the climate change or due to the some disease so color feature are inefficient. INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME: http://www.iaeme.com/IJARET.asp Journal Impact Factor (2014): 7.8273 (Calculated by GISI) www.jifactor.com IJARET © I A E M E

Leaf identification based on fuzzy c means and naïve bayesian classification

  • Upload
    iaeme

  • View
    70

  • Download
    0

Embed Size (px)

DESCRIPTION

Leaf identification based on fuzzy c means and naïve bayesian classification

Citation preview

Page 1: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

71

LEAF IDENTIFICATION BASED ON FUZZY C MEANS AND NAÏVE

BAYESIAN CLASSIFICATION

Shilpa Ankalaki1, Laxmidevi Noolvi2, Dr. Jharna Majumdar3

1Department of CSE (PG), NMIT Bangalore -560064, India

2Department of CSE, Assistant Professor, NMIT Bangalore-560064, India 3Dean R&D, Prof and Head CSE (PG), NMIT, Bangalore, 560064, India

ABSTRACT

Recognition of plants has become an active area of research as most of the plant species are

at the risk of extinction. This paper uses efficient features including moment invariants features are

extracted during the feature extraction phase. The proposed system proposes Fuzzy C means

clustering method for clustering the similar images and Naïve Bayesian classification to classify the

leaf image into the one of the cluster. Different distance methods can are be used to identify the

closest match of the leaf. In proposed system Euclidian distance is used to search similar leaf in the

cluster.

Keywords: Euclidian distance, Fuzzy C Means Clustering and Naive Bayesian Classification.

1. INTRODUCTION

The Plant is one of the most important forms of life on earth. Plants maintain the balance of

oxygen and carbon dioxide of earth’s atmosphere [13]. The relations between plants and human

beings are also very close. In addition, plants are important means of livelihood and production of

human beings. . Plants are vitally important for environmental protection. However, it is an

important and difficult task to recognize plant species on earth. Many of them carry significant

information for the development of human society. The urgent situation is that many plants are at the

risk of extinction [10]. So it is very necessary to set up a database for plant protection [3-4]. The

proposed method mainly concentrates on leaf shape features regardless of color features, because the

color of the leaf may change due to the climate change or due to the some disease so color feature are

inefficient.

INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET)

ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME: http://www.iaeme.com/IJARET.asp Journal Impact Factor (2014): 7.8273 (Calculated by GISI) www.jifactor.com

IJARET

© I A E M E

Page 2: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

72

2. PROPOSED METHODOLOGY

The proposed methodology mainly consists of 2 phases.

i. Learning Phase

ii. Identification phase

Fig1 shows the flow diagram of the learning phase and identification phase.

2.1 Image Acquisition Leaves are usually clustered so that it is difficult to automatically extract features of one leaf from

the unneeded background. We created leaf Image plates. Put these leaves on the light panel, and then

take the picture of the leaf with a digital camera. In this way, we can get an image including only one

leaf.

2.2 Image Pre-processing The raw data, depending on the data acquisition type is subjected to a number of pre processing steps

to make it usable in the descriptive stages of analysis. Pre processing aims to produce image data that

are easy for the Leaf Identification system and can operate quickly and accurately.

2.2.1 Conversion of Color image to Gray scale image

The colors of plant leaves are usually green. Moreover, the shades and the variety of changes

of water, nutrient, atmosphere and season can cause change of the color, so the color feature has low

reliability. Thus, we decided to recognize various plants by the grey-level image of plant leaf. Fig 2

shows the pre processing of leaf image.

An RGB image is firstly converted into a grayscale image. Eq. (1) is the formula used to

convert RGB value of a pixel into its grayscale value.

Gray = 0.2989 * R + 0.5870 * G + 0.1140 *B (1)

Where R, G, B correspond to the color of the pixel, respectively

Fig.1: Flow diagram of learning and identification phase

Page 3: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

73

2.2.2 Generation of Binary Image

In the analysis of images, it is essential to separate the objects of interest from the background. The

techniques used to find the objects of interest from the rest are referred to as Thresholding techniques

and the cluster of pixels corresponding to region of interest are known as foreground pixels and the

rest of the pixels are known as background pixels. The image data is converted to a two level binary

image having pixel values between 0 and 255 this is done using Thresholding, where all pixels above

certain level are assigned 255 and rest of the pixels 0. The proposed methodology used statistical

mean method and Otsu’s method for automatic Thresholding.

2.2.3 Extraction of Boundary

Boundary extraction can be applied to any image containing only boundary information. Once a

single boundary point is found, the operation seeks to find all other pixels on that boundary.

Boundary can be extracted using chain code technique. The system defines the boundary of the leaf

in terms of x-y coordinates [11]. From a starting point, the system traces the boundary coordinates in

a clockwise direction.

Fig.2: Pre Processing of leaf image

2.3 Feature Extraction Feature extraction involves the extraction of geometric features which represents the shape of

the leaf. These features these features are used by the classifier to classify the leaf image. Different

types of features extraction are discussed below.

2.3.1 Aspect ratio: The aspect ratio [1] is ratio between the maximum length and the minimum

length of the minimum bounding rectangle or ratio between length and width of the minimum

bounding box of leaf image. It is scale invariant feature.

Aspect Ratio � LengthWidth �2�

2.3.2 Rectangularity: Rectangularity is the measure of how closely the shape of leaf approaches to

rectangle or it can be defined as the similarity between leaf and rectangle. To calculate the

rectangularity first step is to create bounding box to the leaf image, and find the ratio of leaf area to

the area of leaf bounding box.

Rectangularity Leafarea�Length � Width� �3�

2.3.3 Perimeter: The total number pixels on the leaf boundary.

Page 4: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

74

2.3.4 Roundness: Roundness [2][8] is the measure of how closely the shape of leaf approaches that

of a circle. Difference between a leaf and a circle is calculated by using the Eq. 4

Roundness � 4π � AreaPerimeter �4�

2.3.5 Sphericity: Spericity [4][5] is the ratio of the radius of the incircle of the leaf object (ri) and the

radius of the excircle of the leaf object (rc). Incircle and excircle are as shown in the Fig 3(a).

2.3.6 Principal axes: Principal axes of a given shape is uniquely defined as the two segments of lines

that cross each other orthogonally in the centroid of the shape and represent the directions with zero

cross correlation. This way, a contour is seen as an instance from a statistical distribution. Fig 3 (b)

shows the principal axes.

2.3.7 Eccentricity: Eccentricity is defined as the ratio of minor principal axes to major principal axes.

2.3.8 Tooth Feature: A tooth point [14] is a pixel on the contour that has a high curvature, i.e., it is a

peak. To determine whether a point Pi on the contour is a tooth point or not, we examine the angle

subtended at Pi by its neighbors Pi-k and Pi+k (where k is a threshold). Fig 3(c) shows an example. If

the angle θ is within a particular range, then Pi is a tooth; otherwise, it is not. It is also possible for

two different types of leaves to have nearly the same number of teeth at a particular threshold [12];

so we compute the tooth-based features at multiple increasing threshold values.

Fig.3: (a) Incircle and Excircle, (b) Principal axes, (c) Tooth detection at two different thresholds,

(d)Black occupancy (e) Convex hull

2.3.9 Black occupancy: Black occupancy gives the number of boxes that are occupied by the leaf

pixels. It is a scale invariant feature. Fig 3(d) shows the black occupancy. The Input leaf image

divided into equal 36 boxes (6X6 matrix), and count the number of boxes that are occupied by the

leaf pixel.

2.3.10 Convex Hull: Convex hull of a set of points S is the boundary of the smallest convex region

that contains all the points of S inside it or on its boundary. There are a number of applications of the

convex hull problem, including partitioning problems, shape testing problems, and separation

problems. Fig 3(e) shows the convex hull. The algorithm for convex hull is given in the Appendix A.

Using convex hull we can derive two more features based on the shape of leaf image and its

convex hull.

• Convexity

Convexity is defined as ratio between the convex hull perimeter of the leaf and the perimeter

of the leaf. Mathematically, it is notated as

Convexity = Convex Perimeter / Leaf Perimeter

Page 5: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

75

• Solidity

Solidity is defined as ratio between the area of the leaf and the area of its convex hull

Solidity = Area of Convex hull / Area of Leaf

2.3.11 Leaf Vein Extraction: Leaf vein extraction [3][6] , is one of the important features of the leaf.

Leaf veins can be extracted using the grayscale morphological operations. We perform

morphological top-hat transformation on grayscale image with disk shape structuring element of

radius 2 and 3. The result looks like the leaf vein. The areas of the leaf vein are denoted as Av1 and

Av2. We obtain the leaf vein features by performing Av1/ A, Av2/ A where A is the leaf area, Av1 and

Av2 are total number of pixels on leaf veins using disk shape structuring element of radius 2 and 3

respectively. Algorithm 2 in Appendix A describes the algorithm of the leaf vein extraction. Fig. 6

shows the leaf vein extraction using disk shape structuring element of radius 2.

Fig. 6: Leaf vein extraction

2.4 Normalization of Database Features Data normalization is a useful step often adopted, prior to designing a classifier, as a

precaution when the feature values vary in different dynamic ranges [8]. In the absence of

normalization, features with large values have a stronger influence on the cost function in designing

the classifier. By normalizing data, value of all features will be in predetermined ranges.

Normalization can be done by using formula as follow.

! � "# $ "%#&"%'( $ "%#&

X represents new value of the feature, xi represents original value of the feature, xmin is the

smallest value of original feature, and xmax is the smallest value of original feature.

3. LEARNING PHASE

During learning phase, first step is to acquisition of leaf image from image database, then

Pre-processing of input image, extract geometric features from input image and extraction features

are added to the features database. Same way geometric features are extracted for all the Images

which are present in the image database and features are added to the feature database. Feature

normalization method is applied to the feature database to normalize all features within

predetermined ranges. The proposed system introduced Fuzzy C Means clustering method to cluster

the similar leaves based on the normalized features database.

3.1 Fuzzy C Means Clustering Data clustering is the process of dividing data elements into classes or clusters so that items

in the same class are as similar as possible, and items in different classes are as dissimilar as

possible. Depending on the nature of the data and the purpose for which clustering is being used,

Page 6: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

76

different measures of similarity may be used to place items into classes, where the similarity measure

controls how the clusters are formed. Some examples of measures that can be used as in clustering

include distance, connectivity, and intensity. In fuzzy clustering, data elements can belong to more

than one cluster, and associated with each element is a set of membership levels. These indicate the

strength of the association between that data element and a particular cluster. Fuzzy clustering is a

process of assigning these membership levels, and then using them to assign data elements to one or

more clusters.

Fuzzy C Means takes the database features as the input. User needs to specify the number of

clusters required. This algorithm works by assigning membership to each data point corresponding to

each cluster centre on the basis of distance between the cluster centre and the data point. More the

data is near to the cluster centre more is its membership towards the particular cluster centre. Clearly,

summation of membership of each data point should be equal to one. Algorithm 3 in Appendix A

describes the algorithm of the Fuzzy c means clustering.

4. IDENTIFICATION PHASE

During the identification, first step is to read the test leaf image. Second step is pre-

processing of given input leaf image, third step is feature extraction. Features of given leaf are

normalized and stored in the feature vector. Next step is the searching for cluster containing the input

image to be identified, for this purpose Naïve Bayesian classification is used.

4.1 Naïve Bayesian Classification Naïve Bayesian classification is supervised classification; it takes the prior knowledge from

the clusters. It is possible to use the Naïve Bayesian classification without using the clustering

method, but it is necessary to create the database such that all the similar leaves in one class. So the

proposed system introduced unsupervised classification for clustering and supervised classification

to classify the leaf image into one of the respective class. Bayesian classifiers are statistical

classifiers. They can predict class membership probabilities, such as the probability that a given tuple

belongs to a particular class. Naïve Bayesian classifiers assume that the effect of an attribute value

on a given class is independent of the values of the other attributes. This assumption is called class

conditional independence. In proposed Methodology Naïve Bayesian classification is used to find the

probability of input leaf image belongs to the each cluster. Input leaf image belongs to the cluster

which has the maximum probability. Naïve Bayesian classifier is based on the Bayes theorem is

given as follows:

)�*#|!� � )�!|*�)�*�)�!�

Where

X: feature vector of the given leaf.

C : Leaf clusters

i: Number of Clusters )�*#|!� is the probability that the cluster C holds given the observed data tuple X. )�!|*� is the posterior probability of X conditioned on C

P(X) is the prior probability of X

P(C) is the prior probability, or a priori probability, of C.

Calculation of )�*#|!�, )�!|*�, )�*� and )�!� is described in the algorithm 4 of Appendix A.

Page 7: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

77

4.2 Euclidian Distance In proposed methodology Euclidian distance measure [7] [13] is used to identify leaf image within

the cluster. To identify the leaf, find the distance between the input leaf image feature vector to all

leaves features that are present in particular cluster selected using Naïve Bayesian classifier. The leaf

image that has the minimum distance to the input leaf is identified as recognized leaf.

5. EXPERIMENTAL RESULTS

The proposed methodology uses the invariant shape features. The proposed methodology

considered 50 plant leaves for training database. The proposed leaf identification system recognizes

the leaf correctly even though it is damaged. The experiments were designed to classify each test

image into a single class. Since all the leaf images are taken by us, their true classes are known. In

our experiment, Fuzzy C Means clustering method is used to cluster the similar leaves. Clusters of

sample leaves using Fuzzy C Means is shown in the Fig.7. Naïve Bayesian classification and

Euclidian distance method is used for identification purpose. Fig. 8 shows the identification of

cluster containing the input image to be identified using Naïve Bayesian Classification.

Fig.9 shows the identification of leaf within the cluster using Euclidian distance measurement.

Fig 7: Clusters obtained using Fuzzy C Means Clustering method

Fig.8: Identification of cluster which contains input image to be identified.

Page 8: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

78

Fig.9: Identification of input image

6. CONCLUSION

The work described in this research has been concerned with the two challenging phases in

image analysis applications which are feature extraction and classification phase. Since there is no

general feature extraction method that is available for all type of images, an experiment needs to be

conducted in order to determine the suitable methods for plant leaf images. Therefore, an

investigation of some of the suitable shape features and moment invariants techniques was presented

which were used to be implemented in the feature extraction of plant leaf images. The proposed

methodology gives the 83.24% accuracy. One of the disadvantages in this research is the use of

limited sample of leaf images. The future scope of this research is the identification of compound

with different background and performance improvement of identification system.

7. ACKNOWLEDGMENT

The authors acknowledge Prof. N R Shetty, Director, Nitte Meenakshi Institute of

Technology and Dr. H C Nagaraj, Principal, Nitte Meenakshi Institute of Technology for providing

the support and infrastructure to carry out our research.

REFERENCES

1. Chia-Ling Lee and Shu-Yuan Chen, “Classification for Leaf Images”, 16th IPPR Conference

on Computer Vision, Graphics and Image Processing (CVGIP 2003)

2. Qingfeng Wu, Changle Zhou and Chaonan Wang, “Feature Extraction and Automatic

Recognition of Plant Leaf Using Artificial Neural Network”, © A. Gelbukh, S. Torres, I.

López (Eds.) Avances en Ciencias de la Computación, 2006, pp. 5-12.

3. S. G. Wu, F. S. Bao, E. Y Xu, Y-X. Wang, Y-F. Chang, & Q-L.Xiang, “A Leaf Recognition

Algorithm for Plant Classification Using Probabilistic Neural Network”, IEEE 7th

International Symposium on Signal Processing and Information Technology, Cairo, 2007.

4. J. Du, X. Wang, and G. Zhang, “Leaf shape based plant species recognition,” Applied

Mathematics and Computation, vol. 185-2, pp. 883-893, February 2007.

5. David Knight, James Painte, Matthew Potter, “Automatic Plant Leaf Classification for a

Mobile Field Guide”.

6. Xiaodong Zheng, Xiaojie Wang, “ Leaf Vein Extraction Based on Gray-scale Morphology”,

I.J. Image, Graphics and Signal Processing, 2010, 2, 25-31 Published Online December 2010

in MECS (http://www.mecs-press.org/)

7. Chomtip Pornpanomchai, Chawin Kuakiatngam ,Pitchayuk Supapattranon, and Nititat

Siriwisesokul, “Leaf and Flower Recognition System (e-Botanist)”, IACSIT International

Journal of Engineering and Technology, Vol.3, No.4, August 2011

Page 9: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

79

8. Abdul Kadir, Lukito Edi Nugroho, Adhi Susanto and Paulus Insap Santosa, “Leaf

Classification Using Shape, Color, and Texture Features”, International Journal of Computer

Trends and Technology- July to Aug Issue 2011

9. Jyotismita Chaki and Ranjan Parekh, “Plant Leaf Recognition using Shape based Features and

Neural Network classifiers”, (IJACSA) International Journal of Advanced Computer Science

and Applications, Vol. 2, No. 10, 2011

10. Prof. Meeta Kumar, Mrunali Kamble, Shubhada Pawar, Prajakta Patil, and Neha Bonde, “

Survey on Techniques for Plant Leaf Classification”, International Journal of Modern

Engineering Research (IJMER) www.ijmer.com Vol.1, Issue.2, pp-538-544 ISSN: 2249-6645

11. Chomtip Pornpanomchai, Supolgaj Rimdusit, Piyawan Tanasap and Chutpong Chaiyod, “Thai

Herb Leaf Image Recognition System (THLIRS)”, Kasetsart Journal: Natural Science May

2011 45 : 551 - 562

12. Akhil Arora, Ankit Gupta, Nitesh Bagmar, Shashwat Mishra, and Arnab Bhattacharya, “A

Plant Identification System using Shape and Morphological Features on Segmented Leaflets:

Team IITK, CLEF 2012”

13. Anant Bhardwaj, Manpreet Kaur, and Anupam Kumar, “Recognition of plants by Leaf Image

using Moment Invariant and Texture Analysis”, International Journal Of Innovation And

Applied Studies ISSN 2028-9324 Vol. 3 No. 1 May 2013, Pp. 237-248 © 2013 Innovative

Space Of Scientific Research Journals.

14. Vijay Satti and Anshul Satya, “An Automatic Leaf Recognition System For Plant

Identification Using Machine Vision Technology”, International Journal of Engineering

Science and Technology (IJEST) ISSN : 0975-5462 Vol. 5 No.04 April 2013

15. Jyotismita Chaki and Ranjan Parekh, “Designing an Automated System for Plant Leaf

Recognition”, International Journal of Advances in Engineering & Technology, Jan 2012.

©IJAET ISSN: 2231-1963.

16. Laura Keyes, Adam Winstanley, “USING MOMENT INVARIANTS FOR CLASSIFYING

SHAPES ON LARGE_SCALE MAPS”.

17. George H. John, Pat Langley, “Estimating Continuous Distributions in Bayesian

Classification”, In Proceedings of Conference on uncertainty in Artificial Intelligence, Morgan

Kaufmann Publishers, San Mateo, 1995.

18. James C. Bezdek, Robert Ehrlich and William Full,” FCM: The Fuzzy C-Means Clustering

Algorithm” Computers & Geosciences Vol. 10, No. 2-3, Pp. 191-203, 1984. Printed in the

U.S.A.

19. Garima Agarwal, Rekha Nair and Pravin Shrinath, “A Review of Plant Leaf Classification

Features and Techniques”, International Journal of Computer Engineering & Technology

(IJCET), Volume 4, Issue 5, 2013, pp. 204 - 216, ISSN Print: 0976 – 6367, ISSN Online:

0976 – 6375.

APPENDIX A Algorithm 1: Algorithm for convex hull The idea used here is to use one extreme edge as an anchor for finding the next. Suppose the

algorithm found an extreme edge whose unlinked endpoint is x

• For each y of set S compute the angle θ

• The point that yields the smallest θ must determine an extreme edge

• The output of this algorithm is all the points on the hull in boundary traversal order

Page 10: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

80

Algorithm: Convex Hull

Input: Color leaf image I (M x N)

Output: Convex hull of the given image

Steps:

Step 1: Generation of Gray scale Image

Step 2: Generation of Binary Image

Step 3: Find the lowest point (smallest y coordinate)

Step 4: Find the extreme points on every row and store in an array

Step 5: Let i0 be its index

Step 6: For every point stored in this array compute counterclockwise angle θ from index

Step 7: Find the point with the smallest θ

Step 8: Let k be the index of the point with the smallest θ

Step 9: Output (pi ,pk) as a hull edge

Step 10: i0 ← k

Step 11: repeat step 6 until i = i0

Algorithm2: Algorithm for vein extraction. Algorithm: To find Veins of leaf

Input : Grayscale Leaf Image

Output : Leaf vein structure

Step 1: Read Grayscale Image. Step 2: Let f be the Grayscale leaf image and b be the disk shape Structuring element. Structuring

element of radius 2, 3, 4 or 5 can be used.

Step 3: Perform the Erosion operation on the grayscale image using the disk shape structuring

element that is find the minimum neighbor and replace it with origin of the structuring element.

Step 4: Perform the Dilation on the output image of the erosion using the disk shape structuring

element that is find the maximum neighbor and replace it with origin of the structuring element.

This process is called as Opening Morphological operation.

Step 5: Subtract the original grayscale image from the result of opening operation this process is

called as Top-hat Transformation.

Step 6: Convert the result of Top-hat Transformation into binary image.

Step 7: Perform the Av1/A, where Av1 is the vein area using structuring element (SE) 2, A is the

leaf area.

Step 8: The leaf vein structure can be extracted using different structuring element like disk shape

structuring element of radius 2,3,4,5 and each vein structure of different structuring element is stored

as feature.

Algorithm2: Algorithm for Fuzzy C means. The algorithm for the Fuzzy C Means clustering [18] is as follows:

Algorithm: The Fuzzy C means algorithm Input:

� C: the number of clusters,

� D: a data set containing n objects.

Output: A set of N clusters.

Page 11: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

81

Method: Let X = {x1, x2, x3 ..., xn} be the set of data points and V = {v1, v2, v3 ..., vc} be the set of centers.

Step 1: Randomly select ‘c’ cluster centers.

Step 2: Calculate the fuzzy membership 'µ ij' using Eq.(8):

,#-. / 01#-1#234 %567

�8�92.6

Step 3: Compute the fuzzy centers 'vj' using Eq.(9):

:- � 0∑ ,#-% "#&#.6∑ ,#-%&#.63 <= � 1,2, … . B �9�

‘n’ is the number of data points

‘vj’ represents the jth cluster center.

'm' is the fuzziness index m € [1, ∞].

'c' represents the number of cluster center

'µ ij' represents the membership of ith

data to jth

cluster center.

'dij' represents the Euclidean distance between ith

data and jth

cluster center.

Main objective of fuzzy c-means algorithm is to minimize least square error. Least square error can

be calculated using the Eq. (10):

D�E, :� � / / �,#-�%9-.6

&#.6 F "- $ G- F �10�

Where,

'||xi – vj||' is the Euclidean distance between ith

data and jth

cluster center.

Step 4: Repeat step 2) and 3) until the minimum 'J' value is achieved or

||U(k+1)

-U(k)

||<β.

Where,

‘k’ is the iteration step. ‘β’ is the termination criterion between [0, 1].

‘U = (µ ij)n*c’ is the fuzzy membership matrix. ‘J’ is the objective function.

Algorithm4: Naïve Bayesian Classification

Input: Leaf features of training data set Output: Classification of Test leaf Step 1: Apply any clustering method on the training data set to form the cluster

Step 2: Store the input leaf feature into feature vector.

Step 3: Find the probability of each cluster that holds the given leaf image based features. This

probability can be called as posterior probability. Posterior probability can be calculated using Eq.11

)�*#|!� � )�!|*�)�*�)�!� �11�

Where

X: feature vector of the given leaf.

C : Leaf clusters

i: Number of Clusters

Page 12: Leaf identification based on fuzzy c means and naïve bayesian classification

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 7, July (2014), pp. 71-82 © IAEME

82

The calculation of the P(C), P(X/C) and P(X) is given in the further steps

Step 4: Calculate the class probability P(C) using Eq.12. It is constant value.

)�*#� � |*#,I|J �12�

Where

D: total number of training tuples in the database. |*#,I| : Number of training tuples of class Ci in D.

Step 5: P(x) will be constant, so to maximize the probability its need to maximize the P(X/C)*P(C).

P(X) can be calculated using Eq.13 as follows:

P(X)=P(Xk | C1)+…..+P(Xk|Ci) (13)

Step 6: In order to reduce computation in evaluating P(X|Ci), the naive assumption of class

conditional independence is made. This presumes that the values of the attributes are conditionally

independent of one another, given the class label of the tuple. Thus,

)�!|*#� � )�"6, " … . "2|*#� � )�"6|*#� � )�" |*#� � … � )�"&|*#�

� K )�"2|*#�&

2.6

Where X is feature vector with {x1, x2, -----, xk} attributes, and k is the total number of attributes.

Step 7: To find the P(X/C) for continuous distribution needs to apply Gaussian distribution, i.e. to

find the probability of the leaf that belongs to the cluster with respect to feature. It involves following

steps. • Calculate mean for each cluster with respect to each feature.

• Calculate variance for each cluster with respect to each feature and the following Gaussian

distribution formula[17] as shown in Eq.14

L�", ,, M� � 1√2OM P5�Q5R�S

TS �14�

Finally probability of leaf belong to particular class is given in Eq.15

)�"2|*#� � LU"2 , ,VW , MV#X (15)

The leaf image which is being tested belongs to particular cluster which has the highest probability.