5
978-1-4244-9182-7/10/$26.00 ©2010 IEEE 138 Extraction of Image Features for an Effective CBIR System P. Gangadhara Reddy 1 , JNTU College of Engineering Anantapur, Anantapur, A.P, India. Abstract - In this paper, we propose a content-based image retrieval system based on an efficient combination of both color and texture features. According to HSV (Hue, Saturation, and Value) color space, we quantified the color space into non-equal intervals, and then construct a one dimensional feature vector and represented the color feature. Similarly, the work of texture feature extraction is obtained by using Gray level co-occurrence matrix (GLCM) or Color co-occurrence matrix (CCM) and then we combine color features and GLCM as well as CCM separately. Depending on the former, image retrieval based on multi-feature fusion is achieved by using normalized Euclidean distance classifier. Experiments reveal that the use of both color and texture based on CCM has better effective performance and advantage. Keywords-Image retrieval; GLCM; CCM; I. INTRODUCTION With the rapid development of multimedia database, the application of digital library and image search engine become more and more widely used. Traditional text-based data retrieval approach has difficulties on large database. The first problem is that manually label images requires large amount of labor. The second is that the subjectivity of human perception causes different image annotation for a particular image. The drawbacks of text-based approach led to researches in content-based image retrieval, CBIR, where instead of key words, the image is indexed by its visual content. It makes full use of image content features (color, texture, shape, etc.), which are analyzed and extracted automatically by computer to achieve the effective retrieval [3,4]. Using a single feature for image retrieval cannot be a good solution for the accuracy and efficiency. High- dimensional feature will reduce the query efficiency; low- dimensional feature will reduce query accuracy, so it may be a better way using multi features for image retrieval. Color and texture are the most important visual features. HSV color space is widely used in computer graphics, visualization in scientific computing and other fields [3]. In this space, hue is used to distinguish colors, saturation is the percentage of white light added to a pure color and value refers to the perceived light intensity. The advantage of HSV color space is that it is closer to human conceptual understanding of colors and has the ability to separate chromatic and achromatic components. Therefore, we select the HSV color space to extract the color features according to hue, saturation and value. Texture feature is a kind of visual characteristics that does not rely on color or intensity and reflects the intrinsic phenomenon of images. Everyone can recognize texture, but it is more difficult to define. Unlike color, texture occurs over a region rather than at a point. Texture has qualities such as periodicity and scale; it can be described in terms of direction, coarseness, contrast etc., Firstly, we discuss the color and texture features separately. On this basis, a new method using integrated features is provided, and experiment is done on some database images, a satisfactory result was achieved. II. FEATURE EXTRACTION A. Extraction of color Features Here we are using HSV color space, because it is widely used in computer graphics, scientific computing and other fields [5].In this model, Hue is used to distinguish colors, Saturation is the percentage of white light added to a pure color and Value refers to the perceived light intensity [6]. The advantage of HSV color space is that it is closer to human conceptual Understanding of colors and has the ability to separate Chromatic and achromatic components. B. Non-interval quantization Because of a large range of each component, it is essential to quantify HSV space component to reduce computation and improve efficiency. At the same time, because the human eye to distinguish colors is limited, do not need to calculate all segments. Unequal interval quantization according the human color perception has been applied on H, S, and V components. Here we divide the Hue into eight parts and the Saturation and Value into three parts separately in accordance with the human eye perception to distinguish. In accordance with the different colors and subjective color perception quantification, quantified hue (H), saturation (S) and intensity (V) are showed as equation (1). S = V=

Feature Extraction

  • Upload
    sumit

  • View
    188

  • Download
    6

Embed Size (px)

Citation preview

Page 1: Feature Extraction

978-1-4244-9182-7/10/$26.00 ©2010 IEEE 138

Extraction of Image Features for an Effective CBIR

System

P. Gangadhara Reddy1,

JNTU College of Engineering Anantapur,

Anantapur, A.P, India.

Abstract - In this paper, we propose a content-based image

retrieval system based on an efficient combination of both color

and texture features. According to HSV (Hue, Saturation, and

Value) color space, we quantified the color space into non-equal

intervals, and then construct a one dimensional feature vector

and represented the color feature. Similarly, the work of texture

feature extraction is obtained by using Gray level co-occurrence

matrix (GLCM) or Color co-occurrence matrix (CCM) and then

we combine color features and GLCM as well as CCM

separately. Depending on the former, image retrieval based on

multi-feature fusion is achieved by using normalized Euclidean

distance classifier. Experiments reveal that the use of both color

and texture based on CCM has better effective performance and advantage.

Keywords-Image retrieval; GLCM; CCM;

I. INTRODUCTION

With the rapid development of multimedia database, the application of digital library and image search engine become more and more widely used. Traditional text-based data retrieval approach has difficulties on large database. The first problem is that manually label images requires large amount of labor. The second is that the subjectivity of human perception causes different image annotation for a particular image. The drawbacks of text-based approach led to researches in content-based image retrieval, CBIR, where instead of key words, the image is indexed by its visual content. It makes full use of image content features (color, texture, shape, etc.), which are analyzed and extracted automatically by computer to achieve the effective retrieval [3,4]. Using a single feature for image retrieval cannot be a good solution for the accuracy and efficiency. High-dimensional feature will reduce the query efficiency; low-dimensional feature will reduce query accuracy, so it may be a better way using multi features for image retrieval. Color and texture are the most important visual features. HSV color space is widely used in computer graphics, visualization in scientific computing and other fields [3]. In this space, hue is used to distinguish colors, saturation is the percentage of white light added to a pure color and value refers to the perceived light intensity. The advantage of HSV color space is that it is closer to human conceptual understanding of colors and has the ability to separate chromatic and achromatic components. Therefore, we select the HSV color space to extract the color features according to hue, saturation and value. Texture feature is a kind of visual characteristics that does not rely on color or intensity and reflects the intrinsic phenomenon of

images. Everyone can recognize texture, but it is more difficult to define. Unlike color, texture occurs over a region rather than at a point. Texture has qualities such as periodicity and scale; it can be described in terms of direction, coarseness, contrast etc., Firstly, we discuss the color and texture features separately. On this basis, a new method using integrated features is provided, and experiment is done on some database images, a satisfactory result was achieved.

II. FEATURE EXTRACTION

A. Extraction of color Features Here we are using HSV color space, because it is

widely used in computer graphics, scientific computing and other fields [5].In this model, Hue is used to distinguish colors, Saturation is the percentage of white light added to a pure color and Value refers to the perceived light intensity [6]. The advantage of HSV color space is that it is closer to human conceptual Understanding of colors and has the ability to separate Chromatic and achromatic components.

B. Non-interval quantization Because of a large range of each component, it is

essential to quantify HSV space component to reduce computation and improve efficiency. At the same time, because the human eye to distinguish colors is limited, do not need to calculate all segments. Unequal interval quantization according the human color perception has been applied on H, S, and V components.

Here we divide the Hue into eight parts and the Saturation and Value into three parts separately in accordance with the human eye perception to distinguish. In accordance with the different colors and subjective color perception quantification, quantified hue (H), saturation (S) and intensity (V) are showed as equation (1).

S = V=

Page 2: Feature Extraction

139

(1)

In accordance with the quantization level above, the H, S,

V three-dimensional feature vector for different values of with different weight to form one dimensional feature vector G:

G = QSQVH + QVS + V (2)

Where QS is quantified series of S, QV is quantified

Series of V.

Here we set QS = QV = 3, then

G = 9H + 3S + V (3)

In this way, three-component vector of HSV form one-dimensional vector, which quantize the whole color space for the 72 kinds of main colors. So we can handle 72 bins of one-dimensional histogram. This quantification can be effective in reducing the images by effects of light intensity, but also reducing the computational time and complexity.

As the components of feature vector may have different physical meaning entirely, their rate of change may be very different. It will be much of the deviation of the calculation of the similarity if we do not normalize, so we must normalize the components to the same range. The process of normalization is to make the components of feature vector equal importance. By this way, we extract the color features using the one dimensional feature vector.

C. Texture Feature Extraction

1. Based on GLCM

Gray Level Co-occurrence Matrix (GLCM) is a matrix of frequencies at which two pixels, separated by a certain vector, occur in the image. The distribution in the matrix will be depending on the angular and distance relationship between pixels. By varying the separation vector it allows to capture different texture characteristics. After making the GLCM symmetrical, there is still one step to take before texture measures can be calculated. The measures require that each GLCM cell contain not a count, but rather a probability. It is defined by P (a,b|d,θ) which expresses the probability of the couple pixels at θ direction and d interval. When θ and d is determined, P (a, b d, θ) is showed by P(a, b). Once the normalized GLCM has been created, various features can be computed from it. The following are some of the features calculated from normalized co-occurrence matrix P (a, b).

P (a, b|d,θ) = (4)

GLCM expresses the texture feature based on the

correlation of the couple pixels gray-level at different positions. In this paper, four features are selected includes energy, contrast, entropy, inverse difference.

Energy :

E= 2 (5)

Energy is a gray-scale image texture measure of homogeneity changing, reflecting the distribution of image gray-scale uniformity of weight and texture.

Contrast:

I= 2P(i, j) (6)

Contrast is the main diagonal near the moment of inertia, which measure the value of the matrix is distributed and images of local changes in number, reflecting the image clarity and texture of shadow depth. Contrast is large means texture is deeper.

Entropy:

S=- (7)

Entropy measures image texture randomness, when the space co-occurrence matrix for all values is equal, it achieved the minimum value; on the other hand, if the value of co-occurrence matrix is very uneven, its value is greater. Therefore, the maximum entropy implied by the image gray distribution is random.

Inverse difference:

H= 2 (8)

It measures local changes in image texture number. Its value

in large is illustrated that image texture between the different

regions of the lack of change and partial very evenly. Where P

is a normalized co-occurrence matrix and P(i, j ) is the

probability of the source pixel having value i and the target

pixel having value j for any pixel pair fitting the pairing relationship used to populate the matrix.

1. Feature Extraction based on CCM

In GLCM, we extract the texture features from a gray

scale image .Whereas in Color Co-occurrence Matrix (CCM) we extract the texture features from one of the color plane co-occurrence matrix. In this, we are using R, G, H and I planes for texture extraction as follows.

Assuming color image is divided into N × N image sub-block, for anyone image sub-block T (a, b) (1 ≤ a ≤ N, 1 ≤ b ≤ N), using the main color image extraction algorithm to calculate the main color C (a, b). For any two 4-connected

Page 3: Feature Extraction

140

image sub-block T(a, b) and T(k ,l ) ( a − k =1 and b = l ; or b − l = 1 and a = k) , if its corresponds to the main color and in the HSV space to meet the following condition[8,9]:

(1) Cb and Ca belong to the same color of magnitude, that is, its HSV components ha = hb, sa = s b, va = v b;

(2) Ca And Cb don’t belong to the same color of magnitude, but satisfy sa ∗ 3 + va = s b ∗ 3 + v b and hb− hb = 1; or satisfy ha = hb, sa = s b and va, v b ∈{0,1} . We can say image sub-block T (a, b) and T (k, l) are color connected. According to the concept of color-connected regions, we can make each sub-block of the entire image into a unique color of connected set S = {Ra} (1 ≤ a ≤ M) in accordance with guidelines 4-connected.

The set S corresponds to the color-connected region. For

each color-connected region{Ra} (1 ≤ a ≤ M) , the color components R, G in RGB color space and H in HSV color space are respectively extracted the CCM at the direction δ = 1;θ = 0° ,45° ,90° ,135° . The same operation is done with I (intensity of the image). The statistic features Extracted from CCM are as follows:

(9)

∙m (10)

(11)

Where, if m (a, b) = 0, log [m (a, b)] = 0

(12)

Through this method, we can get a 16 dimensional texture feature for component R, G, H and I, each component correspond to four statistic values E, I, S and H.

F=[FR,FG,FH,FI]=[fRE,fRI,fRS,fRI,..,fII,fIS,fIH]

III.EXPERIMENT AND ANALYSIS

In this paper, experimental data set contains 1000 images from Corel database of images, divided into 10 categories, each category has 100 images. Experimental images covers a wealthy of content, including landscapes, animals, plants, flowers, transport (cars, buses) and so on. Selection of each type in the 80 images as training samples, 20 samples for testing. We study two kinds of feature extraction techniques: feature extraction techniques based on the HSV color space and texture feature extraction technology. A texture feature extraction techniques, we introduce two different extraction methods the gray co-occurrence matrix and CCM. Color and texture are just in part describing the characteristics of images. Image database varies some images dramatic ups and downs in

gray-level, showing a very strong texture characteristic, and some images from a number of smooth but the colors are different regional composition.

Only simple features of image information cannot get comprehensive description of image content. We consider the color and texture features combining not only be able to express more image information, but also to describe image from the different aspects for more detailed information in order to obtain better search results.

As shown in the figure 1, the similarity measure from two

types of characteristic features, including color features and texture features. Two types of characteristics of images represent different aspects of property. So during the Euclidean similarity measure when necessary the appropriate weights to combine them to consider this end.

Therefore, in carrying out Euclidean similarity measure we should consider necessary appropriate weights to combine them. We construct the Euclidean calculation model as follows:

D (A, B) = ω1D (FCA,FCB) +ω2D(FTA,FTB) (13)

Retrieval algorithm flow is as follows:

Figure 1. Algorithm scheme

Page 4: Feature Extraction

141

Normalized form is as follows:

D(A,B)= 1 + 2 (14)

\

Figure.2. Retrieval Result Based on HSV Color Space

Figure.3. Retrieval Result Based on Color+GLCM

Here ω1 is the weight of color features, ω2 is the weight of

texture features, FCA and FCB represents the 72-dimensional

color features for image A, B. For a method based on GCM,

FTA and FTB on behalf of 4-dimensional texture features

correspond to image A and B, another based on CCM, FTA and

FTB on behalf of 16-dimensional texture features.

In this paper, we combine color features and two

different kinds of texture features separately; the value of ω

through experiments shows that at the timeω1 =ω2 = 0.5, with better retrieval performance.

The following experiments are part of search results and test

statistics. Figure 2, 3 for the same retrieved images were

compared, figure 4 for the classic picture plane conducted a

search. Table 1 shows the image retrieval result. In this table,

recall and precision are computed. Obviously, in three feature

extraction mode, integrated feature has effective performance,

especially, color combined with CCM is better than others.

TABLE I. RETRIEVAL RESULT CONTRAST

Figure.4. Retrieval Result Based on Color+CCM

Retrieval

mode recall(%) precision(%)

Color 23.1 37.8

Color+GLCM 27.5 39.8

Color+CCM 28.3 42.1

Page 5: Feature Extraction

142

IV. CONCLUSION

This paper presents an approach based on HSV color Space and texture characteristics of the image retrieval.

Through the quantification of HSV color space, we combine

color features and gray- level co-occurrence matrix as well as

CCM separately, using normalized Euclidean distance

classifier. Through the image retrieval experiment, indicating

that the use of color features and texture characteristics of the

image retrieval method is superior to a single color image

retrieval method, and color characteristics combining color

texture features for the integrated characteristics of color

image retrieval has obvious advantages retrieval. Apart from

reflecting they CCM texture features, it also reflects the

composition of its color, and improve the performance of image retrieval has important research value.

REFERENCES

[1] A. Pentland, R. W. S. Picard, Sclaroff, “Photobook: Content-based manipulation of image databases,” In International Journal of

Computer Vision, vol.18, no.3, 1996, pp. 233–254.

[2] G. Sharma and H. J. Trussell. Digital color imaging. IEEE

[3] H. T. Shen, B. C. Ooi, K. L. Tan, “Giving meanings to www images,” Proceedings of ACM Multimedia, 2000, pp.39–48.

[4] B.S. Manjunath and Wei-Ying Ma. Image Databases. John Wiley and

Sons, New York, 2002.

[5] R.M. Haralick. Statistical and Structural Approaches to Texture.

Proceedings of the IEEE, 67:786–804, 1979

[6] M. Mirmehdi and M. Petrou, “Segmentation of colour textures,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 2, no.

2, pp. 142-159, 2000.