Click here to load reader

Iterative Technique for Content-Based Image Retrieval using ...iris.sel.eesc.usp.br/wvc/Anais_WVC2013/Oral/1/6.pdf Iterative Technique for Content-Based Image Retrieval using Multiple

  • View
    2

  • Download
    1

Embed Size (px)

Text of Iterative Technique for Content-Based Image Retrieval using...

  • Iterative Technique for Content-Based Image Retrieval using Multiple SVM Ensembles

    Douglas Natan Meireles Cardoso, Dionei José Muller, Fellipe Alexandre,

    Luiz Antônio Pereira Neves, Pedro Machado Guillen Trevisani.

    Federal University of Paraná Rua Dr. Alcides Vieira Arcoverde, 1225, Curitiba, PR, Brazil.

    Gilson Antonio Giraldi National Laboratory for Scientific Computing - LNCC

    Av. Getulio Vargas, 333, Quitandinha, Petrópolis, RJ, Brazil.

    Abstract—This paper applies Support Vector Machine (SVM) ensembles, based on the ”one-against-all” SVM multi-class ap- proach, for Content-Based Image Retrieval (CBIR). Given a database previously divided in N classes a first ensemble with N SVM machines is trained. Given a query image, this SVM ensemble is used to find the candidate classes for the query classification. Next, a new ensemble is constructed with ”one- against-all” strategy in order to improve the target search. The process stops when only one class is returned which completes the query classification stage. This class will be used in the final step for image similarity computation and retrieval. Before constructing the SVM ensembles we pre-process the images with the Discrete Cosine Transform (DCT) for feature extraction. We present the accuracy of our approach for Corel and Ground Truth images databases and show that its average accuracy outperforms a related work.

    I. INTRODUCTION

    Nowadays, we observe a huge amount of images stored in electronic format particularly in the case of biological and medical applications. Therefore, efficient content-based image retrieval techniques (CBIR) become a fundamental require- ment for searching and retrieving images from a large digital image database [1].

    CBIR is a class of techniques which uses visual contents (features) to search images from a database following a query image given by the user. In this process, visual contents such as shape descriptors and color features are extracted from each image of the database. The same is done for the user’s request in the form of a query image. Then, some engine is used for feature comparison in order to get the target images; that is, the images of the database that are most similar to the query one. The whole pipeline for CBIR can be roughly divided into three modules [2]: (1) Feature extraction module; (2) Query module; (3) Retrieval module.

    The first module includes techniques to convert an input image into a numerical array, normally called feature vector. The idea of this step is to obtain a more compact representation of the image. Therefore, the feature space has in general a dimension less than the original image space. Feature spaces usually are general composed by shape features, color, texture, histogram, edge features and image transform features [3], [4]. The latter encompasses linear operations, like Fourier, Sine and Cosine transforms as well as Wavelet approaches [5], [6], [7].

    The query module takes the query image, performs its feature extraction and can provide resources to make modi- fications on the query images or even integration of image keywords into the query [8]. Finally, the retrieval module computes some measure of similarity between the query and the database images. Then, the obtained quantities are sorted and images with the highest similarities are returned as target ones.

    One important point in this dataflow is the incorporation of prior information through some human interaction with the database. For example, in [9] the database is segmented in classes which are manually defined. In this case, the query module performs the query image classification; that is, automatically label the query image according to the class they belong. In this case, the retrieval module must search the targets only among the images that belong to same class of the query one.

    In the case of the CBIR approach proposed in [10], it is considered two SVM ensembles based on the ”one-against- all” SVM multi-class approach [11]: one for dimensionality reduction and another one for classification as a part of the feature and query modules, respectively. The feature module engine takes an input RGB image, resize it to 128 × 128 resolution and performs a suitable color space transformation. Then, it applies Daubechies’ wavelet transform and constructs a feature vector from the obtained low-pass image components. Next, the first SVM ensemble computes a reduced feature vector, with dimension equal to the number or pre-defined classes, which will represent the input image in the further operations. Once a query image is presented, the second SVM ensemble performs its classification. Finally, the Euclidean distances from the query image to its class images are used as a similarity measure for image retrieval.

    In this paper we also consider a multiple SVM ensemble for CBIR. In the feature extraction step we get a compact representation of the image by using the Discrete Cosine Transform (DCT) of the image instead of the Daubechies wavelet applied in [10]. The DCT implementation is simpler than the Daubechies wavelet and we have obtained suitable results with DCT. Then, like in [10], we construct N SVM models, one for each class of the database.

    Given the query image, this SVM ensemble is used to find

  • the candidate classes for the query classification. Specifically, each SVM ”i” returns a real number that is interpreted as the probability of the query belongs to the corresponding class Ci. So, we select only the classes whose probability is larger than the mean one. Next, a new SVM ensemble is constructed with the selected classes, using the same strategy as before, and applied to improve the target search. The process stops when only one class is returned which completes the query classification stage. This class will be used in the final step for image similarity computation and retrieval. Before constructing the SVM ensembles we pre-process the images with the Discrete Cosine Transform (DCT) for feature extraction. The method is ”iterative” in the sense that in each instance of the main loop we take the result of the previous one in order to refine the classification of the query. We present the accuracy of our approach for Corel and Ground Truth image databases and show that its average accuracy outperforms the reference [10].

    The paper is organized as follows. In section II we review the basic elements of image processing and SVM model. In Section III discusses the details of our proposal for CBIR. The experimental results are presented on section IV. Finally, we offer conclusions and perspectives for this work in section V.

    II. BACKGROUND

    In this presentation a true-color RGB image I is repre- sented by a (generalized) matrix I ∈

  • Specifically, for training each SVM model i, we take all k images from class i and label them as 1. Then, using random sampling we choose (2k)/(N − 1) images from classes other than i and label them as −1 [10]. The obtained set of feature vectors and corresponding labels lm:

    S = {(l1, x1) , (l2, x2) , ...} , xm ∈ y. Let us suppose that there are L classes that follow this condition. Then, we apply the algorithm (1), with N ← L and the image classes set updated by Φ = {C1, C2, · · ·, CL}, to construct other L support vector machines. Then, we feed each new SVM model with the query image z and verify the classes such that yi > y in order to get another subset of candidate classes. We repeat this process until we have only one class C that satisfies the condition yi > y. The algorithm 2 summarizes the whole process.

    Then, like in [10], it is calculated and sorted the Euclidian distances between the query image and all the images that belong to the same class. Images with the lowest Euclidean distances are considered similar images and returned by the system. This complete the image retrieval step of our method.

    Algorithm 2 Query image classification Input: Image Classes Set: Φ = {C1, C2, · · ·, CN}; SVM parameters; Query Image: z. Set L← N . while L > 1. do

    Apply Algorithm (1) to generate SVM models: SVM1, SV M2, · · ·, SV ML. Compute y1, y2, · · ·, yL in expression (6). Calculate y using equation (7). Select the classes C1, C2, · · ·, CK such that yCj > y. Update image classes set: Φ← {C1, C2, · · ·, CK}. Set L← K.

    end while Output: Class C of the query image.

    IV. EXPERIMENTAL RESULTS

    In this section we demonstrate the potential of our proposal by using the Corel [16] and Ground Truth [17] RGB image databases. The former is composed by 1000 images with 10 different categories while the latter is composed by 1109 divided in 22 genders. We choose these databases in order to produce a straightforward comparison between our results and the ones presented in [10]. The figures 1, 2 and 3 show samples of [16] database and the figures 4 and 5 show samples of the [17] database.

    Once our methodology is a supervised one, we need some human interaction to perform a pre-classification of the images and segment the database in classes. This step has the advantage of incorporating prior information to the system since human are experts in visual pattern recognition. Accordingly, our image database classification is likewise in [10], in other words, we select all images from [16] and divide into 10 classes and we selected 228 images from [17] and divided into 5 classes. named according to Tables I and II.

    The feature extraction step takes an input RGB image and resize it to 128 × 128 resolution in order to normalize the input data. Then, we compute the DCT transform for each database image and perform a zonal mask operation, given by expression (1). So, we need to s

Search related