15
REPORT ON image mining CONTENTS Introduction ..................3 Page 1

Image Minig Abstract

Embed Size (px)

DESCRIPTION

ssdfdsfsdfsa

Citation preview

www.applebuz.com

REPORTON

image mining

CONTENTS Introduction ..................3 Process Of Image Mining ..................4 Techniques in Image Mining ..................4 Issues in image mining ..................5 Feature Extraction for Image Mining ..................6 Google Earth as image database .................10 Images Mining application ..................11 Conclusion ..................11 Abstract Due to the digitization of data and advances in technology, it has become extremely easy to obtain and store large quantities of data, particularly Multimedia data. Fields ranging from Commercial to Military need to analyze these data in an efficient and fast manner. Presently, tools for mining images are few and require human intervention. Feature selection and extraction is the pre-processing step of Image Mining. Obviously this is a critical step in the entire scenario of Image Mining. Our approach to mine from Images to extract patterns and derive knowledge from large collections of images, deals mainly with identification and extraction of unique features for a particular domain. Though there are various features available, the aim is to identify the best features and thereby extract relevant information from the images. We have tried various methods for extraction; the features extracted and the techniques used are evaluated for their contribution to solving the problem. Experimental results show that the features used are sufficient to identify the patterns from the Images. The extracted features were evaluated for goodness and tested on test images. An interactive system was developed which allows the user to define new features and to resolve uncertain regions. We can get much knowledge from images. Our goal is to achieve a middle-level understanding of image semantics to bridge the semantic gap existing in the field of image mining and retrieval With the help of a popular search engine.Introduction

Image mining [1] deals with extraction of implicit knowledge, image data relationship or other patterns not explicitly stored in images and uses ideas from computer vision, image processing, image retrieval, data mining, machine learning, databases and AI. The fundamental challenge in image mining is to determine how low-level, pixel representation contained in an image or an image sequence can be effectively and efficiently processed to identify high-level spatial objects and relationships. Typical image mining process involves preprocessing ,transformations and feature extraction, mining (to discover significant patterns out of extracted features), evaluation and interpretation and obtaining the final knowledge. Various techniques from existing domains are also applied to image mining and include object recognition, learning, clustering and classification, just to name a few. Association rule mining is a well-known data mining technique that aims to find interesting patterns in very large databases. Some preliminary work has been done to apply association rule mining on sets of images to find interesting patterns .

Vast amount of image data are generated every day such as medical images, satellite images, etc.,

Analysis of these images can reveal useful information to the users.

To analyze the collection of images, image mining concepts are used. Image mining is extension of data mining to image domain.

Image mining deals with extraction of implicit knowledge from large collection of images. That is not explicitly stored in the image. However there is a general agreement that sufficient tools are not available for analysis of images. One of the issues is the effective identification of features in the images and the other one is extracting them. One of the difficult task is knowing the image domain and obtaining a priori knowledge of what information is required from the image. This is one of the reasons the Image Mining process cannot be completely automated.

Extracting association rules from images remains a complex, tedious process. It typically requires writing several hundred lines of code to read images, extract features and apply a mining algorithm such as APRIORI [2]. iARM is a scripting language that makes it easier to extract association rules from images. It allows defining a list of source image files and customizing association rule parameters. These parameters include number of terms, filters on text feature and configuration of signal features (i.e. color), support and confidence. Using iARM, association rules can be extracted by writing simple, easy to understand code that can be written and maintained by end users (i.e. knowledge workers) with no programming knowledge.

Image mining have led to tremendous growth in significantly large and detailed image databases. The most important areas belonging to image mining are: image knowledge extraction, content-based image retrieval, video retrieval, video sequence analysis, change detection, model learning, as well as object recognition. Two different types of input data for knowledge extraction from an image collection are original image and symbolic description of the image .Process Of Image Mining

Image Processing principal component analysis

texture feature extraction

classification & clustering

Databases object-oriented database

image database

Graphical User Interface query

browsing

visualization

Techniques in Image Mining: 1. New Mathematical Techniques in Image Mining: Algebraic Approaches Discrete Mathematics Techniques Structural and Syntactic Techniques Multiple Classifiers Other Mathematical Techniques2.ImageModelsandImageFeatures3.AutomationofImageMining:-ImageMining,ComputerVisionandKnowledge-BasedSystems-ImageDatabases.ImageKnowledgeBases-ImageMiningTechnologies

-LinguisticTools:-ImageScienceOntologie

Issues in image mining

The various issued in image mining are [18]:Image Mining for Modeling of Forest Fires From Meteosat Images

1. Stochastic methods for image mining and data quality (DAQUAL)

2. In agricultural studies, topics like precision agriculture and crop modeling will be addressed

3. In environmental studies, the topic of spatial/temporal scales is still an ongoing issue for research.

4. Health issues concern the quantitative modeling of epidemics of a various kind.

5. Hydrology focuses on model-based Geostatistics for rainfall prediction.

The increasing number of image archives has made image mining an important task because of its potential to discove useful image patterns and relationships from a large set of images. A framework for extracting knowledge from a sequence of images has been proposed by Hsu , Lee and Goh. The structure of the framework composed of two modules: image analysis and knowledge processing [19].Feature Extraction for Image MiningIn general, images have the following features color, texture, shape, edge, shadows, temporal

details etc. The features that were

most promising were color, texture and edge. The reasons are as follows:

1. . Color: Egeria occurs in 2 colors pink (rusty rose) and black.

elements can be compared to these spectra. 2. Texture: Texture is defined as a neighborhood feature [RHC99] as a region or a block

3. . Edge: Edge is simply a large change in frequency. This is particularly important here, as the distinction between the dark Egeria and the lighter water bodies or land can be considered as an edge.Color Feature Extraction:Some of the techniques tried were Average color in Gray scale, Average color in RGB.Average color = ------------------------------------------------------------------------

(totalpixelsin theblock)format [GW92] and Average color in YCBCR (Y is the luminance and CB, CR are thechrominance components) [GW92]. We evaluated the various methods using Precision andRecall (introduced in the next section which compares the Precision and Recall values of themethods), and found that YCBCR performs better than the other two. Hence we used it as thebasis of color extraction as shown in the image below (1000_2m_lvi2.tif).

(Intensity of all pixies in the current block) Confusion MatrixThe confusion matrix is a table structure, which describesall possible outcomes of a prediction result. The possible outcomes of a two-class prediction can be True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN).

True positives and True Negatives are the correctly classified instances. A false positive is when the outcome is incorrectly classified as positive (yes) when it in fast is negative (no). Falpositives are the False Alarms in the classification process. A false negative is when the outcome is incorrectly predicted as negative when it should have been in fact positive. False negatives are the instances that the classifier failed to predict. Table 4 shows a two-class confusion matrix.

Predicted

Class

Actual

ClassYesNo

YesTPFN

NoFPTN

Table 4. Confusion MatrixEvaluation criteria

Precision: It is defined as the fraction of the retrieved information, which is relevant.

Precision= (TP) (TP + FP)

Recall: It is defined as the fraction of the retrieved information relevant versus all relevant information:

Recall = (TP)

(TP + FN)

Accuracy: It is the overall success rate of the classifier. We will not be using this criterion, as it does not give a correct measure for the goodness of the classifier.

Accuracy = (TP+TN)

(TP+TN+FP+FN)

Generality: It can be defined as the ability of the system to learn well from the training image(s) and then show acceptable prediction accuracy on the other testing images in general.

Scalability: This is a measure, which tells whether a classification scheme can be modified so that it could be applied to larger sized or larger number of images without any noticeable change in its efficiency or performance.

Reduction in Expert interaction: The aim of this project is to reduce the load on the experts to manually classify the images themselves. We say that a classification scheme is good if it certainly classifies most of the instances on its own and it is uncertain only for a small fraction of the instances, which are the only ones classified by the experts manually.

We will be using precision, recall, generality and reduction in expert interaction as our evaluation criteria. The schemes that have been chosen here have been extensively modified to achieve high generality without loosing any information in the training image itself. As the first set of images did not have a coverage set, we could not compare the results with the actual classes to get the TP, TN, FP and FN values and so we could not calculate the precision and recall values for the testing images. So for the first set of images we only have generality and reduction in expert interaction as our evaluation measures (Table 5). For the second set we can calculate the precision and recall values and so these values are also appended to the evaluation results (Table 6).

RETRIEVAL BASED ON HIGH LEVEL SEMANTIC FEATURES by high level color properties

by high level texture properties

by high level shape properties

A set of high level semantic features which are defining in the image mining process are used. They combine high level color, texture and shape properties and high level semantic features defined by the expert during the image mining. In our example for the query Figure 1. Sea, sky images with a regain with worm color Figure 2. House images Find a sea or sky images with a regain with worm color the following images are retrieved (Figure 1.) The result from the query Find a house images is given in the Figure 2. In the first query we use color description to find images with color regions satisfy worm contract and textures of sea and sky. In the second query we use shape descriptors for the house forms.

Google Earth as image database

other software tools, Image Cutter and Google Map CreatorImage Shape has 3 options:1) Rectangle - for all images apart from panoramas, the set Field of View (FOV) button provides options relating to the camera settings but the most part these can be ignored as the software will automatically set the correct sizes and aspect ratio.2) Cylinder - for partial panoramic images or images that do not cover a full 360 x 180 field of view. The FOV settings are important for cylindrical images, these should be noted down via your stitching software and inputted accordingly

3) Panoramic Sphere - for use with 360 x180 degree panorama images.

Image Mining applications medical/biological images (training; research)

biometrics/security

satellite image analysis

photo collections

museum images

logos

ConclusionVery soon after the introduction of the notion of data-mining in the nineties, it became clear that knowledge discovery, a term often used for data mining techniques, was not just applicable to the digging up of more or less hidden data patterns in traditional databases. Everyday, the average person with a computer faces a growing flow of multimedia information, particularly via the internet. But this ocean of information would be useless without the ability to manipulate, classify, archive and access them quickly and selectively. However multimedia is dominated by images, with respect to bandwidth and complexity. Media mining refers to a technique whereby a user can retrieve, organize, and manage media data. Media mining has a huge number of emerging applications with different usage models.Image mining is more than just an extension of data mining to image domain. It is an interdisciplinary endeavor that draws upon expertise in computer vision, image processing, image retrieval, data mining, machine learning, database, and artificial intelligence. Image mining have led to tremendous growth in significantly large and detailed image databases. The increasing number of image archives has made image mining an important task because of its potential to discover useful image patterns and relationships from a large set of images. The authors are exploring the image mining in depth in order to propose algorithms for improving the efficiency and effectiveness of image mining. EMBED Unknown

Page 2

_1331126050.vsdData

Data

Graphical User Interface

Browsing / Query

Clustering

Land Cover Classification

Texture Feature Extraction

Image Database

Object-Oriented Database

PCA

Image Repository

Image Processing Module

Databases Module