ImAge: an extensible agent-based architecture for image retrieval

  • Published on

  • View

  • Download

Embed Size (px)


  • Int J Digit Libr (2000) 2: 236{250 I N T E R N AT I O N A L J O U R N A L O N

    Digital Libraries Springer-Verlag 2000

    ImAge: an extensible agent-based architecture for imageretrieval

    Hiranmay Ghosh1, Santanu Chaudhury2,, Chetan Arora2, Paramjeet Nirankari2

    1 Centre for Development of Telematics, 9th floor, Akbar Bhawan, New Delhi 110021, India;E-mail: ghosh@cdotd.ernet.in2 Department of Electrical Engineering, Indian Institute of Technology, New Delhi 110016, India;E-mail:

    Abstract. We present an open and extensible architec-ture, ImAge, for content-based image retrieval in a dis-tributed environment. The architecture proposes the useof system components with standard public interfacesfor implementing retrieval functionality. The standard-ization of the components and their encapsulation inautonomous software agents result in functional strati-cation and easy extensibility. Collaboration of the in-dependent retrieval resources in ImAge results in en-hanced system capability. Reuse of existing retrieval re-sources is achieved by encapsulating them in agents withstandard interfaces. The addition of independent agentswith domain knowledege adds the capability of process-ing conceptual queries, while reusing the existing systemcomponents for feature-based retrieval. A communica-tion protocol allows the declaration of the capabilities ofthe system components and negotiations for optimal re-source selection for solving a retrieval problem. The useof mobile agents alleviates network bottlenecks. This pa-per describes a prototype implementation that validatesthe architecture.

    Key words: Content-based image retrieval { Digital li-brary { Multi-agent system { Distributed architecture {Conceptual query interpretation

    1 Introduction

    The availability of digital images for dierent applica-tion domains calls for eective retrieval tools. An image,which is a two-dimensional array of image pixels, en-codes an enormous amount of information. Research in

    Correspondence to: Santanu Chaudhury

    content-based image retrieval investigates new ways to in-terpret image data (pattern recognition algorithms) andestablishing similarities between the images using suchan interpretation. The eectiveness of dierent retrievalalgorithms depends on the application. The same set ofimages needs to be interpreted dierently by dierentretrieval methods to meet diverse user requirements. Ina networked world, the image collections, the retrievaltools and the users are expected to be distributed acrossmultiple locations. This paper addresses the problemof designing an image retrieval system that can fullthe needs of a distributed environment and disperateconstraints.

    The existing image repositories adopt dierent re-trieval paradigms and implement a few retrieval methods.Some of them use aggregate image features such as thecolor histogram and texture [5, 10], some use segmenta-tion information, i.e., image regions with relatively ho-mogeneous properties [4], while some others associate se-mantic meaning to the image segments using some do-main knowledge [6, 15, 18]. An image repository adoptssome data model for representation of the image data.The images are indexed with one or more entities in thedata model and are retrieved using a combination of theseindices in the context of a query. The nature of the queriesthat can be satised by a repository is limited by thedata model implemented in the collection. For example,a retrieval system like Webseek [5] that supports some se-mantic categorization of images and indexing based onaggregate image features in its data model cannot sup-port a query requiring segmentation. The dierent imagerepositories on the internet exhibit heterogeneity with re-spect to the data models and hence, with respect to theiraccess mechanisms.

    An integrated framework for a multimedia digitallibrary reusing the existing heterogeneous network re-

  • H. Ghosh et al.: ImAge: an extensible agent-based architecture for image retrieval 237

    sources has been attempted in UMDL [2] and the Stan-ford University Digital Library [19] projects. These sys-tems use some mediator software to coordinate retrievalfrom the multiple repositories, which may have dier-ent organizations and dierent built-in retrieval methods.The loose coupling between the producers and consumersof information and the mechanism of dynamic resourcediscovery make the systems amenable to easy extension.The systems direct a query transparently to a set of ca-pable repositories. However, the architecture does notenhance the capabilities of the individual repositories. Asa result, retrieval is restricted to the repositories havinga built-in capability to process a query. Moreover, therecan be heterogeneity in the local interpretations resultingin an inconsistent set of documents being retrieved.

    There are some examples of extensible image data-bases, where the data-model of a repository can be en-hanced by the action of external agencies. In MOODS [12],the system stores a set of low-level media features, whilea user can provide the rules for their interpretation usinga script language. Thus, the data model of the system canbe extended by adding new user scripts. In Mirror [7],some demons visit the database to extract new media fea-tures to augment the capability of the system. In eithercase, the extension becomes a permanent feature of thesystem, is done in anticipation, and cannot be dynami-cally tailored to the needs of a specic query.

    In this paper, we present ImAge, an open and exten-sible architecture for a digital image library, where theretrieval functionality is implemented through the inter-action of standard reusable components. These compo-nents represent various entities required for a retrievalsystem, for example, the query interface of a repository,the data entities that populate a repository, and the pat-tern recognition routines, that transform a data objectinto another. The components may have dierent inter-nal structures but are encapsulated with standard publicinterface denitions. The dierent repositories may sup-port dierent sets of the data and query objects, therebyhaving their own individual character.

    The approach followed in ImAge has quite a few ad-vantages. The denition of standard component inter-faces allow separation of the dierent functional units ofthe retrieval system, such as query interpretation, clas-sication and pattern recognition methods. These inde-pendent modules can be encapsulated into autonomoussoftware agents. The agents collaborate with each otherduring retrieval using the methods dened in their publicinterfaces. New agents conforming to the interface speci-cations can be dynamically incorporated in the system,resulting in its extensibility. The agents can declare theircapability set, which is used for negotiation in the contextof a retrieval. The architecture includes a mechanism forbenchmarking these agents against some common bench-mark data to ascertain their relative merits. Componentsencapsulating semantic knowledge can also be added tothe system resulting in the capability to process concep-

    tual queries. The standardization of the interfaces resultin the possibility of independent research teams to con-tribute image analysis routines and domain knowledgeto the system independently of the underlying repositorystructures. These routines can be used with any imagerepository resulting in eective resource sharing. Theycan also build upon one another using public interfaces.It is also possible to include the existing image retrievalresources (e.g., WebSeek [5], QBIC [10], BlobWorld [4],etc.) in the architecture, by encapsulating them into au-tonomous agents conforming to the standard interfacedenitions.

    The ImAge architecture, which is motivated byUMDL, proposes a new communication framework whichallows the autonomous agents encapsulating the dierentsystem components to collaborate during a retrieval. Thedierent retrieval resources can be contributed by inde-pendent research groups and may exist anywhere in thenetwork. We encapsulate the pattern recognition routinesas mobile agents, so that they can travel across a widearea network to the repository sites and analyze the im-ages at their source.

    We have implemented a prototype image retrievalsystem, ImAge, based on this architecture. The basicsystem supports query by example using the extractedimage features. Though the implementation is generic,we have experimented with the system on a collection oftourism-related images. An extended implementation in-cludes conceptual knowledge in the domain of tourismand supports conceptual query. The system can be easilyextended to other applications by incorporating appro-priate domain knowledge.

    The aim of our research is the development of an ar-chitecture that will support content-based image retrievalfrom a multitude of distributed repositories which sup-port a standard retrieval protocol. We explore the pos-sibility of encapsulating the retrieval resources to realizestandard interfaces, so that they can collaborate duringretrieval. We do not consider the development of spe-cic retrieval algorithms, such as data models and patternrecognition algorithms, as part of this research. Thereis currently a strong research interest in multimedia re-trieval methods and adequate availability of the retrievalresources has been assumed.

    The rest of this paper is organized as follows: Sec-tion 2 presents an overview of the multi-agent archi-tecture and describes the various roles played by theagents in the system. Section 3 describes the proto-col for capability negotiation and selection of agentteams. Section 4 describes the communication archi-tecture for the agents constituting the system. Sec-tions 5 and 6 describe some global policies for formulatingsearch strategy. Section 7 describes vertical extensionof the basic feature based retrieval system for concep-tual query processing. Finally, we conclude (Sect. 9)with a summary of our contribution and scope of futurework.

  • 238 H. Ghosh et al.: ImAge: an extensible agent-based architecture for image retrieval

    2 Architecture

    ImAge has been modeled as an open society of au-tonomous and communicating software agents. Eachagent in the society implements an independent unit ofretrieval functionality. The collaboration of these agentsresults in solving a retrieval problem. New agents can dy-namically join the society and contribute to its growth.In order that an autonomous agent can contribute in anopen society, we dene some denite roles in the sys-tem. An agent participates in the system in one of thesepredened roles. We have followed an object-oriented ap-proach. An agent class has been associated with each ofthe roles in the system. Every agent is viewed as an objectbelonging to an agent class. Each agent class is charac-terized by a public interface denition, which denes itsfunctional behavior. Dierent agents in an agent class im-plement the public interface in its own way. Each agentclass has a generic implementation that implements itspublic interface. Every agent belonging to a class extendsthe generic agent and implements its technology specicmethods. For example, the generic Search Agent (SA) de-nes an abstract method similarity(), that returns thesimilarity value between two images. It is extended byevery agent of that class with a feature specic algorithm.ImAge puts no restrictions in the internal design or know-ledge representation techniques of an agent.

    The agent classes in ImAge and their interactions areshown in Fig. 1. A User Interface Agent (UIA) providesthe human-machine interface of the system. It encapsu-lates the knowledge about the users, e.g., a users prefer-ences, feedback, history, etc. A UIA can implement anytype of user interface, e.g. natural language interface,query by example, etc. However, it must communicatethe query to the rest of the system in a standard form.A Search Coordination Agent (SCA) encapsulates theknowledge and the heuristic methods for solving a re-trieval problem and its optimization. It accepts the userquery from a UIA, interprets it and interacts with the











    Fig. 1. Agent interaction

    other agents for planning and scheduling the retrievalsubproblems. A Collection Agent (CA) forms a layer ofabstraction over an image repository. It encapsulates therepository structure and produces a standard view of thedierent data elements available with the repository. Itdeclares the capabilities of a repository in terms of itsquery and data services to the external world. A SearchAgent (SA) encapsulates a specic image retrieval al-gorithm. It is developed independent of any repositorystructure and are made available in the network for pub-lic use. These agents can build upon one another to derivea complex data-model. These agents are designed as mo-bile agents so that they can travel to the collection sitesand can analyze the documents at their sources.

    Since ImAge allows dynamic growth, the agents inthe system cannot be aware of each others existence.The Registration Agents (RAs) maintain a list of theagents available in the system with their capability de-scriptions and provide a mechanism for dynamic resourcediscovery. Since the architecture encourages agents to befreely installed in the system, the system may be popu-lated with a number of agents with similar capabilitiesbut with dierent performance gures. The BenchmarkAgents (BAs) benchmark the agents against a commonset of data, which enables optimal choice of agents forsolving a retrieval problem. The agent classes are de-scribed in more detail in the following subsections.

    2.1 User Interface Agent

    A UIA provides the human-machine interface of ImAge.It is possible to have dierent types of user interfaces inImAge that incorporate dierent forms of inputs, suchas query by example, keywords, natural language input,etc. A user can select an appropriate UIA depending onhis/her convenience. Every UIA should, however, trans-late the query to a standard form which is understood bythe rest of the system (see Sect. 4.2). Besides this, a UIAshould be able to handle a few other functions, such asthe convenient display of results, user registration, main-tenance of history, and accepting user feedback.

    Since the functionality of a UIA largely depends onthe nature of the supported interface, there is no genericimplementation for this agent class. A UIA is transpar-ent to the complexity of the actual retrieval mechanism,that involves interaction of many retrieval resources. Itviews SCA as a complete search engine and submits theuser queries to the latter in interactive or non-interactivemodes.

    2.2 Search Coordinator Agent

    A Search Coordinator Agent (SCA) coordinates the re-trieval process utilizing the available resources in the sys-tem. Collaboration is achieved using a two-phase protocolas in [...


View more >