View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Infomaster: An Infomaster: An information Integration information Integration
ToolTool
O. M. Duschka and M. R. GeneserethPresentation by Cui Tao
Introduction
Huge amount of information online:– Distribution: Not every query can be answered by
the data in a single database• Fragmentation: horizontal, vertical
– Heterogeneity • Notational heterogeneity:
– Different access language and protocol: Parsing HTML, SQL, OQL, Z39.50
• Conceptual heterogeneity:– Semantic mismatches
– Instability
Introduction
Intelligent agents– Search and find desired information– Convert formats– Translate different context– Etc…– Not feasible yet– Considerable research in ontologies and
natural language understanding is required
Introduction
Infomaster: an information integration tool– Provide integrated access– Manage evolving information sources– Add new information sources– Remove outdated information sources
Tested Application Areas
Newspaper classifieds– Provide a uniform search interface– Gather corresponding classifieds from all
relevant newspapers
Product catalogs– Provide terminology translation
Campus databases
Descriptions of Relationships
Interface relation & Site relation: in the terms of Base relation
Interface relation v.s. Base relation:Interface
Base
Query Processing
Example: BMWs built in 1996 that are for sale for a Price below their average market value.
Reduction: Interface relations Base relations
Simple:
User’s query --- Interface relation --- Base relation
Example rewritten query:
Abduction Base relations Site relations
Site relations are expressed in terms of base relations, but not vice versa
Query rewritten problem: answer queries using views
Abduction: use a standard model elimination theorem prover
Abduction Base relations Site relations
: The set of all descriptions of the site relations: A set of site relations: The rewritten user query after the reduction step
Conclusions
The first integration system: – Arbitrary positive relational algebra user queries – DB description
Efficient optimization by use:– Integrity constraints – Local completeness information
Flexible Use of query planning:– Expressive description language– Constraint – Background theories
Related Works
Information Manifold project and SIMS project:– Explore the use of descriptions logics for describing
information sources
Occam project– Use general AI planning techniques to generate
information gathering plans
TSIMMIS project– Use pattern matching techniques to match user queries
and predefined queries.