Improving Search inP2P Networks
Presenters: Mu, Ai Lu, Min
Date: Nov 25, 2004
25/11/04 Improving Search in P2P 2
Outline
Introduction to JXTA SearchArchitecture and ComponentsDesign GoalsQuery Routing Protocol (QRP)Query ResolutionSummary
25/11/04 Improving Search in P2P 3
Current P2P Search Models
Two main models of p2p networks:
The centralized Client/Server model;
The decentralized model.
25/11/04 Improving Search in P2P 4
Searching Centralized Networks
The central index which locates files quickly and efficiently;
A single point of failure and a visible target for attack on the network;
Client receive outdated info because central server index is only updated periodically.
25/11/04 Improving Search in P2P 5
Searching Decentralized Network
Remove the central structure of the network;
Searching a decentralized network is slower;
Not guarantee to find a file even if it is on the network because the TTL (time to live) expires.
25/11/04 Improving Search in P2P 6
Why JXTA Search? Most Web content is invisible to
current search engines;
JXTA Search address this problem, providing a unique query routing protocol that makes content visible and facilitates its use.
25/11/04 Improving Search in P2P 7
Introduction JXTA
Originally developed by Sun; JXTA is a set of open, generalized peer-
to-peer (P2P) protocols that allow any connected device on the network — from cell phone to PDA, from PC to server — to communicate and collaborate as peers;
The JXTA protocols are independent of any programming language, and multiple implementations exist.
25/11/04 Improving Search in P2P 8
JXTA Search
JXTA Search is a decentralized p2p search engine.
Defines a XML-Protocol (QRP), which enables the search in P2P Network.
Supports both “Wide Search” and “Deep Search”.
Open source code (http://search.jxta.org)
25/11/04 Improving Search in P2P 9
Wide search of distributed devices, such as PCs, PDAs, and cell phones.
Deep search of rich content sources such as Web servers.
JXTA Search
25/11/04 Improving Search in P2P 10
Outline
Introduction to JXTA SearchArchitecture and ComponentsDesign GoalsQuery Routing Protocol (QRP)Query ResolutionSummary
25/11/04 Improving Search in P2P 11
The JXTA Search Network architecture consists of the following
components:
• Registration Service• Provider Service • Consumer Service• Hub Service
Architecture and Components
25/11/04 Improving Search in P2P 12
JXTA Search Hub Service
JXTA Search Resolver- maintains an index of provider's registrations, - and when a query is received, matches the query against a set of providers that may be good at answering the query.
JXTA Search Hub Service consists of the two sub components:
Router , ResolverAt the heart of JXTA Search is the "router/resolver"
JXTA Search Router
- routes and manages query connections, - collates results and returns results to consumers
25/11/04 Improving Search in P2P 13
Architecture
Distributed Search
• Central to the JXTA Search infrastructure are "hubs".
• Each hub has a series of providers that form its local network.
• These providers typically have something in common.
• Hubs are expected to become an efficient way to group peers with similar content, or geography.
25/11/04 Improving Search in P2P 14
Outline
Introduction to JXTA SearchArchitecture and ComponentsDesign GoalsQuery Routing Protocol (QRP)Query ResolutionSummary
25/11/04 Improving Search in P2P 15
Design Goals Simplicity: any client and server can be
incorporated; Structure: all queries to the JXTA Search
Network are XML messages conforming to a queryspace in which providers register templates describing the structure of queries they can accept;
Extensibility: arbitrary queryspaces can be used;
Scalability: peer can dynamically join the network for sending the registration message.
25/11/04 Improving Search in P2P 16
OutlineIntroduction to JXTA SearchArchitecture and ComponentsDesign GoalsQuery Routing Protocol (QRP)
Query ResolutionSummary
Query MessagesResponse MessagesRegistration Messages
25/11/04 Improving Search in P2P 17
Query Routing Protocol (QRP)
QRP defines mechanisms for sending, responding queries as well as meta-data for nodes in the network.
25/11/04 Improving Search in P2P 18
Queryspaces Providers may have widely different types
of content or resources in their datastores. The notion of queryspaces is allowed to
define the structure of a query and its associated registration.
Queryspaces are a fundamental component of the JXTA Search framework. Like XML namespaces, queryspaces do not necessarily reference to the actual content, they are simply identifiers used by providers and consumers to find each other.
25/11/04 Improving Search in P2P 19
QRP - Query Messages Query messages are structured as
follows: The default namespace is
http://search.jxta.org The query message is
contained within the envelope <request>...</request>.
The query unique ID is specified in the uuid attribute of the <request> tag.
The query space is specified in the query-space attribute of the request tag.
The query data can be arbitrary XML within a namespace. It includes the tag <query> to specify the start of the actual query data and the tag <text> to specify free text, or within any other namespace specified by the query-space definition.
E.g. a query for books on Java<?xml version='1.0'?><request xmlns=http://search.jxta.org mlns:b=http://www.sellbooks.com/JxtaSearch
query-uuid=1C8DAC3036A811D584AEC2C23
query-space=http://www.sellbooks.com/JxtaSearch> <query>
<b:author> Bill Joy</b:author><b:title> Java</b:title>
</query></request>
25/11/04 Improving Search in P2P 20
QRP - Response MessagesThe response message is structured as follows:The default name space is http://search.jxta.org.The response message is enveloped within the <responses>...</responses> tags, with each specific response enveloped in <response>...</response> tags. The body of the response is contained within the <data>...</data> tags. It can be arbitrary well-formedXML.
<?xml version='1.0'?><responses xmlns=http://search.jxta.org xmlns:b=http://www.sellbooks.com/JxtaSearch query-uuid=1C8DAC3036A811D584AEC2C23> <response> <data> <b:authors>Bill Joy, Guy Steel</b:authors> <b:URL> http://www.sellbooks.com/0201310082 </b:URL> <b:title>The Java Language Specification, Second Edition
</b:title> <b:price>$39.95</b:price> </data> </response></responses>
25/11/04 Improving Search in P2P 21
QRP - Registration Messages
Information providers must register with the JXTA Search network.
To register, a provider contacts an access point with a registration message.
An XML document with three components:• Queryspace URL identifies the URL at which, when
queries are posted to it, the provider’s predicates are checked for matches.
• A set of predicatesThe predicate defines the structure and content of the queries which the provider is interested in.
• The provider’s query server endpoint is either a JXTA pipe ID or a URL. Queries which match one of the provider’s predicates are posted to this endpoint.
25/11/04 Improving Search in P2P 22
QRP - Registration Messages
The query server
The predicate body
The query space
?xml version=’1.0’?><register xmlns="http://search.jxta.org" xmlns:b= http://www.sellbooks.com/JxtaSearch > <query-server>http://www.sellbooks.com/exec/jxtasearch.pl </query-server> <query-space uri= www.sellbooks.com/JxtaSearch> <predicate> <query> <b:author> <quote>Bill Joy</quote><quote>Guy Steel</quote> </b:author> <b:title>The Java Language Specification, Second Edition</b:title> </query> </predicate> </query-space></register>
25/11/04 Improving Search in P2P 23
Outline
Introduction to JXTA SearchArchitecture and ComponentsDesign GoalsQuery Routing Protocol (QRP)Query ResolutionSummary
25/11/04 Improving Search in P2P 24
Query Resolution Queries are resolved by a resolver
by matching query terms to registration terms. Providers whose registration terms match the query terms are returned by the resolver.
The minimal condition for matching a query to a provider is that the query must have the same query-space as the provider registration.
25/11/04 Improving Search in P2P 25
Query Resolution
Provide an efficient query resolution and routing service.
To determine to which set of providers a given query should be routed. Sending all queries to
all providers is inefficient. JXTA Search attempts great efficiency.
Define a framework for providers to register the type of queries they are
interested in.
Method 1 Method 2
25/11/04 Improving Search in P2P 26
Outline
Introduction to JXTA SearchArchitecture and ComponentsDesign GoalsQuery Routing Protocol (QRP)Query ResolutionSummary
25/11/04 Improving Search in P2P 27
JXTA Advantages Simplicity & Robustness JXTA defines a simple & lean framework for P2P applications.
Defers complex implementation details to implementing applications.
Interoperability & UbiquityAllows a wide range of peers such as sensors, PDAs appliances, network routers, desktop computers, data-center servers and storage systems to interact with one another.
Language & Platform independence Clear-cut distinction between policies and mechanisms
To keep the core small and elegant, there is an architectural distinction between core mechanisms and optional policies.
Flexibility: Implementation & Incremental Improvement Openness: Open source is available at http://search.jxta.org
25/11/04 Improving Search in P2P 28
Heavy Applications
JXTA implements minimal P2P infrastructure and leaves several issues for applications to address.Example: reliable end to end communication on top of an unreliable transport
Security & Trust models To date JXTA relies heavily on credentials and digests for
authentication & protection. Applications are required to implement their security models. Trustability of peers.
JXTA Disadvantages
25/11/04 Improving Search in P2P 29
Summary A novel approach for query routing in
distributed networks. Using a simple XML protocol combined
with powerful but simple indexing matching engines.
Provides developers with the capability to connect multiple consumer and provider applications together for the purposes of information discovery and exchange.
25/11/04 Improving Search in P2P 30
The End
Thank you!