36
Efficient Search in Efficient Search in Peer to Peer Networks Peer to Peer Networks By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002

Efficient Search in Peer to Peer Networks

  • Upload
    maylin

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Efficient Search in Peer to Peer Networks. By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002. P2P Networks. Distributed systems in which nodes of equal roles and capabilities exchange information and services directly with each other. Key Challenges. - PowerPoint PPT Presentation

Citation preview

Page 1: Efficient Search in Peer to Peer Networks

Efficient Search in Peer to Efficient Search in Peer to Peer NetworksPeer Networks

By: Beverly Yang

Hector Garcia-Molina

Presented By:Anshumaan RajshivaDate: May 20,2002

Page 2: Efficient Search in Peer to Peer Networks

P2P NetworksP2P Networks

Distributed systems in which nodes of equal roles and capabilities exchange information and services directly with each other.

Page 3: Efficient Search in Peer to Peer Networks

Key ChallengesKey ChallengesEfficient techniques for search and retrieval

of data.Best search techniques for a system

depends on the needs of the application.Current search techniques in “loose” P2P

systems tend to be very inefficient, either generating too much load on the system, or providing for a very bad user experience.

Page 4: Efficient Search in Peer to Peer Networks

Current Reason for Current Reason for Inefficiency Inefficiency

Queries are processed by more nodes than desired.

Page 5: Efficient Search in Peer to Peer Networks

Suggested ImprovementSuggested Improvement

Processing queries through fewer nodes.

Page 6: Efficient Search in Peer to Peer Networks

TechniquesTechniques

Iterative DeepeningDirected BFSLocal Indices

Page 7: Efficient Search in Peer to Peer Networks

Problem FrameworkProblem Framework

P2P: Undirected graph Vertices: nodes in the n/w

Edges: Open connections between neighbors.

Message will travel from A to B in hops. Length of the path: Number of hopsSource of query: Node submitting the query

Page 8: Efficient Search in Peer to Peer Networks

Problem FrameworkProblem Framework

When a node receives a query it should process the query locally and respond to the query/forward/drop the query

Address of the source node will be unknown to the responding node (scheme used by Gnutella)

Page 9: Efficient Search in Peer to Peer Networks

Metrics Metrics

Cost: Average Aggregate Bandwidth Average Aggregate Processing Cost

Quality of Results: Number of Results Satisfaction of the

query

Page 10: Efficient Search in Peer to Peer Networks

CostCost

Message propagates across nodes,each node spend some processing resources on behalf of the query

Main cost described in terms of bandwidth and processing cost

Page 11: Efficient Search in Peer to Peer Networks

CostCost

Average Aggregate Bandwidth: The average over a set of representative queries of the aggregate BW consumed(in bytes) over each edge on behalf of the query

Average Aggregate Processing Cost: The average over a set of representative queries of the aggregate processing power consumed at each node on behalf of the query

Page 12: Efficient Search in Peer to Peer Networks

Quality of ResultsQuality of Results

Results from the perspective of userNumber of results: the size of total result setSatisfaction of the query:a query is satisfied

if Z or more results are returned, where Z is some value specified by user

Time to satisfaction: how long the user must wait for the Zth result to arrive

Page 13: Efficient Search in Peer to Peer Networks

Current TechniquesCurrent Techniques Gnutella: BFS technique is used with depth limit

of D, where D= TTL of the message.At all levels <D query is processed by each node and results are sent to source and at level D query is dropped.

Freenet: uses DFS with depth limit D.Each node forwards the query to a single neighbor and waits for a definite response from the neighbor before forwarding the query to another neighbor(if the query was not satisfied), or forwarding the results back to the query source(if query was satisfied).

Page 14: Efficient Search in Peer to Peer Networks

Stop and Think…Stop and Think…

Quality of results measured only by number of results then BFS is ideal

If Satisfaction is metrics of choice BFS wastes much bandwidth and processing power

With DFS each node processes the query sequentially,searches can be terminated as soon as the query is satisfied, thereby minimizing cost.But poor response time due to the above

Page 15: Efficient Search in Peer to Peer Networks

Broadcast PolicyBroadcast Policy

BFS and DFS falls on opposite extremes of bandwidth/processing cost and response time.

Need to find some middle ground between the two extremes, while maintaining quality of results.

Page 16: Efficient Search in Peer to Peer Networks

Iterative DeepeningIterative Deepening

When satisfaction is the metric of choiceMultiple BFS are initiated with successively

larger depths, until query is satisfied or the maximum depth limit D is reached

Page 17: Efficient Search in Peer to Peer Networks

Iterative DeepeningIterative Deepening

System wide policy specifying at what depth the iterations are to occur

Last depth in policy must be set to DA waiting period W ( time between

successive iterations in the policy)must be specified

Page 18: Efficient Search in Peer to Peer Networks

Working of Iterative Working of Iterative DeepeningDeepening

Policy P{a,b,c} S initiates a BFS of depth a by sending out a query

message with TTL=a to all its neighbors Once a node at depth a receives and process the

message, instead of dropping it, the node will store the message temporarily

Query becomes frozen there at all nodes a hops away from S (Frontier nodes)

S receives response from those nodes that have processed the query so far.

Page 19: Efficient Search in Peer to Peer Networks

Working of Iterative Working of Iterative DeepeningDeepening

After waiting for time W if S finds that the query has already been satisfied, then it does nothing.

Otherwise, if the query is not yet satisfied,s will start the next iteration, initiating BFS at depth b.

S send out a resend message with TTL=a. A node that receives a resend message,simply

unfreeze the query(stored temporarily) and forward the same with TTL= b-a to its neighbors.

This process continues in the similar fashion till TTL=D is reached.At depth D, the query is dropped

Page 20: Efficient Search in Peer to Peer Networks

Working of Iterative Working of Iterative DeepeningDeepening

To identify queries with Resend messages, every query is assigned a system wide “unique identifier”.

The resend message will contain the identifier of the query it is representing and nodes at the frontier of a search will know which query to unfreeze by inspecting this identifier.

Page 21: Efficient Search in Peer to Peer Networks

Iterative DeepeningIterative Deepening

Source

Node1Node2 Level 1

Node3 Node4 Level 2

Page 22: Efficient Search in Peer to Peer Networks

Directed BFSDirected BFSIf minimizing response time is important

then Directed BFS.Strategy used is to send queries to a subset

of nodes that will return many returns quickly by intelligently selecting those nodes based on some parameters.

For this purpose, a node will maintain statistics on its neighbors

Page 23: Efficient Search in Peer to Peer Networks

Directed BFSDirected BFS

These statistics will be based on the number of results that were received through the neighbors for past queries

By sending the queries to small subset of the nodes, the cost incurred will be reduced significantly

The quality of results is not decreased significantly,provided we make neighbor selection intelligently

Page 24: Efficient Search in Peer to Peer Networks

Local IndicesLocal Indices

A node maintains an index over the data of each node within r hops of itself, where r is a system wide variable called radius

When a node receives a Query message, it can then process the query on behalf of every node within r hops of itself

Collections of many nodes can be searched by processing the query at few nodes, while keeping the cost low

Page 25: Efficient Search in Peer to Peer Networks

Local IndicesLocal Indices

R should be small.The index will be small- typically of the

order of 50 KB- independent of the size of the network

Page 26: Efficient Search in Peer to Peer Networks

Working of Local IndicesWorking of Local Indices

Policy specifies at which depth query will be processed

To create and maintain the indices at each node

All nodes at depths not listed in the policy will simply forward the query to the next depth

Page 27: Efficient Search in Peer to Peer Networks

Maintaining IndicesMaintaining Indices

Joining a new node: sends a join message with TTL=r and all the nodes within r hops update their indices.

Join message contains the metadata about the joining node

When a node receives this join message it, in turn, send join message containing its meta data directly to the new node.New node updates its indices

Page 28: Efficient Search in Peer to Peer Networks

Maintaining IndicesMaintaining Indices

Node dies: Other nodes update their indices based on the timeouts

Updating the node: When a node updates its collection, his node will send out a small update message with TTL= r, containing the metadata of the affected item.All nodes receiving this message subsequently update their index.

Page 29: Efficient Search in Peer to Peer Networks

Results for Iterative Results for Iterative DeepeningDeepening

Page 30: Efficient Search in Peer to Peer Networks

Results for Iterative Results for Iterative DeepeningDeepening

Page 31: Efficient Search in Peer to Peer Networks

Results for Directed BFSResults for Directed BFS

Page 32: Efficient Search in Peer to Peer Networks

Results for Directed BFSResults for Directed BFS

Page 33: Efficient Search in Peer to Peer Networks

Results for Local IndicesResults for Local Indices

Page 34: Efficient Search in Peer to Peer Networks

ConclusionsConclusions

Compared to current techniques used in existing systems, the discussed techniques greatly reduce the aggregate cost of processing query over the entire system, while maintaining the quality of results

Schemes are simple and practical to implement on the existing systems.

Page 35: Efficient Search in Peer to Peer Networks

ConclusionsConclusions