36
A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio Giovanna Carofiglio 1 , R.Gaeta 2 , M.Garetto 1 , P.Giaccone 1 , E.Leonardi 1 , M.Sereno 2 MAMA Workshop MAMA Workshop joint with ACM SIGMETRICS 2005 ACM SIGMETRICS 2005 Banff, June 6-10, 2005 1 Politecnico di Torino, 2 Università di Torino Italy

A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

Embed Size (px)

Citation preview

Page 1: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

A Statistical Physics approach for Modeling P2P Systems

Giovanna CarofiglioGiovanna Carofiglio11, R.Gaeta2, M.Garetto1,

P.Giaccone1, E.Leonardi1, M.Sereno2

MAMA WorkshopMAMA Workshop joint with ACM SIGMETRICS 2005ACM SIGMETRICS 2005Banff, June 6-10, 2005

1 Politecnico di Torino, 2 Università di Torino Italy

Page 2: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Outline

Motivation Basic Model Extended Model Content Search Download effects

Page 3: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

P2P System Architecture

peersclients

server

A possible definition

Decentralized, self-organizing distributed systems, in which all or most communication is symmetric.

Page 4: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Peer-to-Peer traffic P2P is the single

largest generator of traffic

P2P traffic significantly outweights web traffic

P2P traffic is continuing to grow

Page 5: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

P2P Applications

Communication Voice Over IP: Skype Instant Messaging

Distributed Computation Seti@home, UnitedDevices,

Distributed Science

File Sharing BitTorrent, KaZaA,

Gnutella, eDonkey, Napster, etc.

DHTs Chord, CAN, Pastry,

Tapestry

Wireless Ad hoc Networking

Page 6: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Motivation

Most of the Internet traffic is generated by p2p applications.

Performance studies of p2p systems may be useful to drive the design of future applications.

Analytical models help analyzing large and complex p2p networks.

Page 7: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Modeling techniques

Traditional Markov Models

A detailed microscopic description is provided but with a huge space-state.

It is computationally expensive to analyze large systems like p2p systems (with million of users and contents shared).

Fluid models

Network dynamics are described with an increased level of abstraction, neglecting stochastic information.

Scalability: the model is based on a set of differential equations invariant w.r.t. the size of the network (n.users, link cap)

Page 8: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Model description

[1]F. Clevenot, P. Nain, “A Simple Model for the Analysis of SQUIRREL”, Infocom 2004, Hong Kong, Mar 2004.

[2]D. Qiu, R. Srikant, “Modeling and Performance Analysis of BitTorrent like Peer-to-Peer Networks”, Sigcomm 2004, U.S.A.

We model a generic p2p system without focusing on a particular implementation.

Based on a fluid approach like in [1] and [2], our model evolves in a second-order diffusion approximation where stochasticity in networks’ dynamics plays a relevant role.

The model provide a description of users/contents dynamics both in transient and in steady state.

Page 9: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Model structure

Users dynamics

Contents dynamics

Search phase

Download phase

Page 10: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Outline

Motivation Basic Model Extended Model Content Search Download effects

2

Page 11: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

The number of users joining the p2p network dynamically changes according to:

Enter-leave dynamics

λ u = new users’ arrival rate 1/μu = average subscription time

Active-Sleeping mode

1/μas = average active time 1/μsa = average sleeping time

Users in sleeping mode do not interact at all with the other users of the community.

Users dynamics (1)

Page 12: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Users dynamics (2)

The evolution of the number of users in active or sleeping mode, Ua and Us respectively, can be described by two fluid differential equations:

sleeping users who become active

new users

active users who become sleeping

active users who leave the system

active users who become sleeping

Page 13: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Content Dynamics

The evolution of the number of available copies of a content is driven by 2 phenomena:

the generation of new copies (downloads or off-on transitions)

the cancellation of existing copies

θ = average request rate

1/μh , 1/μ’h = average content holding time for active/sleeping users

Note: ps=ps(μ’h ) is the probability that sleeping users have the considered content when they become active.

Page 14: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Brownian Motion Content dynamics are modelled through a Second-Order Diffusion Approximation

Each content is a particle with instantaneous position x(t) moving accordingly to a Brownian motion.

Langevin equation

Fokker Planck equation

The evolution of the pdf f(x,t) over follows:

Page 15: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Content diffusion equation

Introduction of new contents in the system

A content can disappear when are no more copies available. The rate at which a content disappear is:

The pdf F(x,t) of the number of copies follows the F.P. equation with boundary conditions for :

Page 16: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Diffusion Parameters

hh = variation coefficient of holding time

hr = variation coefficient of inter request time

m(x,t) expresses the average speed at which the content-particle moves along the x axis.

The variance σ2(x,t) expresses the burstiness of the processes.

Page 17: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Case : Content disappearance (1)

In a single-content scenario we study the probability that the content disappears as a function of the users’ dynamics.

Active Users = 10

Sleeping Users = 10

Copies Availables = 1

Network parameters Initial condition

λ u= users’ arrival rate = 0.1 ut/s

1/μu = avg subscription time = 4000 s

1/μas = avg active period = 400 s

1/μsa = avg sleeping period = 400 s

θ = average request rate

1/μh ,1/μ’h = avg content holding time for a/s users= 100 s

Page 18: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Case: Content disappearance (2)

Che grafico facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

Page 19: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Outline

Motivation Basic Model Extended Model Content Search Download effects

2

Page 20: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Dual distribution

Relations between users’ and contents’ dynamics

The number of active and sleeping users at time t

The number of copies available at time t

Page 21: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Dual equations Ga(x,t) and Gs(x,t) are the pdf of the number of active and sleeping users having x contents:

new usersactive users who become sleeping or leave the system

sleeping users who become active

Page 22: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Diffusion parameters

As for the contents diffusion equation m(x,t) expresses the average speed at which the copy-particle moves along the x axis, while σ2(x,t) expresses the variance of the associated process.

ra = rate of generation of new copies

da/s = rate of cancellation of existing copies

Page 23: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Multi-contents case (1) In a multi-content scenario, still assuming ideal search

and download we study the steady state distribution of the contents among users.

Active Users = 2500

Sleeping Users=7500

Copies Availables = 1

Network parameters Initial condition λ u= users’ arrival rate = 0 ut/s

1/μu = avg subscription time = inf

1/μas = avg active period = 6 h

1/μsa = avg sleeping period = 18 h

θ = average request rate = 2 c/h

λ c= contents’ introduction= 1/600 c/s

1/μh ,1/μ’h = avg content holding time for a/s users= 10 h, 8 h

Page 24: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Multi-contents case (2)

Che grafici facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

Page 25: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Outline

Motivation Basic Model Extended model Content Search Download effects

2

Page 26: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

The contents’ trasfer rate In a non-ideal p2p system the transfer rate of the contents dynamically changes according to:

the probability of a successful search pphithit(x,t)(x,t) (related to content diffusion, search algorithm)

the probability of a successful download ppdowndown(x,t)(x,t) (related to network congestion, user impatience, on-off dynamics)

The effective retrieval rate becomes:

Both search and download require to know F(x,t) and provide it as a function of time.

Page 27: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Search Phase

Search algorithmSearch algorithm: flooding in an unstructured p2p network

For each content request a query message is forwarded to all the neighbors up to the distance max_ttl

Graph ModelGraph Model The P2P network topology is modeled as a random finite graph.

We consider Generalized Random Graph (GRG) to allow an arbitrary vertex degree distribution.

Active peer

Application-level connection

Page 28: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

GRG Model Given the probability distribution {pk} that a vertex has k edges

departing from it, we can define the generating function:

It can be shown that the generating function of the number of the first neighbors with a copy of the content is:

α = x/Ua

X =#copies

Ua=#active users

The composition of these generating functions gives the generating function of the number of neighbors at distance h

Page 29: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

GRG Topology

To compute the pdf of the GRG nodes degree we adopt a M/M/∞ queue

Assuming that an external observer joins the network

# customers # connections established in queue by the observer

Now we can define the generating function for the number of neighbors at distance up to max_ttl that have a copy of the content:

Hence it derives the hit probability:

Page 30: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Outline

Motivation Basic Model Extended Model Content Search Download effects

2

Page 31: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Download Phase AssumptionsAssumptions:

The transport network is ideal Infinite bandwidth on the client side The peer from which downloading the desired content is

rqndomly chosen between those storing that content.

The dynamics of dowload at each peer are modelled by a M/G/1-PS queue.

Problem Problem The download request rate incoming at peers is not known a priori!

It depends on:

The contents’ distribution at peers

The policy used by the system to distribute the load among peers

Page 32: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Probability of successful download (1)

Let θ is the popularity of a content, present in x copies in the network where there are Ua active peers

Download request rate

Assuming that the requests form a Poisson process, the queue becomes a M/G/1-PS with average delay:

Given a download rate y= θsphit the probability of successful download is:

Single Content CaseSingle Content Case

Page 33: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

The overall probability of successful download is

Multiple Content CaseMultiple Content Case

From F(x) we derive the probability that a peer has k contents, present in x copies:

( F(x) is the pdf of the number of copies available for the content )

The overall download request rate seen by a peer is

Probability of successful download (2)

Page 34: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Since all Z(x) are independent we can approximate the distribution of Y around its average with a normal distribution

The probability of successful download becomes

my and σy are the first two moments of Y

The integral is restricted to the interval for numerical reasons.

Notes

Probability of successful download (3)

Page 35: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Conclusions

We defined a stochastic fluid model of a p2p system able to describe users and contents dynamics both in transient and stationary regime.

A support model permits to consider the effects of the search and the download on the system performance.

Analytical solution of the equations in steady state Model Extension to classes of different users Model Extension to classes of different contents Comparison beetween model and simulations in realistic scenarios.

Work in progress…

Page 36: A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

MAMA Workshop, Sigmetrics MAMA Workshop, Sigmetrics ‘05‘05

Thank you!Thank you!