Introduction to Network Mathematics (2) - Probability and Queueing

Introduction to Network Mathematics (2)

- Probability and Queueing

Yuedong Xu10/08/2012

Purpose• All networking systems are

stochastic

– Analyzing the performance of a protocol (e.g. TCP), a strategy (peer selection), a system (e.g. Data center), etc.

Outline• Probability Basics• Stochastic Process• Baby Queueing Theory• Statistics• Application to P2P• Summary

Probability Basics• Review

– Probability: a way to measure the likelyhood that a possible outcome will occur.

• Between 0 and 1

– Events A and B• AUB: union• A B: intersection

A B

A and B

A UB

Probability Basics• Review (‘cont)

– P(AUB): prob. that either A or B happen• P(AUB) = P(A) + P(B) – P(A B)

– P(A|B): prob. that A happens, given Bs• P(A|B) = P(A B)/P(B)

– P(A B): prob. that both A and B happen• P(A B) = P(A|B)*P(B) = P(B|A)*P(A)

Probability Basics• If A and B are mutually exclusive

– P(A B) = 0– P(AUB) = P(A) + P(B)– P(A|B) = 0

• If A and B are independent– P(A B) = P(A)*P(B)– P(AUB) = P(A) + P(B) - P(A)*P(B)– P(A|B) = P(A)


– Theorem of total probability:

Events {Bi, i=1,2,…,k} are mutually exclusive.

)()|()(1

i

k

ii BPBAPAP


– Bayesian's TheoremSuppose that B1, B2, … Bk form a partition

of S:

Then

; i j iiB B B S

1

1

Pr( | ) Pr( )Pr( | )

Pr( )Pr( | ) Pr( )

Pr( )

Pr( | ) Pr( )

Pr( ) Pr( | )

i ii

i ik

jj

i ik

j jj

A B BB A

AA B B

AB

A B B

B A B

1

1

Pr( | ) Pr( )Pr( | )

Pr( )Pr( | ) Pr( )

Pr( )

Pr( | ) Pr( )

Pr( ) Pr( | )

i ii

i ik

jj

i ik

j jj

A B BB A

AA B B

AB

A B B

B A B


– A permutation is an ordered arrangement of objects. The number of different permutations of n distinct objects is n!.

Example:How many different surveys are required to cover all possible question arrangements if there are 7 questions in a survey?

“n factorial”n! = n · (n – 1)· (n – 2)· (n – 3)· …· 3· 2· 1

7! = 7 · 6 · 5 · 4 · 3 · 2 · 1 = 5040 surveys

Probability Basics• Review (‘cont)The number of permutations of n elements taken r at a time is

8 5Pn rP 8 7 6 5 4 3 2 1= 3 2 1

6720 ways

n rP# in the group # taken

from the group

! .( )!nn r

Example:You are required to read 5 books from a list of 8. In how many different orders can you do so?

8!(8 5)!

8!3!


Example:You are required to read 5 books from a list of 8. In how many different ways can you do so if the order doesn’t matter?

A combination is a selection of r objects from a group of n things when order does not matter. The number of combinations of r objects selected from a group of n ob-jects is ! .( )! !

nn r rnC r# in the

collection # taken from the collection

8 58!=3!5!C 8 7 6 5!= 3!5!

combinations=56


– Discrete random variable (r.v.)• Binomial distribution, Poisson distribution,

and so on


– Continuous random variable• Uniform distribution, Normal distribution,

Gamma distribution, and so on

Probability Basics

We enter the more advanced phase!

Never get confused by the concepts!

Probability Basics• Key Concepts

– Probability mass function (pmf)Used for discrete r.v. Suppose that X: S → A is a discrete r.v.

defined on a sample space S. Then the probability mass function fX: A → [0, 1] for X is defined as


– Probability density function (pdf)i) Used for continuous r.v.;ii) A function that describes the relative likelihood for this r.v. to take on a given

value. A random variable X has density f, where f is a non-negative Lebesgue-integrable function, if:


– Cumulative distribution function (cdf)For a discrete r.v. X

For a continuous r.v. X


– Probability generating function (pgf)i) Used for discrete r.v.ii) A power series representation of pmf

For a discrete r.v. X

where p is the probability mass function of X.


– Moment generating function (mgf): a way to represent probability distribution

– What is “moment” of the r.v. X?

kth moment E[Xk]

-

if is discrete

if is continuous

k

x

k

x p x X

x f x dx X


– The moment-generating function of r.v. X is

wherever this expectation exists.– Why is mgf extremely important? unified way to represent the high-order

properties of a r.v. such as expectation, variance, etc.


– In the college study, we know how to compute

• Mean and variance of a r.v.• The joint distribution of two or more r.v.sBut, they are studied case by case!

– Any unified approach?


– Major properties of mgfi) Calculating moments

Mean: E(X) = MX(1)(0)

Variance: E(X) = MX(2)(0) –(MX

(1)(0))2


– Major properties of mgfii) Calculating distribution of sum of r.v.s Given independent r.v.s X1 and X2, and the sum Y = X2+X2, what is the distribution of Y?

If we know the mgf MX1(n) and MX2

(n), then

MY(n) = MX1

(n) *MX2(n)

Probability Basics• Commonly Used Distributions

– Binomial distribution: if you have only two possible outcomes (call them 1/0 or yes/no or success/failure) in n independent trials, then the probability of exactly r “successes”=

rnrn

rpprXP

)1()(

1-p = probability of failurep =

probability of success

r := # successes out of n trials


– mgf of Binomial distribution:

tntttnt

tnttnt

ntn

r

rnrt

n

r

rnrtrtX

eppeepeppennptM

eppenppeppentM

ppeppern

ppyn

eeEtM

12

11

0

0

)1()1()1()(''

)1()1()('

)1()1(

)1()(


– mgf of Binomial distribution:

)1(

)1()()1()()(

)1(1)1(

]1[)1()1()1()1()1()1()1()0(''

)1()1()1()0(')(

22222

22222

122

1

pnp

pnpnppnppnXEXEXV

pnppnnpnppnpnnp

pppppnnpMXE

npppnpMXE

nn

n


– Exponential distribution: a continuous r.v. whose pgf has

– Example: 1/lambda is the mean duration of waiting for the next bus if the bus arrival time is exponentially distributed.

;1)(;)( tt etXFetf


– mgf of exponential distribution:

1**

0

**

*

0

*

0

1

0

1

0

)1(1

1)10(111)(

1 where11

11)(

tt

etM

tdxedxe

dxedxeeeEtM

x

xtx

txxtxtX


– mgf of exponential distribution:

2222

323

22

2)0(')0('')(

)0(')(

)1(2)()1(2)(''

)1()()1(1)('

MMYV

MYE

tttM

tttM


– Continuous r.v.

Name Moment generating function

Uniform

Normal

Gamma

abteee

tabdx

abetM

atbtb

a

txb

a

tx

X

11

22

21 tt

e

kt )1(


– Discrete r.v.

Name Moment generating function

Bernoulli

Poisson

Geometric

tpep 1

t

t

eppe

)1(1

)1( tee

Probability Basics• Advanced Distributions in Networking

– Power law distribution

– Intuitive meaning: • The prob. that you have 1 Billion USD is

extremely small (continuous example)• Lin Dan (x=1 badminton player) gets much

more media exposure than an unknown one with x=10 (discrete example)

P[ ] ~X x cx


– Power law distribution

– Intuitive meaning: • The prob. that you have 1 Billion USD is

extremely small (continuous example)• Lin Dan (x=1 badminton player) gets much

more media exposure than an unknown one with x=10 (discrete example)

P[ ] ~X x cx

Probability Basics• Examples of power-law

a. Word frequencyb. Paper citationsc. Web hitsd. P2P file poplaritye. Wealth of the richest

people.f. Frequencies of surnamesg. Populations of cities.

Probability Basics• Laplace and Z-transform

– Laplace transform is essentially the m.g.f. of non-negative r.v.

– Z-Transform (ZT) is the m.g.f. of a discrete r.v.

• The purpose is to compute the distribution of r.v.s in a easier way

Probability Basics• Laplace transform

– The moments can again be determined by differentiation:

– LT of a sum of independent r.v.s is the product of LTs

. 1,2,...k , 0

)()1(

sds

sLdX kX

kkk

)()(1

n

iXX sLsLi

No need to compute the convolutions one by one!

Probability Basics• Take home messages

– Moment generating function is vital in computing probability distribution

– Laplace transform (and Z transform) has many applications

Probability Basics• Sub-summary

– Review basic knowledge of probability– Highlight important concepts– Review some commonly used

distributions– Introduce Laplace and Z transforms


Stochastic Process• Concepts

– Random variance: a standalone variable– Stochastic process: a stochastic process

X(t) is a family of random variables indexed by a time parameter t

time

X(t) a sample path

a random variable for each fixed t

t

P. 41

Stochastic ProcessTo be more accurate,• A stochastic process N= {N(t), t T} is a

collection of r.v., i.e., for each t in the index set T, N(t) is a random variable– t: time– N(t): state at time t– If T is a countable set, N is a discrete-time

stochastic process– If T is continuous, N is a continuous-time

stochastic process

Stochastic ProcessCounting process

• A stochastic process {N(t) ,t 0} is said to be a counting process if N(t) is the total number of events that occurred up to time t. Hence, some properties of a counting process is– N(t) 0– N(t) is integer valued– If s < t, N(t) N(s)– For s < t, N(t) – N(s) equals number of events

occurring in the interval (s, t]

Stochastic ProcessPoisson process

• Def. A: the counting process {N(t), t0} is said to be Poisson process having rate , >0 if– N(0) = 0;– The process has independent-increments– Number of events in any interval of length t is

Poisson dist. with mean t, that is for all s, t 0.( )[ ( ) ( ) ]

! = 0,1,2,...

nt tP N t s N s n en

n

Stochastic Process• Markov process

– Q: What is Markov process? Is it a new process?

– A: No, it refers to any stochastic process that satisfies the Markov property!

Stochastic Process• Markov process P[X(tn+1) Xn+1| X(tn)= xn, X(tn-1) = xn-1,…

X(t1)=x1] = P[X(tn+1) Xn+1| X(tn)=xn]– Probabilistic future of the process depends

only on the current state, not on the history– We are mostly concerned with discrete-

space Markov process, commonly referred to as Markov chains

– Discrete-time Markov chains– Continuous-time Markov chains

Stochastic Process• Discrete Time Markov Chain

– P[Xn+1 = j | Xn= kn, Xn-1 = kn-1,…X0= k0] = P[Xn+1 = j | Xn = kn]

– discrete time, discrete space– a finite-state DTMC if its state space is

finite– a homogeneous DTMC if P[Xn+1 = j | Xn= i ]

does not depend on n for all i, j, i.e., Pij = P[Xn+1 = j | Xn= i ], where Pij is one step transition prob.

Stochastic Process• Discrete Time Markov Chain

P = [ Pij] is the transition matrix

A B

C D

0.2

0.3

0.5

0.05

0.95

0.2

0.8

1

0100

0.800.20

0.300.50.2

00.0500.95A B

B

A

C

C

D

D

Representation as a directed graph

transition probability

Stochastic Process• Continuous Time Markov Chain

P. 48

– Continuous time, discrete state– P[X(t)= j | X(s)=i, X(sn-1)= in-1,…X(s0) = i0]

= P[X(t)= j | X(s)=i]– A continuous M.C. is homogeneous if

• P[X(t+u)= j | X(s+u)=i] = P[X(t)= j | X(s)=i] = Pij[t-s], where t > s

– Chapman-Kolmogorov equation

For all t > 0, s > 0, i , j I

( ) ( ) ( ) ij ik kjk I

p t s p t p s


P = [ Pij] is called intensity matrix

A B

C D

0.2

0.30.1 0.2

0.8

1.2-1.21.200

0.8-10.20

0.30-0.50.2

00.10-0.1A B

B

A

C

C

D

D

Representation as a directed graph

transition rate


– Irreducible Markov chain: a Markov Chain is irreducible if the corresponding graph is strongly connected.

A B

C D

E

irreducible reducible

A B

C D

Stochastic Process• Continuous Time Markov Chain• Ergodic Markov chain: a Markov Chain is

ergodic if i) strongly connected graph; ii) not periodic.

A B

C D

E

Some periodic behaviors in the transitions from A->B->C->DNot Ergodic

Stochastic Process• Continuous Time Markov Chain• Ergodic Markov chain: a Markov Chain is

ergodic if i) strongly connected graph; ii) not periodic.

Ergodic

A B

C D

Ergodic Markov Chains are important since they guarantee the corresponding Markovian process converges to a unique distribution, in which all states have strictly positive probability.

Stochastic Process• Steady State - DTMC:

Let π = (π1, π2, . . . , πm) is the m-dimensional row vector of steady-state (unconditional) probabilities for the state space S = {1,…,m}. (e.g. m=3)

1 2 3 1 2 3

0.90 0.07 0.03, , , , 0.02 0.82 0.16

0.20 0.12 0.68

π1 + π2 + π2 = 1,

π1 0, π2 0, π3 0

Solve linear system: π = πP, πj = 1, πj 0, j = 1,…,m

transition probability

Stochastic Process• Steady State – CTMC

– The computation is based on Flow balance equation.

– Will be highlighted in the following slides: Baby queueing theory

Stochastic Process• Sub-summary

– Stochastic process is a collection of r.v.s. indexed by time

– Markov process refers to the stochastic processes that the future only depends on the current state.


Baby Queueing Theory• Queueing theory is the most important tool

(not one of) to evaluate the performance of computing systems

• (Kleinrock) “We study the phenomena of standing, waiting, and serving, and we call this study Queueing Theory." "Any system in which arrivals place demands upon a finite capacity resource may be termed a queueing system.”

Baby Queueing Theory• You want to know quick and insightful

answers to– Delay– Delay variation (jitter)– Packet loss – Efficient sharing of bandwidth– Performance of variaous traffic type

(audio/video, file transfer, interactive)– Call rejection rate– Performance of packet/flow scheduling– And so on ……

Baby Queueing Theory• Our slides will cover

– Basic terms of queueing theory– Basic queueing models– Basic analytical approachs and results– Basic knowledge of queueing networks– Application to P2P networks

Baby Queueing Theory• Basic terms

Arrival and service are stochastic processes

Queuing System

Queue Server Customers

Baby Queueing Theory• Basic terms

A/B/m/K/N

Arrival Process•M: Markovian •D: Deterministic•Er: Erlang•G: General

Service Process•M: Markovian •D: Deterministic•Er: Erlang•G: GeneralNumber of

servers m=1,2,…

Storage Capacity K= 1,2,… (if ∞ then it is omitted)

Number of customers N= 1,2,… (for closed networks otherwise it is omitted)

Baby Queueing Theory• Basic terms• We are interested in steady state behavior

– Even though it is possible to pursue transient results, it is a significantly more difficult task.

• E[S] average system time (average time spent in the system)

• E[W] average waiting time (average time spent waiting in queue(s))

• E[X] average queue length• E[U] average utilization (fraction of time that the resources

are being used)• E[R] average throughput (rate that customers leave the

system)• E[L] average customer loss (rate that customers are lost

or probability that a customer is lost)

Baby Queueing Theory• M/M/1 – Steady state

Meaning: Poisson Arrivals, exponentially distributed service times, one server and infinite capacity buffer.

(here, λj=λ and μj=μ)

λ0

0 1μ1

λ1

2μ2

λj-2

j-1μj-1

λj-1

jμj

μ3

λ2λj

μj+1

At steady state, we obtain (due to flow balance)

0 0 1 1 0 01 0

1

Baby Queueing Theory• M/M/1 – Steady state In general

1 1 1 1 0j jj j j j j 01 0

1 1

......

jj

j

Making the sum equal to 1

0 10

1 1

...1 1

...j

j j

Solution exists if0 1

1 1

...1

...j

j j

S

Letting λj=λ and μj=μ, we have

01

11j

j

for λ/μ = ρ <1

0 1

, 1,2,...1 jj j

Baby Queueing Theory• M/M/1 - Performance Server Utilization

Throughput 0

1

1 1 1jj

E U

Expected Queue Length

01

1jj

E R

0 0 01 1

jj

jj j j

dE j jX

d

0

11 1

1 1j

j

d dd d

Baby Queueing Theory• M/M/1 - Performance Average System Time

Average waiting time in queue

1E E E ES SX X

E E E E E ES W W SZ Z

1 11 1

E S

1 11 1

E W

Baby Queueing Theory• M/M/1 - Example

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10

15

20

25

30

35

40

rho

Del

ay (t

ime

units

) / N

umbe

r of c

usto

mer

s μ=0.5 rho=λ/μ

Ε[Χ]

Ε[W]

Ε[S]

Baby Queueing Theory• Little’s Law – obtaining delay

a(t): the process that counts the number of arrivals up to t.

d(t): the process that counts # of departures up to t. N(t)= a(t)- d(t)

N(t)

a(t)

Time t

Area γ(t)

Average arrival rate (up to t) λt= a(t)/t Average time each customer spends in the system Tt=

γ(t)/a(t) Average number in the system Nt= γ(t)/t

d(t)

Baby Queueing Theory• Little’s Law – obtaining delay

t t tN T Taking the limit as t goes to infinity

E EN TExpected number of customers in the system

Expected time in systemArrival rate IN the system

N(t)

a(t)

Time t

Area γ(t)

d(t)

Baby Queueing Theory• M/M/m – Steady state

Meaning: Poisson Arrivals, exponentially distributed service times, m identical servers and infinite buffer.

λ

0 1μ

λ

22μ

λ

mmμ

λ

m+1mμ3μ

λ λ

mμ

1

m…

if 0 and =

if j j

j j mm j m

Baby Queueing Theory• M/M/m – Steady state

– The analysis can be done using flow balance equations (in the same way as M/M/1)

– How can we compare M/M/1 to M/M/m? What are the insights we can get?

Baby Queueing Theory• M/M/m vs M/M/1

Suppose that customers arrive according to a Poisson process with rate λ=1. You are given three options, Install a single server with processing capacity μ1= 1.5 Install two identical servers with processing capacity μ2= 0.75

and μ3= 0.75 Split the incoming traffic to two queues each with

probability 0.5 and have μ2= 0.75 and μ3= 0.75 serve each queue. μ1

λ

Α μ2

μ3

λ

Β

μ2

μ3

λ

C

Baby Queueing Theory• M/M/m vs M/M/1 Throughput

It is easy to see that all three systems have the same throughput E[RA]= E[RB]= E[RC]=λ

Server Utilization

1

1 21.5 3AE U

2

1 40.75 3BE U

Therefore, each server is 2/3

utilized

2

0.5 1 22 0.75 3CE U

Therefore, all servers are similarly loaded.

Baby Queueing Theory• M/M/m vs M/M/1 Probability of being idle

01

113A

For each server02

112 3C

12414 31523 2 1

3

11

01

1! ! 1

j mm

j

m mj m

Baby Queueing Theory• M/M/m vs M/M/1 Queue length and delay

1

1 21.5 1AE X

For each queue!

02

12! 51

m

B

mE mX

m

12

/ 2 0.5 2/ 2 0.75 0.5CE X

1 2A AE ES X

12 4C CE X E X

1 125B BE ES X

1 4C CE X E X

Baby Queueing Theory• M/M/1/K

Meaning: Poisson Arrivals, exponentially distributed service times, one server and finite capacity buffer K.

Using the birth-death result λj=λ and μj=μ, we obtain

0 , 0,1,2,...j

j j K

Therefore

01

11jK

j

for λ/μ = ρ

0 1

11 K

1

1, 1,2,...

1

j

j Kj K

λ

0 1μ

λ

2μ

λ

K-1μ

λ

Kμμ

λ

Baby Queueing Theory• M/M/1/K - Performance Server Utilization

Throughput

0 1 1

1 11 1

1 1

K

K KE U

Blocking Probability

0 1

111

K

KE R

1

11

K

B K KP

Probability that an arriving customer finds the queue full (at state K)

Baby Queueing Theory• M/M/1/K - Performance Expected Queue Length

1 10 0 0

1 11 1

jK K Kj

j K Kj j j

dE j jX

d

1

11 1

KK

K K

System time 1 KE E SX

Net arrival rate (no losses)

Baby Queueing Theory• More difficult queueing models

– M/G/1– G/M/1– G/G/1

In other words, if the inter-arrival time, or the service time follow a more general distribution, the performance analysis is more challenging.

Then, we may using various approximation techniques to obtain the asymptotic behaviors

Baby Queueing Theory• Queueing Networks

– Single queue is usually not enough to model complicated job scheduling, or packet delivery

– Queueing Network: model in which jobs departing from one queue arrive at another queue (or possibly the same queue)

Baby Queueing Theory• Open queueing network

– Jobs arrive from external sources, circulate, and eventually depart

– What is the delay of traversing multiple queues?

Baby Queueing Theory• Closed queueing network

– Machine repairman problem

Baby Queueing Theory• Example 1 – Tandem network

– k M/M/1 queues in series– Each individual queue can be analyzed

independently of other queues– Arrival rate= . If i is the service rate for ith server:

Baby Queueing Theory• Example 1 – Tandem network

Joint probability of queue lengths:

product form network!

Baby Queueing Theory• Insights

– Queueing networks are in general very difficult to analyze, even intractable!

– If each queue can be analyzed independently, we might be lucky to analyze the queueing networks in product-form !

– Next objective: what kinds of queues own this product-form property?

Baby Queueing Theory• Jackson networks

Jackson (1963) showed that any arbitrary open network of m-server queues with exponentially distributed service times has a product formIn general, the internal flow in such networks is not Poisson, in particular when there are feedbacks in the network.

Baby Queueing Theory• BCMP networks

– Gordon and Newell (1967) showed that any arbitrary closed networks of m-server queues with exponentially distributed service times also have a product form solution

– Baskett, Chandy, Muntz, and Palacios (1975) showed that product form solutions exist for an even broader class of networks (no matter it is an open or closed one)

Baby Queueing Theory• BCMP networks

– k severs– R 1 classes of customers– Customers may change class

,

a customer of class completing service at node Pr

moves to node as a customer of class the mean service rate for class at node

ir js

ir

r ip

j sr i

Allowing class changes means that a customer can have different mean service rates for different visits to the same node.

Baby Queueing Theory• BCMP networks Sever may be only of four types:

– First-come-first-served (FCFS)– Processor sharing (PS)– Infinite servers (IS or delay centers) and – Last-come-first-served-preemptive-resume

(LCFS-PR)

Still quite limited!

Baby Queueing Theory• Relationships of queueing networks

Product Form NetworksDenning&Buzen

BCMP

Jackson

Baby Queueing Theory• Sub-summary

– Little’s law: mean delay = mean # of jobs/service rate

– Flow balance approach to solve CTMC

– Classic Queueing models and their performance

– Only product-form queueing networks are not difficult to be analyzed


Statistics


Summary• Basic knowledge of probability

– Moment generating function, Laplace trans.

• Basic stochastic processes– Solving steady state of Markov chain

• Baby queueing theory– M/M/1, M/M/m, M/M/1/K, Jackson, BCMP

• Statistics– To be added

Thanks!


George Kingsley Zipf 1902-1950

Zipf distribution: Named after George Zipf Describing frequency of

occurrence of words Very useful in

characterizing- File popularity- Keyword occurrence- Importance of nodes- and so on ……


– Zipf distribution: the high the rank, the lower the frequency of occurrence.

N : the number of elements; k : their rank; s : the exponential parameter


– Zipf distribution: example

Documents

Introduction to Network Mathematics (2) - Probability and Queueing