Review - University of Calgary in Albertapages.cpsc.ucalgary.ca/.../teaching/W07/CPSC531/assignments/Revi… · Law of Total Probability • Let B 1, B 2, B 3, …, B k be mutually

CPSC 531Systems Modeling and Simulation

Review

2

Independent Events

• Independent events are those that don’t have any effect on each other. That is, knowing one of them occurs does not provide any information about the occurrence of the other event. Mathematically, A and B are independent if P(A|B) = P(A) and P(B|A) = P(B)

• From conditional probability definition, we have P(AB) = P(A|B)P(B). Therefore, if A and B are independent events, P(AB) = P(A)P(B)

3

Law of Total Probability• Let B1, B2, B3, …, Bk be mutually disjoint and collectively

exhaustive events from the sample space S. Then, for any event A in S, we have ∑=

=

k

jjj BAPBPAP

1)|()()(

ExplanationA = (B1A)U (B2A)U… (BkA).

The (BjA)’s are disjoint events. Therefore, using laws of conditional probability we get:

kjBP

BAPBPABPAP

j

k

jjj

k

jj

,...,1for 0)( if

,)|()()()(11

=>

∑=∑===

B1 B2

B3B4

A

4

Bayes’ Theorem• Partition: The events B1, B2, B3, …, Bk form a

partition of a set S if they are mutually disjoint and SBk

i i ==U 1

• Bayes’ Theorem: Suppose that B1, B2, B3, …, Bkform a partition of sample space S such that P(Bj)> 0 for j = 1, …, k. Let A be an event in S such that P(A)>0. Then, for i = 1, …, k,

∑=

=kj jj

iii

BAPBP

BAPBPABP

1 )|()(

)|()()|(

5

Random Variables

• A random variable is a real-valued mapping that assigns a numerical value to each possible outcome of an experiment.

• Consider arrival of jobs at a CPU. Let X be the number of jobs that arrive per unit time. X is a random variable that can take the values {0,1,2,…}.

6

Discrete Random Variables and PMF• A random variable X is said to be discrete if the number of

possible values of X is finite, or at most, an infinite sequence of different values.

• Discrete random variables are characterized by the probabilities of the values attained by the variable. These probabilities are referred to as the Probability Mass Function (PMF) of X. Mathematically, we define PMF as:

∑=

=

==

==

xsX

X

sP

xsXsP

xXPxp

)(

)(

}))(|({

)()(

7

Properties of PMF and CDF

1)(0

)(

)(

)()(

1)(1)(

,1)(0

≤≤

=≤=

≤<−∞=

==ℜ∈∀≤≤

∑

∑∑

≤

ℜ∈

xF

xp

tXP

tXPtF

xporxp

xxp

X

txX

X

i iXx X

X

:CDF

:PMF

8

Expectation

• Definition: weighted average of possible values of X.

• ci ‘s are constants• works even if Xi’s are not independent

∑ ==x

X xpxXE µ)(][

∑∑==

=

=

n

iii

n

iii XEcXcE

XcEcXE

11

][][

][][

9

Binomial Random Variable

• Consider n Bernoulli trials, where each trial can result in a success with probability p. The number of successes X in such a n-trial sequence is a binomial random variable.

• The PMF for this random variable is given by:

where p is the probability of success of a Bernoulli trial.E[X] = np

=−

===−

otherwise

nkppk

n

kXPkpknk

X

,0

,...,2,1,0,)1()}{()(

10

Binomial PMFBinomial Distribution ( n = 10)

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Number of successes (k )

P (

{X =

k}) p = 1/2

p = 1/4

p = 3/4

p = 1/2

11

Geometric Random Variable

• The number of Bernoulli trials, X, until first success is a Geometric random variable.

• PMF is given as:

• CDF is given as:

• Mean and variance:

2

1

1

1

1)(

1][

0,)1(1)1()(

,0

,...2,1,)1()(

p

pXVar

pXE

tppptF

otherwise

kppkp

t

i

tiX

k

X

−==

≥−−=−=

=−=

∑=

−

−

12

Geometric PMFGeometric Distribution (p = 0.5)

0

0.1

0.2

0.3

0.4

0.5

0.6

0 1 2 3 4 5 6 7 8 9 10

Number of trials until first success (k )

P (

{X =

k})

13

Example: Modeling Packet Loss

• Geometric r.v. gives number of trials required to get first success

• It is easy to see pX(k) = (1-p)k-1p, k = 1,2,…where p is the probability of success of a trial

• Modeling packet losses seen at a router• We can model using a Bernoulli process

{ Y0, Y1, Y2,…} where Yi represents a Bernoulli trial for packet number i

• We can say:P{Yi = 1} = p (i.e., a packet loss) P{Yi = 0} = 1 -p (i.e., no loss)

• So number of successful packet transmissions before first loss, X, is geometrically distributedP{( X= n)} = p (1-p)n-1 , n = 1,2,…(good length distribution)

14

Poisson Random Variable

• A discrete random variable, X, that takes only non-negative integer values is said to be Poisson with parameter λ > 0, if X has the following PMF:

• Poisson PMF with parameter λ is a good approximation of Binomial PMF with parameters n and p, providedλ = np, n is very large, and p is very small.

==

−

otherwise

kk

ekp

k

X

,0

,...2,1,0,!)(λλ

15

Poisson PMFPoisson Distribution

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1 2 3 4 5 6 7 8 9 10

Number of events (k)

P [X

= k

]

λ = 0.5λ = 1λ = 5

16

Poisson Approximation to Binomial

Binomial Distribution ( n = 100, p =0.02)

0

0.05

0.1

0.15

0.2

0.25

0.3

0 5 10

Number of successes (k )

P (

{X =

k})

Poisson Distribution (λ = 2)

0

0.05

0.1

0.15

0.2

0.25

0.3

0 5 10

Number of events (k )

P ({

X =

k})

Binomial distribution with large n and small p can be approximated by Poisson distribution with λ = np

17

Poisson Random Variable (cont.)

• CDF of Poisson Random Variable:

• Mean and variance:

• Consider N independent Poisson random variables Xi, i=1,2,3,…,N, with parameters Xi. Then X=X1+X2+…+XN is also a Poisson r.v. with parameter λ=λ1+λ2+...+λΝ

λλ

λλ

==

≥=∑=

−

)(][

0,!

)(0

XVarXE

tk

etFt

k

k

X

18

Example: Job arrivals

• Consider modeling number of job arrivals at a shop in an interval (0,t]

• Let λ be the rate of arrival of jobs• In an interval ∆t → 0

P{one arrival in ∆t} = λ ∆tP{two or more arrivals in ∆t} is negligible

• Divide the interval (0,t] into n subintervals of equal lengths

• Assume arrival of jobs in each interval to be independent of arrivals in another interval

19

Example: Job arrivals (…)

• If n → ∞, the interval can be viewed as a sequence of Bernoulli trials with

• The number of successes k in n trials can be given by the Binomial distribution’s PMF

n

ttp λλ =∆=

knk ppk

n−−

= )1(

20

Example: Job arrivals (…)

ondistributi Poisson the is which

e

is interval time in events ofy probabilit the , Letting

e

to reduces above the Setting

get to for Substitute

-

t-

!

]1,0(1

,...2,1,0,!

)(

,...,1,0,1

k

kk

kk

t

n

nkn

t

n

t

k

n

t/np

k

k

knk

λ

λ

λλ

λ

λ

λ

→=

∞→=

−

=

−

21

Continuous Random Variable

• A random variable X is said to be continuous if there exists a non-negative function f(x),∀x∈(−∞,∞), with the property that for any set Aof real numbers:

• f(x) is called the “probability density function”(PDF) of X

∫=∈A

dxxfAXP )(})({

22

Properties of PDF

}.{],[.,.

)(})({

1)(

,0)(

BXPbaBei

dxxfbXaP

dxxf

xxf

b

a

∈=

=≤≤

=

∀≥

∫∫∞∞−

find to want weand

one. equals curve the under area i.e.,

23

Properties of PDF (continued)

})({

})({

})({

})({

0)(})({

bXaP

bXaP

bXaP

bXaP

dxxfaXPa

a

≤<=<≤=<<=≤≤

=== ∫:property above the of eConsequenc

:values individual to value 0 assign onsdistributi Continuous

24

Cumulative Distribution Function

• The CDF FX(⋅) of a continuous random variable X with PDF fX(⋅) can be obtained as follows:

∫=−∞∈=

∞−

x

X

X

dttf

xXPxF

)(

]}),(({)(

25

CDF - PDF Relationship

• The PDF can be obtained from the CDF and vice versa:

• Distribution of a continuous random variable can be represented using either the PDF or the CDF.

)()(

)(' xfdx

xdFxF X

XX ==

26

PDF and CDF of Uniform R.V.

• The PDF of a uniform random variable X in the interval [a, b] is:

• The CDF of X is:

( )1

,

0, otherwise

a x bf x b a

< <= −( )

0,

,

1,

x a

x aF x a x b

b ax b

≤ −= < < − ≥How did we get F(x)?

∫∫ −

−=

−=

−=<<

∞−

x

a

x

ab

ax

ab

dt

ab

dtbxaF )(

27

Uniform R.V. PDF and CDFPDF of Uniform R.V. (a=1, b=3)

0

0.5

1

0 1 2 3 4

x

f(x)

CDF of Uniform R.V. (a=1, b=3)

0

0.5

1

0 1 2 3 4

x

F(x)

28

Exponential Distribution

β

β

β

β

1][

0,1

0,0)(

,0

0,)(

=

≥−<=

≥=

−

−

XE

xe

xxF

otherwise

xexf

xX

x

X

29

Exponential Models

• This distribution has been used to model:• Inter-arrival times between IP packets• Inter-arrival times between calls at a call

centre• Inter-arrival times between web sessions from

a web client• Service time distributions• Lifetime of products

• Widely used in queuing theory

30

Exponential PDF and CDFPDF of Exponential Distribution

0

1

2

3

4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x

f(x)

β=0.5β=1.0β=2.0β=4.0

CDF of Exponential Distribution

0

0.2

0.4

0.6

0.8

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

x

F(x)

β=2.0

31

Memory-less Property of Exponential Distribution• Suppose inter-arrival times of IP packets are

modelled using Exponential distribution. The memory-less property states that the distribution of the expected time to a packet arrival is independent of the duration there have been no packet arrivals

• Suppose X is an exponentially distributed r.v. and X ≥ t (i.e., no arrivals for time t or less). Then,

P({ X ≥ t+h | X ≥ t }) = P({ X ≥ h })

32

Pareto Distribution

• If X is a random variable with a Pareto distribution, then its PDF is given by:

where xm, is the minimum possible value of X, also called a location parameter, and k is positive, also called a shape parameter.

• CDF of Pareto distribution is given by:

0,0,,)(1

>>≥=+

kxxxx

xkxf mmk

km

km

x

xxF

−=1)(

33

Pareto PDF and CDF

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

10 100 1000

prob

abili

ty d

ensi

ty, f

(x)

x

Pareto PDF with x_min = 3, k = 1.2

0.00

0.20

0.40

0.60

0.80

1.00

10 100 1000

cum

mul

ativ

e di

strib

utio

n, F

(x)

x

Pareto CDF with x_min = 3, k = 1.2

Example: Distribution of file sizes on a web server• A PDF shows a high probability that a file will be under 10 KB in

size, and a very small probability of being larger than 100 KB• A CDF curve shows proportion of files within certain size threshold,

e.g. nearly all files are under 100 KB in size

34

Pareto Models

• This highly left-skewed distribution is heavy-tailed meaning that a random variable can have extreme values.

• Common models:• Distribution of income• Distribution of files in a P2P system• The values of oil reserves in oil fields (a few large

fields, many small fields) • The length distribution in jobs assigned to

supercomputers (a few large ones, many small ones) • The standardized price returns on individual stocks

35

Normal Distribution

• X is a normal random variable with mean µ and variance σ2 if X has the following PDF:

• The CDF of a normal distribution is:

• There is no closed form for FX(x).

∫∞−

−−

−−

=≤=

∞<<∞−=

x t

X

x

X

dtxXPxF

xxf

e

e

2

2

2

2

2

)(

2

)(

2

1})({)(

,2

1)(

σµ

σµ

σπ

σπ

36

PDF of Normal Distribution (µ = 0)

0

0.2

0.4

0.6

0.8

1

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

x

f(x)

σ=1σ=0.5σ=2

PDF of Normal Distribution

This PDF has a “bell” shape with “peak” at x=0.

37

Standard Normal Distribution

• If X is normally distributed with parameters µand σ2, then

is normally distributed with parameters 0 and 1. Z is called the “standard normal distribution”.

σµ)( −

=X

Z

38

Computing CDF of Normal Distribution

• Transform X to standard normal distribution Zand use tables

•

• Area under standard normal curve in (-∞,z) is equal to area under normal curve in (-∞,x)

• Same method is used to obtain P(a < X < b), by calculating P(X < b) − P(X < a)

• Alternative formula:

)1,0()(

),(~ 2 NX

ZNX is then Ifσµσµ −

=

−+=2

12

1)(

σµx

erfxFX

39

Normal CDF - Example

),0(

)2.0(

2.02

54.5)()4.5()4.5(

)4.5()4,5(

)()()(

z

ZP

zforzZPXPF

FfindNFor

xzforzZPxXPxF

X

X

X

interval the over e.g. given, is area which

check always but curve, the under area gives table The

table" ondistributi Normal Standard"

called table the from read is ≤

=−

=≤=≤=

−=≤=≤=

σµ

40

Normal CDF - Example

• Rows mark z-value up to 1 decimal digit• Columns mark z-value’s 2nd decimal digit• For z = 0.2read value at (0.2, 0.0) → 0.0793• Then account for the rest of the area on the left

of y-axis to get the FX(5.4) = 0.5 + 0.0793 = 0.5793

0.28520.28230.27940.27640.27340.27040.26730.26420.26110.25800.7

0.25490.25170.24860.24540.24220.23890.23570.23240.22910.22570.6

0.22240.21900.21570.21230.20880.20540.20190.19850.19500.19150.5

0.18790.18440.18080.17720.17360.17000.16640.16280.15910.15540.4

0.15170.14800.14430.14060.13680.13310.12930.12550.12170.11790.3

0.11410.11030.10640.10260.09870.09480.09100.08710.08320.07930.2

0.07530.07140.06750.06360.05960.05570.05170.04780.04380.03980.1

0.03590.03190.02790.02390.01990.01600.01200.00800.00400.00000.0

0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00

41

Chi-Square Test

• Prepare a histogram of the empirical data with kcells

• Let Oi and Ei be the observed and expected frequency of the ith cell, respectively. Compute the following:

• has a Chi-Square distribution with (k-1) degrees of freedom

20χ

∑ −==

k

i i

ii

E

EO

1

220

)(χ

42

Chi-Square Test (continued …)

• Define a null hypothesis, H0, that observations come from a specified distribution

• The null hypothesis cannot be rejected at a significance level of α if

true) is H|H P(reject

level cesignifican of meaning

00=

<−−−

α

χχ α2

]1,1[20 sk Obtained from a table

43

More on Chi-Square Test

• Errors in cells with small Ei’s affect the test statistics more than cells with large Ei’s.

• Minimum size of Ei debated: [BCNN05] recommends a value of 3 or more; if not combine adjacent cells.

• Test designed for discrete distributions and large sample sizes only. For continuous distributions, Chi-Square test is only an approximation (i.e., level of significance holds only for n→∞).

44

Chi-Square Test Example• Example: 500 random numbers generated using a

random number generator; observations categorized into cells at intervals of 0.1, between 0 and 1. At level of significance of 0.1, are these numbers IID U(0,1)?

Interval Oi Ei [(Oi-Ei)^2]/Ei1 50 50 02 48 50 0.083 49 50 0.024 42 50 1.285 52 50 0.086 45 50 0.57 63 50 3.388 54 50 0.329 50 50 0

10 47 50 0.18500 5.84

0.10. of level cesignifican at accepted Hypothesis

table the from ;68.14;85.5 2]9,9.0[

20 == χχ

45

Fundamental a.k.a. Operational Laws

• Utilization Law• Forced Flow Law• Service Demand Law• Little’s Law• Interactive Response Time Law

46

Utilization Law

iiiii

i SXC

B

T

C

T

B U =×==

Utilization of a resource (system) is equal to the product of the throughput of the resource (system) and average service time of the resource (system)

• Utilization of a resource is the fraction of time that resource is busy.• Ui is always between 0 and 1.

47

Forced Flow Law

• Each “system-level” request may require multiple visits to a system “resource”. • E.g., A database transaction may require several disk

accesses;

• This law relates system throughput to the resource throughput

Xk = Vk × X0

2

1

3

• A system consists of many resources

• Vi := average # of visits per request to resource i

• Xi := throughput at resource i

48

Service Demand Law

Di := mean time spent by a typical request obtaining service from resource i

• Contrast Di with Si

Si := mean service time per visit for resource iDi = Si Vi

Di = Ui/Xi × Xi/X0 = Ui /X0

• Typically X0 and Ui are easier to obtain than Siand Vi.

49

Little’s Law• The most famous Operational Law• Average number in system equals product of the

departure rate of customers (i.e., throughput of the system) and the average time each customer spends in the system.

Ni = Xi × Ri

Arrivals CompletionsNumber in System = N

Black Box == pub

Ri

50

Interactive Response Time Law

X0 = System throughput

N = # of clients (terminals)

Z = Client’s avg. think timeR = Avg. System Response time

Let Nt = avg. # of clients in think mode

Let Nw = avg. # of clients waiting for response

Nt + Nw = N

Nt = X0 Z [Box 1]Nw = X0 R [Box 2],

⇒ N = X0(R+Z)

⇒ R = (N/X0) - Z

Terminals

Subsystem

Box 1

Box 2

R

Z

X

Documents

Review - University of Calgary in Albertapages.cpsc.ucalgary.ca/.../teaching/W07/CPSC531/assignments/Revi… · Law of Total Probability • Let B 1, B 2, B 3, …, B k be mutually