Upload
nola
View
41
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Ch.4 Review of Basic Probability and Statistics. 4.1 Introduction. Perform statistic analyses of the simulation output data. Design the simulation experiments. Probability and statistics. Model a probabilistic system. Generate random samples from the input distribution. - PowerPoint PPT Presentation
Citation preview
Ch.4 Review of Basic Probability and Statistics
4.1 Introduction
Probability and
statistics
Model a probabilistic
system
Model a probabilistic
system
Validate the simulation
model
Validate the simulation
model
Choose the input probabilistic
distribution
Choose the input probabilistic
distribution
Generate random samples from the input distribution
Generate random samples from the input distribution
Perform statistic analyses of the
simulation output data
Perform statistic analyses of the
simulation output data
Design the simulation
experiments
Design the simulation
experiments
4.2 Random variables and their properties
• Experiment is a process whose outcome is not known with certainty.
• Sample space (S) is the set of all possible outcome of an experiment.
• Sample points are the outcomes themselves.• Random variable (X, Y, Z) is a function that
assigns a real number to each point in the sample space S.
• Values x, y, z
Examples
• flipping a coin S={H, T}• tossing a die S={1,2,…,6}• flipping two coins S={(H,H), (H,T), (T,H), (T,T)} X: the number of heads that occurs• rolling a pair of dice S={(1,1), (1,2), …, (6,6)} X: the sum of the two dice
Distribution (cumulative) function
)( xXP
xxXPxF for )()(
}{ xX : the probability associated with the event
Properties:
1.
2. F(x) is nondecreasing [i.e., if ].
3. .
. allfor 1)(0 xxF
)()( then, 2121 xFxFxx
0)(lim and 1)(limx
xFxFx
Discrete random variableA random variable X is said to be discrete if it can take on at most a
countable number of values.
The probability that X takes on the value
Then
Probability mass function on I=[a,b]
1x
,...2,1for )()( ixXPxp ii
1
1)(i
ixp
xxpxF
xpIXP
xxi
bxai
i
i
allfor )()(
)()(
Examples
0 1 2 3 4 x
p(x)1
1/6
1/3
1/2
2/3
5/6
p(x) for the demand-size random variable X.
32
31
31)3()2()32( ppXP
0 1 2 3 4 x
F(x)1
1/6
1/3
1/2
2/3
5/6
F(x) for the demand-size random variable X.
Continuous random variables
PX B B
fxdx
A random variable is said to be continuous if there exists a nonnegative function f(x) such that for any set of real number B
and
fxdx 1
f(x) is called the probability density function.
PX x PX x, x x
xfydy 0
PX x, x x x
x xfydy
Fx PX , x
xfydy for all x
)()( xFxf
PX I a
bfydy Fb Fa I a, b
x
f(x)
xx xx ''xx
]),[( xxxXP
]),[( '' xxxXP
Interpretation of the probability density function
Uniform random variable on the interval [0,1]
fx 1 if 0 x 1
0 otherwise
0 x 1If
xdydyyfxF xx 1)()( 00
, then
0 x
f(x)
1
1
f(x) for a uniform random variable on [0,1]
fx 1 if 0 x 1
0 otherwise
0 x
F(x)
1
1
F(x) for a uniform random variable on [0,1]
PX x, x x x
x xfydy
Fx x Fx x x x
x
0 x x x 1where
Exponential random variable
0 x
f(x)
1
0 x
F(x)
1
f(x) for an exponential random variable with mean
F(x) for an exponential random variable with mean
Joint probability mass function
If X and Y are discrete random variables, then let
px, y PX x, Y y for all x, y
where p(x,y) is called the joint probability mass function of X and Y.
X and Y are independent if
px, y pXxpYy for all x, y
where
pXx all y
px, y
pYy all x
px, y
are the (marginal) probability mass functions of X and Y.
Example 4.9
Suppose that X and Y are jointly discrete random variables with
px, y xy27 for x 1, 2 and y 2, 3, 4
0 otherwise
Then
pXx y 2
4xy27 x
3
pYy x 1
2xy27 y
9
for x=1,2
for y=2,3,4
Since px, y xy/27 pXxpYy For all x, y, the random variables X and Y
are independent.
Joint probability density function
The random variables X and Y are jointly continuous if there exists a nonnegative function f(x,y), such that for all sets of real numbers A and B,
PX A, Y B B
Afx, ydxdy
X and Y are independent if
fx, y fXxfYy for all x and y
where
fXx
fx, ydy
fYy
fx, ydx
are the (marginal) probability density functions of X and Y, respectively.
Example 4.11Suppose that X and Y are jointly continuous random variables with
fx, y 24xy for x 0, y 0, and x y 1
0 otherwise
Then
fXx 0
1 x24xydy 12xy2|0
1 x 12x1 x 2 for 0 x 1
fYy 0
1 y24xydx 12yx2
01 y 12y1 y 2 0 y 1for
Since
f 12 , 1
2 6 32 2 fX 1
2 fY 12
X and Y are not independent.
Mean or expected value
Ex i i j 1
x jpX i x j if X i is discrete
xfX i x if X i is continuous
The mean is one measure of central tendency in the sense that it is the center of gravity
Examples 4.12-4.13For the demand-size random variable, the mean is given by
1 16
2 13
3 13
4 16
52
For the uniform random variable, the mean is given by
0
1xfxdx
0
1xdx 1
2
Properties of means
EcX cEX
Ei 1n c iX i
i 1n c iEX i
X i
1.
Even if the ‘s are dependent.
2.
Median
The median of the random variable is defined to be thesmallest value of x such that
x0.5
x0.5FX i x0.5 0. 5
x0.5 x
)(xfiX
area=0.5
The median for a continuous random variable x0.5
Example 4.14
The median may be a better measure of central tendency than the mean.
1. Consider a discrete random variable X that takes on each of the values, 1, 2, 3, 4, and 5 with probability 0.2. Clearly, the mean And the median of X are 3.
2. Now consider random variable Y that takes on each of the values, 1, 2, 3, 4, and 100 with probability 0.2. The mean and the median of X are 22 and 3, respectively.
Note that the median is insensitive to this change in the distribution.
Variance
])[()( 22iiii XEXVar 2222 )]([)()( iiii XEXEXE
For the demand-size random variable,
EX2 12 16
22 13
32 13
42 16
436
VarX EX2 2 436
52
2 1112
For the uniform random variable on [0,1],
EX2 0
1x2fxdx
0
1x2dx 1
3
VarX EX2 2 13
12
2 112
2large
2small
Density functions for continuous random variables with large and small variances.
Properties of the variance
VarX 0
VarcX c2VarX
Vari 1n X i
i 1n VarX i
1.
2.
3. X iif the ‘s are
independent (or uncorrelated).
Standard deviation
ii 96.1
i i2
iX ii 96.1The probability that is between and is 0.95.
Covariance
jijijjiiijji XXEXXECXXCov )()])([(),(
The covariance between the random variables and is a measure of their dependence.
X i X j
C ij C ji
C ij C ji i2 if i=j,
Example 4.17
For the jointly continuous random variables X and Y in Example 4.11
EXY 0
1 0
1 xxyfx, ydydx
0
1x2
0
1 x24y2dydx
0
18x21 x 3dx
215
EX 0
1xfXxdx
0
112x21 x 2dx 2
5
EY 0
1yf Yydy
0
112y21 y 2dy 2
5CovX, Y EXY EXEY
215
25 2
5
275
If and are independent random variables
C ij 0
X i X j
X jX i and are uncorrelated.
Generally, the converse is not true.
Correlated
If , then and are said to be positively correlated.X i X jC ij 0
If , then and are said to be negatively correlated.X i X j0ijC
X i i
X j j
X i i
X j j
and tend to occur together
and tend to occur together
iiX
X j j
iiX X j j
and tend to occur together
and tend to occur together
Correlation
ij C ij
i2 i
2
i 1, 2, , n
j 1, 2, , n
1 ij 1
ijIf is close to +1, then and are highly positively correlated. X i X j
ijIf is close to -1, then and are highly negatively correlated. X i X j
For the random variable in Example 4.11
VarX VarY 125
CorX, Y CovX, YVarXVarY
2
75125
23
4.3 Simulation output data and stochastic processes
State space is the set of all possible values that these random variables can take on.
X1,X2,
Continuous-time stochastic process: Xt, t 0
Stochastic process is a collection of "similar" random variables ordered over time, which are all defined on a common sample space.
Discrete-time stochastic process:
Example 4.19 M/M/1 queue with A1,A2,
IID service times S1,S2, FIFO service
D1,D2,
D1 0
Di 1 maxDi Si Ai 1, 0 for i 1,2,
Di Di 1 are positively correlated.
IID interarrival times
Define the discrete-time stochastic process of delays in queue
and
input random variables output stochastic processsimulation
The state space: the set of nonnegative real numbers
Example 4.20
For the queueing system of Example 4.19,
Let be the number of customers in the queue at time t .
Then is a continuous-time stochastic process with state space
Qt
Qt, t 00,1,2,
Covariance-stationary
Assumptions about the stochastic process are necessary to draw inferences in practice.
A discrete-time stochastic process X1,X2, is said to be
i for i 1,2, and i
2 2 for i 1,2, and 2
covariance-stationary, if
and is independent of i for ),(, jiijii XXCovC j 1,2, .
Covariance-stationary process
For a covariance-stationary process, the mean and variance are stationary over time, and the covariance between two observations and depends only on the separation j and not actual time value i and i+j.
Xi Xi j
We denote the covariance and correlation between and byXi Xi j
Cj j
,2,1for 0
222
,
jC
CCC
jjj
jii
jii
and respectively, where
Example 4.22
Consider the output process for a covariance-stationary M/M/1 queue with .
D1,D2, / 1
Warmup period
In general, output processes for queueing systems are positively correlated.
If is a stochastic process beginning at time 0 in a simulation, then it is quite likely not to be covariance-stationary.
However, for some simulation will be approximately covariance-stationary if k is large enough, where k is the length of the warmup period.
X1,X2,
Xk 1,Xk 2,
4.4 Estimation of means, variance, and correlations
Suppose are IID random variables with finite population mean and finite population variance
nXXX ,,2,1 2
)]([ nXESample mean Xn
i 1
n
Xi
n
Sample variance 2 22 )]([ nSE S2n i 1
n
Xi Xn2
n 1
Unbiased estimators:
X X
Density function for )(nX
First observationof
Second observationof )(nX )(nX
How close is to ? nX
VarXn Var 1n
i 1
n
Xi
1n2
Vari 1
n
Xi (because the Xi’s are independent)
1n2
i 1
n
VarXi
1n2
n 2 2
n
VarXn S2nn
i 1
n
Xi Xn2
nn 1
How close is to nX to construct a confidence interval
Unbiased estimator
n small
Distributions of for small and large n.
n large
Density function
for X n
Density function
for X n
X n
Estimate the variance of the sample mean .
iX
Var X n
´s are independent iX ´s are uncorrelated 0j
However, the simulation output data are almost always correlated.
nXXX ,,, 21 are from a covariance-stationary stochastic process,
Then, is an unbiased estimator of , however, is no longer an unbiased estimator of . Since
ES2n 2 1 2
j 1
n 1
1 j/n j
n 1
X n S2n 2
However, simulation output data are always correlated. Since
ES2n 2 1 2
j 1
n 1
1 j/n j
n 1
j 0 22 ))(( nSE
For a covariance-stationary process:
Var X n 2
1 2 j 1
n 1
1 j/n j
n (2)
(1)
If one estimates from (correct in the IID case)
Var X n S2n/n
there are two errors: • the bias in as an estimator of .• the negligence of the correlation terms in Eq. (2).
S2n 2
Solution: combine Eq. (1) and Eq. (2)
E S2nn n/an 1
n 1Var X n
an 1 2 j 1
n 1
1 j/n j
j 0If , then and .an 1 ES2n/n Var X n
(3)
Example 4.24
D1, D2, , D10 from the process of delays for a covariance-stationary M/M/1 queue with . Eq.(1) and (3)
ES210 0. 0328 2
E S21010
0. 0034VarD 10
2 VarD i D 10 i 1
10
D i
10S210
i 1
10
D i D 102
9
Thus, is a gross underestimate of , and we are likely to be overly optimistic about the closeness of to
S210/10 VarD 10D 10 ED i
0. 9
Estimate . j
j Ĉ j
S2n, Ĉ j
i 1
n j
X i X nX i j X n
n j
In general "good" estimates of the 's will be difficult to obtain unless n is very large and j is small relative to n.
j
4.5.1 Confidence Intervals
nnXZn //])([ 2
)()( zZPzF nn
4.5.1 Confidence Intervals
Central Limit Theorem: as , where , the distribution function of a normal random variable with and , is given by
If n is "sufficiently large", the random variable will be
approximately distributed as a standard normal random variable, regardless of the underlying distribution of the 's. For large n, the sample mean is approximately distributed as a normal random variable with mean and variance
Fnz z n z
0 2 1
z 12
ze y2/2dy for z
Zn
Xi
Xn 2/n.
tn Xn / S2n/n
nnXZn //])([ 2
P z1 /2 Xn S2n/n
z1 /2
PXn z1 /2S2n
n
Xn z1 /2S2n
n
1
where ( ) is the upper critical point for a standard normal random variable.
z1 /2 0 1 1 /2
f(x)
Shaded area = 1
2/1 z 2/1 z0 x
Xn z1 /2S2n
n confidence interval
If n is sufficiently large, an approximate percent confidence interval for is given by
1001
Interpretation I: If one constructs a very large number of independent percent confidence intervals, each based on n observations, where n is sufficiently large, the proportion of these confidence intervals that contains (cover) should be .
1001
1
Interpretation II: If the 's are normal random variables, the random variable has a t distribution with n-1 degree of freedom (df), and an exact (for any ) percent confidence interval for is given by
Where is the upper critical point for the t distribution with n-1 df
tn Xn / S2n/nn 2
1001
Xn tn 1,1 /2S2n
n
tn 1,1 /2 1 /2
iX
f(x)
x
Standard normal distributiont distribution with 4df
0
Figure 4.16 Density function for the t distribution with 4df and for the standard normal distribution.
Example 4.26
10 observations: 1.20, 1.50, 1.68, 1.89, 0.95, 1.49, 1.58, 1.55,
0.50, 1.09 are from a normal distribution,
To construct a 90% confidence interval for .
X10 t9,0.95S210
10 1.34 1.83 0.17
10 1.34 0.24
X10 1.34 S210 0.17
v EX 3 23/2
v
Distribution Skewness v n=5 n=10 n=20 n=40
Normal 0.00 0.910 0.902 0.898 0.900
Exponential 2.00 0.854 0.878 0.870 0.890
Chi Square 2.83 0.810 0.830 0.848 0.890
Lognormal 6.18 0.758 0.768 0.842 0.852
Hyperexponential 6.43 0.584 0.586 0.682 0.774
Table 4.1 Estimated coverages based on 500 experiments
4.5.2 Hypothesis tests for the mean
H0
0
H0
:
If is large, is not likely to be true.|Xn 0 |
If is true, the statistic will have a t distribution with n-1 df.
tn Xn 0/ S2n/n
H0
If |tn | tn 1,1 /2
tn 1,1 /2
reject H0
"accept" H0
H0
Example 4.27
To test the null hypothesis that at level .H0 1 0.10
For Example 4.26,
t10 X10 1
S210/10 0.34
0.17/10 2.65 1.83 t9,0.95
We reject . H0
4.6 The Strong Law of Large Numbers
Theorem 4.2 Xn w.p. 1 as n .
Example 4.29
Tarea II: Teoría de la probabilitad y estatística
A.M. Law and W.D. Kelton, Simulation, Modeling and Analysis, 3rd edition, pp. 261-263.
Problems 4.1, 4.2, 4.4, 4.7, 4.9, 4.10, 4.13, 4.20, 4.21, 4.23, 4.24, 4.25, 4.26