Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
ENM 307 SIMULATION
Anadolu Üniversitesi,
Endüstri Mühendisliği Bölümü
2011-2012, Bahar Yrd. Doç. Dr. Gürkan ÖZTÜRK
Input Modelling
Input modeling
Four Steps in development of a useful model of input data ◦ Collect data from real system of interest.
◦ Identify a probability distribution to represent the input process.
◦ Choose parameters that determine a specific instance of the distribution family.
◦ Evaluate the chosen distribution and the associated parameters for goodness of fit.
Standalone programs ◦ ExpertFit®, Stat::Fit ®,
Integrated programs ◦ Arena’s Input Analyzer ®, @Risk’s BestFit ®
Data Collection
Are data readily available?
Data collection one of the biggest task in solving a real problem.
Even when data available, they have rarely been recorded in a form that is directly useful for simulation input modeling.
If the input data are
◦ inaccurately collected,
◦ inappropriately analyzed or
◦ not representative of the environment,
The simulation output will be misleading and possibly damaging or costly when used for policy or decision making.
Data Collection The Laundromat
◦ 10 washing machine Six dryers.
◦ The interarrival time distributions was not homogenous Time of the day, day of the week
◦ 7 days a week, 16 hours per day, 112 hours per week
◦ Limited resources Two students were also taking four courses
◦ Time constraint The simulation was to be completed in a 4-week periods
◦ The distribution of time between arrivals during one week might not have been followed during the next week.
◦ As a compromise, a sample of times was selected, interarrival time distributions according to arrival rate: high, medium and low
◦ Service time distributions also present a difficult problem Various service combinations, numbered machines, membership, dependence washer and
dryer demands
◦ Also machine breakdowns The length of the breakdown varied from a few moments, to several days.
Data Collection Lessons can be learned from an actual experience at data
collection.
Suggestions to enhance and facilitate data collection
1. A useful expenditure of time is in planning.
2. Try to analyze the data as the are being collected.
3. Try to combine homogeneous data sets.
4. Be aware of the possibility of data censoring,
5. To discover whether there is a relationship between two variables
6. Consider the possibility that a sequence of observations that appear to be independent actually has autocorrelation.
7. Keep in mind the difference between input data and output data
Identifying the distribution with data
Histograms
◦ A frequency distribution or histogram is useful in identifying the
shape of a distribution.
A histogram is constructed as follows:
1. Divide the range of the data into intervals.
2. Label the horizontal axis to conform to the intervals selected.
3. Find the frequency of occurrences with each interval
4. Label the vertical axis so that the total occurrences can be
plotted for each interval
5. Plot the frequencies on the vertical axis
Identifying the distribution with data
Histograms
Arrivals per
Period Frequency
0 12
1 10
2 19
3 17
4 10
5 8
6 7
7 5
8 5
9 3
10 3
11 1
0
5
10
15
20
0 1 2 3 4 5 6 7 8 9 10 11
Number of arrivals per period
Identifying the distribution with data
Histograms 79,919 3,081 0,062 1,961 5,845
3,027 6,505 0,021 0,013 0,123
6,769 59,899 1,192 34,76 5,009
18,387 0,141 43,565 24,42 0,433
144,695 2,663 17,967 0,091 9,003
0,941 0,878 3,371 2,157 7,579
0,624 5,38 3,148 7,078 23,96
0,59 1,928 0,3 0,002 0,543
7,004 31,764 1,005 1,147 0,219
3,217 14,382 1,008 2,336 4,562
Component Life Frequency
0 xj < 3 24
3 xj < 6 9
6 xj < 9 5
9 xj < 12 1
12 xj < 15 1
15 xj < 18 1
18 xj < 21 1
21 xj < 24 1
24 xj < 27 1
27 xj < 30 0
30 xj < 33 1
33 xj < 36 1
36 xj < 39 0
39 xj < 42 0
42 xj < 45 1
144 xj < 147 1
0
5
10
15
20
25
30
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42
Histogram of component life
Identifying the distribution with data
Selecting family of distributions The purpose of preparing a histogram is to infer a
known pdf or pmf.
There are litterally hundreds of probability distributions
Examples of physical properties of the distributions
◦ Binomial
◦ Negative Binomial (includes the geometric distribution)
◦ Poisson
◦ Normal
◦ Lognormal
◦ Exponential
Identifying the distribution with data
Selecting family of distributions ◦ Gamma
◦ Beta
◦ Erlang
◦ Discrete or Continuous Uniform
◦ Triangular
◦ Emprical
Identifying the distribution with data
Quantile-Quantile plots When there is a small number or data points
(<=30), histogram can be ragged.
Our perception of the fit depends on widths of
the histogram intervals
Even if the intervals are chosen well, grouping
data into cells makes it difficult to compare a
histogram to a continuous probability density
function.
Q-Q plot is a useful tool for evaluating
distribution fit.
Identifying the distribution with data
Quantile-Quantile plots ◦ If X is a random variable with cdf F.
◦ The quantile of X is that value such that
F()=P(X≤ ) = q, for 0<q<1 =F-1(q)
◦ Let {xi ,i=1,2,…,n} be a sample of data from X.
◦ Order the observations from the smallest to largest
{yj, j=1,2,…,n}, where y1≤ y2≤ …≤ yn
n
jFy j
2/1ely approximat is 1
Identifying the distribution with data
Quantile-Quantile plots xi j (j-.5)/n yj zj F-1((j-.5)/n)
99,79 1 0,025 99,55 -1,54 -1,96
100,26 2 0,075 99,56 -1,51 -1,44
100,23 3 0,125 99,62 -1,29 -1,15
99,55 4 0,175 99,65 -1,19 -0,93
99,96 5 0,225 99,79 -0,69 -0,76
99,56 6 0,275 99,82 -0,59 -0,6
100,41 7 0,325 99,83 -0,55 -0,45
100,27 8 0,375 99,85 -0,48 -0,32
99,62 9 0,425 99,9 -0,31 -0,19
99,9 10 0,475 99,96 -0,09 -0,06
100,17 11 0,525 99,98 -0,02 0,06
99,98 12 0,575 100,02 0,12 0,19
100,02 13 0,625 100,06 0,26 0,32
99,65 14 0,675 100,17 0,65 0,45
100,06 15 0,725 100,23 0,86 0,6
100,33 16 0,775 100,26 0,97 0,76
99,83 17 0,825 100,27 1 0,93
100,47 18 0,875 100,33 1,21 1,15
99,82 19 0,925 100,41 1,5 1,44
99,85 20 0,975 100,47 1,71 1,96
99,99 m 99,99
0,2832 s 0,2832
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 -1 0 1 2
Seri 1
99.6 99.8 100 100.2 100.4 100.6
Identifying the distribution with data
Quantile-Quantile plots The evaluation of q-q plot
1. The observed values will never fall exactly on a straight line
2. The observed values are not independent, they have been ranked.
3. The variances of extremes are much higher than variances in the middle of the plot.
q-q plots can also be used to compare two samples to compare two samples of data whether they can be represented by the same distributions.