19
1 OR II OR II GSLM 52800 GSLM 52800

1 OR II GSLM 52800. 2 Outline introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain problem statement long-term

Embed Size (px)

Citation preview

Page 1: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

1

OR IIOR IIGSLM 52800GSLM 52800

Page 2: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

2

OutlineOutline

introduction to discrete-time Markov Chain

problem statement

long-term average cost pre unit time

solving the MDP by linear programming

Page 3: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

3

States of a MachineStates of a Machine

State Condition

0 Good as new

1 Operable – minor deterioration

2 Operable – major deterioration

3 Inoperable – output of unacceptable quality

Page 4: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

4

Transition of StatesTransition of States

State 0 1 2 3

0 0 7/8 1/16 1/16

1 0 3/4 1/8 1/8

2 0 0 1/2 1/2

3 0 0 0 1

Page 5: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

5

Possible Actions Possible Actions

Decision Action Relevant States

1 Do nothing 0, 1, 2

2Overhaul (return to

state 1)2

3Replace (return to

state 0)1, 2, 3

Page 6: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

6

ProblemProblem

adopting different collections of actions adopting different collections of actions leading to different long-term average cost leading to different long-term average cost per unit time per unit time

problem: to find the policy that minimizes problem: to find the policy that minimizes the long-term average cost per unit time the long-term average cost per unit time

Page 7: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

7

Costs of ProblemCosts of Problem

cost of defective items state 0: 0; state 1: 1000; state 2: 3000

cost of replacing the machine = 4000

cost of losing production in machine replacement = 2000

cost of overhauling (at state 2) = 2000

Page 8: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

8

Policy Policy RRdd: Always Replace : Always Replace

When State When State 0 0 half of the time at state 0, with cost 0

half of the time at other states, all with cost 6000, because of machine replacement

average cost per unit time = 3000

0 1

23

1/16

7/8

1

1

1 1/16

Page 9: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

9

Long-Term Average Cost ofLong-Term Average Cost ofa Positive, Irreducible Discrete-time Markov Chaina Positive, Irreducible Discrete-time Markov Chain

a positive, irreducible discrete-time Markov chain with M+1 states, 0, …, M only M of the balance eqt plus the

normalization eqt

0

0

balance eqt.: , 0,...,

normalization eqt.: 1

M

j i iji

M

ii

p j M

Page 10: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

10

Policy Policy RRaa: Replace at Failure : Replace at Failure

but Otherwise Do Nothingbut Otherwise Do Nothing

0 1

23

1/16

7/8

1

1/2

1 1/16

3/4

1/8

1/8

1/2

0 3

7 31 0 18 4

1 1 12 0 1 216 8 2

1 1 13 0 1 216 8 2

0 1 2 3 1

72 2 20 1 2 313 13 13 13

; ; ;

0 1 2 3(0) 1000 3000 6000 1923

Page 11: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

11

Policy Policy RRbb: Replace in State 3, : Replace in State 3,

and Overhaul in State 2and Overhaul in State 20 3

7 31 0 1 28 4

1 12 0 116 8

1 13 0 116 8

0 1 2 3 1

52 2 20 1 2 321 7 21 21

; ; ;

0 1 2 3(0) 1000 4000 6000 1667

0 1

23

1/16

7/8

1

1 1/16

3/4

1/8

1/81

Page 12: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

12

Policy Policy RRcc: Replace : Replace

in States 2 and 3in States 2 and 30 2 3

7 31 0 18 4

1 12 0 116 8

1 13 0 116 8

0 1 2 3 1

72 1 10 1 2 311 11 11 11

; ; ;

0 1 2 3(0) 1000 6000 6000 1727

0 1

231/16

7/8

1

1 1/16

3/4

1/8

1/81

1

Page 13: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

13

ProblemProblem

in this case the minimum-cost policy is RRbb, ,

i.e., replacing in State 3 and overhauling in i.e., replacing in State 3 and overhauling in State 2State 2

question: Is there any efficient way to find the minimum cost policy if there are many states and different types of actions? impossible to check all possible cases

Page 14: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

14

Linear Programming Approach Linear Programming Approach for an MDPfor an MDP

let

Dik be the probability of adopting decision k at state i

i be the stationary probability of state i

yik = P(state i and decision k)

Cik = the cost of adopting decision k at state i

Page 15: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

15

Linear Programming Approach Linear Programming Approach for an MDPfor an MDP

0

0

, 0,...,

1

M

j i iji

M

ii

p j M

, 0,...,ik i iky D i M

0 1 1M K

iki k

y

1 0 1 ( ), 0,1,...,K M K

jk ik ijk i ky y p k j M

0, 0,1,..., ; 1,...,iky i M k K

0 1 0 1 ( ) =

M K M K

i ik ik ik iki k i k

E C C D C y

Page 16: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

16

Linear Programming Approach Linear Programming Approach for an MDPfor an MDP

0 1

0 1

1 0 1

min = ,

. .

1,

( ) 0, 0,1,...,

0, 0,1,..., ; 0,1,...,

M K

ik iki k

M K

iki k

K M K

jk ik ijk i k

ik

Z C y

s t

y

y y p k j M

y i M k K

at optimal, Dik = 0 or 1, i.e., a deterministic policy is used

Page 17: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

17

Linear Programming Approach Linear Programming Approach for an MDPfor an MDP

actions possibly to adopt at state 0: do nothing (i.e., k = 1)

1: do nothing or replace (i.e., k = 1 or 3)

2: do nothing, overhaul, or replace (i.e., k = 1, 2, or 3)

3: replace (i.e., k = 3)

variables: y01, y11, y13, y21, y22, y23, and y33

Page 18: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

18

Linear Programming Approach Linear Programming Approach for an MDPfor an MDP

11 13 21

22 23 33

01 11 13 21 22 23 33

01 13 23 33

11 13

min =1000 6000 +3000

+4000 +6000 +6000 ,

. .

+ + + + 1

( + + ) 0

+

Z y y y

y y y

s t

y y y y y y y

y y y y

y y

7 301 11 228 4

1 1 121 22 23 01 11 2116 8 2

1 1 133 01 11 2116 8 2

( + + ) 0

+ + ( + + ) 0

( + + ) 0

0, 0,1,..., ; 0,1,...,ik

y y y

y y y y y y

y y y y

y i M k K

State 0 1 2 3

0 0 7/8 1/161/16

1 0 3/4 1/8 1/8

2 0 0 1/2 1/2

3 0 0 0 1

Page 19: 1 OR II GSLM 52800. 2 Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term

19

Linear Programming Approach Linear Programming Approach for an MDPfor an MDP

solving, y01 = 2/21, y11 = 5/7, y13 = 0, y21 = 0, y22 = 2/21, y23 = 0, y33 = 2/21

optimal policy at state 0: do nothing

state 1: do nothing

state 2: overhaul

state 3: replace