Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Optimal Control of “Deviant” Behavior
Gustav Feichtinger
Institute for Mathematical Methods in EconomicsVienna University of Technology
[email protected]://www.eos.tuwien.ac.at/OR/
Roadmap
• Prologue• Optimal Control Applications in Economics and OR• Pontryagin’s Maximum Principle• Complex Behavior of Optimal Solutions• The American Cocaine Epidemic• The LHY – Model• Countering Terrorism• Conclusions
2
Prologue
The economics of ‘deviant’ behavior
Four cartoons:• Sherlock Holmes vs. Professor Moriarty• The corrupt politician• Addiction and binge drinking• Escalation from ‘light’ to ‘heavy’
3
4
The economics of crime: Gary Becker (1968)How much crime should be tolerated?
Deviant behavior: violence, terrorism, corruption, extortion, illicit (& legal) drug consumption, HIV / AIDS
Epidemiological feedbacks: macro micro• Easterlin effect• Double-edged sword effect• Stock-dependent recruitment
5
T. Schelling: social interactions, peer effects
How to start smoking: asymptotic influence of smokers and non smokers
Nonlinearities
Heterogeneities
6
Schelling diagram
0 1
un (C)
uc (C)
Ctipping C
C … degree of corruption,
c … corrupt, n … non-corrupt
Binary decision!
7
uc (C) … utility of being corruptun (C) … utility of being non-corrupt
stable equilibrium
unstable equilibrium
Gary S. Becker
University Professor Department of Economics
and Sociology
Professor Graduate School of BusinessThe University of Chicago
Nobel Laureate
The Journal of Political Economy,Vol. 76, No. 2. (Mar. - Apr., 1968),pp. 169-217
Gary S. Becker
Nobel Prize 1992 inEconomic Sciences
for having extended thedomain of microeconomicanalysis to a wide range ofhuman behavior andinteraction, includingnonmarket behavior
The Journal of Political Economy,Vol. 76, No. 2. (Mar. - Apr., 1968),pp. 169-217
Optimization
NO Dynamics
Dynamics
NO Optimization
„In general this is a difficult problem […]“
Jonathan P. Caulkins
Stever Professor ofOperations Research andPublic Policy
Carnegie Mellon University
INFORMS President's AwardINFORMS 2010 Annual Meeting
15
The Corrupt PoliticianCyclical behavior can be rational from the politician’s point of view
Two sides of a coin
up to now simplest model type leading to an optimal stable limit cycle in economics
Result: Under certain parameter constellations a stable Hopf cycle is optimal
Feichtinger & Wirl (92)
k p+ -
c+ +
16
c(t) … acutal (current) corruption (bribery)
p(t) … ‘popularity’ (reputation) of the politician
politician benefits both from corruption and popularity
0
d)()(max tpVcUe rt
0c
U(c)
0p
V(p)
k(t) … accumulated corruption (observed level of corruption)
2 states … k, p
1 control … c
totally separable, strictly concave
17
0p
g(p)
p
)()( kfpgpkck
0
0
f(k)
k
18
instability saddle-pointlimit cycles stability
)ˆ(pg
p
p … steady state popularities of this equilibrium
High popularity sufficient for stable corruption.
For weak popularity: cyclical behavior may be optimal
19
k
pp k
Popularity p is the leading variable, bad reputation (- p) lags behind k (and c)
Evans (1924), Ramsey (1928), Hotelling (1931)
Kurz (1965, 1968), Cass (1965,1966), Shell (1967,1969), Arrow & Kurz (1970)
Intriligator (1971), Stöppler (1974)
von Weizsäcker (1967), Näslund (1966,1969), Thompson (1968), Bensoussan et al. (1974), Stepan (1977), Ludwig (1978),van Loon (1985)
Sethi & Thompson (1981), Kamien & Schwartz (1981), Seierstad & Sydsaeter (1986), Feichtinger & Hartl (1986), Leonard & Long (1992), Dockner et al. (2000), Grass et al. (2008) 20
Optimal Control Applications in economics and OR
• Growth theory, capital accumulation, human capital
• Extraction of non-renewable resources, management of renewable resources
• Control of environmental pollution
• Dynamics of the firm: Production, inventory, finance, marketing, maintenance and replacement, planning, research and development, queuing systems
21
• Health planning: epidemiology of infectious diseases, illicit drugs
• Control of diseases, e.g. cancer treatment, pest control
• Control of violence and terrorism
• ‘Planning the unusual’
22
23
Pontryagin’s Maximum Principle
OCM:
0
0)(
)0( ),,( s.t.
d ),(max
xxuxfx
tuxge rt
u
Discounted autonomous optimal control model with T=
x n, u m
Pontryagin, Boltyanskii, Gamkrelidze, Mishchenko (1962, 1959),Pesch & Bulirsch (1994), Carathéodory (1926,1935)
24
H= λ0 g(x,u) + λf(x,u)
current-value Hamiltonian
V(x) ... optimal value at state x
λ ... c.v. adjoint variable (shadow price)
λ n
λ = Vx
25
Hamiltonian as generalized utility function:
H = g + λf
[H] = [g] = value / time[ ] = state / time[λ] = [Vx] = value / statex
26
Necessary and sufficient optimality conditions
) ),,(max ,
u(x,λ uuxHHrHx
u
x
canonical system in (x,λ), qualitative analysiseconomic interpretation
xxx
x
HrHHH
J
tr J = r > 0
n = 1: saddle points and unstable steady states (foci and nodes), i.e. stability and limit cycles are ruled out
27
Faced with x(t) at t [0,T) the control u(t) has
– a direct effect: instantaneous profit g(x,u)– an indirect effect: change of the state
‘after me the deluge’ vs. Stalinist policy
Limiting transversality condition
Michel (1982)
dt/dxx
xxufgr),u,x(Hmax
0
)t(elim rt
t
Sufficient optimality condition: concavity of H w.r.t. (x,u) or of w.r.t. x (Mangasarian):
Hmaxu
28
Finite planning horizon T
S(x(T)) … salvage value at T
TPBV problem: canonical system with n initial conditions x(0) = x0n terminal conditions λ(T) = Sx(x(T))
))T(x(Sedt)u,x(gemax rTT rt
)(u
0
29
‚Complex‘ Solutions of Optimal Control Models
- multiple equilibria
- limit cycles
The American Cocaine Epidemic
Multiple Steady States and DNSS Points in an Optimal Control Model of a
Drug User Population
J.P. Caulkins, G. Tragler, G. Feichtinger
30
31
• Optimal mix of instruments to control an epidemic,
• Dynamic process, nonlinear feedbacks, social interactions
• Other diffusion processes: marketing, HIV/AIDS, social norms, etc.
• Interaction of instruments
• Heterogeneities: compartment models, distributed control
Drug Problems Are Important• In the US
– $180 B per year in social cost– ~20,000 premature deaths per year
• In Austria– 181 drug-related deaths per year in 2005
• (about 1/3 the per capita rate in the US)
• Globally– 200 million drug users– 25 million problem drug users– Ebbing in US/Australia; growing elsewhere
32
Why Dynamic Optimization?
Drug problems are very dynamic
0
5
10
15
20
25
1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012
Dem
and
(Lig
ht U
ser-
Year
Equ
ival
ents
)
Demand from Heavy UsersDemand from Light Users
U.S. cocaine epidemic
“Heavy use”: consumption at least once a week
“Light use”:consumption at least once within the last year, but less than weekly
Why Dynamic Optimization?
Drug problems cause enormous social costs
Social costs due to U.S. cocaine consumption in 2000:
Total consumption in 2000: 225 metric tonsSocial costs (Caulkins et al., 2002): 215 $ / gram
(225,000,000 grams) * (215 $ / gram)=
48,375,000,000 $(approx. 50 billion $)
Why Dynamic Optimization?A lot of money is spent for the
control of drug problems
Drug control expenditures in 2000 (U.S.A., cocaine):
Prevention: 0.9 billion $Treatment: 2.0 billion $Law enforcement: 9.2 billion $--------------------------------------------------Expenditures in 2000: 12.1 billion $
Social costs + budget in one year: 60 billion $
Our Goal
Our goal is to find theoptimal mix
ofcontrol instruments
such as prevention, treatment, or law enforcement, which
minimizes the social costs / harmcaused by drug consumption.
Our Approach• We construct (sometimes simple, sometimes less
simple) models of drug consumption and the involved feedback effects;
• we solve these models as dynamic optimization problems (by using methods of optimal control theory);
• we determine if and how the optimal mix of interventions changes with time; and
• we cautiously draw conclusions about the optimal policy.
Nonlinear Feedback #1a
• Initiation is an increasing function of the number of current users
– Friends introduce friends to drugs– “Epidemic” models not merely “diffusion”
models
38
Nonlinear Feedback #1b
• Enforcement swamping– Market participants respond to incentives,
such as enforcement intensity (not level), so– dN/dt = f(u/N)– Akin to Schelling’s “tipping point” model of
corruption
39
Modelling the Controls
Australian Heroin Drought
Moore, T.J., J.P. Caulkins, A. Ritter, P. Dietze, S. Monagle and J. Pruden, 2005, Heroin Markets in Australia: Current Understanding and Future Possibilities,DPMP Monograph Series Fitzroy Turning Point Alcohol and Drug Centre
Drug Prices and Their EffectsMain idea is that higher prices induce lower levels of use.
In this sense, assume that demand may be described by the formula
with c denoting a constant and being the price elasticity of demand. Then, …
pc
Analogously: U.S. Data
Caulkins, J.P., 2001, „Drug Prices and Emergency Department Mentions for Cocaine and Heroin“, American Journal of Public Health 91(9), 1446-1448.
0
100
200
300
400
500
600
700
800
1981 1983 1985 1987 1989 1991 1993 1995
Pric
e pe
r Pur
e G
ram
& E
R M
entio
ns/y
r 0.1*Heroin PriceCocaine PriceHeroin ER Mentions/200Cocaine ER Mentions/200
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996
Num
ber o
f Coc
aine
-Rel
ated
ER
Men
tions Predicted by Price
Actual
Price changes can explain 97.5% of the variation in ER Mentions for cocaine
cocaine ER mentions = 5.85 x 107 x price-1.30
1‐State‐Model with (up to) 3 Controls
outflow natural treatment todue quittinginitiation
,, tAt,vtAμptAtutActwtAtvtAkptA ba
tA
tvedtvtAp ,
„enforcement swamping“
1‐State‐Model with (up to) 3 Controls
outflow natural treatment todue quittinginitiation
,, tAt,vtAμptAtutActwtAtvtAkptA ba
1‐State‐Model with (up to) 3 Controls
outflow natural treatment todue quittinginitiation
,, tAt,vtAμptAtutActwtAtvtAkptA ba
z
tAtututA
,
„cream skimming“
1‐State‐Model with (up to) 3 Controls
outflow natural treatment todue quittinginitiation
,, tAt,vtAμptAtutActwtAtvtAkptA ba
mwehhw 1
2´ 109 4´ 109 6´ 109 8´ 109 1´ 1010
0.2
0.4
0.6
0.8
1
prevention spending
reduction of initiation
Optimal Control Model
0
spendingprevention
spendingtenforcemen
lawspendingtreatment
nconsumptio todue costs social
nconsumptiogdiscountin0,0,0,min dttwtvtutvtAptAe rt
tGAtwtvtu
twtvtu
s.t.
outflow natural treatment todue quittinginitiation
,, tAt,vtAμptAtutActwtAtvtAkptA ba
50
… damage in $ per gram of consumed cocaine(60-500$/g)
dynamic cost-benefit analysis:optimal mix of instruments
u … treatmentv … enforcementw … prevention
Solving theOptimal Control Models
• Pontryagin‘s Maximum Principle
• Computers / Numerical Tools / Software
http://orcos.tuwien.ac.at/research/ocmat_software/
D. Grass,J.P. Caulkins, G. Feichtinger,G. Tragler andD.A. Behrens,
2008,Optimal Control of Nonlinear
Processes – With Applications in Drugs, Corruption, and Terror
(Springer, Heidelberg), ISBN: 978‐3‐540‐77646‐8.
D. Grass,J.P. Caulkins, G. Feichtinger,G. Tragler andD.A. Behrens,
2008,Optimal Control of Nonlinear
Processes – With Applications in Drugs, Corruption, and Terror
(Springer, Heidelberg), ISBN: 978‐3‐540‐77646‐8.
56
http://www.springer.com/economics/game+theory/book/978-3-540-77646-8
http://www.eos.tuwien.ac.at/OR/OCMat/
57
Optimal mix of u and v
0
p.a. $ 16000
w
GGAvu
uv
v
AA
GA
Ass.
Intertemporal trade-off between damage of addicts (social costs) and costs of fighting them … saddle-point ‚stability‘.
58
Skiba points separating multiple equilibria
Eradiction of ‚crime‘
Dechert & Nishimura (83)Sethi (77)Skiba (78)
Spend much if the epidemic is still small! But …
A
v
DNSSA
boundaryequilibrium
interiorequilibrium
Bifurcation Diagram: Multiple Steady States 59
c
Bifurcation
A
There exists a DNSS („tipping“) point;
if initial number of users is …
- below that point: optimal to drive
use down to the low-use steady state;
- above that point:„Accomodation“
strategy
Phase portrait with two optimal steady state solutions 60
DNSSA
Optimal Policy: Feedback Rule in Phase Portrait 61
DNSSA
62
Skiba points
DNSS sets (Dechert, Nishimura, Sethi, Skiba)
Buridan‘s donkey
Multiple equilibria, history (path)-dependence
2 attractors (here: saddle points), xL xH
Unstable long-run equilibrium in between
Indifference point x0: more than one optimal solutionemanating from x0
63
Threshold point x0: in every neighborhood of x0 solutions
starting there and converging to distinct limit sets (weak
DNSS point)
Skiba (DNSS) point: indifference and threshold point
If x0 is a threshold but not an indifference point: weak Skiba point
DNSS points seperate different basins of attraction and
create an option about how to proceed
64
Strong DNSS point
strongDNSSxIx IIx
unstable focus
Weak DNSS point
weakDNSSx IIx
Ix
unstable node
65
Unstable focus or node
Concave models: unstable nodes, continuous controls
Non-concave models: foci or nodes, jump in the control
Grass et al. (2008) Chap. 5: Multiple equilibria, points of indifference and thresholds; Skiba points and curves
Multiple Equilibria[DNS(S) / Skiba] ThresholdsTipping Points
Optimal Spending for Prevention (w),Treatment (u), and Law Enforcement (v) as Functions of
the Number of Cocaine Users (A)
“Eradicate or Accommodate”Skiba/DNS(S) Threshold
Multiple Equilibria, Thresholds, and Bifurcations
The LHY – Model
Memory, Contagion, and Capture Rates: Characterizing the Types of Addictive Behavior that Are Prone to Repeated
Epidemics
D. Behrens, J.P. Caulkins, G. Tragler, G. Feichtinger
69
The U.S. cocaine epidemic
historical datamodel
70
Uncontrolled Dynamics and Historical Data
2106 4106 6106 8106 L
250000
500000
750000
1106
1.25106
1.5106
1.75106
2106
H 0H
0L
1982
1980
1981
19791978
1977 1976
1970
1983
1984
1985 1987
19891996
1975
historical trajectory modeled trajectories modeled 1970-trajectory
71
Nonlinear Feedback #2
• Initiation may be increasing in the number of contented users, but not all users are happy, or are good “advertisements” for merits of drug use– Distinguish L = light and H = heavy users– Musto effects posits negative feedback from H
and/or from H/L
72
Early Intervention is Valuable
2x106 4x106 6x106L
0.5x106
1x106
1.5x106
H
1975
1979
1996
uncontrolled m odeled epidem ic controlled (optim ally allocated budget) 2x106
1967
73
Dramatic ResultTreat and Prevent -- But Not at the
Same Time
0
0,2
0,4
0,6
0,8
1
1,267 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 Year
Con
trol
spen
ding
in b
illio
n do
llars
prevention
treatment
74
Nonlinear Feedback #3
• “Musto Effect”– Adverse consequences of addiction
become known 10 or so years afterinitiation
• “Adverse Reactions”– Adverse reactions can happen much
sooner, particularly with novice users whoinitiated recently
– Experience of adverse reactions may“disappear” from social environmentrelatively quickly 75
,YY,YHY
,HH,gHbLH,LL,LbaY,LIL
0
0
0
000
LYqexpsLY,LI
L = Number of light users,
H = Number of heavy users, and
Y = Decaying heavy user years.
76
Stage #1 (L):
Tax evasionDon‘t buckle up
Light drug useRisky behavior
Stage #2 (H):
DetectionRoad casualties
Heavy drug useHIV/HCV infection
General Two-Stage Escalation Model77
Initiation (I)
LightUsers
(L)
HeavyUsers
(H)
negative feedback of drugabuse on initiation
rate ofdesistance (a)
Memory ofHeavy Users’
Years (Y)
forgettingrate ()
positive feedback of occasionaldrug use on initiation
rate ofdesistance (g)
rate ofescalation (b)
Flow chart of the LHY model78
1970 1980 1990 2000 2010Year
2M
4M
6M
8M
L, H, Y
L
H
modeled epidemicmodeled memory ofheavy user years
Y
Plot of modeled numbers of users including the memory of abuseand “sequence of peaks” for the base case parameter-set.L0=1,400,000, H0=130,000, Y0=110,000. 79
Table 1: Examples for the LHY-model’s possible fields of application
Problem type Stage 1 (risky) Stage 2 (harmful) Controlinterventions
taxes tax evasion detected tax auditsseat-belts don‘t buckle up road casualties information
campaigns, controls
Illicit drugs light use addictive (heavy) use prevention, treatment, enforcement
violence ‚boys are boys‘ substential violence education,prevention
sexual harassement
dirty jokes sex-violence prevention
HIV / AIDS unsafe sex HIV infected condoms, medication
marketing satisfied customer unsatisfied customer Product quality, service, price
80
Why present-oriented societies undergo cycles of drug epidemics
'Those who forget the past are condemned to repeat it. '
81
0
dtwuHLeJ HLtr
.0,
,00,,00,,
,00,
0
0
0
wuYYYHYHHHuHgbLHLLLbawY,LIL
82
mwehhw 1
dHucu,H
Efficiency functions
prevention
treatment
83
Stability regions for any combination of s and values for a =0.163, b = 0.024, g = 0.062, and q = 3.443.
0.8
Damped oscillation
Monotonous stability
Cycless
0 0.2 0.4 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
84
Moving To OpportunityKeeping „the richs“ while placing „the poors“
Trade-off between positive effects (assimilation, social
capillarity) & negative effects („flight“)
Schelling (1971), Caulkins et al. (2005)
85
Side step 1:
A Production / Inventory Model„Intensity splitting“
uvdvx
tuvxe chrt
d )(min0
22
22
x … stock of inventoryv … production rated … constant demandu … rate of production changeφ(v) … concave-convex production cost
How can constant demand lead to periodic production? 86
Side step 2:
Production / inventory model: a triple indifference point (Steindl, 04) 87
88
Deissenberg et al. (2004), Wagener (2003,2006)
Wagener‘s conjecture: In an n-state optimal control
model generically no indifference points of higher
multiplicity than n
Lessons from Drug Modeling that Might Apply to Modeling Counter-
Terror
• Think of there being a population or “stock” of terrorists
89
Countering Terrorism
Modeling a “Stock” of Terrorists is not Common but Has Precedent
• Keohane and Zeckhauser, 2003: „The ecology of terrordefense”
• Castillo-Chavez & Song, 2003: „Models for thetransmission dynamics of fanatical behaviors“
• Greiner, 2004: „How much to spend on the fight againstterrorism? An optimal control approach“
• Faria, 2004: „Terror cycles“• Kaplan et al., 2005: „What happened to suicide
bombings in Israel? Insights from a terror stock model”• Udwadia et al., 2006: „A dynamical model of terrorism“
90
Lessons from Drug Modeling that Might Apply to Modeling Counter-
Terror• Think of there being a population or “stock”
of terrorists • Focus on “flows” in and out of that stock
– Levels of associated “reputation” or “perception” stocks can affect those flows
– Pay attention to the possibility that some interventions that increase an outflow might simultaneously increase inflow
91
Examples of Feedback from Perceptions/Reputation to Inflow• Mountain climbing is safer, so now there are
more deaths from mountain climbing (MacCoun)
• Prevalence of HIV affects sexual behavior and hence inflow into stock of …– Susceptibles and/or
• Blower, Doyle, & Lewis (2000), Greenhalgh et al. (2001)– Infecteds
• Zeiler (2007)• Drug epidemics end when enough people have
progressed to states of use that manifest the drugs’ potential ill effects (Musto, 1987; Kleiman, 1992; Behrens et al. 1999, 2000)
92
Data Limitations
• Drug data are poor• Terror stock data are worse
– So models are stylized and not validated– Value stems from articulating ideas in a
language as precise as mathematics• Not from hypothesis testing or• Providing specific, quantitative guidance
93
94
Counter-Terror Measures in a Multi-Stage Scenario
„Fire strategies“: territorial bombing, aggressively searching all people, activities involving significant collateral damage
inconvenience to third parties, resentment by population, stimulation of recruitment rates, elimination of current terrorists
„Water strategies“: intelligence driven arrests or „surgical“ operations against almost certainly guilty individuals
no harm to innocent parties, higher acceptance by population, expensive, difficult to apply
95
Occurrence of terroristic attack at t=0
Stage 1: modest counter measures, „water strategy“
Stage 2: additional, more aggressive measures, „fire strategy“
(side effect: increased inflow of recruits to terror organisation)
x(t) … number of terrorists at time t
u(t) … „water strategy“ at time t
v(t) … „fire strategy“ at time t
96
Stage 1:
s.t.
Stage 2:
s.t.
where
97
98
Conclusions
Flavor of Pontryagin’s Maximum Principle
– Qualitative insights into structural properties of optimal solutions
– Bifurcation analysis of NLDS
Epidemiological aspects
– Feedbacks: prevalence dep. initiation & transition rates
– Social interactions
99
Heterogeneities:
– Compartment models– PDE vs. OLG
Dynamic games:
– Strategic interactions in intertemporal competitive situations
J. Lesourne’s ocean statement
100