Optimal Control of “Deviant” Behavior · Deviant behavior: violence, terrorism, corruption, extortion, illicit (& legal) drug consumption, HIV / AIDS Epidemiological feedbacks:

Optimal Control of “Deviant” Behavior

Gustav Feichtinger

Institute for Mathematical Methods in EconomicsVienna University of Technology

[email protected]://www.eos.tuwien.ac.at/OR/

Roadmap

• Prologue• Optimal Control Applications in Economics and OR• Pontryagin’s Maximum Principle• Complex Behavior of Optimal Solutions• The American Cocaine Epidemic• The LHY – Model• Countering Terrorism• Conclusions

2

Prologue

The economics of ‘deviant’ behavior

Four cartoons:• Sherlock Holmes vs. Professor Moriarty• The corrupt politician• Addiction and binge drinking• Escalation from ‘light’ to ‘heavy’

3

4

The economics of crime: Gary Becker (1968)How much crime should be tolerated?

Deviant behavior: violence, terrorism, corruption, extortion, illicit (& legal) drug consumption, HIV / AIDS

Epidemiological feedbacks: macro micro• Easterlin effect• Double-edged sword effect• Stock-dependent recruitment

5

T. Schelling: social interactions, peer effects

How to start smoking: asymptotic influence of smokers and non smokers

Nonlinearities

Heterogeneities

6

Schelling diagram

0 1

un (C)

uc (C)

Ctipping C

C … degree of corruption,

c … corrupt, n … non-corrupt

Binary decision!

7

uc (C) … utility of being corruptun (C) … utility of being non-corrupt

stable equilibrium

unstable equilibrium

Gary S. Becker

University Professor Department of Economics

and Sociology

Professor Graduate School of BusinessThe University of Chicago

Nobel Laureate

The Journal of Political Economy,Vol. 76, No. 2. (Mar. - Apr., 1968),pp. 169-217

Gary S. Becker

Nobel Prize 1992 inEconomic Sciences

for having extended thedomain of microeconomicanalysis to a wide range ofhuman behavior andinteraction, includingnonmarket behavior

The Journal of Political Economy,Vol. 76, No. 2. (Mar. - Apr., 1968),pp. 169-217

Optimization

NO Dynamics

Dynamics

NO Optimization

„In general this is a difficult problem […]“

Jonathan P. Caulkins

Stever Professor ofOperations Research andPublic Policy

Carnegie Mellon University

INFORMS President's AwardINFORMS 2010 Annual Meeting

15

The Corrupt PoliticianCyclical behavior can be rational from the politician’s point of view

Two sides of a coin

up to now simplest model type leading to an optimal stable limit cycle in economics

Result: Under certain parameter constellations a stable Hopf cycle is optimal

Feichtinger & Wirl (92)

k p+ -

c+ +

16

c(t) … acutal (current) corruption (bribery)

p(t) … ‘popularity’ (reputation) of the politician

politician benefits both from corruption and popularity

0

d)()(max tpVcUe rt

0c

U(c)

0p

V(p)

k(t) … accumulated corruption (observed level of corruption)

2 states … k, p

1 control … c

totally separable, strictly concave

17

0p

g(p)

p

)()( kfpgpkck

0

0

f(k)

k

18

instability saddle-pointlimit cycles stability

)ˆ(pg

p

p … steady state popularities of this equilibrium

High popularity sufficient for stable corruption.

For weak popularity: cyclical behavior may be optimal

19

k

pp k

Popularity p is the leading variable, bad reputation (- p) lags behind k (and c)

Evans (1924), Ramsey (1928), Hotelling (1931)

Kurz (1965, 1968), Cass (1965,1966), Shell (1967,1969), Arrow & Kurz (1970)

Intriligator (1971), Stöppler (1974)

von Weizsäcker (1967), Näslund (1966,1969), Thompson (1968), Bensoussan et al. (1974), Stepan (1977), Ludwig (1978),van Loon (1985)

Sethi & Thompson (1981), Kamien & Schwartz (1981), Seierstad & Sydsaeter (1986), Feichtinger & Hartl (1986), Leonard & Long (1992), Dockner et al. (2000), Grass et al. (2008) 20

Optimal Control Applications in economics and OR

• Growth theory, capital accumulation, human capital

• Extraction of non-renewable resources, management of renewable resources

• Control of environmental pollution

• Dynamics of the firm: Production, inventory, finance, marketing, maintenance and replacement, planning, research and development, queuing systems

21

• Health planning: epidemiology of infectious diseases, illicit drugs

• Control of diseases, e.g. cancer treatment, pest control

• Control of violence and terrorism

• ‘Planning the unusual’

22

23

Pontryagin’s Maximum Principle

OCM:

0

0)(

)0( ),,( s.t.

d ),(max

xxuxfx

tuxge rt

u

Discounted autonomous optimal control model with T=

x n, u m

Pontryagin, Boltyanskii, Gamkrelidze, Mishchenko (1962, 1959),Pesch & Bulirsch (1994), Carathéodory (1926,1935)

24

H= λ0 g(x,u) + λf(x,u)

current-value Hamiltonian

V(x) ... optimal value at state x

λ ... c.v. adjoint variable (shadow price)

λ n

λ = Vx

25

Hamiltonian as generalized utility function:

H = g + λf

[H] = [g] = value / time[ ] = state / time[λ] = [Vx] = value / statex

26

Necessary and sufficient optimality conditions

) ),,(max ,

u(x,λ uuxHHrHx

u

x

canonical system in (x,λ), qualitative analysiseconomic interpretation

xxx

x

HrHHH

J

tr J = r > 0

n = 1: saddle points and unstable steady states (foci and nodes), i.e. stability and limit cycles are ruled out

27

Faced with x(t) at t [0,T) the control u(t) has

– a direct effect: instantaneous profit g(x,u)– an indirect effect: change of the state

‘after me the deluge’ vs. Stalinist policy

Limiting transversality condition

Michel (1982)

dt/dxx

xxufgr),u,x(Hmax

0

)t(elim rt

t

Sufficient optimality condition: concavity of H w.r.t. (x,u) or of w.r.t. x (Mangasarian):

Hmaxu

28

Finite planning horizon T

S(x(T)) … salvage value at T

TPBV problem: canonical system with n initial conditions x(0) = x0n terminal conditions λ(T) = Sx(x(T))

))T(x(Sedt)u,x(gemax rTT rt

)(u

0

29

‚Complex‘ Solutions of Optimal Control Models

- multiple equilibria

- limit cycles

The American Cocaine Epidemic

Multiple Steady States and DNSS Points in an Optimal Control Model of a

Drug User Population

J.P. Caulkins, G. Tragler, G. Feichtinger

30

31

• Optimal mix of instruments to control an epidemic,

• Dynamic process, nonlinear feedbacks, social interactions

• Other diffusion processes: marketing, HIV/AIDS, social norms, etc.

• Interaction of instruments

• Heterogeneities: compartment models, distributed control

Drug Problems Are Important• In the US

– $180 B per year in social cost– ~20,000 premature deaths per year

• In Austria– 181 drug-related deaths per year in 2005

• (about 1/3 the per capita rate in the US)

• Globally– 200 million drug users– 25 million problem drug users– Ebbing in US/Australia; growing elsewhere

32

Why Dynamic Optimization?

Drug problems are very dynamic

0

5

10

15

20

25

1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012

Dem

and

(Lig

ht U

ser-

Year

Equ

ival

ents

)

Demand from Heavy UsersDemand from Light Users

U.S. cocaine epidemic

“Heavy use”: consumption at least once a week

“Light use”:consumption at least once within the last year, but less than weekly

Why Dynamic Optimization?

Drug problems cause enormous social costs

Social costs due to U.S. cocaine consumption in 2000:

Total consumption in 2000: 225 metric tonsSocial costs (Caulkins et al., 2002): 215 $ / gram

(225,000,000 grams) * (215 $ / gram)=

48,375,000,000 $(approx. 50 billion $)

Why Dynamic Optimization?A lot of money is spent for the

control of drug problems

Drug control expenditures in 2000 (U.S.A., cocaine):

Prevention: 0.9 billion $Treatment: 2.0 billion $Law enforcement: 9.2 billion $--------------------------------------------------Expenditures in 2000: 12.1 billion $

Social costs + budget in one year: 60 billion $

Our Goal

Our goal is to find theoptimal mix

ofcontrol instruments

such as prevention, treatment, or law enforcement, which

minimizes the social costs / harmcaused by drug consumption.

Our Approach• We construct (sometimes simple, sometimes less

simple) models of drug consumption and the involved feedback effects;

• we solve these models as dynamic optimization problems (by using methods of optimal control theory);

• we determine if and how the optimal mix of interventions changes with time; and

• we cautiously draw conclusions about the optimal policy.

Nonlinear Feedback #1a

• Initiation is an increasing function of the number of current users

– Friends introduce friends to drugs– “Epidemic” models not merely “diffusion”

models

38

Nonlinear Feedback #1b

• Enforcement swamping– Market participants respond to incentives,

such as enforcement intensity (not level), so– dN/dt = f(u/N)– Akin to Schelling’s “tipping point” model of

corruption

39

Modelling the Controls

Australian Heroin Drought

Moore, T.J., J.P. Caulkins, A. Ritter, P. Dietze, S. Monagle and J. Pruden, 2005, Heroin Markets in Australia: Current Understanding and Future Possibilities,DPMP Monograph Series Fitzroy Turning Point Alcohol and Drug Centre

Drug Prices and Their EffectsMain idea is that higher prices induce lower levels of use.

In this sense, assume that demand may be described by the formula

with c denoting a constant and being the price elasticity of demand. Then, …

pc

Analogously: U.S. Data

Caulkins, J.P., 2001, „Drug Prices and Emergency Department Mentions for Cocaine and Heroin“, American Journal of Public Health 91(9), 1446-1448.

0

100

200

300

400

500

600

700

800

1981 1983 1985 1987 1989 1991 1993 1995

Pric

e pe

r Pur

e G

ram

& E

R M

entio

ns/y

r 0.1*Heroin PriceCocaine PriceHeroin ER Mentions/200Cocaine ER Mentions/200

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996

Num

ber o

f Coc

aine

-Rel

ated

ER

Men

tions Predicted by Price

Actual

Price changes can explain 97.5% of the variation in ER Mentions for cocaine

cocaine ER mentions = 5.85 x 107 x price-1.30

1‐State‐Model with (up to) 3 Controls

outflow natural treatment todue quittinginitiation

,, tAt,vtAμptAtutActwtAtvtAkptA ba

tA

tvedtvtAp ,

„enforcement swamping“







z

tAtututA

,

„cream skimming“




mwehhw 1

2´ 109 4´ 109 6´ 109 8´ 109 1´ 1010

0.2

0.4

0.6

0.8

1

prevention spending

reduction of initiation

Optimal Control Model

0

spendingprevention

spendingtenforcemen

lawspendingtreatment

nconsumptio todue costs social

nconsumptiogdiscountin0,0,0,min dttwtvtutvtAptAe rt

tGAtwtvtu

twtvtu

s.t.



50

… damage in $ per gram of consumed cocaine(60-500$/g)

dynamic cost-benefit analysis:optimal mix of instruments

u … treatmentv … enforcementw … prevention

Solving theOptimal Control Models

• Pontryagin‘s Maximum Principle

• Computers / Numerical Tools / Software

http://orcos.tuwien.ac.at/research/ocmat_software/

D. Grass,J.P. Caulkins, G. Feichtinger,G. Tragler andD.A. Behrens,

2008,Optimal Control of Nonlinear

Processes – With Applications in Drugs, Corruption, and Terror

(Springer, Heidelberg), ISBN: 978‐3‐540‐77646‐8.

D. Grass,J.P. Caulkins, G. Feichtinger,G. Tragler andD.A. Behrens,

2008,Optimal Control of Nonlinear

Processes – With Applications in Drugs, Corruption, and Terror

(Springer, Heidelberg), ISBN: 978‐3‐540‐77646‐8.

56

http://www.springer.com/economics/game+theory/book/978-3-540-77646-8

http://www.eos.tuwien.ac.at/OR/OCMat/

57

Optimal mix of u and v

0

p.a. $ 16000

w

GGAvu

uv

v

AA

GA

Ass.

Intertemporal trade-off between damage of addicts (social costs) and costs of fighting them … saddle-point ‚stability‘.

58

Skiba points separating multiple equilibria

Eradiction of ‚crime‘

Dechert & Nishimura (83)Sethi (77)Skiba (78)

Spend much if the epidemic is still small! But …

A

v

DNSSA

boundaryequilibrium

interiorequilibrium

Bifurcation Diagram: Multiple Steady States 59

c

Bifurcation

A

There exists a DNSS („tipping“) point;

if initial number of users is …

- below that point: optimal to drive

use down to the low-use steady state;

- above that point:„Accomodation“

strategy

Phase portrait with two optimal steady state solutions 60

DNSSA

Optimal Policy: Feedback Rule in Phase Portrait 61

DNSSA

62

Skiba points

DNSS sets (Dechert, Nishimura, Sethi, Skiba)

Buridan‘s donkey

Multiple equilibria, history (path)-dependence

2 attractors (here: saddle points), xL xH

Unstable long-run equilibrium in between

Indifference point x0: more than one optimal solutionemanating from x0

63

Threshold point x0: in every neighborhood of x0 solutions

starting there and converging to distinct limit sets (weak

DNSS point)

Skiba (DNSS) point: indifference and threshold point

If x0 is a threshold but not an indifference point: weak Skiba point

DNSS points seperate different basins of attraction and

create an option about how to proceed

64

Strong DNSS point

strongDNSSxIx IIx

unstable focus

Weak DNSS point

weakDNSSx IIx

Ix

unstable node

65

Unstable focus or node

Concave models: unstable nodes, continuous controls

Non-concave models: foci or nodes, jump in the control

Grass et al. (2008) Chap. 5: Multiple equilibria, points of indifference and thresholds; Skiba points and curves

Multiple Equilibria[DNS(S) / Skiba] ThresholdsTipping Points

Optimal Spending for Prevention (w),Treatment (u), and Law Enforcement (v) as Functions of

the Number of Cocaine Users (A)

“Eradicate or Accommodate”Skiba/DNS(S) Threshold

Multiple Equilibria, Thresholds, and Bifurcations

The LHY – Model

Memory, Contagion, and Capture Rates: Characterizing the Types of Addictive Behavior that Are Prone to Repeated

Epidemics

D. Behrens, J.P. Caulkins, G. Tragler, G. Feichtinger

69

The U.S. cocaine epidemic

historical datamodel

70

Uncontrolled Dynamics and Historical Data

2106 4106 6106 8106 L

250000

500000

750000

1106

1.25106

1.5106

1.75106

2106

H 0H

0L

1982

1980

1981

19791978

1977 1976

1970

1983

1984

1985 1987

19891996

1975

historical trajectory modeled trajectories modeled 1970-trajectory

71

Nonlinear Feedback #2

• Initiation may be increasing in the number of contented users, but not all users are happy, or are good “advertisements” for merits of drug use– Distinguish L = light and H = heavy users– Musto effects posits negative feedback from H

and/or from H/L

72

Early Intervention is Valuable

2x106 4x106 6x106L

0.5x106

1x106

1.5x106

H

1975

1979

1996

uncontrolled m odeled epidem ic controlled (optim ally allocated budget) 2x106

1967

73

Dramatic ResultTreat and Prevent -- But Not at the

Same Time

0

0,2

0,4

0,6

0,8

1

1,267 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 Year

Con

trol

spen

ding

in b

illio

n do

llars

prevention

treatment

74

Nonlinear Feedback #3

• “Musto Effect”– Adverse consequences of addiction

become known 10 or so years afterinitiation

• “Adverse Reactions”– Adverse reactions can happen much

sooner, particularly with novice users whoinitiated recently

– Experience of adverse reactions may“disappear” from social environmentrelatively quickly 75

,YY,YHY

,HH,gHbLH,LL,LbaY,LIL

0

0

0

000

LYqexpsLY,LI

L = Number of light users,

H = Number of heavy users, and

Y = Decaying heavy user years.

76

Stage #1 (L):

Tax evasionDon‘t buckle up

Light drug useRisky behavior

Stage #2 (H):

DetectionRoad casualties

Heavy drug useHIV/HCV infection

General Two-Stage Escalation Model77

Initiation (I)

LightUsers

(L)

HeavyUsers

(H)

negative feedback of drugabuse on initiation

rate ofdesistance (a)

Memory ofHeavy Users’

Years (Y)

forgettingrate ()

positive feedback of occasionaldrug use on initiation

rate ofdesistance (g)

rate ofescalation (b)

Flow chart of the LHY model78

1970 1980 1990 2000 2010Year

2M

4M

6M

8M

L, H, Y

L

H

modeled epidemicmodeled memory ofheavy user years

Y

Plot of modeled numbers of users including the memory of abuseand “sequence of peaks” for the base case parameter-set.L0=1,400,000, H0=130,000, Y0=110,000. 79

Table 1: Examples for the LHY-model’s possible fields of application

Problem type Stage 1 (risky) Stage 2 (harmful) Controlinterventions

taxes tax evasion detected tax auditsseat-belts don‘t buckle up road casualties information

campaigns, controls

Illicit drugs light use addictive (heavy) use prevention, treatment, enforcement

violence ‚boys are boys‘ substential violence education,prevention

sexual harassement

dirty jokes sex-violence prevention

HIV / AIDS unsafe sex HIV infected condoms, medication

marketing satisfied customer unsatisfied customer Product quality, service, price

80

Why present-oriented societies undergo cycles of drug epidemics

'Those who forget the past are condemned to repeat it. '

81

0

dtwuHLeJ HLtr

.0,

,00,,00,,

,00,

0

0

0

wuYYYHYHHHuHgbLHLLLbawY,LIL

82

mwehhw 1

dHucu,H

Efficiency functions

prevention

treatment

83

Stability regions for any combination of s and values for a =0.163, b = 0.024, g = 0.062, and q = 3.443.

0.8

Damped oscillation

Monotonous stability

Cycless

0 0.2 0.4 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

84

Moving To OpportunityKeeping „the richs“ while placing „the poors“

Trade-off between positive effects (assimilation, social

capillarity) & negative effects („flight“)

Schelling (1971), Caulkins et al. (2005)

85

Side step 1:

A Production / Inventory Model„Intensity splitting“

uvdvx

tuvxe chrt

d )(min0

22

22

x … stock of inventoryv … production rated … constant demandu … rate of production changeφ(v) … concave-convex production cost

How can constant demand lead to periodic production? 86

Side step 2:

Production / inventory model: a triple indifference point (Steindl, 04) 87

88

Deissenberg et al. (2004), Wagener (2003,2006)

Wagener‘s conjecture: In an n-state optimal control

model generically no indifference points of higher

multiplicity than n

Lessons from Drug Modeling that Might Apply to Modeling Counter-

Terror

• Think of there being a population or “stock” of terrorists

89

Countering Terrorism

Modeling a “Stock” of Terrorists is not Common but Has Precedent

• Keohane and Zeckhauser, 2003: „The ecology of terrordefense”

• Castillo-Chavez & Song, 2003: „Models for thetransmission dynamics of fanatical behaviors“

• Greiner, 2004: „How much to spend on the fight againstterrorism? An optimal control approach“

• Faria, 2004: „Terror cycles“• Kaplan et al., 2005: „What happened to suicide

bombings in Israel? Insights from a terror stock model”• Udwadia et al., 2006: „A dynamical model of terrorism“

90

Lessons from Drug Modeling that Might Apply to Modeling Counter-

Terror• Think of there being a population or “stock”

of terrorists • Focus on “flows” in and out of that stock

– Levels of associated “reputation” or “perception” stocks can affect those flows

– Pay attention to the possibility that some interventions that increase an outflow might simultaneously increase inflow

91

Examples of Feedback from Perceptions/Reputation to Inflow• Mountain climbing is safer, so now there are

more deaths from mountain climbing (MacCoun)

• Prevalence of HIV affects sexual behavior and hence inflow into stock of …– Susceptibles and/or

• Blower, Doyle, & Lewis (2000), Greenhalgh et al. (2001)– Infecteds

• Zeiler (2007)• Drug epidemics end when enough people have

progressed to states of use that manifest the drugs’ potential ill effects (Musto, 1987; Kleiman, 1992; Behrens et al. 1999, 2000)

92

Data Limitations

• Drug data are poor• Terror stock data are worse

– So models are stylized and not validated– Value stems from articulating ideas in a

language as precise as mathematics• Not from hypothesis testing or• Providing specific, quantitative guidance

93

94

Counter-Terror Measures in a Multi-Stage Scenario

„Fire strategies“: territorial bombing, aggressively searching all people, activities involving significant collateral damage

inconvenience to third parties, resentment by population, stimulation of recruitment rates, elimination of current terrorists

„Water strategies“: intelligence driven arrests or „surgical“ operations against almost certainly guilty individuals

no harm to innocent parties, higher acceptance by population, expensive, difficult to apply

95

Occurrence of terroristic attack at t=0

Stage 1: modest counter measures, „water strategy“

Stage 2: additional, more aggressive measures, „fire strategy“

(side effect: increased inflow of recruits to terror organisation)

x(t) … number of terrorists at time t

u(t) … „water strategy“ at time t

v(t) … „fire strategy“ at time t

96

Stage 1:

s.t.

Stage 2:

s.t.

where

97

98

Conclusions

Flavor of Pontryagin’s Maximum Principle

– Qualitative insights into structural properties of optimal solutions

– Bifurcation analysis of NLDS

Epidemiological aspects

– Feedbacks: prevalence dep. initiation & transition rates

– Social interactions

99

Heterogeneities:

– Compartment models– PDE vs. OLG

Dynamic games:

– Strategic interactions in intertemporal competitive situations

J. Lesourne’s ocean statement

100

Documents

Optimal Control of “Deviant” Behavior · Deviant behavior: violence, terrorism, corruption, extortion, illicit (& legal) drug consumption, HIV / AIDS Epidemiological feedbacks: