Experiments in Robust Optimization

Preview:

Citation preview

Experiments in Robust Optimization

Daniel Bienstock

Columbia University

12-06-2006

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 1 / 52

Motivation

Robust Optimization

Optimization under parameter (data) uncertainty

Ben-Tal and Nemirovsky, El Ghaoui et al

Bertsimas et al

Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.

Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].

Typically, a minimization problem becomes a min-max problem.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 2 / 52

Motivation

Robust Optimization

Optimization under parameter (data) uncertainty

Ben-Tal and Nemirovsky, El Ghaoui et al

Bertsimas et al

Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.

Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].

Typically, a minimization problem becomes a min-max problem.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 2 / 52

Motivation

Robust Optimization

Optimization under parameter (data) uncertainty

Ben-Tal and Nemirovsky, El Ghaoui et al

Bertsimas et al

Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.

Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].

Typically, a minimization problem becomes a min-max problem.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 2 / 52

Motivation

Robust Optimization

Optimization under parameter (data) uncertainty

Ben-Tal and Nemirovsky, El Ghaoui et al

Bertsimas et al

Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.

Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].

Typically, a minimization problem becomes a min-max problem.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 2 / 52

Motivation

Robust Optimization

Optimization under parameter (data) uncertainty

Ben-Tal and Nemirovsky, El Ghaoui et al

Bertsimas et al

Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.

Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].

Typically, a minimization problem becomes a min-max problem.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 2 / 52

Motivation

Robust Optimization

Optimization under parameter (data) uncertainty

Ben-Tal and Nemirovsky, El Ghaoui et al

Bertsimas et al

Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.

Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].

Typically, a minimization problem becomes a min-max problem.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 2 / 52

Motivation

Example: Linear Programs with Row-Wise uncertainty

Ben-Tal and Nemirovsky, 1999

min ctxSubject to:

Ax ≥ b for all A ∈ Ux ∈ X

U = uncertainty set→ the i th row of A belongs to an ellipsoidal set Ei

e.g.∑

j α2ij (aij − aij )

2 ≤ 1

→ can be solved using SOCP techniques

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 3 / 52

Motivation

Other forms of optimization under uncertainty

Stochastic programming

Adversarial queueing, online optimization

“Risk-aware” optimization

Optimization of utility functions as a substitute for handlinginfeasibilities

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 4 / 52

Motivation

Other forms of optimization under uncertainty

Stochastic programming

Adversarial queueing, online optimization

“Risk-aware” optimization

Optimization of utility functions as a substitute for handlinginfeasibilities

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 4 / 52

Motivation

Other forms of optimization under uncertainty

Stochastic programming

Adversarial queueing, online optimization

“Risk-aware” optimization

Optimization of utility functions as a substitute for handlinginfeasibilities

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 4 / 52

Motivation

Other forms of optimization under uncertainty

Stochastic programming

Adversarial queueing, online optimization

“Risk-aware” optimization

Optimization of utility functions as a substitute for handlinginfeasibilities

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 4 / 52

Motivation

Scenario I: Stability

Data is fairly accurate, though possibly noisy – small errors arepossible

�������

→ Idiosyncratic decisions and small changes in data could havemajor impact

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 5 / 52

Motivation

Scenario I: Stability

Data is fairly accurate, though possibly noisy – small errors arepossible

������ � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �

� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �

� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �

� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �

� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �

� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �

� � � � �� � � � �� � � � �� � � � �� � � � �

� � � � �� � � � �� � � � �� � � � �� � � � �

→ Idiosyncratic decisions and small changes in data could havemajor impact

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 5 / 52

Motivation

Scenario II: Hedging

Significant, but within order-of-magnitude, data uncertainty

Example:A certain parameter, α, is volatile. Its long-term average is 1.5 but itwe could expect changes of the order of .3.

Possibly more than just noise

Could use deviations to our advantage, especially if there areseveral uncertain parameters that act “correlated”

Are we guarding against risk or are we hedging?

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 6 / 52

Motivation

Scenario III: InsuranceReal world data can exhibit undesirable and unexpected behavior

Classical goal: how can we protect without becoming too riskaverse

Need to clearly spell out desired tradeoff between risk andperformanceMagnitude and geometry of risk are not the same

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 7 / 52

Motivation

Scenario III: InsuranceReal world data can exhibit undesirable and unexpected behavior

Classical goal: how can we protect without becoming too riskaverse

Need to clearly spell out desired tradeoff between risk andperformanceMagnitude and geometry of risk are not the same

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 7 / 52

Motivation

Application: Portfolio Optimization

min λx T Qx − µT x

Subject to:

Ax ≥ b

µ = vector of “returns”, Q = “covariance” matrix

x = vector of “asset weights”

Ax ≥ b : general linear constraints

λ ≥ 0 = “risk-aversion” multiplier

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 8 / 52

Motivation

Application: Portfolio Optimization

min λx T Qx − µT x

Subject to:

Ax ≥ b

µ = vector of “returns”, Q = “covariance” matrix

x = vector of “asset weights”

Ax ≥ b : general linear constraints

λ ≥ 0 = “risk-aversion” multiplier

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 8 / 52

Motivation

Application: Portfolio Optimization

min λx T Qx − µT x

Subject to:

Ax ≥ b

µ = vector of “returns”, Q = “covariance” matrix

x = vector of “asset weights”

Ax ≥ b : general linear constraints

λ ≥ 0 = “risk-aversion” multiplier

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 8 / 52

Motivation

Application: Portfolio Optimization

min λx T Qx − µT x

Subject to:

Ax ≥ b

µ = vector of “returns”, Q = “covariance” matrix

x = vector of “asset weights”

Ax ≥ b : general linear constraints

λ ≥ 0 = “risk-aversion” multiplier

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 8 / 52

Motivation

Application: Portfolio Optimization

min λx T Qx − µT x

Subject to:

Ax ≥ b

µ = vector of “returns”, Q = “covariance” matrix

x = vector of “asset weights”

Ax ≥ b : general linear constraints

λ ≥ 0 = “risk-aversion” multiplier

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 8 / 52

Motivation

Robust Portfolio Optimization

Goldfarb and Iyengar, 2001

→ Q and µ are uncertain

Robust Problem

min x{

maxQ∈Q λx T Qx − min µ∈E µT x}

Subject to:∑j x j = 1, x ≥ 0

→ When Q is an ellipsoid and E is a product of intervals the robustproblem can be solved as an SOCP

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 9 / 52

Motivation

Linear Programs with Row-Wise uncertainty

Bertsimas and Sim, 2002

max ctxSubject to:

Ax ≤ b for all A ∈ Ux ≥ 0

→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij

→ for all other coefficients: aij = aij .

Robust problem can be formulated as a (larger) linear program.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 10 / 52

Motivation

Linear Programs with Row-Wise uncertainty

Bertsimas and Sim, 2002

max ctxSubject to:

Ax ≤ b for all A ∈ Ux ≥ 0

→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij

→ for all other coefficients: aij = aij .

Robust problem can be formulated as a (larger) linear program.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 10 / 52

Motivation

Linear Programs with Row-Wise uncertainty

Bertsimas and Sim, 2002

max ctxSubject to:

Ax ≤ b for all A ∈ Ux ≥ 0

→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij

→ for all other coefficients: aij = aij .

Robust problem can be formulated as a (larger) linear program.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 10 / 52

Motivation

Linear Programs with Row-Wise uncertainty

Bertsimas and Sim, 2002

max ctxSubject to:

Ax ≤ b for all A ∈ Ux ≥ 0

→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij

→ for all other coefficients: aij = aij .

Robust problem can be formulated as a (larger) linear program.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 10 / 52

Motivation

Equivalent formulation

max ctxSubject to:

Ax ≤ b for all A ∈ Ux ≥ 0

→ in every row i exactly Γi coefficients change: aij = aij + δij

→ for all other coefficients: aij = aij .

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 11 / 52

Data-driven Model definition

Robust Portfolio Optimization

A different uncertainty model

→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant

A simple example

Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)

Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj

Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj

At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 12 / 52

Data-driven Model definition

Robust Portfolio Optimization

A different uncertainty model

→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant

A simple example

Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)

Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj

Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj

At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 12 / 52

Data-driven Model definition

Robust Portfolio Optimization

A different uncertainty model

→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant

A simple example

Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)

Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj

Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj

At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 12 / 52

Data-driven Model definition

Robust Portfolio Optimization

A different uncertainty model

→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant

A simple example

Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)

Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj

Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj

At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 12 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ K

for each asset j : µj = expected return

between n i and Ni assets j satisfy:

(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj , for each i ≥ 1 ( γ0 = 0)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 13 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ K

for each asset j : µj = expected return

between n i and Ni assets j satisfy:

(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj , for each i ≥ 1 ( γ0 = 0)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 13 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return

between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑

j µj ≥ Γ∑

j µj ; Γ > 0 a parameter

(R. Tutuncu) For 1 ≤ h ≤ H,

a set (“tier”) Th of assets, and a parameter Γh > 0

for each h,∑

j∈Thµj ≥ Γh

∑j∈Sh

µj

Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 14 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return

between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑

j µj ≥ Γ∑

j µj ; Γ > 0 a parameter

(R. Tutuncu) For 1 ≤ h ≤ H,

a set (“tier”) Th of assets, and a parameter Γh > 0

for each h,∑

j∈Thµj ≥ Γh

∑j∈Sh

µj

Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 14 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return

between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑

j µj ≥ Γ∑

j µj ; Γ > 0 a parameter

(R. Tutuncu) For 1 ≤ h ≤ H,

a set (“tier”) Th of assets, and a parameter Γh > 0

for each h,∑

j∈Thµj ≥ Γh

∑j∈Sh

µj

Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 14 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return

between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑

j µj ≥ Γ∑

j µj ; Γ > 0 a parameter

(R. Tutuncu) For 1 ≤ h ≤ H,

a set (“tier”) Th of assets, and a parameter Γh > 0

for each h,∑

j∈Thµj ≥ Γh

∑j∈Sh

µj

Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 14 / 52

Data-driven Model definition

A more comprehensive setting

Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return

between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑

j µj ≥ Γ∑

j µj ; Γ > 0 a parameter

(R. Tutuncu) For 1 ≤ h ≤ H,

a set (“tier”) Th of assets, and a parameter Γh > 0

for each h,∑

j∈Thµj ≥ Γh

∑j∈Sh

µj

Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 14 / 52

Data-driven Model definition

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 15 / 52

Data-driven Model definition

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 15 / 52

Data-driven Model definition

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 15 / 52

Data-driven Model definition

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 15 / 52

Data-driven Benders

General methodology:

Benders’ decomposition (= cutting-plane algorithm)

Generic problem: min x∈X maxd∈D f (x , d )

→ Maintain a finite subset D of D (a “model” )

GAME

1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗

2 Adversary: solve maxd∈D f (x ∗, d ), with solution d

3 Add d to D, and go to 1.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 16 / 52

Data-driven Benders

General methodology:

Benders’ decomposition (= cutting-plane algorithm)

Generic problem: min x∈X maxd∈D f (x , d )

→ Maintain a finite subset D of D (a “model” )

GAME

1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗

2 Adversary: solve maxd∈D f (x ∗, d ), with solution d

3 Add d to D, and go to 1.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 16 / 52

Data-driven Benders

General methodology:

Benders’ decomposition (= cutting-plane algorithm)

Generic problem: min x∈X maxd∈D f (x , d )

→ Maintain a finite subset D of D (a “model” )

GAME

1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗

2 Adversary: solve maxd∈D f (x ∗, d ), with solution d

3 Add d to D, and go to 1.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 16 / 52

Data-driven Benders

General methodology:

Benders’ decomposition (= cutting-plane algorithm)

Generic problem: min x∈X maxd∈D f (x , d )

→ Maintain a finite subset D of D (a “model” )

GAME

1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗

2 Adversary: solve maxd∈D f (x ∗, d ), with solution d

3 Add d to D, and go to 1.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 16 / 52

Data-driven Benders

General methodology:

Benders’ decomposition (= cutting-plane algorithm)

Generic problem: min x∈X maxd∈D f (x , d )

→ Maintain a finite subset D of D (a “model” )

GAME

1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗

2 Adversary: solve maxd∈D f (x ∗, d ), with solution d

3 Add d to D, and go to 1.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 16 / 52

Data-driven Benders

Why this approach

Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems

Decoupling allows us to use more sophisticated uncertaintymodels

If number of iterations is small, implementor’s problem is a small“convex” problem

Most progress will be achieved in initial iterations – permits “soft”termination criteria

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 17 / 52

Data-driven Benders

Why this approach

Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems

Decoupling allows us to use more sophisticated uncertaintymodels

If number of iterations is small, implementor’s problem is a small“convex” problem

Most progress will be achieved in initial iterations – permits “soft”termination criteria

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 17 / 52

Data-driven Benders

Why this approach

Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems

Decoupling allows us to use more sophisticated uncertaintymodels

If number of iterations is small, implementor’s problem is a small“convex” problem

Most progress will be achieved in initial iterations – permits “soft”termination criteria

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 17 / 52

Data-driven Benders

Why this approach

Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems

Decoupling allows us to use more sophisticated uncertaintymodels

If number of iterations is small, implementor’s problem is a small“convex” problem

Most progress will be achieved in initial iterations – permits “soft”termination criteria

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 17 / 52

Data-driven Benders

Why this approach

Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems

Decoupling allows us to use more sophisticated uncertaintymodels

If number of iterations is small, implementor’s problem is a small“convex” problem

Most progress will be achieved in initial iterations – permits “soft”termination criteria

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 17 / 52

Data-driven Algorithm

Implementor’s problem

A convex quadratic program

At iteration m, solve

min λx T Qx − r

Subject to:

Ax ≥ b

r ≤ µT(i )x , i = 1, . . . , m

Here, µ(1), . . . , µ(m) are given return vectors

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 18 / 52

Data-driven Algorithm

Adversarial problem: A mixed-integer program

x ∗ = given asset weights

min∑

j x ∗j µj

Subject to:

µj (1 −∑

i γi−1 y ij ) ≤ µj ≤ µj (1 −∑

i γi y ij ) ∀i ≥ 1∑i y ij ≤ 1, ∀ j (each asset in at most one segment)

n i ≤∑

j y ij ≤ Ni , 1 ≤ i ≤ K (segment cardinalities)∑j∈Th

µj ≥ Γh∑

j∈Thµj , 1 ≤ h ≤ H (tier ineqs.)

µj free, y ij = 0 or 1 , all i, j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 19 / 52

Data-driven Experiments

Example: 2464 assets, 152-factor model. CPU time: 500 seconds

10 segments (a: “heavy tail”)6 tiers: the top five deciles lose at most 10% each, total loss ≤ 5%

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 20 / 52

Data-driven Experiments

Same run

2464 assets, 152 factors;10 segments, 6 tiers

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 21 / 52

Data-driven Experiments

Summary of average problems with 3-4 segments, 2-3 tiers

columns rows iterations time imp. time adv. time(sec.)

1 500 20 47 1.85 1.34 0.462 500 20 3 0.09 0.01 0.033 703 108 1 0.29 0.13 0.044 499 140 3 3.12 2.65 0.055 499 20 19 0.42 0.21 0.176 1338 81 7 0.45 0.17 0.087 2019 140 8 41.53 39.6 0.368 2443 153 2 12.32 9.91 0.079 2464 153 111 100.81 60.93 36.78

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 22 / 52

Data-driven Approximation

Why the adversarial problem is “easy”

( x ∗ = given asset weights)

min∑

j x ∗j µj

Subject to:

µj (1 −∑

i γi−1 y ij ) ≤ µj ≤ µj (1 −∑

i γi y ij )∑i y ij ≤ 1, ∀ j (each asset in at most one segment)

n i ≤∑

j y ij ≤ Ni , ∀i (segment cardinalities)∑j∈Th

µj ≥ Γh (∑

j∈Thµj ), ∀h (tier inequalities)

µj free, y ij = 0 or 1 , all i, j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 23 / 52

Data-driven Approximation

Why the adversarial problem is “easy”

( K = no. of segments, H = no. of tiers)

Theorem . For every fixed K and H, and for every ε > 0, there is analgorithm that finds a solution to the adversarial problem with optimalityrelative error ≤ ε, in time polynomial in ε−1 and n (= no. of assets).

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 24 / 52

Data-driven Approximation

The simplest case

max∑

j x ∗j δj

Subject to:∑j δj ≤ Γ

0 ≤ δj ≤ u j y j , y j = 0 or 1 , all j∑j y j ≤ N

· · · a cardinality constrained knapsack problemB. (1995), DeFarias and Nemhauser (2004)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 25 / 52

Data-driven More experiments

What is the impact of the uncertainty model

All runs on the same data set with 1338 columns and 81 rows

1 segment: (200, 0.5)robust random return = 4.57, 157 assets

2 segments: (200, 0.25), (100, 0.5)robust random return = 4.57, 186 assets

2 segments: (200, 0.2), (100, 0.6)robust random return = 3.25, 213 assets

2 segments: (200, 0.1), (100, 0.8)robust random return = 1.50, 256 assets

1 segment: (100, 1.0)robust random return = 1.24, 281 assets

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 26 / 52

VAR Definition

Ambiguous chance-constrained models

1 The implementor chooses a vector x ∗ of assets

2 The adversary chooses a probability distribution P for the returnsvector

3 A random returns vector µ is drawn from P

→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)

Erdogan and Iyengar (2004), Calafiore and Campi (2004)

→ We want to model correlated errors in the returns

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 27 / 52

VAR Definition

Ambiguous chance-constrained models

1 The implementor chooses a vector x ∗ of assets

2 The adversary chooses a probability distribution P for the returnsvector

3 A random returns vector µ is drawn from P

→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)

Erdogan and Iyengar (2004), Calafiore and Campi (2004)

→ We want to model correlated errors in the returns

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 27 / 52

VAR Definition

Ambiguous chance-constrained models

1 The implementor chooses a vector x ∗ of assets

2 The adversary chooses a probability distribution P for the returnsvector

3 A random returns vector µ is drawn from P

→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)

Erdogan and Iyengar (2004), Calafiore and Campi (2004)

→ We want to model correlated errors in the returns

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 27 / 52

VAR Definition

Ambiguous chance-constrained models

1 The implementor chooses a vector x ∗ of assets

2 The adversary chooses a probability distribution P for the returnsvector

3 A random returns vector µ is drawn from P

→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)

Erdogan and Iyengar (2004), Calafiore and Campi (2004)

→ We want to model correlated errors in the returns

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 27 / 52

VAR Definition

Ambiguous chance-constrained models

1 The implementor chooses a vector x ∗ of assets

2 The adversary chooses a probability distribution P for the returnsvector

3 A random returns vector µ is drawn from P

→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)

Erdogan and Iyengar (2004), Calafiore and Campi (2004)

→ We want to model correlated errors in the returns

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 27 / 52

VAR Definition

Ambiguous chance-constrained models

1 The implementor chooses a vector x ∗ of assets

2 The adversary chooses a probability distribution P for the returnsvector

3 A random returns vector µ is drawn from P

→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)

Erdogan and Iyengar (2004), Calafiore and Campi (2004)

→ We want to model correlated errors in the returns

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 27 / 52

VAR Definition

Uncertainty set

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 28 / 52

VAR Definition

Uncertainty set

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 28 / 52

VAR Definition

Uncertainty set

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 28 / 52

VAR Definition

Uncertainty set

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 28 / 52

VAR Definition

Uncertainty set

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 28 / 52

VAR Definition

Uncertainty set

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 28 / 52

VAR Definition

OR ...

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ Wfor all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj − δw j (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 29 / 52

VAR Definition

OR ...

Given a vector x ∗ of assets, the adversary

1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ Wfor all j.

2 Chooses a random variable 0 ≤ δ ≤ 1

→ Random return: µj = µj − δw j (µ = nominal returns).

Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that

Prob(ν − µT x ∗ ≥ ρ) ≥ θ

→ The adversary wants to maximize VAR

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 29 / 52

VAR Definition

The classical factor model for returns

µ = µ + V T f + ε

where

µ = expected return,

V = “factor exposure matrix”,

f = a bounded random variable,

ε = residual errors

V is r × n with r << n.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 30 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

A discrete distribution:

We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i

K

Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K

The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu

i(and possibly other constraints)

Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ

∑j∈Th

w j ) ≤ Γh (given)

or, (∑

i δi πi )∑

j∈Thw j ≤ Γh

Cardinality constraint: w j > 0 for at most N indices j

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 31 / 52

VAR Adversarial problem

The adversarial problem is “easy”

K = no. of points in discrete distribution, H = no. of tiers

Theorem

Without the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number oflinear programs.

With the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number ofknapsack problems.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 32 / 52

VAR Adversarial problem

The adversarial problem is “easy”

K = no. of points in discrete distribution, H = no. of tiers

Theorem

Without the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number oflinear programs.

With the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number ofknapsack problems.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 32 / 52

VAR Adversarial problem

The adversarial problem is “easy”

K = no. of points in discrete distribution, H = no. of tiers

Theorem

Without the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number oflinear programs.

With the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number ofknapsack problems.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 32 / 52

VAR Adversarial problem

Adversarial problem as an MIP

Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w

min π,w ,V min 1≤i≤k Vi

Subject to

0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu

i , all i,∑i πi = 1,

Vi =∑

j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ

Vi = M (large ), otherwise

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 33 / 52

VAR Adversarial problem

Adversarial problem as an MIP

Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w

min π,w ,V min 1≤i≤k Vi

Subject to

0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu

i , all i,∑i πi = 1,

Vi =∑

j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ

Vi = M (large ), otherwise

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 33 / 52

VAR Adversarial problem

Adversarial problem as an MIP

Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w

min π,w ,V min 1≤i≤k Vi

Subject to

0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu

i , all i,∑i πi = 1,

Vi =∑

j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ

Vi = M (large ), otherwise

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 33 / 52

VAR Adversarial problem

Adversarial problem as an MIP

Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w

min π,w ,V min 1≤i≤k Vi

Subject to

0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu

i , all i,∑i πi = 1,

Vi =∑

j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ

Vi = M (large ), otherwise

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 33 / 52

VAR Adversarial problem

Approximation

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h (∗)

Let N > 0 be an integer. For 1 ≤ k ≤ N , write

kN

∑j∈Th

w j ≤ Γh + M (1 − zhk ), where

zhk = 1 if k −1N <

∑i δi πi ≤ k

N

zhk = 0 otherwise∑k zhk = 1

and M is large

Lemma. Under reasonable conditions, replacing (∗) with this systemchanges the value of the problem by at most a factor of (1 + 1

N )

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 34 / 52

VAR Adversarial problem

Approximation

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h (∗)

Let N > 0 be an integer. For 1 ≤ k ≤ N , write

kN

∑j∈Th

w j ≤ Γh + M (1 − zhk ), where

zhk = 1 if k −1N <

∑i δi πi ≤ k

N

zhk = 0 otherwise∑k zhk = 1

and M is large

Lemma. Under reasonable conditions, replacing (∗) with this systemchanges the value of the problem by at most a factor of (1 + 1

N )

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 34 / 52

VAR Adversarial problem

Approximation

(∑

i δi πi )∑

j∈Thw j ≤ Γh , for each tier h (∗)

Let N > 0 be an integer. For 1 ≤ k ≤ N , write

kN

∑j∈Th

w j ≤ Γh + M (1 − zhk ), where

zhk = 1 if k −1N <

∑i δi πi ≤ k

N

zhk = 0 otherwise∑k zhk = 1

and M is large

Lemma. Under reasonable conditions, replacing (∗) with this systemchanges the value of the problem by at most a factor of (1 + 1

N )

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 34 / 52

VAR

Implementor’s problem

Find a near-optimal solution with minimum value-at-risk

Nominal problem:

v ∗ = min x λx T Qx − µT x

Subject to:

Ax ≥ b

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 35 / 52

VAR

Implementor’s problem

Find a near-optimal solution with minimum value-at-risk

Nominal problem:

v ∗ = min x λx T Qx − µT x

Subject to:

Ax ≥ b

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 35 / 52

VAR

Implementor’s problem

Find a near-optimal solution with minimum value-at-risk

Given asset weights x , we have:

value-at-risk ≥ ρ, if the adversary can produce a return vector µwith

Prob(ν − µT x ≥ ρ) ≥ θ

where ν is a fixed reference value.

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 36 / 52

VAR

Implementor’s problem

Find a near-optimal solution with minimum value-at-risk

Implementor’s problem at iteration r:

min V

Subject to:

λx T Qx − µT x ≤ (1 + ε)v ∗

Ax ≥ b

V ≥ ν −∑

j µj

(1 − δi (t )w

(t )j

)x j , t = 1, 2, . . . , r − 1

Here, δi (t ) and w (t ) are the adversary’s output at iteration t < r .

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 37 / 52

VAR

Implementor’s problem

Find a near-optimal solution with minimum value-at-risk

Implementor’s problem at iteration r:

min V

Subject to:

λx T Qx − µT x ≤ (1 + ε)v ∗

Ax ≥ b

V ≥ ν −∑

j µj

(1 − δi (t )w

(t )j

)x j , t = 1, 2, . . . , r − 1

Here, δi (t ) and w (t ) are the adversary’s output at iteration t < r .

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 37 / 52

VAR Runs

First set of experiments

1338 assets, 41 factors, 81 rows

problem: find a “near optimal” solution with minimum value-at-risk,for a given threshold probability θ

experiment: investigate different values of θ

“near optimal”: want solutions that are at most 1% more expensivethan optimalrandom variable δ:

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 38 / 52

VAR Runs

First set of experiments

1338 assets, 41 factors, 81 rows

problem: find a “near optimal” solution with minimum value-at-risk,for a given threshold probability θ

experiment: investigate different values of θ

“near optimal”: want solutions that are at most 1% more expensivethan optimalrandom variable δ:

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 38 / 52

VAR Runs

1338 assets, 41 factors, 81 rows, ≤ 1% suboptimality

θ time iters. VAR VAR as %(sec.)

.89 1.18 2 3.43131 71.50

.90 1.42 3 3.74498 78.04

.91 1.42 3 3.74498 78.04

.92 3.47 11 4.05669 84.53

.93 6.35 29 4.05721 84.54

.94 6.37 20 4.05721 84.54

.95 26.59 51 4.35481 90.74

.96 26.25 51 4.35481 90.74

.97 26.20 51 4.35481 90.74

.98 33.07 58 4.63938 96.67

.99 33.11 58 4.63938 96.67

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 39 / 52

VAR Runs

Second set of experiments

Fix θ = 0.90 but vary suboptimality criterion

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 40 / 52

VAR Runs

Typical convergence behavior

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 41 / 52

VAR Runs

Heavy tail, proportional error (100 points):

Heavy tail, constant error (100 points):

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 42 / 52

VAR Runs

Heavy tail, proportional error (100 points):

Heavy tail, constant error (100 points):

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 42 / 52

VAR Runs

Proportional Constantθ Time Its Time Its

.85 4.79 8 11.84 11

.86 2.32 3 8.27 8

.87 6.40 10 9.55 11

.88 4.34 4 18.10 24

.89 8.00 14 5.85 6

.90 2.58 4 13.54 20

.91 4.79 9 16.31 23

.92 7.99 15 13.13 22

.93 13.43 27 22.47 40

.94 10.04 15 21.99 40

.95 9.59 16 11.90 25

.96 6.63 17 29.89 54

.97 48.43 110 16.45 35

.98 20.25 53 20.25 45

.99 22.02 52 21.89 47

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 43 / 52

VAR Harder case

A difficult case

2464 columns, 152 factors, 3 tiers

time = 6191 seconds

258 iterations

implementor time = 6123 seconds,adversarial time = 20 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 44 / 52

VAR Harder case

A difficult case

2464 columns, 152 factors, 3 tiers

time = 6191 seconds

258 iterations

implementor time = 6123 seconds,adversarial time = 20 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 44 / 52

VAR Harder case

A difficult case

2464 columns, 152 factors, 3 tiers

time = 6191 seconds

258 iterations

implementor time = 6123 seconds,adversarial time = 20 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 44 / 52

VAR Harder case

A difficult case

2464 columns, 152 factors, 3 tiers

time = 6191 seconds

258 iterations

implementor time = 6123 seconds,adversarial time = 20 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 44 / 52

VAR Harder case

A difficult case

2464 columns, 152 factors, 3 tiers

time = 6191 seconds

258 iterations

implementor time = 6123 seconds,adversarial time = 20 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 44 / 52

VAR Harder case

Implementor runtime

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 45 / 52

VAR Harder case

Implementor’s problem at iteration r

min V

Subject to:

λx T Qx − µT x ≤ (1 + ε)v ∗

Ax ≥ b

V ≥ ν −∑

j µj

(1 − δi (t )w

(t )j

)x j , t = 1, 2, . . . , r − 1

Here, δi (t ) and w (t ) are the adversary’s output at iteration t < r .

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 46 / 52

VAR Harder case

Implementor’s problem at iteration r

Approximate version

min V

Subject to:

2λ x T(k )Qx − λ x T

(k )Qx(k ) − µT x ≤ (1 + ε)v ∗, ∀k < r

Ax ≥ b

V ≥ ν −∑

j µj

(1 − δi (k )w

(k )j

)x j , ∀ k < r

Here, δi (k ) and w (k ) are the adversary’s output at iteration k < r ,and x(k ) is the implementor’s output at iteration k .

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 47 / 52

VAR Harder case

Does it work?

Before: 258 iterations, 6191 seconds

Linearized: 1776 iterations, 3969 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 48 / 52

VAR Harder case

Does it work?

Before: 258 iterations, 6191 seconds

Linearized: 1776 iterations, 3969 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 48 / 52

VAR Harder case

Averaging

x(k ) is the implementor’s output at iteration k .

Define y(1) = x(1)

For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1

Input y(k ) to the adversary

Old ideas, also Nesterov, Nemirovsky (2003)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 49 / 52

VAR Harder case

Averaging

x(k ) is the implementor’s output at iteration k .

Define y(1) = x(1)

For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1

Input y(k ) to the adversary

Old ideas, also Nesterov, Nemirovsky (2003)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 49 / 52

VAR Harder case

Averaging

x(k ) is the implementor’s output at iteration k .

Define y(1) = x(1)

For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1

Input y(k ) to the adversary

Old ideas, also Nesterov, Nemirovsky (2003)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 49 / 52

VAR Harder case

Averaging

x(k ) is the implementor’s output at iteration k .

Define y(1) = x(1)

For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1

Input y(k ) to the adversary

Old ideas, also Nesterov, Nemirovsky (2003)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 49 / 52

VAR Harder case

Averaging

x(k ) is the implementor’s output at iteration k .

Define y(1) = x(1)

For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1

Input y(k ) to the adversary

Old ideas, also Nesterov, Nemirovsky (2003)

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 49 / 52

VAR Harder case

Does it work?

Default: 258 iterations, 6191 seconds

Linearized: 1776 iterations, 3969 seconds

Averaging plus Linearized: 860 iterations, 530 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 50 / 52

VAR Harder case

Does it work?

Default: 258 iterations, 6191 seconds

Linearized: 1776 iterations, 3969 seconds

Averaging plus Linearized: 860 iterations, 530 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 50 / 52

VAR Harder case

Does it work?

Default: 258 iterations, 6191 seconds

Linearized: 1776 iterations, 3969 seconds

Averaging plus Linearized: 860 iterations, 530 seconds

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 50 / 52

Future work

Other robust models

Min-max expected loss with orthogonal missing factor

Random return = µ • (1 − δw ) where −1 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

normalization constraints, e.g.∑

j w j = 0

errors in covariance matrix Q

robust problem: → min x maxQ∈Q λx T Qx − µT x

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 51 / 52

Future work

Other robust models

Min-max expected loss with orthogonal missing factor

Random return = µ • (1 − δw ) where −1 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

normalization constraints, e.g.∑

j w j = 0

errors in covariance matrix Q

robust problem: → min x maxQ∈Q λx T Qx − µT x

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 51 / 52

Future work

Other robust models

Min-max expected loss with orthogonal missing factor

Random return = µ • (1 − δw ) where −1 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.

normalization constraints, e.g.∑

j w j = 0

errors in covariance matrix Q

robust problem: → min x maxQ∈Q λx T Qx − µT x

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 51 / 52

Future work

On-going work: a provably good version of Benders’ algorithm

Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 12-06-2006 52 / 52

Recommended