32
EFFICIENT SCENARIO GENERATION FOR HEAVY-TAILED CHANCE CONSTRAINED OPTIMIZATION JOSE BLANCHET, FAN ZHANG, AND BERT ZWART Abstract. We consider a generic class of chance-constrained optimization problems with heavy-tailed (i.e., power-law type) risk factors. In this setting, we use the scenario approach to obtain a constant approximation to the optimal solution with a compu- tational complexity that is uniform in the risk tolerance parameter. We additionally illustrate the efficiency of our algorithm in the context of solvency in insurance net- works. 1. Introduction In this paper, we consider the following family of chance constrained optimization problems: minimize c > x subject to P(φ(x, L) > 0) δ, x R dx . (CCP δ ) where x R dx is a d x -dimensional decision vector and L is a d l -dimensional random vector in R d l . The elements of L are often referred to as risk factors; the function φ : R dx × R d l R is often assumed to be convex in x and often models a cost constraint; the parameter δ> 0 is the risk level of the tolerance. Our framework encompasses the joint chance constraint of the form P(φ i (x, L) > 0, i ∈{1,...,n}) δ , by setting φ(x, L) = max i=1,...,n φ i (x, L). Chance constrained optimization problems have a rich history in Operations Research. Introduced by Charnes et al. (1958), chance constrained optimization formulations have proved to be versatile in modeling and decision making in a wide range of settings. For example, Prekopa (1970) used these types of formulations in the context of production planning. The work of Bonami and Lejeune (2009) illustrates how to take advantage of chance constrained optimization formulations in the context of portfolio selection. In the context of power and energy control the use of chance constrained optimization is illustrated in Andrieu et al. (2010). These are just examples of the wide range of applica- tions that have benefited (and continue to benefit) from chance constrained optimization formulations and tools. Consequently, there has been a significant amount of research effort devoted to the so- lution of chance constrained optimization problems. Unfortunately, however, these types of problems are provably NP-hard in the worst case, see Luedtke et al. (2010). As a con- sequence, much of the methodological effort has been placed into developing: a) solutions 1 arXiv:2002.02149v2 [math.OC] 11 Feb 2021

arXiv:2002.02149v2 [math.OC] 11 Feb 2021

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR HEAVY-TAILEDCHANCE CONSTRAINED OPTIMIZATION

JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Abstract. We consider a generic class of chance-constrained optimization problemswith heavy-tailed (i.e., power-law type) risk factors. In this setting, we use the scenarioapproach to obtain a constant approximation to the optimal solution with a compu-tational complexity that is uniform in the risk tolerance parameter. We additionallyillustrate the efficiency of our algorithm in the context of solvency in insurance net-works.

1. Introduction

In this paper, we consider the following family of chance constrained optimizationproblems:

minimize c>xsubject to P(φ(x, L) > 0) ≤ δ,

x ∈ Rdx .(CCPδ)

where x ∈ Rdx is a dx-dimensional decision vector and L is a dl-dimensional randomvector in Rdl . The elements of L are often referred to as risk factors; the functionφ : Rdx ×Rdl → R is often assumed to be convex in x and often models a cost constraint;the parameter δ > 0 is the risk level of the tolerance. Our framework encompassesthe joint chance constraint of the form P(φi(x, L) > 0, ∃i ∈ 1, . . . , n) ≤ δ, by settingφ(x, L) = maxi=1,...,n φi(x, L).

Chance constrained optimization problems have a rich history in Operations Research.Introduced by Charnes et al. (1958), chance constrained optimization formulations haveproved to be versatile in modeling and decision making in a wide range of settings. Forexample, Prekopa (1970) used these types of formulations in the context of productionplanning. The work of Bonami and Lejeune (2009) illustrates how to take advantageof chance constrained optimization formulations in the context of portfolio selection. Inthe context of power and energy control the use of chance constrained optimization isillustrated in Andrieu et al. (2010). These are just examples of the wide range of applica-tions that have benefited (and continue to benefit) from chance constrained optimizationformulations and tools.

Consequently, there has been a significant amount of research effort devoted to the so-lution of chance constrained optimization problems. Unfortunately, however, these typesof problems are provably NP-hard in the worst case, see Luedtke et al. (2010). As a con-sequence, much of the methodological effort has been placed into developing: a) solutions

1

arX

iv:2

002.

0214

9v2

[m

ath.

OC

] 1

1 Fe

b 20

21

Page 2: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

2 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

in the case of specific models; b) convex and, more generally, tractable relaxations; c)combinatorial optimization tools; d) Monte-Carlo sampling schemes. Of course, hybridapproaches are also developed. For example, as a combination of type b) and type d)approaches, Hong et al. (2011) show that the solution to a chance constraint optimizationproblem can be approximated by optimization problems with constraints represented asthe difference of two convex functions. In turn, this is further approximated by solvinga sequence of convex optimization problems, each of which can be solved by a gradientbased Monte Carlo method. Another example is Pena-Ordieres et al. (2020), which com-bines relaxations of type b) with sample-average approximation associated with type d)methods. In addition to the aforementioned types, Hong et al. (2020) provides an upperbound for the chance constraint optimization problem using a robust optimization witha data-driven uncertainty set, achieving a dimension independent sample complexity.

Examples of type a) approaches include the study of Gaussian or elliptical distributionswhen φ is affine both in L and x. In this case, the problem admits a conic programmingformulation, which can be efficiently solved, see Lagoa et al. (2005). Type b) ap-proaches include Hillier (1967), Seppala (1971), Ben-Tal and Nemirovski (2000, 2002),Prekopa (2003), Bertsimas and Sim (2004), Nemirovski and Shapiro (2006a), Chen et al.(2010), Tong et al. (2020). These approaches usually integrate probabilistic inequalitiessuch as Chebyshev’s bound, Bonferroni’s bound, Bernstein’s approximations, or largedeviation principles to construct tractable analytical approximations. Type c) methodsare based on branch and bounding algorithms, which connect squarely with the classof tools studied in areas such as integer programming, see Ahmed and Shapiro (2008),Luedtke et al. (2010), Kucukyavuz (2012), Luedtke (2014), Zhang et al. (2014), Lejeuneand Margot (2016). Type d) methods include the sample gradient method, the sampleaverage approximation and the scenario approach. The sample gradient method is usuallycombined with a smooth approximation, see Hong et al. (2011) for example. The sampleaverage approximations studied by Luedtke and Ahmed (2008) and Barrera et al. (2016),although simplifying the constraint’s probabilistic structure via replacing the populationdistribution by sampled empirical distribution, are nevertheless hard to solve due to non-convex feasible regions. The method we consider in this paper is the scenario approach.The scenario approach is introduced and studied in Calafiore and Campi (2005) and isfurther developed in a series of papers, including Calafiore and Campi (2006), Nemirovskiand Shapiro (2006b).

The scenario approach is the most popular generic method for (approximately) solvingchance constrained optimization. The idea is to sample a number N of scenarios (eachscenario consists of a sample of L) and enforce the constraint in all of these scenarios.The intuition is that if for any scenario, say L(i), the constraint φ(L(i), x) < 0 is convexin x, and δ > 0 is small, we expect that by suitably choosing N the constrained regionscan be relaxed by enforcing φ(L(i), x) < 0 for all i = 1, . . . , N , leading to a good and, insome sense, tractable (if N is of moderate size) approximation of the chance constrainedregion. Of course, this intuition is correct only when δ > 0 is small and we expect thechoice of N to be largely influenced by this asymptotic regime.

Page 3: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 3

By choosing N sufficiently large, the scenario approach allows obtaining both upperand lower bounds which become asymptotically tighter as δ → 0. In a celebrated paper,Calafiore and Campi (2006) provide rigorous support for this claim. In particular, givena confidence level β ∈ (0, 1), if N ≥ (2/δ) × log(1/β) + 2d + (2d/δ) × log(2/δ), withprobability at least 1 − β, the optimal solution of the scenario approach relaxation isfeasible for the original chance constrained problem and, therefore, an upper bound tothe problem is obtained.

Unfortunately, the required sample size of N grows with (1/δ)× log(1/δ) as δ becomessmall, limiting the scope of the scenario methods in applications. Many applicationsof chance constraint optimization require a very small δ. For example, in the 5G ultra-reliable communication system design, the failure probability δ is no larger than 10−5,see Alsenwi et al. (2019); for fixed income portfolio optimization, an investment gradeportfolio has a historical default rate of 10−4, reported by Frank (2008).

Motivated by this, Nemirovski and Shapiro (2006b) developed a method that lowersthe required sample size to the order of log(1/δ), making additional assumptions on thefunction φ (which is taken to be bi-affine), and the risk factors L, which are to be assumedlight-tailed. Specifically, the moment generating function E[exp(sL)] is assumed to befinite in a neighborhood of the origin. No guarantee is given in terms of how far theupper bound is from the optimal value function of the problem as δ → 0.

In the present paper, we focus on improving the scalability of N in terms of 1/δ forthe practically important case of heavy-tailed risk factors. Heavy-tailed distributionsappear in a wide range of applications in science, engineering and business, see e.g.,Embrechts et al. (2013), Wierman and Zwart (2012), but, in some aspects, are not aswell understood as light-tails. One reason is that techniques from convex duality cannotbe applied as the moment generating function of L does not exist in a neighborhood of0. In addition, probabilistic inequalities, exploited in Nemirovski and Shapiro (2006b),do not hold in this setting. Only very recently, a versatile algorithm for heavy-tailed rareevent simulation has been developed in Chen et al. (2019).

The main contribution of our paper is an algorithm that provides a sample complexityfor N which is bounded in 1/δ, assuming a versatile class of heavy-tailed distributionsfor L. Specifically, we shall assume that L follows a semi-parametric class of modelsknown as multivariate regular variation, which is quite standard in multivariate heavy-tail modeling, cf. Embrechts et al. (2013), Resnick (2013). A precise definition is givenin Section 5. Moreover, our estimator is shown to be within a constant factor to thesolution to (CCPδ) with high probability, uniformly as δ → 0. We are not aware ofother approaches that provide a uniform performance guarantee of this type.

We illustrate our assumptions and our framework with a risk problem of independentinterest. This problem consists in computing a collective salvage fund in a network offinancial entities whose liabilities and payments are settled in an optimal way using theEisenberg-Noe model, see Eisenberg and Noe (2001). The salvage fund is computed tominimize its size in order to guarantee a probability of collective default after settlements

Page 4: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

4 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

of less than a small prescribed margin. For the sake of demonstrating the broad ap-plicability of our method, we present a portfolio optimization problem with value-at-riskconstraints as an additional running example.

The rest of the paper is organized as follows. In Section 2, we introduce the portfoliooptimization problem and the minimal salvage fund problem as particular applicationsof chance constraint optimization. We employ both problems as running examples toprovide a concrete and intuitive explanation for the concepts we introduce throughoutthe paper. In Section 3, we provide a brief review of the scenario approach in Calafioreand Campi (2006).

The ideas behind our main algorithmic contributions are given in Section 4, wherewe introduce its intuition, rooted in ideas originating from rare event simulation. Ouralgorithm requires the construction of several auxiliary functions and sets. How to do thisis detailed in Section 5, in which we also present several additional technical assumptionsrequired by our constructions. In Section 5, we also explain that our procedure results inan estimate which is within a constant factor of the optimal solution of the underlyingchance constrained problem with high probability as δ → 0. In Section 6 we show that theassumptions imposed are valid in our motivating example (as well as a second examplewith quadratic cost structure inside the probabilistic constraint). Numerical results forthe examples are provided in Section 7. Throughout our discussion in each section wepresent a series of results which summarize the main ideas of our constructions. To keepthe discussion fluid, we present the corresponding proofs in Appendix A unless otherwiseindicated.

Notations: in the sequel, R+ = [0,+∞) is the set of non-negative real numbers,R++ = (0,+∞) is the set of positive real numbers, and R = [−∞,+∞] is the extendedreal line. A column vector with zeros is denoted by 0, and a column vector with ones isdenoted by 1. For any matrix Q, the transpose of Q is denoted by Q>; the Frobeniusnorm of Q is denoted by ‖Q‖F . The identity matrix is denoted by I. For two columnvectors x, y ∈ Rd, we say x y if and only if y − x ∈ Rd

+. For α ∈ R and x ∈ Rd, we useα · x to denote the scalar multiplication of x with α. For α ∈ R and E ⊆ Rd, we defineα ·E = α · x | x ∈ E. The optimal value of an optimization problem (P) is denoted byVal(P). We also use Landau’s notation. In particular, if f(·) and g(·) are non-negativereal valued functions, we write f(t) = O(g(t)) if f(t) ≤ c0 × g(t)) for some c0 ∈ (0,∞)and f(t) = Ω(g(t)) if f(t) ≥ g(t))/c0 for some c0 ∈ (0,∞).

2. Running Examples

2.1. Portfolio Optimization with VaR Constraint. We first introduce a portfoliooptimization problem. Suppose that there are d assets to invest. If we invest a dollarin the i-th asset, the investment has mean return µi and a non-negative random lossLi. Let x = (x1, · · · , xd) represent the amount of dollar invested in different assets, andlet µ = (µ1, . . . , µd) and L = (L1, . . . , Ld). We assume that L follows a multivariateheavy-tailed distribution, in a way made precise later on. The portfolio manager’s goalis to maximize the mean return of the portfolio, which is equal to µ>x, with a portfolio

Page 5: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 5

risk constraint prescribed by a risk measure called value-at-risk (VaR). The VaR at level1− δ ∈ (0, 1) for a random variable X is defined as

VaR1−δ(X) = minz ∈ R : FX(z) ≥ 1− δ.

For a given number η > 0, we formulate the following portfolio optimization problem.

maximize µ>xsubject to VaR1−δ(x

>L) ≤ η,x ∈ Rd

++.

Using the definition of VaR and the fact that the cumulative distribution function is rightcontinuous, we conclude that VaR1−δ(x

>L) ≤ η is equivalent to P(x>L− η > 0) ≤ δ. Inorder to facilitate the technical exposition, we apply the change of variable xi 7→ 1/xi tohomogenize the constraint function, yielding the following equivalent chance constrainedoptimization problem in standard form:

maximize∑d

i=1(µi/xi)subject to P

(φ(x, L) > 0

)≤ δ,

x ∈ Rd++.

(1)

where φ(x, l) =∑d

i=1(li/xi) − η. Despite the nonlinear objective, (Calafiore and Campi2005, Section 4.3) shows that it admits an epigraphic reformulation with a linear objectiveso that the standard scenario approach is applicable.

2.2. Minimal Salvage Fund. Suppose that there are d entities or firms, which we caninterpret as (re)insurance firms. Let L = (L1, . . . , Ld) ∈ Rd

+ denotes the vector of incurredlosses by each firm, where Li denotes the total incurred loss that entity i is responsible topay. We assume that L follows a multivariate heavy-tailed distribution as in the previousexample. Let Q = (Qi,j : i, j ∈ 1, . . . , d) be a deterministic matrix where Qi,j denotesthe amount of money received by entity j when entity i pays one dollar. We assume thatQi,j ≥ 0 and

∑dj=1Qi,j < 1. Let x = (x1, . . . , xd) denote the total amount that the

salvage fund allocated to each entity, and y∗ = (y∗1, . . . , y∗d) denote the amount of the final

settlement. The amount of final settlement is determined by the following optimizationproblem:

y∗ = y∗(x, L) = arg max1>y | 0 y L,(I −Q>

)y x.

In words, the system maximizes the payments subject to the constraint that nobodypays more than what they have (in the final settlement), and nobody pays more thanwhat they owe. Notice that y∗ = y∗(x, L) is also a random variable (the randomnesscomes from L) satisfying 0 y∗ L. Suppose that entity i bankrupts if the deficitLi− y∗i ≥ mi, where m ∈ Rd

+ is a given vector. We are interested in finding the minimalamount of salvage fund that ensures no bankruptcy happens with probability at least1 − δ. The problem can be formulated as a chance constraint programming problem asfollows

minimize 1>xsubject to P(L− y∗(x, L) m) ≥ 1− δ,

x ∈ Rd++.

(2)

Page 6: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

6 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Now we write the problem (2) into standard form. Notice that L − y∗(x, L) m if andonly if φ(x, L) ≤ 0, where φ(x, L) is defined as follows

φ (x, L) := minb,yb | (L− y −m) b · 1,

(I −Q>

)y x, y 0.

Therefore, problem (2) is equivalent to

minimize 1>xsubject to P(φ(x, L) > 0) ≤ δ,

x ∈ Rd++.

(3)

3. Review of Scenario Approach

As mentioned in the introduction, a popular approach to solve the chance constraintproblem proceeds by using the scenario approach developed by Calafiore and Campi(2006). They suggest to approximate the probabilistic constraint P(φ(x, L) > 0) ≤ δby N sampled constraints φ(x, L(i)) ≤ 0 for i = 1, . . . , N , where L(1), . . . , L(N) areindependent samples. Instead of solving the original chance constraint problem (CCPδ),which is usually intractable, we turn to solve the following optimization problem

minimize c>xsubject to φ(x, L(i)) ≤ 0, i = 1, . . . , N,

x ∈ Rdx .(SPN)

The total sample size N should be large enough to ensure the feasible solution to thesampled problem (SPN) is also a feasible solution to the original problem (CCPδ) with ahigh confidence level. According to Calafiore and Campi (2006), for any given confidencelevel parameter β ∈ (0, 1), if

N ≥ 2

δlog

1

β+ 2d+

2d

δlog

2

δ,

then any feasible solution to the sampled optimization problem (SPN) is also a feasiblesolution to (CCPδ) with probability at least 1 − β. However, when δ is small, the totalnumber of sampled constraints is of order Ω((1/δ) log(1/δ)), which could be a problemfor implementation. For example, as we shall see in Section 7, when β = 10−5, d = 15and δ = 10−3, the number of sampled constraints N is required to be larger than 2× 105.In contrast, our method only requires to sample 2× 103 constraints.

4. General Algorithmic Idea

To facilitate the development of our algorithm, we introduce some additional notationand a desired technical property. As we shall see, if the technical property is satisfied,then there is a natural way to construct a scenario approach based algorithm that onlyrequires O(1) of total sampled constraints. We exploit key intuition borrowed from rareevent simulation. A common technique exploited, for example, in Chen et al. (2019), isthe construction of a so-called super set, which contains the rare event of interest. Thesuper set should be constructed with a probability which is of the same order as that ofthe rare event of interest. If the conditional distribution given being in the super set is

Page 7: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 7

Figure 1. Pictorial illustration of Oδ and Cδ.

accessible, this can be used as an efficient sampling scheme. The first part of this sectionsimply articulates the elements involved in setting the stage for constructing such a setin the outcome space of L. Later, in Section 5, we will impose assumptions in orderto ensure that the probability of the super set, which eventually we will denote by Cδis suitably controlled as δ → 0. Simply collecting the elements necessary to constructCδ requires introducing some super sets involving the decision space, since the optimaldecision is unknown.

Let Fδ ⊆ Rdx denote the feasible region of the chance constraint optimization problem(CCPδ), i.e.,

Fδ := x ∈ Rdx | P(φ(x, L) > 0) ≤ δ.(4)

Here, the subscript δ is involved to emphasize that the feasible region Fδ is parametrizedby the risk level δ. For any fixed x ∈ Rdx , let Vx := L ∈ Rdl | φ(x, L) > 0 denote theviolation event at x.

Property 1. For any δ > 0, there exist a set Oδ ⊆ Rdx , and an event Cδ ⊆ Rdl thatsatisfy the following statements.

a) The feasible set Fδ is a subset of Oδ.b) The event Cδ contains the violation event Vx for any x ∈ Oδ.c) There exist a constant M > 0 independent of δ such that P(L ∈ Cδ) ≤M · δ.

In the rest of this paper, we will refer to Oδ as the outer approximation set, and Cδ asthe uniform conditional event. A graphical illustration of Oδ and Cδ is shown in Figure1.

Now, given Oδ and Cδ that satisfies Property 1, we define the conditional sampledproblem (CSPδ,N′):

minimize c>x

subject to φ(x, L(i)δ ) ≤ 0, i = 1, . . . , N ′.

x ∈ Oδ.

(CSPδ,N′)

where L(i)δ are i.i.d. samples generated from the conditional distribution (L|L ∈ Cδ).

Page 8: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

8 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

We now present our main result of this section in Lemma 1, which validates (CSPδ,N′)is an effective and sample efficient scenario approximation by incorporating (Calafioreand Campi 2006, Theorem 2) and Property 1. The proof of Lemma 1 will be presentedin Section 4.1.

Lemma 1. Suppose that Property 1 is imposed, and let β > 0 be a given confidencelevel.

(1) Let δ′ = δ/P(L ∈ Cδ) ≥ 1/M and N ′ be any integer that satisfies

N ′ ≥ 2

δ′log

1

β+ 2d+

2d

δ′log

2

δ′.(5)

With probability at least 1 − β, if the conditional sampled problem (CSPδ,N′) isfeasible, then its optimal solution x∗N ∈ Fδ and Val (CSPδ,N′) ≥ Val (CCPδ).

(2) Let N ′ be any integer such that N ′ ≤ βδ−1P(L ∈ Cδ). Assume that the chanceconstraint problem (CCPδ) is feasible. Then, with probability at least 1 − β,Val (CCPδ) ≥ Val (CSPδ,N′).

Remark 1. Note that the lower bound given in (5) is not greater than 2M log( 1β) + 2d+

2dM log(2M), which is independent of δ. Therefore, Lemma 1 shows that the chanceconstraint problem (CCPδ) can be approximated by (CSPδ,N′) with sample complexitybounded uniformly as δ → 0, as long as Property 1 is satisfied.

Remark 2. Efficiently generating samples of (L|L ∈ Cδ) when δ → 0 requires rare eventsimulation techniques. For example, when L is light-tailed, exponential tilting can beapplied to achieve O(1) sample complexity uniformly in δ; when L is heavy-tailed, withthe help of specific problem structure, one can apply importance sampling, see Blanchetand Liu (2010), or Markov Chain Monte Carlo, see Gudmundsson and Hult (2014), todesign an efficient sampling scheme. The specific structure of our salvage fund exampleresults in Cδ being the complement of a box, which makes the sampling very tractable ifthe element of L are independent.

Even if the aforementioned rare event simulation techniques are hard to apply in prac-tice, we can still apply a simple acceptance-rejection procedure to sample the conditionaldistribution (L|L ∈ Cδ). It costs O(1/δ) samples of L on average to get one sample of(L|L ∈ Cδ), since P(L ∈ Cδ) = O(δ). Consequently, the total complexity for generat-

ing L(i)δ , i = 1, . . . , N ′ and solving (CSPδ,N′) is O(1/δ), which is still much more efficient

than the scenario approach in Calafiore and Campi (2006), because it requires compu-tational complexity O(((1/δ) log(1/δ))3) for solving a linear programming problem withO((1/δ) log(1/δ)) sampled constraints by the interior point method.

Although Property 1 seems to be restrictive at first glance, we are still able to constructthe sets Oδ and Cδ for a rich class of functions φ(x, L), including the constraint functionfor the minimal salvage fund problem. As we shall see in the proof of Lemma 1, onceOδ and Cδ are constructed the sampled problem (CSPδ,N′) is a tractable approximationto the problem (CCPδ). We explain how to construct the sets Oδ and Cδ in the nextsection under some additional assumptions. These assumptions relate in particular to

Page 9: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 9

the distribution of L. It turns out that, if L is heavy-tailed, the construction of Oδ andCδ becomes tractable.

4.1. Proof of Lemma 1. If Property 1 is satisfied, (CCPδ) is equivalent to

minimize c>xsubject to P(φ(x, L) > 0 | L ∈ Cδ) ≤ δ/P(L ∈ Cδ),

x ∈ Oδ ⊆ Rdx .(6)

Let δ′ := δ/P(L ∈ Cδ) ≥ 1/M denote the risk level in the equivalent problem (6). Thesampled optimization problem related to problem (6) is given by

minimize c>x

subject to φ(x, L(i)δ ) ≤ 0, i = 1, . . . , N ′,

x ∈ Oδ,

(CSPδ,N′)

where the L(i)δ are independently sampled from P(· | L ∈ Cδ). Notice that

N ′ ≥ 2

δ′log

1

β+ 2d+

2d

δ′log

2

δ′.

According to (Calafiore and Campi 2006, Corollary 1 and Theorem 2), with probabilityat least 1− β, if the sampled problem (CSPδ,N′) is feasible, then the optimal solution toproblem (CSPδ,N′) is feasible to the chance constraint problem (6), thus it is also feasiblefor (CCPδ). The proof of the first statement is complete.

Now we turn to prove the second statement. Note that the equivalence between (CCPδ)and (6) is still valid, so it is sufficient to compare the optimal values of (6) and (CSPδ,N′).By applying (Calafiore and Campi 2006, Theorem 2) again, we have with probability atleast 1− β the value of (CSPδ,N′) is smaller or equal than the optimal value of

minimize c>xsubject to P(φ(x, L) > 0 | L ∈ Cδ) ≤ 1− (1− β)1/N ′

,x ∈ Oδ ⊆ Rdx .

(7)

The proof is complete by using 1− (1−β)1/N ′ ≥ β/N ′ ≥ δP(L∈Cδ)

. So, using Val for “value

of”, Val (7) ≤ Val (6) = Val (CCPδ).

5. Constructing outer approximations and summary of the algorithm

In this section, we come full circle with the intuition borrowed from rare event simu-lation explained at the beginning of Section 4. The scale-free properties of heavy-taileddistributions (to be reviewed momentarily) coupled with natural (polynomial) growthconditions (like the linear loss) given by the structure of the optimization problem, pro-vide the necessary ingredients to show that the set Cδ has a probability which is of orderO(δ). In this section, we present two methods for the construction of Oδ and Cδ satisfy-ing Property 1. We mostly focus on our “scaling method” which is presented in Section5.1, which is facilitated precisely by the scale-free property that we will impose on L.After showing the construction of the outer sets under the scaling method, we summarizethe algorithm at the end of Section 5.1. We supply a lower bound guaranteeing a constant

Page 10: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

10 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

approximation for the output of the algorithm in Section 5.2. Our second method forouter approximation constructions is summarized in Section 5.3. This method is simplerto apply because is based on linear approximations, however, it is somewhat less powerfulbecause it assume that φ(x, L) is jointly convex.

5.1. Scaling Method. We are now ready to state our assumption on the distribution ofL. We assume that the distribution of L is of multivariate regular variation, a definition

that we review first. For background, we refer to Resnick (2013). Let M+(Rdl\0)denote all Radon measures on the space Rdl\0 (recall that a measure is Radon if it

assigns finite mass to all compact sets). If µn(·), µ(·) ∈M+(Rdl\0), then µn converges

to µ vaguely, denoted by µnv→ µ, if for all compactly supported continuous functions

f : Rdl\0 → R+,

limn→∞

∫Rdl\0

f(x)µn(dx) =

∫Rdl\0

f(x)µ(dx).

L is multivariate regularly varying with limit measure µ(·) ∈M+(Rdl\0) if

P(x−1L ∈ ·)P(‖L‖2 > x)

v→ µ(·), as x→∞.

Assumption 1. L is multivariate regularly varying with limit measure µ(·) ∈M+(Rdl\0).

We give some intuition behind this definition. Write L in terms of polar coordinates,with R the radius and Θ a random variable taking values on the unit sphere. The radiusR = ‖L‖2 has a one-dimensional regularly varying tail (i.e. we can write P(R > x) =L(x)x−α for a slowly varying function L and α > 0). The angle Θ, conditioned on R beinglarge, converges weakly (as R → ∞) to a limiting random variable. The distribution ofthis limit can be expressed in terms of the measure µ. For another recent application ofmultivariate regular variation in operations research, see Kley et al. (2016).

We proceed to analyze the feasible region Fδ when δ → 0. Intuitively, if the violationprobability P(φ(x, L) > 0) has a strictly positive lower bound in any compact set, thenFδ will ultimately be disjoint with the compact set when δ → 0. Thus, the set Fδ isexpelled to infinity when δ → 0 in this case. Fδ is moving towards the direction thatφ(x, L) becomes small such that the violation probability becomes smaller. For instance,if x is one dimensional and φ(x, L) is increasing in x, then Fδ is moving towards thenegative direction. Consider the portfolio optimization problem as another example, inwhich mindi=1 xi → +∞ as δ → 0.

Now we begin to construct the outer approximation set Oδ. To this end, we need tointroduce an auxiliary function which we shall call a level function.

Definition 1. We say that π : Rdx → [0,+∞] is a level function if

(1) for any α ≥ 0 and x ∈ Rdx , we have π(α · x) = α · π(x),(2) limδ→0 infx∈Fδ π(x) = +∞.

Page 11: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 11

We also define the level set Π = x ∈ Rdx | π(x) = 1.

As Fδ is moving to infinity, the level function is helpful to characterize the ‘movingdirection’ of Fδ as well as the correct rate of scaling as δ becomes small. As we shall seein the proof of Lemma 2, for any δ small enough we can choose some αδ and define

Oδ :=⋃α≥αδ

(α · Π) ⊇ Fδ.

To construct Oδ, we first select the level set Π, and then derive the scaling rate of αδ.

The level function π and the shape of Π should be chosen in accordance with the movingdirection of Fδ to reduce the size of Oδ, in order to achieve better sample complexity. Forexample, when φ(x, L) = −‖x‖2−L, the level function π can be chosen as the Euclideannorm and Π can be chosen as the unit sphere in Rdx . For the portfolio optimizationproblem, the level function can be chosen as π(x) = mindi=1 xi + ∞ · I(x /∈ Rdx

++) inaccordance with our intuition that mindi=1 xi → ∞, and the level set can be chosenas Π = x ∈ Rdx | mindi=1 xi = 1. Therefore, it is natural to impose the followingassumption about the existence of the level function.

Assumption 2. There exist a level function π and a level set Π.

To analyze the asymptotic shape of the uniform conditional event Cδ, we connectthe asymptotic distribution of L to the asymptotic distribution of φ(x, L). We pick acontinuous non-decreasing function h : R++ → R++ such that limα→+∞ h(α) = +∞to characterize the scaling rate of L. In addition, we pick another positive functionr : R++ → R++ to characterize the scaling rate of φ(α · x, h(α) · L). Intuitively, thescaling function r(·) and h(·) should ensure the condition that 1

r(α)φ(α · x, h(α) ·L)α≥1

is tight. For the minimal salvage fund problem with fixed δ, as the deficit φ(x, L) isasymptotically linear with respect to the salvage fund x and the loss L, we can simplypick r(α) = h(α) = α in this problem. We next introduce two auxiliary functions Ψ+

and Ψ−.

Definition 2. Let Ψ+ : Rdl → R, Ψ− : Rdl → R be two Borel measurable functions.We say Ψ+ (resp. Ψ−) is the asymptotic uniform upper (resp. lower) bound of 1

r(α)φ(α ·

x, h(α) · l) over the level set x ∈ Π if for any compact set K ⊆ Rdl ,

(8a) lim infα→∞

infl∈K

(Ψ+(l)− sup

x∈Π

[1

r(α)φ(α · x, h(α) · l)

])≥ 0,

(8b) lim supα→∞

supl∈K

(Ψ−(l)− inf

x∈Π

[1

r(α)φ(α · x, h(α) · l)

])≤ 0.

In Section 6, we show for the salvage fund example how Ψ+ and Ψ− can be writtenas maxima or minima of affine functions. Here, we employ the functions Ψ+ and Ψ− todefine the event Cε,− and Cε,+, which serve as the inner and outer approximation of theevent ∪x∈ΠVx, where Vx = l ∈ Rdl | φ(x, l) > 0 is the violation event at x.

Page 12: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

12 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Definition 3. For ε > 0, let Cε,+ (resp. Cε,−) be the ε-outer (resp. inner) approximationevent

(9a) Cε,+ := l ∈ Rdl | Ψ+(l) ≥ −ε,

(9b) Cε,− := l ∈ Rdl | Ψ−(l) ≥ +ε.

We now define Oδ :=⋃α≥αδ α ·Π. The following property ensures that the shape of Π

is appropriate and αδ is large enough, hence Oδ is an outer approximation of Fδ.

Property 2. There exist δ0 such that for any δ < δ0, we have an explicitly computableconstant αδ that satisfies

P(‖L‖2 > h(αδ)) = O(δ) and Fδ ⊆⋃α≥αδ

α · Π = Oδ.

If the violation probability is easy to analyze, we will directly derive the expression ofαδ and verify Property 2. Otherwise, we resort to Lemma 2, which provides a sufficientcondition of Property 2 by analyzing the asymptotic distribution. The proof of Lemma2 is deferred to Appendix A.

Lemma 2. Suppose that Assumptions 1 and 2 hold. If there exists an asymptoticuniform lower bound function Ψ−(·) as given in (8b) and ε > 0 such that µ(Cε,−) > 0,then Property 2 is satisfied.

We impose the following Assumption 3 on the asymptotic uniform upper bound Ψ+(·)so that we can employ the multivariate regular variation of L to estimate P(L ∈ α ·Cε,+)for large scaling factor α.

Assumption 3. There exist an event S ⊆ Rdl with µ(Sc) <∞ such that

S ⊆ α · S, Ψ+(l) ≤ Ψ+(α · l), ∀l ∈ S, α ≥ 1.

In addition, there exist some ε > 0 such that Cε,+ is bounded away from the origin, i.e.,inf l∈Cε,+ ‖l‖2 > 0.

For the minimal salvage fund problem, since the deficit function φ(x, L) is coordinate-wise nondecreasing with respect to the loss vector L, it is reasonable to assume thatits asymptotic bound Ψ+(·) is also coordinatewise nondecreasing. For this example, theclosed form expression of Ψ+(·) and the detailed verification of all the assumptions aredeferred to Proposition 9. Our next result summarizes the construction of the outerapproximation sets.

Theorem 3. Suppose that Property 2 and Assumption 3 are imposed. Then there existδ0 > 0 such that the following sets

Oδ =⋃α≥αδ

α · Π, Cδ = h(αδ) ·(Cε,+ ∪Kc ∪ Sc

)(10)

satisfy Property 1 for all δ < δ0. Here, S is given in Assumption 3 and K is a ball in Rdl

with µ(Kc) <∞.

Page 13: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 13

With the aid of Lemma 1 and Theorem 3, we provide Algorithm 1 for approximating(CCPδ) in which the sampled optimization problem is bounded in 1/δ.

Algorithm 1: Scenario Approach with Optimal Scenario Generation

input : Constraint function φ, risk tolerance parameter δ, confidence level β, allthe elements and constants appearing in Property 2 and Assumption 3.

1 Compute the expression of sets Oδ and Cδ by (10);2 Compute required number of samples N ′ by (5);3 for i = 1, . . . , N ′ do

4 Sample L(i)δ using acceptance-rejection or importance sampling.

5 end6 Solve the conditional sampled problem (CSPδ,N′).

In Section 5.2, our objective is to show that the output of the previous algorithm isguaranteed to be within a constant factor of the optimal solution to (CCPδ) with highprobability, uniformly in δ.

5.2. Constant Approximation Guarantee. We shall work under the setting of The-orem 3, so we enforce Property 2 and Assumptions 3. We want to show that thereexist some constant Λ > 1 independent of δ, such that Val (CCPδ) ≤ Val (CSPδ,N′) ≤Λ×Val (CCPδ) with high probability. This indicates that our result guarantees a constantapproximation to (CCPδ) for regularly varying distributions (under our assumptions) inO(1) sample complexity when δ → 0 with high probability.

Note that (CSPδ,N′) ≤ Λ × Val (CCPδ) is meaningful only if Val (CCPδ) > 0. Weassume that the outer approximation set is good enough such that the following naturalassumption is valid.

Assumption 4. There exist δ > 0 such that minx∈Oδ c>x > 0.

The previous assumption will typically hold if c has strictly positive entries. Theorem3 and the form of Oδ guarantee that the norm of the optimal solution of (CSPδ,N′) growsin proportion to αδ, so we also assume the following scaling property for φ(x, l).

Assumption 5. There exist a function φlim : (Rdx\0)× (Rdl\0)→ R such that forevery compact set E ⊆ Rdl\0, we have

limα→∞

supl∈E

∣∣∣∣ 1

r(α)φ(α · x, h(α) · l)− φlim(x, l)

∣∣∣∣ = 0.

In addition, φlim(x, l) is continuous in l.

Assumption 5 is satisfied by both running examples. For the portfolio optimizationproblem, we have φ(x, l) =

∑di=1(Li/xi)−η, thus φlim(x, l) = φ(x, l). For the minimal sal-

vage fund problem, we have φlim(x, l) = φ(x, l)−m such that |α−1φ(α · x, α · l)− φlim(x, l)| ≤α−1m and |φlim(x, l)− φlim(x, l′)| ≤ ‖l − l′‖1.

Page 14: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

14 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

We define the following optimization problem, which will serve as an asymptotic upperbound of (CSPδ,N′) in stochastic order when δ → 0:

minimize c>x

subject to φlim(x, L(i)lim) ≤ 0, i = 1, . . . , N ′,

x ∈⋃α≥1 α · Π,

(CSPlim,N′)

where L(i)lim are i.i.d. samples from a random variable Llim, whose distribution is charac-

terized by P(Llim ∈ (Cε,+ ∪Kc ∪Sc)) = 1 and P(Llim ∈ E) = µ(E)/µ(Cε,+ ∪Kc ∪Sc) forall measurable set E ⊆ Cε,+ ∪Kc ∪ Sc.

Theorem 4. Let β > 0 be a given confidence level and N ′ be a fixed integer that satisfies(5). If Assumptions 4 and 5 are enforced, and (CSPlim,N′) satisfies Slater’s condition withprobability one, then there exist δ0 > 0 and Λ > 0 such that

P(

Val (CCPδ) ≤ Val (CSPδ,N′) ≤ Λ× Val (CCPδ))≥ 1− 2β, ∀δ < δ0.

Slater’s condition (See Section 5.2.3 in Boyd and Vandenberghe (2004) for reference)can be verified directly on the problem (CSPlim,N′). This condition is satisfied in thesalvage fund problem by standard linear programming duality.

5.3. Linear Approximation Method. Suppose that the constraint function φ(x, l) isjointly convex in (x, l), and L is multivariate regularly varying. We will develop a simplermethod in this section to construct the outer approximation set Oδ and the uniformconditional event Cδ.

We first introduce a crucial assumption in the construction of Oδ and Cδ.

Assumption 6. There exist a convex piecewise linear function φ−(x, l) : Rdx ×Rdl → Rof the form

φ−(x, l) = maxi=1,...,N

a>i l + b>i x+ ci, ai ∈ Rdl , bi ∈ Rdx and ci ∈ R for i = 1, . . . , N.

such that:

(1) φ−(x, l) ≤ φ(x, l), ∀(x, l) ∈ Rdx × Rdl ;(2) there exist some constant C ∈ R+ such that φ(x, l) ≤ 0 if φ−(x, l) ≤ −C.

If φ(x, l) itself is a piecewise affine function, then Assumption 6 is satisfied by simplytaking φ−(x, l) = φ(x, l). For general jointly convex functions, the following lemmaverifies Assumption 6 if φ(x, l) has a compact zero sublevel set.

Lemma 5. If the constraint function φ(x, L) : Rdx × Rdl → R is convex and twicecontinuously differentiable, and it has a compact zero sublevel set Zφ := (x, l) ∈ Rdx ×Rdl | φ(x, l) ≤ 0, then Assumption 6 is satisfied.

With Assumption 6 enforced, we are now ready to provide our main result in thissection to fully summarize the construction of Oδ and Cδ.

Page 15: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 15

Theorem 6. If Assumptions 1 and 6 hold, we can construct Oδ and Cδ that satisfyProperty 1 as

Oδ :=N⋂i=1

x ∈ Rdx | b>i x+ci+F−1a>i L

(δ) ≤ 0, Cδ :=N⋃i=1

L ∈ Rdl | a>i L+C > F−1a>i L

(δ),

where F−1a>i L

(δ) = infx ∈ R | P(x > a>i L) ≤ δ.

6. Verifying the Assumptions in Examples

In this section, we verify the elements required to apply our algorithm. We provideexplicit expressions for sets Oδ and Cδ in the statement of the propositions. The detailedverification process and the steps for constructing sets Oδ and Cδ are presented as theproofs in Appendix A.

6.1. Portfolio Optimization with VaR Constraint. In this section, we will verifythat Theorem 3 is applicable to an equivalent form of the portfolio optimization problem(1).

Proposition 7. The portfolio optimization problem (1) satisfies all assumptions requiredby Theorem 3, such that the sets Oδ and Cδ admits the explicit expressions

Oδ =x ∈ Rd

++

∣∣ η · x F−11>L

(δ), Cδ =

l ∈ Rd

++

∣∣ 2 · 1>l ≥ F−11>L

(δ).

6.2. Minimal Salvage Fund. The key observation to solve the minimal salvage fundproblem (3) is the following lemma, which provides a closed form piecewise linear expres-sion for the constraint function φ(x, L).

Lemma 8. In the minimal salvage fund problem (3), we have

φ(x, L) = maxi=1,...,d

Li − e>i (I −Q>)−1x−mi,

where ei denote the unit vector on the i-th coordinate.

Now we prove that Theorem 6 is applicable to the minimal salvage fund problem (3).

Proposition 9. The minimal salvage fund problem (3) satisfies all assumptions requiredby Theorem 6, such that the sets Oδ and Cδ admits the explicit expressions

Oδ =d⋂i=1

x ∈ Rd | F−1Li

(δ) ≤ e>i (I −Q>)−1x+mi, Cδ =d⋃i=1

L ∈ Rd | Li > F−1Li

(δ).

6.3. Quadratic Model. In this section, we consider a model with a quadratic controlterm in x as an additional example. Suppose that the constraint function φ(x, L) :Rdx × Rdl → R is defined as

φ(x, L) = x>Qx+ x>AL,(11)

where Q ∈ Rdx×dx is a symmetric matrix and A ∈ Rdx×dl is a matrix with rank(A) = dx,i.e. there exist σ > 0 such that ‖A>x‖2 ≥ σ‖x‖2.

Page 16: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

16 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Proposition 10. Consider the chance constraint optimization model with constraintfunction defined as (11).

(1) If Q is a positive semi-definite matrix and L has a positive density, there existsome δ such that the problem is infeasible.

(2) If Q has a negative eigenvalue and L is multivariate regularly varying, the modelsatisfies all the assumptions required by Theorem 3.

7. Numerical Experiments

In order to empirically study the computational complexity and compare the qual-ity of the solutions, in this section we conduct numerical experiments for two scenariogeneration algorithms:

(1) the efficient scenario generation approach proposed in this paper (abbreviated asEff-Sc);

(2) the scenario approach in Calafiore and Campi (2006) (abbreviated as CC-Sc).

In Section 7.1, we present the results for the portfolio optimization problem. In Section7.2, we present the results for the minimal salvage fund problem. The numerical exper-iment is conducted using a Laptop with 2.2 GHz Intel Core i7 CPU, and the sampledlinear programming problem is solved using CVXPY (Diamond and Boyd (2016)) withthe MOSEK solver (MOSEK ApS (2020)).

7.1. Portfolio Optimization with VaR Constraint. First of all, we present the pa-rameter selection and the implement details for the numerical experiment of portfoliooptimization problem (1). Suppose that there are d = 10 assets to invest, and the pa-rameters of the problem are chosen as follows:

• The mean return vector is µ = (1.0, 1.5, 2.0, 2.5, 3, 1.6, 1.2, 1.1, 1.8, 2.2).• Li are i.i.d. with Pareto cumulative distribution function P(Li > l) = (li/l), forl ≥ `i.• ` = (`1, . . . , `d) = (2.1, 1.3, 1.6, 2.5, 2.7, 1.3, 1.9, 1.5, 2.2, 2.3).• The loss threshold η = 1000.

Now we explain the implementation detail of Eff-Sc. Recall the expression of Oδ andCδ from Proposition 7, which involves the analytically unknown quantity F−1

1>L(δ). Since

quantile estimation is much more computationally efficient than solving the sampledoptimization problem, we generate samples of L to estimate a confidence interval ofF−11>L

(δ) with large enough confidence level 1−o(β), and we denote the resulting confidence

interval by (LB, UB). We replace the expressions of Oδ and Cδ by their sampled versionconservative approximations, i.e.

Oδ =x ∈ Rd

++

∣∣∣ η · x UB, Cδ =

l ∈ Rd

++

∣∣∣ 2 · 1>l ≥ LB.

Page 17: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 17

10 3 10 2 10 1

δ

103

104

105

Requ

ired

Sam

ples

CC-ScEff-Sc

(a) Required Number of Samples in (1)

10 3 10 2 10 1

δ

10 1

100

CPU

Tim

e (s

)

CC-ScEff-Sc

(b) CPU Time

Figure 2. Comparison of computational efficiency for the portfolio op-timization problem, in terms of the required number of samples shownin Figure 2a and the used CPU time shown in Figure 2b. We testδ ∈ 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1.

The value of P(L ∈ Cδ) is also estimated using the generated samples. We compute therequired number of samples N ′ using Lemma 1, and the samples of Lδ is generated viaacceptance-rejection.

In Figure 2, we compare the efficiency between Eff-Sc and CC-Sc. Figure 2a presentsthe required number of samples for both algorithms, in which one can quickly remarkthat Eff-Sc requires significantly fewer samples than CC-Sc, especially for the problemswith small δ. In Figure 2b we compare the running time for both models. Whereas Eff-Sccosts slightly more time for δ around 0.1 due to the overhead cost of computing Oδ andCδ, the computational time stays nearly constant uniformly in δ, indicating that Eff-Scis a substantially more efficient algorithm than CC-Sc.

Finally, we compare Eff-Sc and CC-Sc for the optimal values of the sampled problemsand the violation probabilities of the optimal solutions. Because both methods requiregenerating random samples, the generated solutions are also random. Thus, the optimalvalues and the violation probabilities are also random. To compare the distributions ofthe random quantities, we conduct 103 independent experiments. In each experiment, weexecute both algorithms and get two solutions, then we evaluate the solutions’ violationprobabilities using 106 samples of L. We employ boxplots (See McGill et al. (1978)) todepict the samples’ distribution through their quantiles. A boxplot is constructed of twoparts, a box and a set of whiskers. The box is drawn from the 25% quantile to the 75%quantile, with a horizontal line drawn in the middle to denote the median. Two whiskersindicate 5% and 95% quantiles, respectively, and the scatters represent all the rest samplepoints beyond the whiskers.

In Figure 3, we present (a) the optimal values; and (b) the violation probabilities. Onecan quickly remark from Figure 3a that the optimal value of Eff-Sc is stochastically largerthan the optimal value of CC-Sc, while Figure 3b indicates that the optimal solutions

Page 18: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

18 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

10 1

100

101

Optim

al V

alue

MethodCC-ScEff-Sc

(a) Optimal Value of the Sampled Problem

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

10 5

10 4

10 3

10 2

10 1

Viol

atio

n Pr

obab

ility

Violation Probability= δ

MethodCC-ScEff-Sc

(b) Violation Probability of the Solution

Figure 3. Comparison of the quality of optimal solutions for the portfoliooptimization problem, in terms of the optimal value shown in Figure 3aand the solutions’ violation probabilities shown in Figure 3b. Here δ ∈0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, and the box plots are generatedusing 1000 experiments.

produced by both methods are feasible for all the 103 experiments. Overall, with bothmethods successfully and conservatively approximating the probabilistic constraint, Eff-Sc is more computationally efficient and less conservative, producing solutions with betterobjective values than its counterpart.

7.2. Minimal Salvage Fund. In this section we conduct a numerical experiment forthe minimal salvage fund problem (3). In the experiment we pick d ∈ 10, 15, 20 to testthe performance of the problem in different dimensions.

For each fixed d, the parameters of the problem (3) are chosen as follows:

• Q = (Qi,j : i, j ∈ 1, · · · , d) where Qi,j = 1/d if i 6= j and otherwise Qi,j = 0.• m = (mi : i ∈ 1, · · · , d) where mi = 10 for each i.• Li are i.i.d. with Pareto cumulative distribution function P(Li > l) = (1/l), forl ≥ 1.

Recall the explicit expressions for sets Oδ and Cδ from Proposition 9. To solve the

conditional sampled problem (CSPδ,N′), it remains to sample L(i)δ and compute N ′, the

required number of samples. When δ is small, When δ ≤ 10−3, solving the optimization

problem (CSPδ,N′) costs much more time than simulating L(i)δ , despite that a simple

acceptance rejection scheme is applied to sample L(i)δ in our experiments. We fix the

confidence level parameter β = 10−5 and set δ′ = δ/P (L ∈ Cδ) ≥ d−1, then we cancompute N ′ by the first part of Lemma 1.

Similar to Figure 2 of the portfolio optimization problem, we compare the efficiencybetween Eff-Sc and CC-Sc for different d and δ in Figure 4, in terms of (a) the requirednumber of samples; and (b) the CPU time for solving the sampled approximation problem.

Page 19: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 19

10 3 10 2 10 1

δ

103

104

105

Requ

ired

Sam

ples

CC-Sc, d = 10Eff-Sc, d = 10CC-Sc, d = 15Eff-Sc, d = 15CC-Sc, d = 20Eff-Sc, d = 20

(a) Required Number of Samples in (3)

10 3 10 2 10 1

δ

100

101

102

103

CPU

Tim

e (s

)

CC-Sc, d = 10Eff-Sc, d = 10CC-Sc, d = 15Eff-Sc, d = 15CC-Sc, d = 20Eff-Sc, d = 20

(b) CPU Time

Figure 4. Comparison of computational efficiency for the minimal salvagefund problem, in terms of the required number of samples shown in Figure4a and the used CPU time shown in Figure 4b. We test d ∈ 10, 15, 20and δ ∈ 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1.

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

103

104

105

106

107

108

109

Optim

al V

alue

MethodCC-ScEff-Sc

(a) Optimal Value of the Sampled Problem

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

10 6

10 5

10 4

10 3

10 2

10 1

Viol

atio

n Pr

obab

ility Violation Probability = δMethod

CC-ScEff-Sc

(b) Violation Probability of the Solution

Figure 5. Comparison of the quality of optimal solutions for the minimalsalvage fund problem, in terms of the optimal value shown in Figure 5a andthe solutions’ violation probabilities shown in Figure 5b. Here d = 15, δ ∈0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, and the box plots are generatedusing 1000 experiments.

We observe that the Eff-Sc has uniformly smaller sample complexity and computationalcomplexity than CC-Sc, where the superiority becomes significant for small δ. In par-ticular, the required number of samples and the used CPU time are bounded for Eff-Sc,while they quickly deteriorate for CC-Sc when δ becomes smaller. It is also worth notingthat Eff-Sc is consistently more efficient than CC-Sc for all the tested dimensions.

Page 20: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

20 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Finally, we compare optimal values of the sampled problems and violation probabilitiesof the optimal solutions in Figure 5. We present in (5a) the optimal values; and (5b) theviolation probabilities, with fixed dimension d = 15 (We provide additional results ford = 5 and d = 10 in Appendix B.1). One can quickly remark from Figure 5a that theoptimal value of Eff-Sc is stochastically smaller than the optimal value of CC-Sc, whileFigure 5b indicates that the optimal solutions produced by both methods are feasible forall the 103 experiments. Therefore, we are able to draw the same conclusion as we havefrom the portfolio optimization experiment: Eff-Sc efficiently produces less conservativesolutions.

Acknowledgement

The authors are grateful to Alexander Shapiro for helpful comments. The research ofBert Zwart is supported by NWO grant 639.033.413. The research of Jose Blanchet issupported by the Air Force Office of Scientific Research under award number FA9550-20-1-0397, NSF grants 1915967, 1820942, 1838576, DARPA award N660011824028, andChina Merchants Bank.

References

Ahmed S, Shapiro A (2008) Solving chance-constrained stochastic programs via sampling andinteger programming. State-of-the-Art Decision-Making Tools in the Information-IntensiveAge, 261–269 (INFORMS).

Alsenwi M, Pandey SR, Tun YK, Kim KT, Hong CS (2019) A chance constrained based formu-lation for dynamic multiplexing of embb-urllc traffics in 5g new radio. 2019 InternationalConference on Information Networking (ICOIN), 108–113 (IEEE).

Andrieu L, Henrion R, Romisch W (2010) A model for dynamic chance constraints in hydropower reservoir management. European Journal of Operational Research 207(2):579–589.

Barrera J, Homem-de Mello T, Moreno E, Pagnoncelli BK, Canessa G (2016) Chance-constrained problems and rare events: an importance sampling approach. MathematicalProgramming 157(1):153–189.

Ben-Tal A, Nemirovski A (2000) Robust solutions of linear programming problems contaminatedwith uncertain data. Mathematical programming 88(3):411–424.

Ben-Tal A, Nemirovski A (2002) Robust optimization–methodology and applications. Mathe-matical programming 92(3):453–480.

Bertsimas D, Sim M (2004) The price of robustness. Operations research 52(1):35–53.

Blanchet J, Liu J (2010) Efficient importance sampling in ruin problems for multidimensionalregularly varying random walks. Journal of Applied Probability 47(2):301–322.

Bonami P, Lejeune MA (2009) An exact solution approach for portfolio optimization problemsunder stochastic and integer constraints. Operations Research 57(3):650–670.

Boyd S, Vandenberghe L (2004) Convex optimization (Cambridge university press).

Calafiore G, Campi MC (2005) Uncertain convex programs: randomized solutions and confi-dence levels. Mathematical Programming 102(1):25–46.

Calafiore G, Campi MC (2006) The scenario approach to robust control design. IEEE Transac-tions on Automatic Control 51(5):742–753.

Page 21: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 21

Charnes A, Cooper WW, Symonds GH (1958) Cost horizons and certainty equivalents: anapproach to stochastic programming of heating oil. Management Science 4(3):235–263.

Chen B, Blanchet J, Rhee CH, Zwart B (2019) Efficient rare-event simulation for multiple jumpevents in regularly varying random walks and compound poisson processes. Mathematicsof Operations Research 44(3):919–942.

Chen W, Sim M, Sun J, Teo CP (2010) From cvar to uncertainty set: Implications in jointchance-constrained optimization. Operations research 58(2):470–485.

Diamond S, Boyd S (2016) CVXPY: A Python-embedded modeling language for convex opti-mization. Journal of Machine Learning Research 17(83):1–5.

Eisenberg L, Noe TH (2001) Systemic risk in financial systems. Management Science 47(2):236–249.

Embrechts P, Kluppelberg C, Mikosch T (2013) Modelling extremal events: for insurance andfinance, volume 33 (Springer Science & Business Media).

Frank B (2008) Municipal bond fairness act. 110th Congress, 2d Session, House of Representa-tives, Report, 110–835.

Gudmundsson T, Hult H (2014) Markov chain monte carlo for computing rare-event probabili-ties for a heavy-tailed random walk. Journal of Applied Probability 51(2):359–376.

Hillier FS (1967) Chance-constrained programming with 0-1 or bounded continuous decisionvariables. Management Science 14(1):34–57.

Hong LJ, Huang Z, Lam H (2020) Learning-based robust optimization: Procedures and statis-tical guarantees. Management Science .

Hong LJ, Yang Y, Zhang L (2011) Sequential convex approximations to joint chance constrainedprograms: A monte carlo approach. Operations Research 59(3):617–630.

Kley O, Kluppelberg C, Reinert G (2016) Risk in a large claims insurance market with bipartitegraph structure. Operations Research 64(5):1159–1176.

Kucukyavuz S (2012) On mixing sets arising in chance-constrained programming. Mathematicalprogramming 132(1-2):31–56.

Lagoa CM, Li X, Sznaier M (2005) Probabilistically constrained linear programs and risk-adjusted controller design. SIAM Journal on Optimization 15(3):938–951.

Lejeune MA, Margot F (2016) Solving chance-constrained optimization problems with stochasticquadratic inequalities. Operations Research 64(4):939–957.

Luedtke J (2014) A branch-and-cut decomposition algorithm for solving chance-constrainedmathematical programs with finite support. Mathematical Programming 146(1-2):219–244.

Luedtke J, Ahmed S (2008) A sample approximation approach for optimization with proba-bilistic constraints. SIAM Journal on Optimization 19(2):674–699.

Luedtke J, Ahmed S, Nemhauser GL (2010) An integer programming approach for linear pro-grams with probabilistic constraints. Mathematical Programming 122(2):247–272.

McGill R, Tukey JW, Larsen WA (1978) Variations of box plots. The American Statistician32(1):12–16.

MOSEK ApS (2020) MOSEK Fusion API for Python. URL https://docs.mosek.com/9.2/

pythonfusion.pdf.

Nemirovski A, Shapiro A (2006a) Convex approximations of chance constrained programs. SIAMJournal on Optimization 17(4):969–996.

Page 22: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

22 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Nemirovski A, Shapiro A (2006b) Scenario approximations of chance constraints. Probabilisticand Randomized Methods for Design under Uncertainty, 3–47 (Springer).

Pena-Ordieres A, Luedtke JR, Wachter A (2020) Solving chance-constrained problemsvia a smooth sample-based nonlinear approximation. SIAM Journal on Optimization30(3):2221–2250.

Prekopa A (1970) On probabilistic constrained programming. Proceedings of the Princeton Sym-posium on Mathematical Programming, volume 113, 138 (Princeton, NJ).

Prekopa A (2003) Probabilistic programming. Handbooks in Operations Research and Manage-ment Science 10:267–351.

Resnick SI (2013) Extreme values, regular variation and point processes (Springer).

Seppala Y (1971) Constructing sets of uniformly tighter linear approximations for a chanceconstraint. Management Science 17(11):736–749.

Tong S, Subramanyam A, Rao V (2020) Optimization under rare chance constraints. arXivpreprint arXiv:2011.06052 .

Wierman A, Zwart B (2012) Is tail-optimal scheduling possible? Operations Research60(5):1249–1257.

Zhang M, Kucukyavuz S, Goel S (2014) A branch-and-cut method for dynamic decision makingunder joint chance constraints. Management Science 60(5):1317–1333.

Page 23: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 23

Appendix A. Proofs of Technical Results

A.1. Proofs for Section 5.

Proof of Lemma 2. We will derive an expression of αδ to ensure that Fδ ⊆⋃α≥αδ α · Π

for δ small enough. Because of Assumption 2, for any α0 > 0 there exist some δ smallenough such that Fδ ⊆

⋃α≥α0

α ·Π. Therefore, it suffices to prove that Fδ and⋃α<αδ

α ·Πare disjoint. In other words,

P (φ(α · x, L) > 0) > δ, ∀α < αδ, x ∈ Π, δ < δ0.(12)

Let ε be a positive number such that µ(Cε,−) > 0. Pick the set K in (8b) as a compactset such that 0 < µ(K ∩ Cε,−) < ∞. It follows from the inequality (8b) that there exista constant α1 such that

(13) Ψ−(l)− ε ≤ infx∈Π

[1

r(α)φ(α · x, h(α) · l)

]∀l ∈ K, α > α1

Therefore, for any α ≥ α1 we have,

P

(minx∈Π

φ(α · x, L) > 0

)= P

(minx∈Π

1

r(α)φ(α · x, L) > 0

)(Due to (13)) ≥ P (G(L/h(α)) ≥ ε;L/h(α) ∈ K)

= P (L ∈ h(α) · (K ∩ Cε,−)) .

(14)

Recall that L is regularly varying from Assumption 1,

limα→∞

P (L ∈ h(α) · (K ∩ Cε,−))

P(‖L‖2 > h(α))= µ(K ∩ Cε,−).

Therefore, there exist a number α2 such that

P (L ∈ h(α) · (K ∩ Cε,−)) ≥ 1

2P(‖L‖2 > h(α))µ(K ∩ Cε,−), ∀α ≥ α2.(15)

Note that the right hand side of (15) is nondecreasing in α. Thus, if δ1 := 12P(‖L‖2 >

h(α2))µ(K ∩ Cε,−), for any δ ≤ δ1 there exist αδ satisfying

1

2P(‖L‖2 > h(αδ))µ(K ∩ Cε,−) = δ. ∀α, δ s.t. α2 ≤ α < αδ, 0 < δ ≤ δ1.(16)

Substituting (16) into (14), we have

P (φ(x, L) > 0) ≥ P

(minx∈Π

φ(α · x, L) > 0

)> δ.

∀α, x, δ s.t. max(α1, α2) ≤ α < αδ, x ∈ Π, 0 < δ ≤ δ1.

Moreover, Assumption 2 guarantees the existence of δ2 such that

P (φ(α · x, L) > 0) > δ, ∀α < max(α1, α2), x ∈ Π, δ < δ2.

Consequently (12) is proved with δ0 = min(δ1, δ2).

Page 24: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

24 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Proof of Theorem 3. We construct the uniform conditional event Cδ that contains all theVx for x ∈ Oδ. Due to the definition (8) and limδ→0 αδ =∞, there exist δ0 such that forall δ < δ0,

(17) Ψ+(l) + ε ≥ supx∈Π

[1

r(α)φ(α · x, h(α) · l)

]∀l ∈ K, α > αδ.

Notice that for any x ∈ Oδ, there exist an αx ≥ αδ such that x ∈ αx · Π. Consequently,it follows from (17) that

φ(x, l) > 0 =⇒ Ψ+

( l

h(αx)

)≥ −ε, ∀x ∈ Oδ, l ∈ h(αx) ·K.

Applying Assumption 3 yields that

Ψ+

( l

h(αδ)

)≥ Ψ+

( l

h(αx)

)≥ −ε, ∀x ∈ Oδ, l ∈ h(αx) · (K ∩ S).

Recall that K is a ball in Rdl (thus K ⊆ (h(αx)/h(αα))·K) and that S ⊆ (h(αx)/h(αα))·Sfrom Assumption 3, it turns out that h(αδ) · (K ∩ S) ⊆ h(αx) · (K ∩ S). Consequently,

whenever l ∈ Vx for some x ∈ Oδ, we either have l ∈ h(αx)·(K∩S) implying Ψ+

(l

h(αδ)

)≥

−ε, or we have l ∈(h(αx)·(K∩S)

)c ⊆ (h(αδ)·(K∩S))c

. Summarizing these two scenarios,⋃x∈Oδ

Vx ⊆ l ∈ Rdl | Ψ+

( l

h(αδ)

)≥ −ε

⋃(h(αδ) · (K ∩ S)

)c= h(αδ) ·

(Cε,+ ∪Kc ∪ Sc

).

Thus, we define the conditional set Cδ as

Cδ := h(αδ) ·(Cε,+ ∪Kc ∪ Sc

).

It remains to analyze the probability of the uniform conditional event Cδ. As L ismultivariate regularly varying,

limδ→0

P (L ∈ Cδ)P(‖L‖2 > h(αδ))

= µ(Cε,+ ∪Kc ∪ Sc).

Recalling, P(‖L‖2 > h(αδ)) = O(δ) and invoking Property 2, we get

lim supδ→0

δ−1P(L ∈ Cδ) <∞.

Hence, the proof is complete.

Proof of Theorem 4. Using Lemma 1, we immediately have P(Val (CCPδ) ≤ Val (CSPδ,N′)) ≥1 − β, it remains to show that there exist Λ > 0 such that P(Val (CSPδ,N′) ≤ Λ ×Val (CCPδ)) ≥ 1− β.

For simplicity, in the proof we will use Lδ as a shorthand for (L|L ∈ Cδ), the randomvariable with conditional distribution of L given L ∈ Cδ. By a scaling of x by a factor αδ

Page 25: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 25

in (CSPδ,N′), we have an equivalent optimization problem

minimize c>x

subject to 1r(αδ)

φ(αδ · x, L(i)δ ) ≤ 0, i = 1, . . . , N ′,

x ∈⋃α≥1 α · Π.

(18)

where L(i)δ are i.i.d. samples from Lδ. Notice that Val (CSPδ,N′) = αδ × Val (18).

For any compact set E ⊆ Cδ, since L is multivariate regularly varying,

limδ→0

P((h(αδ))−1Lδ ∈ E) = lim

δ→0

P(L ∈ (h(αδ) · E))

P(L ∈ Cδ)=

limδ→0P(L∈(h(αδ)·E))P(‖L‖2>h(αδ))

limδ→0P(L∈Cδ)

P(‖L‖2>h(αδ))

=µ(E)

µ(Cε,+ ∪Kc ∪ Sc).

Thus (h(αδ))−1Lδ

v→ Llim. As the limiting measure is a probability measure, the family

h(αδ))−1Lδ | δ > 0 is tight and consequently (h(αδ))

−1Lδd→ Llim follows directly from

the vague convergence, see Resnick (2013). Consequently, since all the samples are i.i.d,we also have

(h(αδ))−1 · (L(1)

δ , . . . , L(N ′)δ )

d→ (L(1)lim, . . . , L

(N ′)lim ).

Now we define a family of deterministic optimization problem, denoted by (DP (l1, · · · , lN ′)),which is parameterized by (l1, · · · , lN ′) as follows,

minimize c>xsubject to φlim(x, li) ≤ 0, i = 1, . . . , N ′,

x ∈⋃α≥1 α · Π.

(DP (l1, · · · , lN ′))

Then, there exist a compact set E1 ⊆ Rdl×N ′such that:

(1) Problem (DP (l1, · · · , lN ′)) satisfies Slater’s condition if (l1, · · · , lN ′) ∈ E1;

(2) P((h(αδ))−1 · (L(1)

δ , . . . , L(N ′)δ ) ∈ E1) ≥ 1− β for all δ > 0;

For every (l1, · · · , lN ′) ∈ E1 and ε > 0, due to the Slater’s condition, there exista feasible solution x ∈

⋃α≥1 α such that supj=1,...,N ′ φlim(x, lj) < −ε. Since φlim(x, l)

is continuous in l, there exist an open neighborhood U around (l1, · · · , lN ′) such thatsup(l1,...,lN′ )∈U supj=1,...,N ′ φlim(x, lj) < −ε/2. Notice that such feasible solution x and

neighborhood U exist for every (l1, · · · , lN ′) ∈ E1. There exists a finite open coverUimi=1 of E1 due to its compactness. Let ximi=1 be the corresponding feasible solutionsto the open cover Uimi=1. Due to Assumption 5, there exist δ1 > 0 such that for allδ < δ1, we have

sup(l1,...,lN′ )∈E1

supi=1,...,m

supj=1,...,N ′

∣∣∣∣ 1

r(αδ)φ(αδ · xi, h(αδ) · lj)− φlim(xi, lj)

∣∣∣∣ < ε/2.(19)

Therefore by the triangle inequality, it follows that if δ < δ1,

sup(l1,...,lN′ )∈Ui

supj=1,...,N ′

1

r(αδ)φ(αδ · xi, h(αδ) · lj) < 0.

Page 26: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

26 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Consequently, xi is a feasible solution for optimization problem (18) if (h(αδ))−1·(L(1)

δ , . . . , L(N ′)δ ) ∈

Ui, which further implies that α−1δ × Val (CSPδ,N′) ≤ c>xi. As a result, we have

Val (CSPδ,N′) ≤ αδ × maxi=1,...,m

c>xi, if (h(αδ))−1 · (L(i)

δ , . . . , L(N ′)δ ) ∈ E1.

Note that Val (CCPδ) ≥ infx∈Oδ c>x = αδ × infc>x | x ∈

⋃α≥1 α · Π. Therefore, let

Λ =(

infc>x | x ∈⋃α≥1

α · Π)−1

×(

maxi=1,...,m

c>xi)> 0

It follows that

P(

Val (CSPδ,N′) ≤ Λ× Val (CCPδ))≥ P

((h(αδ))

−1 · (L(1)δ , . . . , L

(N ′)δ ) ∈ E1

)≥ 1− β.

The statement is concluded by using the union bound, combining the lower bound to-gether with the upper bound implied by Theorems 1 and 3, hence obtaining factor 2β.

Proof of Lemma 5. Without loss of generality, assume that R is an integer such that

Zφ = (x, l) ∈ Rdx × Rdl | φ(x, l) ≤ 0 ⊆ [−R,R](dx+dl).

Let N1 = (2R + 1)(dx+dl), and let (x(i), l(i)), i = 1, . . . , N1 be the integer lattice points in[−R,R](dx+dl). In addition, let ai = ∂φ

∂L(x(i), l(i)), bi = ∂φ

∂x(x(i), l(i)) and ci = φ(x(i), l(i)) −

∂φ∂L

(x(i), l(i))>l(i)−∂φ∂x

(x(i), l(i))>x(i) for i = 1, . . . , N1, then define φ1,−(x, l) = maxi=1,...,N1 a>i l+

b>i x + ci. Since the function φ(x, l) is convex, we can invoke the supporting hyper-plane theorem to deduce that a>i l + b>i x + ci ≤ φ(x, l) for i = 1, . . . , N1, and con-sequently φ1,−(x, l) ≤ φ(x, l). In addition, since φ(x, l) ≥ 0 at the boundary of thecube [−R,R](dx+dl), there exist a constant C1 such that −C1 · R ± C1 · xi ≤ φ(x, l) fori = 1, . . . , dx and −C1 · R ± C1 · li ≤ φ(x, l) for i = 1, . . . , dl, for all (x, l) ∈ Rdx × Rdl .Therefore, with φ2,−(x, l) being the maximum of the aforementioned N2 = 2(dx + dl)linear functions, we have φ2,−(x, l) ≤ φ(x, l), and we also have that φ2,−(x, l) ≤ 0 implies(x, l) ∈ [−R,R](dx+dl).

Define φ−(x, l) = max φ1,−(x, l), φ2,−(x, l). We can conclude the property of φ−(x, l)as follows: (1) φ−(x, l) is a piecewise linear function of form maxi=1,...,N a

>i l + b>i x + ci,

where N = N1 +N2; (2) φ−(x, l) ≤ φ(x, l); (3) φ−(x, l) ≤ 0 implies (x, l) ∈ [−R,R](dx+dl).To complete the proof, it remains to verify for φ−(x, l) the second statement of Assump-tion 6.

As φ−(x, l) ≤ 0 implies (x, l) ∈ [−R,R](dx+dl), it suffices to prove that there exist someuniversal constant C ∈ R+ such that φ(x, l)− φ−(x, l) ≤ C for all (x, l) ∈ [−R,R](dx+dl).For an arbitrary point (x, l) ∈ [−R,R](dx+dl), there exist a lattice point (x(i), l(i)) such that‖(x, l)− (x(i), l(i))‖2 ≤

√dx + dl/2. Next, since φ(x, l) is twice continuously differentiable,

the gradient ∇φ(x, l) is Lipschitz over [−R,R](dx+dl) with Lipschitz constant denoted byMφ. Therefore, for any (x, l) ∈ [−R,R](dx+dl),

φ(x, l)− φ−(x, l) ≤ φ(x, l)− φ1,−(x, l) ≤ mini=1,...,N1

(φ(x, L)− (a>i L+ b>i x+ ci)

)≤ 1

4M2

φ

√dx + dl.

The proof is now complete.

Page 27: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 27

Proof of Theorem 6. Since φ−(x, L) ≤ φ(x, L), the probability constraint P(φ(x, L) >0) ≤ δ implies that P(φ−(x, L) > 0) ≤ δ, which further implies P(a>i L+b>i x+ci > 0) ≤ δfor i = 1, . . . , N . Therefore, we have −b>i x− ci ≥ F−1

a>i L(δ) for i = 1, . . . , N , which implies

Fδ ⊆ Oδ.

Then, consider x ∈ Oδ and L ∈ Vx = L ∈ Rdl | φ(x, L) > 0. It follows fromthe second statement of Assumption 6 that φ(x, L) > 0 implies that φ−(x, L) + C > 0.Thus, there exist an index i such that a>i L + b>i x + ci + C > 0. As x ∈ Oδ implies thatb>i x+ ci + F−1

a>i L(δ) ≤ 0, so

a>i L− F−1a>i L

(δ) + C ≥ a>i L+ b>i x+ ci + C > 0.

Therefore, the condition set Cδ can be constructed as

Cδ :=N⋃i=1

L ∈ Rdl | a>i L+ C > F−1a>i L

(δ).

Thus, as the distribution a>i L is regularly varying in dimension one for each i, we havelim supδ→0 δ

−1P(L ∈ Cδ) ≤ N , completing the proof.

A.2. Proofs for Section 6.

Proof of Proposition 7. Let φ(x, l) =∑d

i=1(li/xi) − η and π(x) = mindi=1 xi. The levelset is Π = x ∈ Rd

++ | mini=1,...,d xi = 1. Let h(α) = α and r(α) = 1, it followsthat 1

r(α)φ(α · x, h(α) · l) = φ(x, l). In view of the inequalities φ(x, l) ≤ 1>l − η and

φ(x, l) ≥ mini=1,...,d li − η when x ∈ Π, we choose the asymptotic uniform bounds as

Φ+(l) = 1>l − η, Φ−(l) = mini=1,...,d

li − η.

Furthermore, by definition we construct two approximation sets as

Cε,+ =l ∈ Rd

++

∣∣ 1>l ≥ η − ε, Cε,− =

l ∈ Rd

++

∣∣∣∣ mini=1,...,d

li ≥ η + ε

.

With all the elements that we have already defined, Assumption 1 follows directly fromthe assumption on distribution of L. Now we turn to verify Assumption 2. As π(α · x) =α · π(x) due to the definition of π(x), it suffices to prove that limδ→0 infx∈Fδ π(x) = +∞.In view of φ(x, L) ≤ 1>L/π(x)− η, we have

Fδ =x ∈ Rd

++

∣∣ P(φ(x, L) > 0) ≤ δ⊆x ∈ Rd

++

∣∣ P(1>L > η · π(x)

)≤ δ

=x ∈ Rd

++

∣∣ η · π(x) ≥ F−11>L

(δ)

Consequently, we have infx∈Fδ π(x) ≥ η−1F−11>L

(δ). Taking limit for δ → 0, we concludethat limδ→0 infx∈Fδ π(x) = +∞.

As Assumption 1 and 2 are both satisfied, and we also have µ(Cε,−) > 0, thus Property2 is verified due to Lemma 2. In addition, if ε ∈ (0, η), we have Cε,+ is bounded awayfrom the origin. Thus Assumption 3 is verified with S = Rd.

Page 28: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

28 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Finally, we provide closed form expressions for Oδ and Cδ. Define αδ = η−1 · F−11>L

(δ),

then it follows that Oδ =⋃α≥αδ α · Π =

x ∈ Rd

++

∣∣ η · π(x) ≥ F−11>L

(δ)

, and Cδ =

h(αδ) ·(Cε,+ ∪Kc ∪ Sc

)= αδ ·Cε,+ =

l ∈ Rd

++

∣∣ 1>l ≥ (1− ε/η) · F−11>L

(δ). By setting

ε = η/2, we get the expression in the statement of the theorem.

Proof of Lemma 8. We start by showing some properties of I − Q>. Since Q is a non-negative matrix and the row sum is less than 1, it is a sub-stochastic matrix and all ofits eigenvalues must be less than 1 in magnitude. This further implies: (1) I − Q> isinvertible; and (2) (I − Q>)−1 = I +

∑∞n=1(Q>)n is a non-negative matrix with strictly

positive diagonal terms.

Notice that y = (I −Q>)−1x is the unique vector such that (I −Q>)y = x. Let (y′, b′)be the optimal solution of

φ (x, L) = miny,bb | (L− y −m) b · 1,

(I −Q>

)y x, y ∈ Rd

+, b ∈ R

We have (I−Q>)y′ (I−Q>)y = x, and we multiply the non-negative matrix (I−Q>)−1

on both side, yielding y′ y. Obviously, Let b = maxi=1,...,d(Li − yi) such that (y, b)is a feasible solution to above problem. Obviously, it follows from y′ y that b′ =maxi=1,...,d(Li− y′i−mi) ≥ maxi=1,...,d(Li− yi−mi) = b, thus (y, b) is also optimal, whichcompletes the proof.

Proof of Proposition 9. Assumption 1 follows directly from the assumptions of the exam-ple. Now we turn to verify Assumption 6. Using Lemma 8, we define φ−(x, l) = φ(x, l) =maxi=1,...,d Li − e>i (I − Q>)−1x −mi. Therefore, Assumption 6 is satisfied with N = d,ai = ei, bi = −(I −Q)−1ei, ci = −mi and C = 0. Plugging above values into the expres-sions of Oδ and Cδ given in Theorem 6, we get the expressions shown in the statementof the proposition.

The following lemma is used in the proof of Proposition 10.

Lemma 11. There exist sets S1, . . . , S2dl ⊆ Rdl with positive Lebesgue measure suchthat for any z ∈ Rdl with ‖z‖2 = 1, there exist some Si ⊆ l ∈ Rdl |z>l > 1.

Proof of Lemma 11. Let ei denote the unit vector on the ith coordinate in Rdl for i =1, . . . , dl. Fix z = (z1, . . . , zdl) ∈ Rdl with ‖z‖2 = 1, define θi be the angle between z andei, which satisfies cos(θi) = z>ei. Since we have

∑ni=1 cos(θi)

2 = 1, so there exist some isuch that cos(θi)

2 ≥ 1/n, thus zi ∈ [−1,−1/√n] ∪ [1/

√n, 1]. Then, define

S2i−1 = l = (l1, . . . , ldl) ∈ Rdl | li > 0, l2i ≥ (n− 1)∑j 6=i

l2j,

S2i = l = (l1, . . . , ldl) ∈ Rdl | li < 0, l2i ≥ (n− 1)∑j 6=i

l2j.

we have either S2i−1 ⊂ l ∈ Rdl |z>l > 1 or S2i ⊂ l ∈ Rdl |z>l > 1. Thus the proof iscomplete.

Page 29: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 29

Proof of Proposition 10. For the first statement, since x>Qx ≥ 0 and A>x ∈ Rdl , andinvoking the assumption that L has a positive density,

miny∈Rdl\0

P (y>L > 0) ≥ miny:‖y‖2=1

P (y>L > 0) > 0.

For the second statement, Assumption 1 is easy to verify. Notice that α−2φ(α ·x, α ·L) =φ(x, L) for all α > 0, so we pick the scaling rate function as h(α) = α and r(α) = α2.Let λmax denote the maximal eigenvalue of Q, and λmin denote the minimal eigenvalue ofQ. The rest of the proof will be divided into two cases.

Case 1 (λmax < 0): We pick the level set as Π = x ∈ Rdx | ‖x‖2 = 1. Sincelimδ→0 infx∈Fδ ‖x‖2 = ∞, Assumption 2 is verified. Next, we directly show Property 2instead of using Lemma 2. For any x ∈ α · Π we have

minx∈α·Π

P(x>Qx+ x>AL > 0) ≥ minx∈Π

P(αλmin + x>AL > 0

)= min

x∈ΠP

(x>AL

‖A>x‖2

>−αλmin

‖A>x‖2

)≥ min

z:‖z‖=1P

(z>L > −ασ−1λmin

)(Apply Lemma 11) ≥ min

i=1,...,2dlP(L ∈ −ασ−1λminSi).

Thus, αδ can be chosen such that αδ = O(δ), and mini=1,...,2dl P(L ∈ −αδσ−1λminSi) > δ.As a result, Property 2 is verified. We next turn to derive the asymptotic uniform boundΨ+. Observing that

supx∈Π

φ(x, L) ≤ λmax + ‖A‖F‖L‖2,

we define Ψ+(L) := λmax + ‖A‖F‖L‖2. Assumption 3 now follows from the definition ofΨ+.

Case 2 (λmax ≥ 0): The level set Π is chosen as an unbounded set Π = x ∈ Rdx |x>Qx = −‖x‖2 and we have minx∈Π ‖x‖2 = 1/|λmin|. For any x ∈ α · Π we have

minx∈α·Π

P(x>Qx+ x>AL > 0) ≥ minx∈Π

P(x>AL > α

),

= minx∈Π

P

(x>AL

‖A>x‖2

‖A>x‖2

)≥ min

z:‖z‖=1P

(z>L > −ασ−1λmin

)(Apply Lemma 11) ≥ min

i=1,...,2dlP(L ∈ −ασ−1λminSi).

Thus we can pick an αδ that satisfies Property 2. Now note that supx∈Π φ(x, L) is boundedby

supx∈Π

φ(x, L) ≤ supx∈Π‖x‖2(‖AL‖2 − 1) ≤ −1

2|λmin|−1 · I(‖AL‖2 ≤ 1/2) +∞ · I(‖AL‖2 > 1)

Page 30: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

30 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

so we can pick Ψ+(L) := −12|λmin|−1 · I(‖AL‖2 ≤ 1/2) +∞· I(‖AL‖2 > 1). Consequently

Assumption 3 follows immediately.

Page 31: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

EFFICIENT SCENARIO GENERATION FOR CHANCE CONSTRAINED OPTIMIZATION 31

Appendix B. Additional Numerical Results

B.1. Additional Results for Minimal Salvage Fund. In this section we presentsadditional numerical experiments that demonstrates the quality of the solutions producedby Eff-Sc is better than the solutions produced by CC-Sc. See Figure 6 for dimensiond = 5 and Figure 7 for d = 10.

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

102

103

104

105

106

107

108

Optim

al V

alue

MethodCC-ScEff-Sc

(a) Optimal Value of the Sampled Problem

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

10 6

10 5

10 4

10 3

10 2

10 1

Viol

atio

n Pr

obab

ility Violation Probability = δMethod

CC-ScEff-Sc

(b) Violation Probability of the Solution

Figure 6. Comparison of the quality of optimal solutions given by Eff-Scand CC-Sc for d = 5, in terms of the optimal value shown in Figure 6a andthe solutions’ violation probabilities shown in Figure 6b.

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

103

104

105

106

107

108

109

Optim

al V

alue

MethodCC-ScEff-Sc

(a) Optimal Value of the Sampled Problem

0.001 0.002 0.005 0.01 0.02 0.05 0.1δ

10 6

10 5

10 4

10 3

10 2

10 1

Viol

atio

n Pr

obab

ility Violation Probability = δMethod

CC-ScEff-Sc

(b) Violation Probability of the Solution

Figure 7. Comparison of the quality of optimal solutions given by Eff-Scand CC-Sc for d = 10, in terms of the optimal value shown in Figure 7aand the solutions’ violation probabilities shown in Figure 7b.

Page 32: arXiv:2002.02149v2 [math.OC] 11 Feb 2021

32 JOSE BLANCHET, FAN ZHANG, AND BERT ZWART

Department of Management Science and Engineering, Stanford University, Stanford,CA 94305

Email address: [email protected]

Department of Management Science and Engineering, Stanford University, Stanford,CA 94305

Email address: [email protected]

Centrum Wiskunde & Informatica, Amsterdam 1098 XG, Netherlands, and EindhovenUniversity of Technology.

Email address: [email protected]