Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Roxanne E. and Michael J. Zak Professor
Process-Energy-Environmental Systems Engineering (PEESE)School of Chemical and Biomolecular Engineering
Cornell University, Ithaca, New York
Fengqi You
When Machine Learning Meets Robust Optimization – DDARO Models, Algorithms
and Applications for Industry 4.0
www.peese.orgPSE-BR, May 21, 2019
• Four major campuses• Ithaca, New York (main)
• One of the most beautiful campuses in the U.S.
• Carbon neutral by 2035• Living lab for energy research
• Roosevelt Is., New York (tech)• Manhattan, New York (med)• Doha, Qatar (med)
Cornell University
2
Cornell University in Ithaca, New York
3
Conventional approach for optimization under uncertainty• “Fit” uncertainty data into probability distribution or uncertainty set• Optimization results can only be at most as good as the uncertainty model
• “Close the loop” between optimization inputs and results, but not with uncertainty info
Decision Making under Uncertainty
4
( , )min max min
s.t. , :
T T
U
x y x uu
x
y
c x b y
Ax d x Sy S
Wy h Tx Mu
Optimal Decision
Optimization Models and Algorithms
“Fitted” Uncertainty Model
Asymmetry
Multimode
Multiclass
High-dimension
Correlation
Optimization under Uncertainty from the Data Lens
5
( , )min max min
s.t. , :
T T
U
x y x uu
x
y
c x b y
Ax d x Sy S
Wy h Tx Mu
Optimal Decision
Optimization Models and Algorithms
Machine Learning Models & Algorithms
Uncertainty “Big” Data
• Integrate data-driven (ML) system & model-based (optimization) system • Lead to new (and better) modeling frameworks and needs new algorithms
• Applications: Manufacturing, sustainability, energy systems, agriculture, …
Data-Driven Decision making under Uncertainty
• Let Uncertainty Data “Speak” in Math Programming• Distributionally robust optimization [Delage & Ye, 10]
• Data-driven chance constrained stochastic program [Guan, 16]
• Data-driven static/adaptive robust optimization• Balance the conservativeness and computational tractability
• When Machine Learning meets Robust Optimization• Data-driven static robust optimization [Bertsimas et al., 17]
• Date-driven RO with kernel learning [Shang & You, 17]• Date-driven RO with PCA & kernel smoothing [Ning & You, 18]• Date-driven distributionally robust optimization [Shang & You, 18]
• Data-driven adaptive robust optimization [Ning & You, 17]• Date-driven stochastic robust optimization [Ning & You, 18]• Date-driven multi-stage adaptive RO [Ning & You, 17]
Data-Driven Decision Making under Uncertainty
6
Background: Nominal vs. Robust Optimization
7
• Nominal optimization:
• Robust optimization:
2
model target 2J f f p pCost function:
Robust optimum = best worst case
if U p
p0
p p p
p0
J nonconvex
min maxU
J
p p
p p
min Jp
p
• Two main components• Decisions: All the decisions are made “here-and-now”• Uncertainty set: Often constructed based on a priori
and relatively simple assumptions about uncertainty
• Drawback: Solution could be overly conservative
Background: Static Robust Optimization
8
0min max ,
s.t. , 0 ,U
i
f
f U i
x u
x u
x u u
Uncertainty setDecisions
• “Wait-and-see” decisions made after uncertainty is revealed• Well represents the sequential decision-making process• Less conservative than Static Robust Optimization• Recourse decisions address feasibility issues
Two-Stage Adaptive Robust Optimization (ARO)
9
( , )min max min
. . ,
, :
T T
U
s t
x y x uu
x
y
c x b y
Ax d x S
x u y S Wy h Tx Mu “wait-and-see” decisions
“here-and-now” decisions
“here-and-now” decisions
Uncertainty “wait-and-see” decisions
• Box Uncertainty Set• Soyster (1973)
• Pros: Tractable• Cons: Very conservative
• Ellipsoidal Uncertainty Set• Ben-Tal and Nemirovski (1998)
• Pros: Control conservatism• Cons: Introducing nonlinear function to the model
Uncertainty Sets – “Heart” of Robust Optimization
10
box , L Ui i i iU u u u u i
1 2Ellipsoid 2
1 1TU u u Σu u Σ u
• Budget/Gamma Uncertainty Set• Bertsimas and Sim (2003)
• Pros: Control conservatism• Cons: Suitable for independent and symmetric uncertainty
• Polyhedral Uncertainty Set• Bertsimas and Ruiter (2016)
• Pros: Flexible structure to model uncertainty• Cons: Difficulty in deriving ‘optimal’ polyhedral coefficients
Uncertainty Sets – “Heart” of Robust Optimization
11
polyhedral , 1, ,Tj jU v j s u w u polyhedralU
Ti ivw u
budget , 1 1, , i i i i i i ii
U u u u u z z z i
1
• Research Questions on Uncertainty Set• Classic uncertainty sets
• Fixed geometric shape• Always convex• One-set-fits-all
• How to derive the model/set(s) from uncertainty data?• Data-driven uncertainty set(s) with ML
• “Optimal” polyhedral uncertainty set(s)
• Why using a convex set for uncertainty?• Piecewise linear uncertainty sets• Non-convex uncertainty sets from multiple basic convex sets
• ML + RO leads to new data-driven methods andalso novel & previously intractable RO paradigms
Uncertainty Sets – “Heart” of Robust Optimization
12
Example 1: Data-driven uncertainty set for ARO
13
Box type uncertainty set
Budgeted uncertainty set Data-driven uncertainty set
Uncertainty data
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Uncertainty Set
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Uncertainty Set
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Uncertainty Set
• The “bridge” between data and uncertainty set
• Dirichlet Process (DP) Mixture Model [Blei & Jordan, 06]• A powerful Bayesian nonparametric model• Ability to adjust its complexity to that of data
Data-Driven Uncertainty Set for ARO
14
0 0
1 2
(1, )
, , ( )
( )
~
~
~
~i
k
k
i
i i i l
Beta
F F
l Mult
o l p o
1 kkkF
“Stick Breaking”
Data Sample
1 11
2 21
Pr new observation Dataset
Predictive PosteriorVariationalinference
Dirichlet Process Mixture Model
Uncertainty Set
Variational Inference for DDANRO Uncertainty Set
15
,,
i i
i i
v
Variationalinference
Inference results
Uncertainty dataq is variational distribution
Update kq
Update
Update q
,k kq η H
1
1
ELBO ELBOELBOt t
t
q qtol
q
Yes
No
Parameters in uncertainty sets
1
1
iji
iji i j j
vv v
1
1 dimi
ii i
s
u
1, , NU u u
, , ,i i i is μ Ψ
Evidence lower bound
Update iq l
1
1 1 1
, , , , ,N M M
i k k ki k k
q q l q q q
l β η H η H
,i iμ Ψ
• The “bridge” between data and uncertainty set
Data-Driven Uncertainty Set for ARO
16
Pr new observation Dataset
Predictive PosteriorVariationalinference
Dirichlet Process Mixture Model
Uncertainty Set
Component 1
• The predictive posterior is a mixture of student’s t-distributions
Component 2
Component m
…
Uncertainty set 1
Uncertainty set m
…
Uncertainty set 2
1
1 dim 2,i
ii i
i i
Sts
uΨμ
1/2i i is u μ Ψ ξ
Uncertainty transformation
Multiple basic sets for high-fidelity descriptions of uncertainties
Data-Driven Uncertainty Set for ARO
17
*
1/21
:
, 1, i
i i i i ii
U s
u u μ Ψ z z z
Uncertainty set using l1 and l∞ norms
Budget of data-driven uncertainty set
Union of basic uncertainty sets
Data-Driven Adaptive Nested Robust Optimization
18
*
1/21
:
, 1, i
i i i i ii
U s
u u μ Ψ z z z
( , )1, ,
1/21
min max max min
. . ,
, 1,
, :
i
T T
i m U
i i i i i i
s t
U s
x y x uu
x
y
c x b y
Ax d x S
u u μ Ψ z z z
x u y S Wy h Tx Mu
Uncertainty set using l1 and l∞ norms
DDANRO1∩∞
• Size depends on data • Multi-level (min-max-
max-min) optimization
Model Features
• Adaptive to uncertainty• Less conservative • Captures the nature of
uncertainty data
Advantages
component iChallenge: How to solve the multi-level optimization problem?
• Multi-level optimization to single-level
Computational Algorithm
19
Multi-level
Decomposition
Extreme point
min
. . , ,
, ,
T
T l
ll
l
s tl L
l L
l L
x y
c x
Ax db y
Tx Wy h Mu
x S y S
( , )1, ,
1/21
min max max min
. . ,
, 1,
, :
i
T T
i m U
i i i i i i
s t
U s
x y x uu
x
y
c x b y
Ax d x S
u u μ Ψ z z z
x u y S Wy h Tx Mu
Large-scale!
Single-level
• Features of the algorithm• Comparison: Unlike C&CG algorithm, it has a set of sub-problems• Convergence: Finite number of extreme points of uncertainty set
implies convergence in finite number of iterations
Semi-infinite program
(SIP)
Tailored Row & Column Generation Algorithm
20
min
. . , ,
, ,
T
T l
ll
l
s tl L
l L
l L
x y
c x
Ax db y
Tx Wy h Mu
x S y S
Master problem
Sub-problems
max min
. .
i
Ti U
Q
s t
yu
y
x b y
Wy h Tx Muy S
First-stage decisions
Optimality or feasibility cuts
• Multi-level optimization to single-level SIP
Example 2: ARO under data-driven uncertainties
21
1 2 1 2
1 2
1 1 1
2 2 2
min 3 5 max min 6y 10
. . 100 , 0, 1, 2
U
i i
x x y
s t x xx y ux y ux y i
x yu
Uncertainties
Uncertainty set is constructeddirectly from data.
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Uncertainty data
Motivating Example 2
22
ARO with box uncertainty set
ARO with budgetuncertainty set
Data-driven ARO with l1and l∞ norms based sets
Min. obj. 453.0 431.3 320.4First-stagedecisions
1
2
45.254.8
xx
1
2
47.952.1
xx
1
2
33.943.7
xx
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Data-driven uncertainty setBudget based uncertainty setBox based uncertainty set
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Example 3: ARO under correlated uncertainties
23
1 2 1 2
1 2
1 1 1
2 2 2
min 3 5 max min 6y 10
. . 100 , 0, 1, 2
U
i i
x x y
s t x xx y ux y ux y i
x yu
Uncertainties
Uncertainty set is constructeddirectly from data.
Results of Example 3
24
ARO with box uncertainty set
ARO with budgetuncertainty set
Data-driven ARO with l1and l∞ norms based set
Min. obj. 824.8 732.3 620.3First-stagedecisions
1
2
20.379.7
xx
1
2
32.267.8
xx
1
2
41.458.6
xx
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%500
550
600
650
700
750
800
850
Data Coverage of Uncertainty Set
Obj
ectiv
e Fu
nctio
n V
alue
ARO with budgeted setThe proposed DDANRO
30 40 50 60 70 80 9020
30
40
50
60
70
80
90
100
u1
u 2
30 40 50 60 70 80 9020
30
40
50
60
70
80
90
100
u1
u 2
Data-driven uncertainty setBox based uncertainty setBudgeted based uncertainty set
30 40 50 60 70 80 9020
30
40
50
60
70
80
90
100
u1
u 2
Example 4: ARO with 3D Uncertainty Data
25
1 2 3 1 2 3
1 2 3
1 1 1
2 2 2
3 3 3
min 3 5 6 max min 12y 20 30
s.t. 100 , 0, 1, 2,3
U
i i
x x x y y
x x xx y ux y ux y ux y i
x yu
Uncertainty set is constructeddirectly from data.
Uncertainties
Results of Example 4: Box Set v.s. DDANRO
26
ARO with box uncertainty set (or budget=3)
Data-driven ARO with l1 and l∞ norms based set
Min. obj. 1,741.2 1,251.8First-stagedecisions
1
2
3
0.028.371.7
xxx
1
2
3
13.737.349.0
xxx
Data coverage = 100%
Results of Example 4: budget = 2 v.s. DDANRO
27
ARO with budgeteduncertainty set (budget=2)
Data-driven ARO with l1 and l∞ norms based set
Min. obj. 1,534.3 1,204.1First-stagedecisions
1
2
3
4.734.860.5
xxx
1
2
3
10.540.049.5
xxx
Data coverage = 80.25%
Results of Example 4: budget = 1 v.s. DDANRO
28
ARO with budgeteduncertainty set (budget=1)
Data-driven ARO with l1 and l∞ norms based set
Min. obj. 1,189.5 1,145.4First-stagedecisions
1
2
3
8.632.558.9
xxx
1
2
3
9.941.149.0
xxx
Data coverage = 40.75%
• Idea 1: Union of basic convex sets to represent the entire uncertainty space • Small pieces/shapes to cover all data points• Overlaps between the basic shapes are allowed
• Each basic set is modeled as a constraint
• Idea 2: Beyond simple clustering - optimalpolyhedrons to derive the uncertainty sets• For “one-cluster” case, it is better typical methods• Can we use Convex Hull?
• Impossible for general higher dimension problems (> 3 dimensions); otherwise, P = NP (all integer programs can be solved as a linear program)
• Scalable method for large-scale applications• Previous examples have low dimensions for visual inspection
Summary: Insights from Numerical Examples
29
30 40 50 60 70 80 9020
30
40
50
60
70
80
90
100
u1
u 2
0 10 20 30 40 50 60
10
20
30
40
50
60
u1
u 2
Uncertainty Set
• Uncertain parameters from historical data• Demands of 4 products (correlated uncertainty)• Processing times of 3 reactions (with outliers)• Asymmetry, multimode and correlated data
30
Objective• Maximize profit
• Assignment constraint• Time constraint• Batch size constraint• Mass balance constraint• Storage constraint• Demand constraint
Constraints
Application 1: Batch Process Scheduling
31
Affected by outliers in processing time data
Static robustoptimization
box uncertainty
set
ARO budgeted
uncertainty set
DDANRO
• DDANRO yields the highest profit ($46,597)
• Reduces conservatism of ARO solution in the presence of outlier-corrupted data.
Data-Driven Robust Batch Scheduling Results
Application 2: Process Network Planning
32
Objective• Maximize NPV
Constraints• Expansion constraint• Investment constraint• Mass balance constraint• Capacity constraint• Demand constraint• Supply constraint
10,000 uncertain supplydata points for 10 feedstocks
16,000 uncertain demanddata points for 16 products
Process Network
• 38 processes• 28 chemicals
• Supply (10)• Demand (16)
Uncertainty
Data-Driven Robust Process Network Planning
33
Static robust optimization w/ box uncertainty
ARO with budget based uncertainty
(Гd=3, Гs=2)
DDANRO(Φd=3, Φs=2)
Max. NPV(m.u.) 761.79 799.03 857.38
Robust Design and planning results for time period 4 (left: SRO with boxed uncertainty; right: DDANRO)
34
Computational Results for Application 2
35
Int. Variables Cont. Var. Constraints Total CPU (s)Original ARO 152 681 945
466.4Master (last iter.) 152 7,450 9,748Subproblem 112 13,033 38,067
Summary on DDANRO
36Ning, C., & You, F. (2017). Data-Driven Adaptive Nested Robust Optimization: General Modeling Framework and Efficient Computational Algorithm for Decision Making under Uncertainty. AIChE Journal, 63, 3790–3817.
Labeled Multi-Class Uncertainty Data
37
Climate & Weather
Solar Power Generation
Weather
d1 Sunnyd2 Sunnyd3 Cloudy… …
Example 1
Example 2 The process data are collected from 6 operating modes, which own different mass ratio or production rate.
Mode G/H Mass Ratio Production Rate (stream 11)
1 50/50 7,038 kg/h G and 7,038kg/h H
2 10/90 1,408 kg/h G and 12,669kg/h H
3 90/10 10,000 kg/h G and 1,111kg/h H
4 50/50 Maximum production rate
5 10/90 Maximum production rate
6 90/10 Maximum production rate
[Shi et al., 12]
[Down & Vogel, 93]
Data-Driven Stochastic Robust Optimization
38
,
1 2
3
T T
( , )1, ,
1/2, , , , , ,1
min max max min
s.t. ,
, 1,
, :
ss is sUi m ss
n n
s i s i s i s i s i s i
ns s
p
R Z
U
R
x y x uuc x b y
Ax d x
u u μ Ψ z z z
x u y Wy h Tx Mu
Data-Driven SRO Framework
Stochastic Robust Optimization
1 Maximum likelihood estimation
2 pA group of Dirichlet
process mixture models
1
,Ni i
iD c
u
Uncertainty data Label
Probability of data classes Uncertainty sets
Data-Driven Uncertainty Model
Ning, C. & You, F. (2018). Data-Driven Stochastic Robust Optimization: General Computational Framework and Algorithm Leveraging Machine Learning for Optimization under Uncertainty in the Big Data Era. Computers & Chemical Engineering, 111, 115-133.
Data-Driven RO using Support Vector Method
39
T T
1
min
s.t. 0 1/ , 1, ,
1
iN
ii
N i N
K K
SVC (Dual as QP)
Weighted Generalized
Intersection Kernel
• Flexible & Compact geometry• Control “fraction of outliers”
RO with Uncertain Parameters
T
( )max
U Db
aa x
Tractable Robust Formulation
T
SV
SV0
, 0
i i ii
i ii
i i i
Qu b
Q x
1
SVCData
Robust Mixed-Integer Linear Program
Robust Mixed-Integer Linear Program
Application in Process Network Design
SVCUncertain Demand
Data
SVC-Induced Uncertainty Sets
Shang, C., Huang, X., & You, F. (2017). Data-Driven Robust Optimization Based on Kernel Learning. Computers & Chemical Engineering, 106, 464–479.
Data-Driven RO with PCA & Kernel Smoothing
40
1 Control System Diagram
3
Principal Component Analysis
Kernel Smoothing Method
0
KDE
KDE
1 1 1KDE KDE KDE
1 1 1KDE KDE KDE
min
s.t.
ˆ ˆ , ,
ˆ ˆ 1 , , 1
, ,
T
T T
T T
T T
Tm
Tm
b
F F
F F
xc x
μ x e λ λ ψ
λ ψ e P ξ e x
λ ψ e P ξ e x
ξ
ξ
λ λ
, 0 ψ 0
3D Illustrative Example
2 Data-Driven Uncertainty Set for MPC
Result of Data-Driven Robust MPC
Data-Driven Uncertainty Set Construction
Data-Driven Robust Optimization
• High-dimension• Correlation • Asymmetry
Application to MPC
Data-driven uncertainty set
General data-driven robust counterpart
Ning, C. & You, F. (2018). Data-Driven Decision Making under Uncertainty Integrating Robust Optimization with Principal Component Analysis and Kernel Smoothing Methods. Computers & Chemical Engineering, 112, 190–210.
Data-Driven Multistage ARO Based on RKDE
41
Batch Scheduling
10 1 1 11 1
1 1, , ,min max min max min
T TT T T T
T TU U
x y x uu u y x y uc x d y d y
Multistage Adaptive Robust Optimization Data-Driven Uncertainty Model
Uncertainty realization
Recourse decision
“Here-and-now” decision
Uncertainty realization
Recourse decision
Stage 1 Stage T…Model formulation for multistage decision making
Stage 0
Ning, C., & You, F. (2017). A Data-Driven Multistage Adaptive Robust Optimization Framework for Planning and Scheduling under Uncertainty. AIChE Journal, 63, 4343–4369.
Deep Learning Based Stochastic Program w/ GAN
42Zhao, S. & You, F. (2019). Deep Learning based Stochastic Chance Constrained Programming with Generative Adversarial Network.
DDARO for Steam and Energy Systems
43Zhao, L., Ning, C. & You, F. (2018). Operational Optimization of Industrial Steam Systems under Uncertainty Using
Data-Driven Adaptive Robust Optimization. AIChE Journal. doi:10.1002/aic.16500
T09
T18
T19
T20
EA01
T10
EA02
T11
T15 T16 T17
F19 F17 F18
T14
D1
F10
F09F01F02F03F04F05F06F07
F11
F08
T21A/CT22A/B/C
T01
T03
E-EA
128
T05
EA28
EA29
EA30 EA
32
EA33
EA31
EA34
EA27
EA22
EA18
EA20
EA21
EA17
EA03
EA04
T30
EA10
EA09
EA08
EA07
SSHSMSLS
F15
F14
F13
F12
EA26
T27A/B
EA24 EA23
EA37
T02
T04
T08
T07
EA36
EA47
EA46
EA45
EA44
EA42
EA43
EA48
EA55
EA51
EA53
EA52
EA54
EA49
EA50
EA14
EA13
EA12
EA11
EA15
EA16
EA06
EA05
T06
T31
F16
TC
BWBW
TC
BW
TC
BW
TC
T23
BW
TC
LV1
BW
TC
LV2
BW
TC
LV3
T28
T29
TS LV5
TS
LV7
LV9
LV10
LV8
EA41
EA40
EA38
EA39
EA25
D6
EA19
EA35T25
T24
PS
LV6TS
T12
T13
EG1
EG2
D4
D2HS
1
HS 2
Purification device
D5
D6
T26
Torch
D7
D8
D9
LV14
LV11LV12LV13
DDARO for Electric Power Systems Operations
44Ning, C. & You, F. (2019). Data-Driven Adaptive Robust Unit Commitment under Wind Power Uncertainty:
A Bayesian Nonparametric Approach. IEEE Trans. on Power Systems. 34, 2409-2418.
Data-Driven RO for Stochastic MPC
45
• “Soft” state constraints• Distribution of w is typically
unknown…
Data-Driven Robust Optimization Approach to Stochastic MPC
UncertaintyTraining Data
Chance-Constrained Stochastic MPC
Compact Data-Driven Set
UncertaintyCalibration Data
SVC
Calibration
Calibrated Set
Shang, C. & You, F. (2019). A data-driven robust optimization approach to stochastic model predictive control, Journal of Process Control. 75, 24-39.
Applications to Building Energy ControlApplications to Building Energy Control
Toboggan Lodge
@ Cornell
• Reduced Energy Consumption• Modest Constraint Violations
Active Online Learning of Temp. Forecast Error
Building Dynamics Based on Thermal Balance
Learning-based Robust MPC for Irrigation
46
• Min: Water consumption• Moisture level constraints to ensure
crop yield and avoid devastation
SVCData
Active Uncertainty Learning & Online Data Analytics
Active Uncertainty Learning & Online Data Analytics
Optimal IrrigationControl
Shang, C., Chen, W. H., Stroock, A. D., & You, F. (2019). Robust model predictive control of irrigation systems with active uncertainty learning and data analytics. IEEE Trans. on Control Systems Technology. DOI: 10.1109/TCST.2019.2916753
EvapotranspirationForecast Error
Data-Driven Uncertainty Set
PrecipitationForecast Error
In-DepthOnline DataAnalytics
Conditional Uncertainty Set
Optimal Control Profile of Soil Moisture using Real Weather Data
• Holistic learning-based stochastic control integrating mechanistic models & data-driven uncertainty learning
• Zero (0) probability of soil moisture deficiency • Minimum water consumptions (> 40% saving of water)• Good computational efficiency
• Data as valuable assets for optimal control decisions
Uncertain Forecast Errors
• New integrated (optimal) decision-making frameworks• Integrating data-driven and model-based systems into a cohort• Leveraging big data analytics for optimization under uncertainty
• Bringing theory & methods in both fields to the next level• Data-driven scientific discovery of new & powerful RO paradigms
• Flexible and powerful uncertainty sets that were previously intractable• Modeling & algorithmic frameworks for big data driven optimization
• Contribute to new mathematical programming theory and algorithms
• Empowering machine learning from data analytics to integrated decisionsupport (from data to information, and to optimal decisions)
• Applications: Manufacturing, sustainability, energy systems, agriculture, …
The “Meeting” of ML and RO Leads to …
47
48
www.peese.org
www.peese.org
49
Comp. & Chem. Eng. 125 (2019) 434-448
50