Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Concepts in Global Sensitivity Analysis IMA UQ Short Course, June 23, 2015 A good reference is Global Sensitivity Analysis: The Primer. Saltelli, et al. (2008)
WARNING: These slides are meant to complement the oral presentation in the short course. Use out of context at your own risk.
Paul Constantine Colorado School of Mines inside.mines.edu/~pconstan activesubspaces.org @DrPaulynomial
http://www.sfu.ca/~ssurjano/index.html
Von Neumann, John, and Herman H. Goldstine. "Numerical inverting of matrices of high order." Bulletin of the American Mathematical Society 53.11 (1947): 1021-1099.
• What kinds of science/engineering models do you care about?
• Do you have a simulation that you trust? What are the
inputs and outputs? • How would you characterize the uncertainty in the
inputs? In other words, what do you know about the unknown inputs?
• What question are you trying to answer with your model?
f(x)
x
• Finite dimensional vector • Independent components • Centered and scaled to remove units
Time0 0.5 1 1.5 2
Res
pons
e
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Perturbation 1Baseline
Time0 0.5 1 1.5 2
Res
pons
e-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Perturbation 2Baseline
2-‐norm difference 20.5
infinity-‐norm difference 2.0
difference at final 5me 0.0
2-‐norm difference 31.6
infinity-‐norm difference 1.8
difference at final 5me 0.0
Time0 0.5 1 1.5 2
Response
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Baseline Which perturbation shows the largest change?
• Scalar-valued • “Smooth” • No “noise!”
f
Sensitivity analysis seeks to identify the most important parameters.
• What are the most important parameters in your model?
• What are the least important parameters?
• What does it mean for a parameter to be important?
@f
@xi(x)
Derivatives measure local sensitivity. But we want something global.
Some Global Sensitivity Metrics 1. Morris’ elementary effects 2. Sobol sensitivity indices 3. Mean (squared) derivatives 4. Active subspaces
Morris’ Elementary Effects (Like bad approximations to average derivatives)
EEij(h) =f(xj + hei)� f(xj)
h
Step size
h 2⇢
2n
p� 1, n = 1, . . . , p� 1
�
Elementary effect
p-level grid
x1
x2
µi(h) =1
N
NX
j=1
EEij(h)
µ⇤i (h) =
1
N
NX
j=1
|EEij(h)|
Sensitivity indices
Variance-based decompositions
f(x) = f0
+mX
i=1
fi(xi)
+mX
i=1
mX
j>i
fi,j(xi, xj)
· · ·+ f1,...,m(x1, . . . , xm)
constant
functions of one variable
functions of two variables
functions of 3, 4, … variables
function of m variables
Variance-based decompositions
f0 = E [f ]
fi = E [f |xi]� f0
fi,j = E [f |xi, xj ]�X
i
fi � f0
...
f1,...,m = f(x)� “everything else”
orthogonal functions
Var [f ] =X
i
Var [fi] +X
i,j
Var [fi,j ] + · · ·+Var [f1,...,m]
Decomposition of variance
Sobol indices
Si =Var [fi]
Var [f ]
Si1,...,ik =Var [fi1,...,ik ]
Var [f ]
ST1 = S1 + S1,2 + S1,3 + S1,2,3
First order sensitivity index
Interaction effects
Total effects (e.g., sum everything with a “1”)
(PAUL: Mention the relationship to polynomial chaos.)
Mean (squared) derivatives
E@f
@xi
�E"✓
@f
@xi
◆2#
Kucherenko, et al., DGSM, RESS (2008)
Let’s play!
Think of an interesting bivariate function.
Estimating with Monte Carlo is loud.
Number of samples102 104 106
Mon
te C
arlo
Erro
r
10-5
10-4
10-3
10-2
10-1
100
What is it goood for?
• Sensitivity metrics can be hard to interpret if not zero.
• May provide or confirm understanding. • Lots of ideas for using them as weights for anisotropic
approximation schemes. • Would like to use them to reduce the dimension.
AUDIENCE POLL
How many dimensions is “high” dimensions?
APPROXIMATION OPTIMIZATION INTEGRATION
f(x) ⇡ f(x)
Zf(x) ⇢ dx minimize
x
f(x)
Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years
(240x age of the universe)
Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years
(240x age of the universe)
“Reduced order models”
Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years
(240x age of the universe)
“Better designs”
Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years
(240x age of the universe)
“Dimension reduction”
direction of change
direction of flat
f(x1, x2) = exp(0.7x1 + 0.3x2)
bookstore.siam.org/sl02/
$27
Coupon code: BKSL15
DEFINE the active subspace.
Consider a function and its gradient vector,
The average outer product of the gradient and its eigendecomposition,
Partition the eigendecomposition,
Rotate and separate the coordinates,
⇤ =
⇤1
⇤2
�, W =
⇥W 1 W 2
⇤, W 1 2 Rm⇥n
x = WW Tx = W 1W
T1 x+W 2W
T2 x = W 1y +W 2z
active variables
inactive variables
C =
Z(r
x
f)(rx
f)T ⇢ dx = W⇤W T
f = f(x), x 2 Rm, rf(x) 2 Rm, ⇢ : Rm ! R+
Zxx
T ⇢ dx
Zr
x
f rx
fT ⇢ dx
VS.
The eigenvectors indicate perturbations that change the function more, on average.
LEMMA 1:
LEMMA 2:
�i =
Z �(r
x
f)Twi
�2⇢ dx, i = 1, . . . ,m
Z(ryf)
T (ryf) ⇢ dx = �1 + · · ·+ �n
Z(rzf)
T (rzf) ⇢ dx = �n+1 + · · ·+ �m
DISCOVER the active subspace with random sampling.
Draw samples:
Compute: and fj = f(xj) rx
fj = rx
f(xj)
Approximate with Monte Carlo
Equivalent to SVD of samples of the gradient.
Called an active subspace method in T. Russi’s 2010 Ph.D. thesis, Uncertainty Quantification with Experimental Data in Complex System Models
xj ⇠ ⇢
C ⇡ 1
N
NX
j=1
rx
fj rx
fTj = W ⇤W
T
1pN
⇥r
x
f1 · · · rx
fN⇤= W
p⇤V
T
1pN
⇥r
x
f1 · · · rx
fN⇤⇡ W 1
q⇤1V
T
1
Low-rank approximation of the collection of gradients:
Let’s be abundantly clear about the problem we are trying to solve.
Low-dimensional linear approximation of the gradient:
rf(x) ⇡ W 1 a(x)
f(x) ⇡ g⇣W
T
1 x
⌘
Approximate a function of many variables by a function of a few linear combinations of the variables: ✔
✖ ✖
f(x) ⇡ g⇣W
T
1 x
⌘
How do you construct g?
What is the approximation error?
What is the effect of the approximate eigenvectors?
[ Show them the animation! ]
Define the conditional expectation:
THEOREM:
Define the Monte Carlo approximation:
THEOREM:
g(y) =
Zf(W 1y +W 2z) ⇢(z|y) dz, f(x) ⇡ g(W T
1 x)
g(y) =1
N
NX
i=1
f(W 1y +W 2zi), zi ⇠ ⇢(z|y)
EXPLOIT active subspaces for response surfaces with conditional averaging.
✓Z ⇣f(x)� g(W T
1 x)⌘2
⇢ dx
◆ 12
CP (�n+1 + · · ·+ �m)12
✓Z ⇣f(x)� g(W T
1 x)⌘2
⇢ dx
◆ 12
CP
⇣1 +N� 1
2
⌘(�n+1 + · · ·+ �m)
12
✓Z ⇣f(x)� g(W
T
1 x)⌘2
⇢ dx
◆ 12
CP
⇣" (�1 + · · ·+ �n)
12 + (�n+1 + · · ·+ �m)
12
⌘
EXPLOIT active subspaces for response surfaces with conditional averaging.
Subspace error
Eigenvalues for active variables
Eigenvalues for inactive variables
Define the subspace error:
" = dist (W 1, W 1)
THEOREM:
THE BIG IDEA
1. Choose points in the domain of g.
2. Estimate conditional averages at each point.
3. Construct the approximation in n < m dimensions.
There’s an active subspace in this parameterized PDE.
Two-d Poisson with 100-term Karhunen-Loeve coefficients
�1
�2D
�r · (aru) = 1, x 2 Du = 0, x 2 �1
�n · aru = 0, x 2 �2
DIMENSION REDUCTION: 100 to 1
1 2 3 4 5 610−13
10−12
10−11
10−10
10−9
10−8
10−7
10−6
Index
Eige
nval
ues
EstBI
1 2 3 4 5 610−13
10−12
10−11
10−10
10−9
10−8
10−7
10−6
Index
Eige
nval
ues
1 2 3 4 5 610−2
10−1
100
Subspace Dimension
Subs
pace
Dis
tanc
e
BIEst
1 2 3 4 5 610−2
10−1
100
Subspace Dimension
Subs
pace
Dis
tanc
e
Two-d Poisson with 100-term Karhunen-Loeve coefficients
�1
�2D
Active variable-3 -2 -1 0 1 2 3
Qua
ntity
of I
nter
est
#10-3
0
0.5
1
1.5
2
2.5
3
�r · (aru) = 1, x 2 Du = 0, x 2 �1
�n · aru = 0, x 2 �2
There’s an active subspace in this parameterized PDE.
DIMENSION REDUCTION: 100 to 1
Active subspaces can be sensitivity metrics.
0 20 40 60 80 100−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Index
β=0.01β=1
Com
pone
nts
of fi
rst e
igen
vect
or
Short correlation length Long correlation length
Questions?
• How do the active subspaces relate to the coordinate-based sensitivity metrics?
• How does this relate to PCA/POD?
• How many gradient samples do I need?
• How new is all this?
Paul Constantine Colorado School of Mines
activesubspaces.org @DrPaulynomial