Sparse Polynomial Chaos Surrogate for ACME Land Model via ... · Polynomial Chaos surrogate for f( ) Scale the input parameters i 2[a i;b i] i = a i + b i 2 + b i a i 2 x i Forward

Sparse Polynomial Chaos Surrogatefor ACME Land Model

via Iterative Bayesian Compressive Sensing

K. Sargsyan1, D. Ricciuto2, P. Thornton2

C. Safta1, B.Debusschere1, H. Najm1

1Sandia National LaboratoriesLivermore, CA

2Oak Ridge National LaboratoryOak Ridge, TN

Sponsored by DOE, Biological and Environmental Research,under Accelerated Climate Modeling for Energy (ACME).

Sandia National Laboratories is a multi-program laboratory operated by Sandia Corporation,a wholly owned subsidiary of Lockheed Martin Corporation,

for the U.S. Department of Energy’s National Nuclear Security Administrationunder contract DE-AC04-94AL85000.

K. Sargsyan ([email protected]) AGU Fall Meeting 2015 Dec 18, 2015 1

OUTLINE

• Surrogates needed for complex models

• Polynomial Chaos (PC) surrogates do well with uncertain inputs

• Bayesian regression provide results with uncertainty certificate

• Compressive sensing ideas deal with high-dimensionality


Application of Interest: ACME Land Model

http://www.cesm.ucar.edu/models/clm/

A single-site, 1000-yr simulation takes ∼ 10 hrs on 1 CPUInvolves ∼ 70 input parameters; some dependentNon-smooth input-output relationship


Surrogate construction: scope and challenges

Construct surrogate for a complex model f (λ) to enable

• Global sensitivity analysis• Optimization• Forward uncertainty propagation• Input parameter calibration• · · ·

• Computationally expensive model simulations, data sparsity• Need to build accurate surrogates with as few

training runs as possible

• High-dimensional input space• Too many samples needed to cover the space• Too many terms in the polynomial expansion


Polynomial Chaos surrogate for f (λ)

• Scale the input parameters λi ∈ [ai, bi]

λi =ai + bi

2+

bi − ai2

xi

• Forward function f (·), output u

u(x) = f (λ(x)) ≈K−1∑k=0

ckΨk(x) ≡ g(x)

• Global sensitivity information for free- Sobol indices, variance-based decomposition.

• Bayesian inference useful for finding ck:

P(c|u) ∝ P(u|c)P(c)


Bayesian inference of PC surrogate: high-d, low-data regime

y = u(x) ≈∑K−1

k=0 ckΨk(x)

Ψk(x1, x2, ..., xd) = ψk1(x1)ψk2(x2) · · ·ψkd (xd)

• Issues:

• how to properly choosethe basis set?

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

Dim 1

Dim

2

• need to work in underdetermined regimeN < K: fewer data than bases (d.o.f.)

• Discover the underlying low-d structure in the model

• get help from the machine learning community


In a different language....

• N training data points (xn, un) and K basis terms Ψk(·)• Projection matrix PN×K with Pnk = Ψk(xn)• Find regression weights c = (c0, . . . , cK−1) so that

u ≈ Pc or un ≈∑

k ckΨk(xn)

• The number of polynomial basis terms grows fast; a p-th order,d-dimensional basis has a total of K = (p + d)!/(p!d!) terms.

• For limited data and large basis set (N < K) this is a sparse signalrecovery problem⇒ need some regularization/constraints.

• Least-squares argminc {||u− Pc||2}

• The ‘sparsest’ argminc {||u− Pc||2 + α||c||0}

• Compressive sensing argminc {||u− Pc||2 + α||c||1}


In a different language....

• N training data points (xn, un) and K basis terms Ψk(·)• Projection matrix PN×K with Pnk = Ψk(xn)• Find regression weights c = (c0, . . . , cK−1) so that

u ≈ Pc or un ≈∑

k ckΨk(xn)

• The number of polynomial basis terms grows fast; a p-th order,d-dimensional basis has a total of K = (p + d)!/(p!d!) terms.

• For limited data and large basis set (N < K) this is a sparse signalrecovery problem⇒ need some regularization/constraints.

• Least-squares argminc {||u− Pc||2}

• The ‘sparsest’ argminc {||u− Pc||2 + α||c||0}

• Compressive sensing argminc {||u− Pc||2 + α||c||1}Bayesian Likelihood Prior


BCS removes unnecessary basis terms

f (x, y) = cos(x + 4y) f (x, y) = cos(x2 + 4y)

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

Order (dim 2)

Ord

er

(dim

1)

−18

−16

−14

−12

−10

−8

−6

−4

−2

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

Order (dim 2)

Ord

er

(dim

1)

−18

−16

−14

−12

−10

−8

−6

−4

−2

The square (i, j) represents the (log) spectral coefficientfor the basis term ψi(x)ψj(y).


Iterative Bayesian Compressive Sensing (iBCS)

Iterative BCS: We implement an iterative procedure that allowsincreasing the order for the relevant basis terms while maintaining thedimensionality reduction [Sargsyan et al. 2014], [Jakeman et al. 2015].Combine basis growth and reweighting!

Initial Basis

Iterations

WeightedBCS

Model data

Sparse Basis Final Basis

BasisGrowth

ReweightingNew Basis


Basis set growth: simple anisotropic function


Basis set growth: ... added outlier term


Application of Interest: ACME Land Model

http://www.cesm.ucar.edu/models/clm/

A single-site, 1000-yr simulation takes ∼ 10 hrs on 1 CPUInvolves ∼ 70 input parameters; some dependentNon-smooth input-output relationship


FLUXNET experiment

60°N

30°N

0°

30°S

60°S

60°N

30°N

0°

30°S

60°S

180° 180°120°W 60°W 0° 60°E 120°E180° 180°

FLUXNET sites

96 FLUXNET sites covering major biomes and plant functional types

Varying 68 input parameters over given ranges; 5 steady state outputs

Ensemble of 3000 runs on Titan, DoE Leadership Computing Facility atOak Ridge National Lab


Sparse PC surrogate and uncertainty decompositionfor the ACME Land Model

Main effect sensitivities : rank input parametersJoint sensitivities : most influential input couplingsAbout 200 polynomial basis terms in the 68-dimensional spaceSparse PC will further be used for• sampling in a reduced space• parameter calibration against experimental data

9

33 - bp

31

31 - br_mr

19

19 - froot_leaf

2

2 - mp

16

16 - frootcn3232 - q10_mr

9 - slatop

Site # 31

GPP

33 - bp

9

9 - slatop

19

19 - froot_leaf

31

31 - br_mr

25

25 - fr_fcel2020 - flivewd

Site # 31

TLAI




33 - bp

31

31 - br_mr

2

2 - mp

19

19 - froot_leaf

16

16 - frootcn99 - slatop

Site # 31

TOTVEGC

9

3131 - br_mr

35

35 - cn_s3

3

3 - bp

32

32 - q10_mr

34

34 - cn_s219

19 - froot_leaf

9 - slatop

Site # 31

TOTSOMC




MP

BP

ROOTB_PAR

SLATOP

FLNR

FROOTCN

FROOT_LEAF

LEAF_LONG

BDNR

BR_M

R

Q10_M

R

CN_S2

CN_S3

RF_S2S3

RF_S3S4

R_M

ORT

GPP

TLAI

TOTVEGC

EFLX_LH_TOT

TOTSOMC

10−3

10−2




GPPgross primaryproductivity

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

960.0

0.2

0.4

0.6

0.8

1.0

Sensitivity

soilpsi_offmpbp

slatopcrit_onset_swileafcn

flnrfrootcnfroot_leaf

fstor2tranbdnr

br_mrq10_mr




TLAIleaf area index

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

960.0

0.2

0.4

0.6

0.8

1.0

Sensitivity

soilpsi_offbp

slatopleafcn

flnrfrootcn

froot_leafleaf_long

bdnrbr_mr




TOTVEGCvegetation carbon

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

960.0

0.2

0.4

0.6

0.8

1.0

Sensitivity

mpbp

leafcnflnr

frootcnfroot_leaf

r_mortfstor2tran

bdnrbr_mr




EFLX LH TOTlatent heat flux

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

960.0

0.2

0.4

0.6

0.8

1.0

Sensitivity

soilpsi_offcn_s3

rootb_parslatop

leafcnflnr

frootcnfroot_leaf

br_mrq10_mr




TOTSOMCsoil organicmatter carbon

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

960.0

0.2

0.4

0.6

0.8

1.0

Sensitivity

cn_s3slatop

leafcnflnr

bpfrootcn

froot_leaffstor2tran

br_mrq10_mr


SummarySurrogate models are necessary for complex models• Replace the full model for both forward and inverse UQ

Uncertain inputs• Polynomial Chaos surrogates well-suited

Limited training dataset• Bayesian methods handle limited information well

Curse of dimensionality• The hope is that not too many dimensions matter• Compressive sensing (CS) ideas ported from machine learning• We implemented iteratively reweighting Bayesian CS algorithm that

reduces dimensionality and increases order on-the-fly.

Open issues• Computational design. What is the best sampling strategy?• Overfitting still present. Cross-validation techniques help.

Software: employed SNL-CA UQ library UQTk (www.sandia.gov/uqtoolkit)


Literature

• M. Tipping, “Sparse Bayesian learning and the relevance vector machine”, JMachine Learning Research, 1, pp. 211-244, 2001.

• S. Ji, Y. Xue and L. Carin, “Bayesian compressive sensing”, IEEE Trans. SignalProc., 56:6, 2008.

• S. Babacan, R. Molina and A. Katsaggelos, “Bayesian compressive sensing usingLaplace priors”, IEEE Trans. Image Proc., 19:1, 2010.

• E. J. Candes, M. Wakin and S. Boyd. “Enhancing sparsity by reweighted `1minimization”, J. Fourier Anal. Appl., 14 877-905, 2007.

• A. Saltelli, “Making best use of model evaluations to compute sensitivityindices”,Comp Phys Comm, 145, 2002.

• K. Sargsyan, C. Safta, H. Najm, B. Debusschere, D. Ricciuto and P. Thornton,“Dimensionality reduction for complex models via Bayesian compressive sensing”,Int J for Uncertainty Quantification, 4(1), pp. 63-93, 2014.

• J. Jakeman, M. Eldred and K. Sargsyan, “Enhancing `1-minimization estimates ofpolynomial chaos expansions using basis selection”, J Comp Phys, 289, pp. 18-34,2015.

Thank YouK. Sargsyan ([email protected]) AGU Fall Meeting 2015 Dec 18, 2015 16

Random variables represented by Polynomial Chaos

X 'K−1∑k=0

ckΨk(η)

• η = (η1, · · · , ηd) standard i.i.d. r.v.Ψk standard polynomials, orthogonal w.r.t. π(η).

Ψk(η1, η2, . . . , ηd) = ψk1(η1)ψk2(η2) · · ·ψkd (ηd)

• Typical truncation rule: total-order p, k1 + k2 + . . . kd ≤ p.Number of terms is K = (d+p)!d!p! .

• Essentially, a parameterization of a r.v. by deterministic spectralmodes ck .

• Most common standard Polynomial-Variable pairs:(continuous) Gauss-Hermite, Legendre-Uniform,(discrete) Poisson-Charlier.

[Wiener, 1938; Ghanem & Spanos, 1991; Xiu & Karniadakis, 2002; Le Maı̂tre & Knio, 2010]

Bayesian inference of PC surrogate

u '∑K−1

k=0 ckΨk(x) ≡ gc(x)Posterior︷︸︸︷P(c|D) ∝

Likelihood︷︸︸︷P(D|c)

Prior︷︸︸︷P(c)

• Data consists of training runsD ≡ {(xi, ui)}Ni=1

• Likelihood with a gaussian noise model with σ2 fixed or inferred,

L(c) = P(D|c) =(

1σ√

2π

)N N∏i=1

exp(− (ui − gc(x))

2

2σ2

)• Prior on c is chosen to be conjugate, uniform or gaussian.• Posterior is a multivariate normal

c ∈ MVN (µ,Σ)

• The (uncertain) surrogate is a gaussian processK−1∑k=0

ckΨk(x) = Ψ(x)Tc ∈ GP(Ψ(x)Tµ,Ψ(x)ΣΨ(x′)T)

Sensitivity information comes free with PC surrogate,

g(x1, . . . , xd) =K−1∑k=0

ckΨk(x)

• Main effect sensitivity indices

Si =Var[E(g(x|xi)]

Var[g(x)]=

∑k∈Ii c

2k ||Ψk||2∑

k>0 c2k ||Ψk||2

Ii is the set of bases with only xi involved

Sensitivity information comes free with PC surrogate,

g(x1, . . . , xd) =K−1∑k=0

ckΨk(x)


Si =Var[E(g(x|xi)]

Var[g(x)]=

∑k∈Ii c

2k ||Ψk||2∑

k>0 c2k ||Ψk||2

• Joint sensitivity indices

Sij =Var[E(g(x|xi, xj)]

Var[g(x)]− Si − Sj =

∑k∈Iij c

2k ||Ψk||2∑

k>0 c2k ||Ψk||2

Iij is the set of bases with only xi and xj involved

Sensitivity information comes free with PC surrogate,but not with piecewise PC

g(x1, . . . , xd) =K−1∑k=0

ckΨk(x)


Si =Var[E(g(x|xi)]

Var[g(x)]=

∑k∈Ii c

2k ||Ψk||2∑

k>0 c2k ||Ψk||2

• Joint sensitivity indices

Sij =Var[E(g(x|xi, xj)]

Var[g(x)]− Si − Sj =

∑k∈Iij c

2k ||Ψk||2∑

k>0 c2k ||Ψk||2

• For piecewise PC, need to resort to Monte-Carlo estimation[Saltelli, 2002].

Basis normalization helps the success rate

0 5 10 15 20 25 30 35 40 45Number of measurements

0.0

0.2

0.4

0.6

0.8

1.0Success rate

LU Basis

LU Normalized Basis

Input correlations: Rosenblatt transformation

• Rosenblatt transformation maps any (not necessarily independent) set ofrandom variables λ = (λ1, . . . , λd) to uniform i.i.d.’s {xi}di=1[Rosenblatt, 1952].

x1 = F1(λ1)

x2 = F2|1(λ2|λ1)x3 = F3|2,1(λ3|λ2, λ1)...

xd = Fd|d−1,...,1(λd|λd−1, . . . , λ1)0.18

0.2

0.22

0.24

0.4

0.45

0.5

0.3

0.32

0.34

0.36

LabileCellulose

Lig

nin

• Inverse Rosenblatt transformation λ = R−1(x) ensures a well-defined inputPC construction

λi =

K−1∑k=0

λikΨk(x)

• Caveat: the conditional distributions are often hard to evaluate accurately.

Strong discontinuities/nonlinearities challenge globalpolynomial expansions

• Basis enrichment [Ghosh & Ghanem, 2005]

• Stochastic domain decomposition• Wiener-Haar expansions,

Multiblock expansions,Multiwavelets, [Le Maı̂tre et al, 2004,2007]

• also known as Multielement PC [Wan & Karniadakis, 2009]

• Smart splitting, discontinuity detection[Archibald et al, 2009; Chantrasmi, 2011; Sargsyan et al, 2011; Jakeman et al, 2012]

• Data domain decomposition,• Mixture PC expansions [Sargsyan et al, 2010]

• Data clustering, classification,• Piecewise PC expansions

Piecewise PC expansion with classification

• Cluster the training dataset into non-overlapping subsets D1and D2, where the behavior of function is smoother• Construct global PC expansions gi(x) =

∑k cikΨk(x) using

each dataset individually (i = 1, 2)• Declare a surrogate

gs(x) =

{g1(x) if x ∈∗ D1g2(x) if x ∈∗ D2

∗ Requires a classification step to find out which cluster xbelongs to. We applied Random Decision Forests (RDF).

• Caveat: the sensitivity information is harder to obtain.

0.0: 0.1: 0.2: 0.3: 0.4: 0.5: 0.6: 0.7: 0.8: anm0: 1.0: 1.1: 1.2: 1.3: 1.4: 1.5: 1.6: 1.7: 1.8: 1.9: 1.10: anm1:

Documents

Sparse Polynomial Chaos Surrogate for ACME Land Model via ... · Polynomial Chaos surrogate for f( ) Scale the input parameters i 2[a i;b i] i = a i + b i 2 + b i a i 2 x i Forward