25
Design of computer experiments Fabrice Gamboa 28/02/2013 Fabrice Gamboa Bertrand Iooss

Fabrice Gamboa Bertrand Iooss - Institut de Mathématiques ... › ~gamboa › newwab › ...0.6 0.8 Example: p = 2 –N= 16 Baranquilla course 2013 – Design of computer experiments

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Design of computer experiments

    Fabrice Gamboa

    28/02/2013

    Fabrice Gamboa Bertrand Iooss

  • Typical engineering practice : One-At-a-Time (OAT) design

    X2P2P3

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 2

    Main remarks :

    OAT brings some information, but potentially wrong

    Exploration is poor: Non monotonicity ? Discontinuity ? Interaction ?

    Leave large unexplored zones of the domain (curse of dimensionality)

    X1P1 P2

  • Model exploration goalGOAL : explore as best as possible the behaviour of the code

    Put some points in the whole input space in order to « maximize » the amount of information on the model output

    Contrary to an uncertainty propagation step, it depends on p

    Regular mesh with n levels N =n p simulations Ex: p =2, n =3N =9

    p = 10, n=3N = 59049

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 3

    To minimize N, needs to have some techniques ensuring good « coverage » of the input spaceSimple random sampling (Monte Carlo) does not ensure this

    0.0 0.2 0.4 0.6 0.8

    0.2

    0.4

    0.6

    0.8

    1.0

    V1

    V2

    0.2 0.4 0.6 0.8

    0.2

    0.4

    0.6

    0.8

    V1

    V2

    Monte Carlo Optimized design

    Ex: p = 2N = 10

    N = 59049

  • Objectives

    When the objectives is to discover what happens inside the model and whenno model computations have been realized, we want to respect the twofollowing constraints:

    • To spread the points over the input space in order to capture nonlinearities of the model output,

    • To ensure that this input space coverage is robust with respect todimension reduction.

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 4

    Therefore, we look some design which insures the « best coverage » of theinput space

    Main question:

    • How to define this « best » ?

  • Exploration in physical experimentationDesign of experiments develops strategies to define experiments in order to obtain the required information as efficie ntly as possible

    Designs for numericalexperiments

    Characteristics

    Deterministic experiments (no error), Full factorial design 23

    Designs for real experiments

    Estimate parameters of linear regressionwith a minimal number of points

    Examples :

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 5

    Deterministic experiments (no error),

    Large number of input variables,

    Large range of input variation domain,

    Multiple output variables,

    Strong interactions between inputs,

    High non linearity in the model

    space filling designs (uniformcoverage in the input space)

    parameter 1

    parameter 2

    parameter 3

    Full factorial design 23

    Fractional factorial design 23-1

  • Space filling designs

    Sparsity of the space of the input variables in high dimension

    The learning design choice is made in order to have an optimal coverage of the input domain

    The space filling designs are good candidates.

    0.6

    0.8

    1.0

    V2

    0.4

    0.6

    0.8

    V2

    Simple Simple RandomRandomSampleSample

    Space Space FillingFilling

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 6

    Example: Sobol sequence

    Two possible criteria:1. Distance criteria between the points: minimax, maximin, …2. Uniformity criteria of the design (discrepancy measures)

    0.0 0.2 0.4 0.6 0.8

    0.2

    0.4

    V1

    V2

    0.2 0.4 0.6 0.8

    0.2

    0.4

    V1

    V2RandomRandom

    SampleSample(SRS)(SRS)

    FillingFillingDesignDesign(SFD)(SFD)

  • • Minimax design DMI : Minimize the maximal distance between one point of the domain and one point of the design

    All points in [0,1]p are not too far from a design point

    ),(min),( re whe

    ),(max),(maxmin

    )0(

    )0(xxdDxd

    DxdDxd

    Dx

    MIxxD

    ∈=

    = [ Johnson et al. 1990 ][ Koehler & Owen 1996 ]

    Geometrical criteria (1/2)

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 7

    => One of the best design, but too expensive to find DMI

  • • p = 1 ; Xi = (2i-1)/(2N) ; φmM = 1 / 2N

    • p > 1 : sphere recovering

    Minimax design

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 8

    [ www.spacefillingdesigns.nl ]

  • Geometrical criteria (2/2)

    - Mindist distance: ( L2 norm for example)

    Maximin design ΞNMm : maximize minimal distance between two points of the design

    ),(min)( )2()1(, )2()1(

    xxdNxx

    N

    Ξ∈=Ξφ

    ),(min),(minmax )2()1(,

    )2()1(

    ,Mm

    )2()1()2()1(xxdxxd

    NNN xxxx Ξ∈Ξ∈Ξ=

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 9

    - …

  • • p = 1 ; Xi = (i-1)/(N-1) ; φmM = 1 / (N-1)

    • p > 1 : sphere packing

    Maximin design

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 10[ www.spacefillingdesigns.nl ] [ www.packomania.com ]

  • Space filling measure of a design: the discrepancyMeasure of the maximal deviation between the distribution of the sample’s points to an uniform distribution

    ⇒ Measure of deviation from the uniformity

    Geometrical interpretation:Comparison between the volume of intervals and the number points within these intervals

    ×××=∈ pp ttttQtQ 21 [,0[[,0[[,0[)(,[1,0[)( K

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 11

    Lower the discrepancy is, the more the points of

    the design D fill the all space

    ∏=∈

    −=

    ×××=∈p

    ii

    tQ

    tQ

    p

    tN

    ND

    ttttQtQ

    p1

    )(

    [1,0[)(

    21

    sup)(disc

    [,0[[,0[[,0[)(,[1,0[)( K

  • Link with the integration problem

    ( )

    ( ) ( )

    =

    =

    =

    =∑

    N

    x

    xfN

    I

    dxxfI

    pNi

    i

    N

    i

    iN

    p

    1)(Var

    [1,0[in points random of sequence a with

    )(1

    : Carlo Monte

    )(

    ...1)(

    1

    )(MC

    [1,0[

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 12

    ( ) ( )

    =⇒==ΕN

    ON

    NIII NN

    1)(VarVar; MCMC ε

    General property (Koksma-Hlawka inequality):

    With a low discrepancy sequence D (quasi Monte Carlo sequence) :

    Well-known choice: Sobol’ sequence

    ( )

    =

    N

    NO

    plnε

    )(disc)( DfV ×≤ε

  • ( )

    ( )2/12

    [1,0[ 1)(

    *2

    1)(

    [1,0[

    *

    ))(Volume(11

    ))(Volume(11

    sup

    )(

    )(

    −=Ξ

    −=Ξ

    ∫ ∑

    =∈

    =∈

    tt

    t

    tx

    txt

    dQN

    D

    QN

    D

    p

    i

    ip

    N

    iQ

    N

    N

    iQ

    N

    L2 discrepancySeveral definitions, depending on considered norms and intervals

    Choice allowing computations : L 2222 discrepancy

    L2 discrepancy at origin :

    [ Hickernell 1998 ]

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 13

    Missing property: taking into account uniformity of the point projections On lower-dimensional subspaces of [0,1[ p

    => Modified L 2 discrepancies

    ( ){ }

    )in scoordinate of cube(unit on )( of projection )( and

    ,...,1with

    ))(Volume(11

    Ø

    2

    1)(2 )(

    uCQQ

    pu

    dQN

    D

    uu

    u C

    u

    N

    iQ

    N

    uu

    iu

    tt

    tttx

    =⊂

    −=Ξ ∑ ∫ ∑≠ =

  • Discrepancy computation in practice

    • Modified L2-discrepancy (intervals with minimal boundary 0)

    • Centered L2-discrepancy (intervals with boundary one vertex of the unit cube)

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 14

    • Symetric L2-discrepancy (intervals with boundary one « even » vertex of the unit cube)

    ∑∏

    ∑∏

    = =

    = =

    −−−+−++

    −−−+−

    =

    N

    ji

    p

    k

    jk

    ik

    jk

    ik

    N

    i

    p

    k

    ik

    ik

    p

    xxxxN

    xxN

    D

    1, 1

    )()()()(2

    1 1

    2)()(

    2

    2

    1

    2

    1

    2

    1

    2

    1

    2

    11

    1

    2

    1

    2

    1

    2

    1

    2

    11

    2

    12

    13)(disc

  • Sobol’sequence vs. Random sample vs. regular grid

    [ From: Kucherenko, 2010 ]

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 15

  • Example - N = 150 - Dimension = 8

    Sobol Sobol scrambling Owen

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 16

  • Example - N = 150 - Dimension = 8

    Halton

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 17

  • Pathologies on 2D projections

    Halton

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 18

  • Important property: robustness in terms of subproje ctions

    Most of the times, the function f (X) has low effective dimensions:- in the truncation sense (p1 = number of influent inputs) ⇒⇒⇒⇒ p1

  • Latin Hypercube Sample (LHS)

    Property: Uniform projections on margins

    Principle: p variables, N points ⇒⇒⇒⇒ LHS(p,N)

    Divide each dimension in N intervalsTake one point in each stratum

    Exemple : p =2, N =4

    Most often, only a small number of variables are influent

    [ McKay et al. 1979 ]

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 20

    Each level is taken only one time by each variable⇒⇒⇒⇒ Each column of the design is a permutation of { 1,2,..,N }

    Exemple : p =2, N =4

  • Algorithm of LHS( p,N) – Stein methodran = matrix(runif(N*p),nrow=N,ncol=p) #tirage de N x p valeurs selon loi

    U[0,1]

    x = matrix(0,nrow=N,ncol=p) # constructi on de la matrice x

    for (i in 1:p) {

    idx = sample(1:N) #vecteur de permutations des enti ers {1,2,…,N}

    P = (idx-ran[,i]) / N # vecteur de probabilités

    x[,i]

  • Simple methiod: produce a large number (for ex 1000) of different LHS. Then, choos the best with respect to a criterion φ (.) (« space filling »)

    BUT: the number of LHSis huge :

    Methods via optimization algo (ex: minimisation of φ(.) via simulated annealing) :

    Optimisation of LHS => Space-filling LHS

    Example : LHS(2,16)

    Maximin criterion ( )pN!

    [ Park 1993; Morris & Mitchell 1995 ]

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 22

    1. Initialisation of a design Ξ (LHS initial) and a temperature T

    2. While T > 0 : 1. Produce a neighbor Ξ new of Ξ (permutation of 2 components in a column)

    2.replace Ξ by Ξ new with proba

    3.decrease T

    3. Stop criterion => Ξ is the optimal solution

    Ξ−Ξ− 1 , )()(expmin newT

    φφ

  • Examples of optimized LHSJoining the two properties (space filling and LHS)

    0.6

    0.8

    Example: p = 2 – N = 16

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 23

    Maximin LHS Low wrap-around For comparison:discrepancy LHS Sobol sequence

    0.2 0.4 0.6 0.8

    0.2

    0.4

  • Summary on the design of numerical experiments

    Goal: Sample a high dimensionam space in an « optimal » manner (obtain the maximum of information on the behaviour of the outputZ / X є Rp)

    Problem: a pure random sample (Monte Carlo) badly fills the space

    1.« Space filling » designs are good candidates:

    - Based on a distance criterion between points (minimax, maximin, …)

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 24

    - Based on a citerion of uniform distribution of the points (discrepancy)

    2.Property of uniform projections on margins can be obtained via the Latin hypercube designs (LHS)

    3.It is possible to couple 1 and 2

  • – Fang et al., Design and modeling for computer experiments, Chapman & Hall, 2006

    – J.C. Helton, J.D. Johnson, C.J. Salaberryet C.B. Storlie: Survey of sampling-based methods for uncertainty and sensitivity analysis. Reliability Engineering and System Safety, 91:1175–1209, 2006.

    – Kleijnen, The design and analysis of simulation experiments, Springer, 2008

    Bibliography

    Baranquilla course 2013 – Design of computer experiments - F. Gamboa & B. Iooss 25

    – Kleijnen, The design and analysis of simulation experiments, Springer, 2008

    – Koehler & Owen, Computer experiments, 1996

    – A. Saltelli, K. Chan & E.M. Scott, Sensitivity analysis, Wiley, 2000

    – A. Saltelli et al., Global sensitivity analysis - The primer. Wiley, 2008.