SIMPAT: Stochastic Simulation with Patterns · Flow in a reservoir is mostly controlled by the connectivity of extreme perme-abilities (both high and low) which are generally associated

SIMPAT:

Stochastic Simulation with Patterns

G. Burc Arpat

Stanford Center for Reservoir Forecasting

Stanford University, Stanford, CA 94305-2220

April 26, 2004

Abstract

Flow in a reservoir is mostly controlled by the connectivity of extreme perme-

abilities (both high and low) which are generally associated with marked, multiple-

scale geological patterns. Thus, accurate characterization of such patterns is re-

quired for successful flow performance and prediction studies.

In this paper, a new pattern-based geostatistical algorithm (SIMPAT) is proposed

that redefines reservoir characterization as an image construction problem. The

approach utilizes the training image concept of multiple-point geostatistics but

instead of exporting statistics, it infers multiple-scale geological patterns that occur

within the training image and uses these patterns as the building blocks for image

(reservoir) construction. The method works equally well with both continuous and

categorical variables while conditioning to a variety of local subsurface data such

as well logs and 3D seismic.

The paper includes the technical details of the SIMPAT algorithm and various

complex examples to demonstrate the flexibility of the approach.

1

1 Introduction

Sequential simulation is one of the most widely used stochastic imaging techniques within

the Earth Sciences. The sequential simulation idea found its first use in the traditional

variogram-based geostatistics algorithms such as sequential Gaussian simulation (SGSIM)

and sequential indicator simulation (SISIM) as described in Deutsch and Journel (1998).

In the early nineties, multiple-point geostatistics introduced the training image concept

proposing to replace the variogram with the training image within an extended sequential

simulation framework (Guardiano and Srivastava, 1993). Later, the landmark algorithm

(SNESIM) of Strebelle (2000) made the use of multiple-point geostatistics practical for

real reservoirs (Caers et al., 2003; Strebelle et al., 2002).

In this paper, a different approach to the use of training images within the extended

sequential simulation framework is explored. The problem is redefined as an image

processing problem where one finds the geological patterns in a training image instead of

exporting multiple-point statistics through conditional probabilities. Once these patterns

are found, the method applies sequential simulation such that, during the simulation, a

local pattern similarity criterion is honored instead of local conditional distributions.

The method accounts for multiple-scale interactions of geological patterns through a new

multiple-grid approach where the scale relations are tightly coupled (See Tran (1994) for

the details of the traditional multiple-grid method). The same algorithm works equally

well both with categorical and continous variables such as facies and permeability while

conditioning to a variety of local subsurface data such as well logs and 3D seismic.

This algorithm is named SIMPAT (SIMulation with PATterns). For a discussion about

the general philosophy of the basic algorithm, the reader is referred to Caers and Arpat

(2004). This paper is intended to document the technical details of SIMPAT. The paper

first introduces a formal notation for the proposed algorithm (Section 2). Then, a detailed

discussion of the SIMPAT algorithm using this notation is given (Section 3). This section is

followed by the results section where several examples are discussed (Section 4). Finally,

future work plans about the algorithm are laid out in Section 5.

2

2 Notation

For clarity, this section introduces the required notation for explaining the SIMPAT al-

gorithm on a binary (sand/non-sand) case. The extension to multiple-categories and

continuous variables are discussed later in Section 3.1.1.

2.1 Grids and Templates

Define i(u) as a realization of an indicator variable I(u) modeling the occurrence of

a binary event (For example, a model for the spatial distribution of two facies, e.g.

sand/non-sand):

i(u) =

1 if at u the event occurs (sand)

0 if at u the event does not occur (non-sand)(1)

where u = (x, y, z) ∈ G and G is the regular Cartesian grid discretizing the field of

study. When node u of i is unknown or missing, it is denoted by i(u) = χ.

iT(u) indicates a specific multiple-point event of indicator values within a template

T centered at u; i.e., iT(u) is the vector:

iT(u) = {i(u + h0), i(u + h1), i(u + h2), . . . , i(u + hα), . . . , i(u + hnT−1)} (2)

where the hα vectors are the vectors defining the geometry of the nT nodes of template

T and α = 0, . . . , nT − 1. The vector h0 = 0 identifies the central location u.

To distinguish the realization (the simulation grid), the training image, the primary

data (such as well data) and the secondary data (such as seismic), the notations re, ti,

dt1 and dt2 are used in place of i. For example,

tiT(u′) = {ti(u′ + h0), ti(u′ + h1), ti(u

′ + h2), . . . , ti(u′ + hα), . . . , ti(u′ + hnT−1)} (3)

denotes a multiple-point event scanned from the training image ti at location u′ and

u′ ∈ G′ where G′ is the regular Cartesian grid discretizing the training image. Notice

that, the training image grid G′ need not be the same as the simulation grid G.

3

2.2 Patterns

A pattern patkT is the particular k-th configuration of the previous vector of indicator val-

ues tiT(u′) of the training image ti, with each indicator value now denoted by patkT(hα),

where k = 0, . . . , npatT− 1 and npat

Tis the number of total available patterns in the

pattern database associated with the training image ti.

The pattern geometry is again defined by a template T containing nT nodes with the

vectors hα where α = 0, . . . , nT − 1. That configuration is location-independent, hence

the definition of a particular pattern is itself location-independent. Thus:

patkT(hα) =

1 if at the α-th template location hα the event occurs (sand)

0 if at the location hα the event does not occur (non-sand)(4)

Different from iT (u) defined in Eq. (1), a pattern patkT cannot have unknown/missing

nodes as patterns are extracted from the training image ti and the training image is

always fully known by definition. The k-th pattern is then defined by the vector patkT of

nT binary indicator values:

patkT = {patkT(h0), patkT(h1), patkT(h2), . . . , patkT(hα), . . . , patkT(hnT−1)} (5)

where k = 0, . . . , npatT− 1. All npat

Tpatterns are defined on the same template T.

In any finite training image ti, there is a finite maximum number of ntipat

Tpatterns

that can be extracted which are defined over a template T. A filter can be applied to

discard patterns with undesirable characteristics to reach the final smaller npatT

count.

Therefore, the final number (npatT) of patterns retained might be smaller than nti

patT.

2.3 Data Events

A data event devT(u) is defined as the vector of indicator values,

devT(u) = {devT(u + h0), devT(u + h1), . . . , devT(u + hα), . . . , devT(u + hnT−1)} (6)

with devT(u + hα) = re(u + hα) where re is the realization (simulation grid) indicator

function as previously defined in Section (2.1). In other words, devT(u) = reT(u).

4

2.4 Similarity

d 〈x,y〉 is used to denote a generic dissimilarity (distance) function. The vector entries

x and y can be patterns or data events. Ideally, d 〈x,y〉 should satisfy the conditions:

1. d 〈x,y〉 ≥ 0,

2. d 〈x,y〉 = 0 only if x = y,

3. d 〈x,y〉 = d 〈y,x〉

4. d 〈x, z〉 ≤ d 〈x,y〉 + d 〈y, z〉

In practice the last two conditions, which are required to make d 〈x,y〉 a metric, are

sometimes relaxed for certain distance functions.

A commonly used distance function is the Minkowski metric (Duda et al., 2001).

Applied to the case of d 〈x,y〉 for a data event and a pattern, one has,

d⟨devT(u),patk

T

⟩=

(nT−1∑

α=0

∣∣∣devT(u + hα) − patkT(hα)∣∣∣q

)1/q

(7)

where q ≥ 1 is a selectable parameter. Setting q = 2 gives the familiar Euclidean distance

while setting q = 1 gives the Manhattan or the “city block” distance.

The entries of d 〈x,y〉, vectors x and y might contain unknown/missing components.

For example, a data event devT(u) may have unknown nodes. In such a case, these

missing components are not used in the distance calculations, i.e. if u + hα is unknown,

the node α is skipped during the summation such that:

d⟨devT(u),patk

T

⟩=

(nT∑

α=0

dα

⟨devT(u + hα), patkT(hα)

⟩ )1/q

(8)

and,

dα 〈xα, yα〉 =

|xα − yα|q if xα and yα are both known, i.e. 6= χ

0 if either xα or yα is unknown/missing(9)

The above expression prevents the node hα from contributing to the similarity calcu-

lation if it contains an unknown/missing value in either one of the entires of dα 〈xα, yα〉.

5

2.5 Multiple-grids

On a Cartesian grid, the multiple-grid view of a grid G is defined by a set of cascading

coarse grids Gg and sparse templates Tg instead of a single fine grid and one large dense

template where g = 0, . . . , ng − 1 and ng is the total number of multiple-grids for grid G.

See Fig. 1 for an illustration of multiple-grid concepts.

full node

empty node

[ a ] coarse grid

fine node

coarse node

[ b ] fine grid

[ c ] coarse template [ d ] fine (base) template

Figure 1: Illustration of multiple-grid concepts for ng = 2.

The g-th (0 ≤ g ≤ ng − 1) coarse grid is constituted by each 2g-th node of the final

grid (g = 0; i.e. G0 = G). If T is a template defined by vectors hα, α = 0, . . . , nT − 1

where nT is the number of nodes in the template, then the template used for the grid

Gg, Tg is defined by hgα = 2g.hα and has the same configuration of nT nodes as T but

with spacing 2g times larger.

2.6 Dual Templates

For a given coarse template Tg, one can define a 3D bounding box around the nodes of

Tg such that this imaginary box is expanded to contain all the nodes of the template Tg

but not larger (Fig. 2). Then, the dual template of Tg, T̃g is defined as the template

6

that contains all the nodes within this bounding box. In other words, the dual template

T̃g contains the coarse nodes of the current coarse grid Gg as well as all the finest

nodes from the finest grid G0 which fall inside the bounding box. The total number of

nodes within the dual template T̃g is denoted by nT̃

and typically nT << nT̃. Fig. 2

illustrates a coarse grid template (also called a “primal template”) and its corresponding

dual template for g = 1 for a base template of size 3 × 3.

[ a ] primal template [ b ] dual template

[ c ] base template

full node

empty node

bounding box

Figure 2: Illustration of a primal template and its corresponding dual template for g = 1.

By definition, a dual template T̃g always shares the center node hg0 with its corre-

sponding primal template Tg. Using this property, the dual of a pattern patkT, patk

T̃is

defined as the pattern extracted at the same location u in the ti as patkT but using the

template T̃g instead of Tg. A dual data event devT̃(u) can be defined similarly.

7

3 The SIMPAT Algorithm

In this section, the details of the SIMPAT algorithm are presented using the above nota-

tion. For clarity, the algorithm is first presented without considering the details of the

application of the multiple-grid approach (such as the use of dual templates) and data

conditioning. These topics are elaborated in the subsequent subsections.

3.1 Single-grid, Unconditional SIMPAT

The basic algorithm can be divided into two modules: Preprocessing and simulation.

Preprocessing of the training image ti:

P-1. Scan the training image ti using the template T for the grid G to obtain all existing

patterns patkT, k = 0, . . . , nti

patT− 1 that occur over the training image.

P-2. Reduce the number of patterns to npatT

by applying filters to construct the pattern

database. Typically, only unique patterns are taken.

Simulation on the realization re:

S-1. Define a random path on the grid G of re to visit each node only once. Note that,

there is a variation to this step as explained in Section 3.1.3.

S-2. At each node u, retain the data event devT(u) and find the pat∗T that minimizes

d⟨devT(u),patk

T

⟩for k = 0, . . . , npat

T− 1, i.e. pat∗T is the ‘most similar’ pattern

to devT(u). See Section 3.1.4 for details of how the most similar pattern is searched

within the pattern database.

S-3. Once the most similar pattern pat∗T is found, assign pat∗T to devT(u), i.e. for all

the nT nodes u +hα within the template T, devT(u +hα) = pat∗T(hα). See Section

3.1.2 for details of why the entire pattern pat∗T is ‘pasted’ on to the realization re.

S-4. Move to the next node of the random path and repeat the above steps until all the

grid nodes along the random path are exhausted.

8

3.1.1 Multiple-categories and Continuous Variables

Section 2 explains the concepts used in the SIMPAT algorithm applied to a binary case.

Yet, one may notice that the only time the actual values of the training image ti or the

realization re are used is when calculating the distance between a pattern and a data

event. Within this context, extension to multiple-categories and continuous variables

can be implemented in a trivial manner. As long as the distance function chosen is

capable of handling the nature of the simulation variable (i.e. categorical or continuous),

all the concepts explained in Section 2 as well as the above basic version of the SIMPAT

algorithm remain intact, i.e. one does not need to modify the algorithm to accommodate

multiple-category or continuous variables.

The Minkowski metric explained in Section 2.4 is indifferent to the nature of the

simulation variable and thus can be used both with multiple-category or continuous

variables. There exists other distance functions that can only operate on a certain type

of variable. Examples of such distance functions are discussed in Section 5.

3.1.2 Enhancing Pattern Reproduction

Step S-3 of the above algorithm dictates that, at node u, the entire content of the most

similar pattern pat∗T is assigned to the current data event devT(u). The rationale behind

this pasting of the entire pattern pat∗T on to the data event devT(u) is to improve the

overall pattern reproduction within the realization re.

The above aim is achieved through two mechanisms: First, assigning values to mul-

tiple nodes of re at every step of the random path results in a rapid reduction of the

unknown nodes within re. Second, assigning the entire pat∗T to devT(u) communicates

the shape of the pattern existing at node u ‘better’: When another node u′ is visited

along the random path that happens to fall in the neighborhood of u, the distance calcu-

lations include not only the value of u but many of the nT nodes previously determined

at u. Thus, the ‘shape information’ provided by pat∗T is retained in a stronger manner

by encouraging a greater interaction between the patterns selected at neighboring nodes.

9

It is important to understand that, nodes assigned by pasting of pat∗T to devT(u) are

not marked ‘visited’, i.e. they will be visited again during the traversal of the random

path. Furthermore, although the current node u is marked visited after Step S-3, its

value is not fixed. When visiting another node u′ that falls in the neighborhood of u,

the previously determined value of u might be updated depending on the selected pat∗T

for node u′. In other words, the algorithm is allowed to revise its previous decisions

regarding the most similar pattern selection, i.e. any node value is temporary and may

change until the simulation completes. Due to this property, any node u of re is updated

approximately nT times during a simulation.

3.1.3 Skipping Grid Nodes

Whenever a pat∗T defined over a template T is pasted on to the realization, the algorithm

actually decides on the values of several nodes of re, not only the visited central value.

One might consider not visiting all or some of these nodes later during the simulation,

i.e. they might be skipped while visiting the remaining nodes of the random path. This

results in visiting less number of nodes within re and thus improves the CPU efficiency

of the algorithm. In SIMPAT, skipping of already calculated nodes is achieved by visiting

all grid nodes separated by a distance (called the “skip size”). In essence, this is a

modification of the Step S-1.

Consider the 2D 9 × 9 Cartesian grid given in Fig. 3. Assume a 5 × 5 template is

used and is currently located on u = u20. If the template T is defined by hα vectors

with α = 0, . . . , nT − 1 and nT = 25, then, the data event devT(u) is defined by vectors

u+hα such that u+hα = {u0, . . . ,u4,u9, . . . ,u13,u18, . . . ,u22,u27, . . . ,u31,u36, . . . ,u40}.

Assume the skip size is set to 3. During the simulation, when node u = u20 is visited,

the values of the most similar pattern pat∗T are used to populate all the values of the

data event devT(u). As the skip size is decided to be 3, all the nodes within the 3 × 3

neighborhood of the node u defined by the hβ vectors, β = 0, . . . , 8 such that u +

hβ = {u10,u11,u12,u19,u20,u21,u28,u29,u30} are marked ‘visited’ and removed from the

random path. This removal from the random path does not mean the values of these

10

nodes are fixed as explained in Section 3.1.2. It simply means the algorithm will not

perform an explicit most similar pattern search for these nodes.

10 2 3 4 5 6 7 8

9 13 14 15 16 17

18 22 23 24 25 26

27

10 11 12

19 20 21

28 29 30 31 32 33 34 35

36 37 38 39 40 41 42 43 44

45 46 47 48 49 50 51 52 53

54 55 56 57 58 59 60 61 62

63 64 65 66 67 68 69 70 71

72 73 74 75 76 77 78 79 801

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1

2

3

4

5

6

7

8

9 10 2 3 4 5 6 7 8

9 16 17

18 25 26

27

10 11 12

19 20 21

28 29 30

13 14 15

22 23 24

31 32 33 34 35

36 37 38 39 40 41 42 43 44

45 46 47 48 49 50 51 52 53

54 55 56 57 58 59 60 61 62

63 64 65 66 67 68 69 70 71

72 73 74 75 76 77 78 79 801

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1

2

3

4

5

6

7

8

9

( a ) visiting node u = 20 ( b ) visiting node u = 23

Figure 3: (a) Visiting node u = u20. The 5 × 5 template is placed on the node. Due to

the skip size of 3, the nodes within the 3 × 3 shaded area marked visited and removed

from the random path. (b) Visiting node u = u23 later during the simulation.

In the above example, a single most similar pattern search is performed but a total of

9 values are calculated. Hence, the number of nodes that needs to be visited is decreased

by a factor of 9, which results in 9 times less number of the most similar pattern search.

Thus, depending on the skip size, considerable speed gains might be achieved.

In general, only skip sizes much smaller than the template size should be considered.

The sensitivity of the skip size to the overall pattern reproduction quality of a realization

is discussed further in Section 4.2.

3.1.4 Searching for the Most Similar Pattern

In Step S-2, the algorithm requires finding the most similar pattern pat∗T given a data

event devT(u) minimizing d⟨devT(u),patk

T

⟩for k = 0, . . . , npat

T−1. This is a well un-

derstood problem in computational geometry known as the “nearest-neighbor problem”

or sometimes the “post office problem” due to Knuth (1997).

This problem can be trivially solved in O(npatT) time by calculating all possible

d⟨devT(u),patk

T

⟩and taking the minimum distance patk

T as pat∗T. Yet, this trivial

11

solution (also known as “linear search”) can be highly CPU demanding, especially when

both npatT

and nT are large; typically due to a large and complex 3D training image

calling for the use of a large template.

The computational geometry literature has many known algorithms that solve the

exact nearest-neighbor problem in at least O(log npatT) time. Approximate solutions

that perform better than this logarithmic time are also known. For a comprehensive

review of exact and approximate solutions, the reader is referred to Smid (1997).

Typically, it is not possible to take a computational geometry algorithm and imme-

diately apply it to SIMPAT as SIMPAT operates on data events with missing (unknown)

nodes, which is uncommon in computer sciences and thus generally not addressed by

the above cited solutions. Furthermore, the proposed solutions generally apply to cases

where nT < 100, whereas nT > 250 is typical in SIMPAT. Due to these issues, currently,

SIMPAT employs a slightly improved variation of the linear search for finding the most

similar pattern (See Section 4.5 of Duda et al. (2001) for the details of this improved

search). Further possible improvements of this scheme are discussed in Section 5.

3.2 Multiple-grid, Unconditional SIMPAT

In SIMPAT, the multiple-grid simulation of a realization is achieved by successively apply-

ing the single-grid algorithm to the multiple-grids starting from the coarsest grid Gng−1.

After each multiple-grid simulation, g is set to g− 1 and this succession of multiple-grids

continues until g = 0. The application of the multiple-grid approach can further be

discussed in two parts: Interaction between grids and use of dual templates.

3.2.1 Interaction Between Coarse and Fine Grids

At the beginning of the multiple-grid simulation, g is set to ng − 1, i.e. first the coarsest

grid Gng−1 is to be simulated. Then, the single-grid SIMPAT algorithm is applied to this

grid Gg using template Tg. When the current multiple-grid simulation is completed, the

values calculated on the current grid are transferred to Gg−1 and g is set to g − 1. It

12

should be noted that, different from the classical multiple-grid approach (Strebelle, 2002;

Tran, 1994), SIMPAT does not ‘freeze’ the coarse grid values when they are transferred to

a finer grid, i.e. such values are still allowed to be updated and visited by the algorithm

in the subsequent multiple-grid simulations.

On a grid Gg, the previously calculated coarse grid values contribute to the distance

calculation of Step S-2, i.e. if on node u a previous coarse grid value exists, this value

is taken into account when minimizing d⟨devTg(u),patk

Tg

⟩. Thus, the new value of

node u is calculated conditioned to both the coarse grid value and the newly available

neighborhood information devTg(u).

3.2.2 Using Dual Templates

Dual templates allows the algorithm to populate the nodes of the finest grid along with

the nodes of the current coarse grid.

To use dual templates, Step S-3 of the algorithm is modified. On node u, first,

the most similar pattern pat∗Tg is found based on the minimization of the distance

d⟨devTg(u),patk

Tg

⟩. Then, the corresponding dual pattern pat∗

T̃gis retrieved from the

pattern database. Finally, instead of populating the nodes of devTg(u), the algorithm

populates all the nodes of the dual data event devT̃g

(u), i.e. not only the coarse nodes

but also the fine nodes of re are updated.

Dual templates are used only when populating the nodes of the dual data event and

not during the distance calculations. In other words, Step S-2 of the algorithm is not

modified: the distance calculations are still performed on the original grid template Tg

using the data event devTg(u). This property means use of dual templates has no effect

on the current multiple-grid simulation as the finer nodes calculated by the dual pattern

pat∗T̃g

are never taken into account during the distance calculations of the current grid.

The effects of using dual templates will only be seen when simulating the one finer grid.

Use of dual templates can be envisioned as a dimensionality reduction method. In-

stead of minimizing d⟨dev

T̃g(u),patk

T̃g

⟩, where each vector entry to the distance func-

13

tion has nT̃g

nodes such that nT̃g

>> nTg , the algorithm uses the minimization of

d⟨devTg(u),patk

Tg

⟩as an approximation to the more costly minimization between the

dual data event and the dual pattern.

3.3 Data Conditioning

This section explains the details of how data conditioning to both primary and secondary

data is performed.

3.3.1 Conditioning to Primary Data

In SIMPAT, conditioning to primary data is performed in Step S-2 of the algorithm, during

the search for the pattern yielding the minimum d⟨devT(u),patk

T

⟩, k = 1, . . . , npat

T. If

a conditioning data exists on devT(u+hα), the algorithm first checks if patkT(hα) is within

a threshold ǫd of this data, i.e.∣∣∣devT(u + hα) − patkT(hα)

∣∣∣ ≤ ǫd. Typically, this threshold

ǫd is taken zero for hard conditioning information such as well data (ǫd = 0). If the pattern

patkT does not fulfill this condition, it is skipped and the algorithm searches for the next

most similar pattern until a match is found. If none of the available patterns fulfill the

condition, the algorithm selects the pattern that minimizes d⟨devT(u + hα), patkT(hα)

⟩,

i.e. only the nodes of devT(u) that has conditioning information are considered during

the distance calculations. If several patkT fulfills this condition, then a second minimiza-

tion is performed on the non-conditioning nodes of devT(u) using only the these patterns.

Note that, such a case generally occurs when the training image is not representative of

the conditioning data.

The above conditioning methodology used in SIMPAT can be summarized as follows:

1. At node u, retrieve the data event devT(u).

2. Divide the data event into two parts: devT,1 and devT,2 such that devT,1 con-

tains only the conditioning nodes of devT(u) and devT,2 contains only the non-

conditioning and uninformed nodes of devT(u).

14

3. Minimize the distance d⟨devT,1,patk

T

⟩to find the most similar pattern pat∗T to

the conditioning data event. If only one such pattern exists, take pat∗T as the final

most similar pattern and proceed with Step S-3 of the SIMPAT algorithm.

4. If several patterns minimize the distance d⟨devT,1,patk

T

⟩, perform another mini-

mization, namely d⟨devT,2,patk

T

⟩, only within these conditioned patterns to find

the final most similar pat∗T.

Conditioning data might impose certain problems when used in a multiple-grid set-

ting. A particular multiple-grid Gg might not include the location containing the condi-

tioning data. In such a case, as conditioning is performed during the distance calculations

using the coarse template Tg, the conditioning data will not be included within the cal-

culation, resulting in poorly-conditioned results.

SIMPAT avoids the above situation by making use of dual templates when condi-

tioning to primary data. While minimizing d⟨devT,1,patk

T

⟩, each corresponding dual

pattern patk

T̃is checked against the dual data event dev

T̃(u) to discover whether the

patkT̃

conflicts with the primary data. Such conflicting patterns are ignored during the

minimization. Recall that, by definition, a dual template contains extra fine grid nodes

in addition to the coarse grid nodes of the primal template. Thus, checking devT̃(u)

against patkT̃

guarantees that the final most similar pattern pat∗T selected is conditioned

to the primary data at the finest grid and not only at the current coarse grid.

3.3.2 Conditioning to Secondary Data

In Earth Sciences, secondary data typically refers to data obtained from an indirect

measurement such as seismic surveys. Thus, secondary data is nothing but a ‘filtered’

view of the original field of study (reservoir), where the filtering is performed by some

forward model F. Generally, the forward model F is not fully known and is approximated

by a known model F∗. Furthermore, secondary data typically exists in exhaustive form,

i.e. for every node u of the field of study, dt2(u) is known where dt2 is the vector of all

secondary data values.

15

Within this context, in SIMPAT, conditioning to secondary data calls for additional

steps to the basic algorithm. First, a secondary training image ti2 is required. This

training image can be obtained by applying the approximate forward model F∗ to ti.

Once the secondary training image is obtained, Step P-1 of the algorithm is modified

such that, for every patkT of the primary training image, a corresponding secondary

pattern pat2kT is extracted from the secondary training image from the same location.

In essence, patterns now exist in pairs in the pattern database.

Once the preprocessing module of the algorithm is modified this way, another mod-

ification is done to the Step S-2, i.e. the search for the most similar pattern. In-

stead of minimizing d⟨devT(u),patk

T

⟩, the algorithm now minimizes the summation

of d⟨devT(u),patk

T

⟩and d

⟨dev2

T(u),pat2kT

⟩where dev2

T(u) denotes the secondary

data event obtained from dt2, i.e. dev2T(u) = dt2T (u).

The net result of the above modifications is that, for every node u, the algorithm

now finds the most similar pattern not only based on the previously calculated nodes

but also based on the secondary data. When there is also primary data available, this

minimization is performed only after the patterns that condition to the primary data are

found as explained in Section 3.3.1, i.e. primary data has priority over secondary data.

The values of the secondary data events and the secondary patterns need not be

in the same range as the values of the primary data event and the primary patterns.

This is typically the case when seismic information is used as secondary data. In

such cases, the combined distance minimization might result in biased distances as

for a node u, d⟨dev2

T(u),pat2kT

⟩might be several orders of magnitude greater than

d⟨devT(u),patk

T

⟩, effectively dominating the distance calculation and shadowing the

contribution of the previously calculated re nodes to the combined distance.

To prevent the above problem, the individual distances are normalized such that both

d⟨devT(u),patk

T

⟩and d

⟨dev2

T(u),pat2kT

⟩range between the same values; typically,

[0,1]. Furthermore, a weight is attached to the combined summation to let the user of

the algorithm give more weight to either the primary or the secondary values, reflecting

the ‘trust’ of the user to the secondary data. Thus, for the pairs devT(u), patkT and

16

dev2T(u), pat2k

T, the final form of the distance function becomes:

d1,2 〈·, ·〉 = ω × d1⟨devT(u),patk

T

⟩+ (1 − ω) × d2

⟨dev2

T(u),pat2kT

⟩(10)

where d1,2 〈·, ·〉 is the final combined distance that will be minimized to find the most

similar pattern pat∗T, d1 〈·, ·〉 and d2 〈·, ·〉 are the normalized distances, ω is the weight

factor and typically ω ∈ [0, 1]. If desired a local ω(u) can be used instead of a global ω.

This reflects the fact that one has varying degrees of ’trust’ to secondary data in different

regions of the reservoir.

17

4 Examples

In this section, the results of several runs performed using the SIMPAT algorithm are dis-

cussed. The results are presented under three subsections: (1) Simple 2D, unconditional

examples; (2) complex 3D, multiple-category and continuous, unconditional examples

and (3) complex 3D, multiple-category, conditional examples. All results discussed in

this section are presented at the end of the section. Reported CPU times are obtained

using an Intel Pentium IV 2.4 GHz.

4.1 2D Simple Examples

Fig. 5 illustrates the intermediate steps of an application of the multiple-grid, uncon-

ditional SIMPAT to a binary (sand/non-sand) case. Using the 250 × 250 training image

given in Fig. 5a, the final realization (Fig. 5d) is obtained using a 11×11 base template,

3 multiple-grids and a skip size of 2. Fig. 5b and Fig. 5c are the intermediate multiple-

grid results for the coarsest and one finer grids, g = 2 and g = 1. Despite the fact

that these results are taken from intermediate steps of SIMPAT, they are still completely

informed due to the use of dual templates as explained in Section 3.3. In the figure,

the algorithm starts by ‘roughly’ placing the channel patterns on to the realization re

during the coarsest grid simulation. In essence, the multiple-grid approach ignores the

finer-scale details on the coarser grids. Later, the algorithm ‘corrects’ the details of the

realization as subsequent multiple-grid simulations are performed. Fig. 5e and Fig. 5f

shows closeups of the same region on the second coarsest grid and on the finest grid.

Here, the distance calculations detect the broken channels that does not exist in the

training image and replace them with connected channel patterns.

Fig. 6 provides some insights into the sensitivity to template size when a fixed skip

size of 2 is used. As expected using a small template (3 × 3) results in poor pattern

reproduction (Fig. 6b). Increasing the template size to 5 × 5 (which effectively triples

the number of template nodes nT) improves the pattern reproduction (Fig. 6c). Succes-

sive increases in the template size continue to improve the final results; yet, it appears

18

that after a certain threshold, increasing the template size does not provide any further

improvements. This threshold (sometimes called “the optimum template size”) is related

to the complexity of the patterns found in a training image. Given a training image,

automatic discovery of the optimum template size is one of the active research topics of

pattern-based geostatistics.

Fig. 7 shows application of SIMPAT to different types of 2D training images. Of these,

Fig. 7e and 7f demonstrate the capability of pattern-based geostatistics to inherently

reproduce the lower-order statistics that exist in the training image. A visual check of

the SIMPAT realization in Fig. 7f shows that the algorithm successfully reproduces the

Gaussian patterns found in the training image. Variogram comparisons between the two

figures further confirms this check (Fig. 4).

Figure 4: Comparison of 0◦, 45◦ and 90◦ variograms of Fig. 7e and 7f. The color red

denotes the calculated variograms of the SIMPAT realization.

4.2 3D Complex Examples

Fig. 8a is an example of wavy bed lamination. This continuous training image of size

90×90×80 is generated using the SBED software (Wen et al., 1998). In this example, the

training image contains complex patterns in the finest scale. Yet, higher scale patterns

are fairly repetitive and thus can be considered ‘simple’. A template size of 11 × 11 × 7

19

is chosen to reflect the complexity of fine scale patterns. Due to the apparent repetitive

and simple nature of the higher scale patterns, a skip size of 6 is used on all grids, which

significantly improved the CPU time of the run (35 minutes). The SIMPAT unconditional

realization for this case is shown in Fig. 8b.

Fig. 9a is a 7 facies, complex training image of size 100 × 75 × 25, depicting a tidal

channel system. Another notable property of Fig. 9a is that, the image is highly non-

stationary, especially in the higher scales, where one facies appears only locally (The

facies represented by the yellow color which runs through the x-axis). Fig. 9b shows

the unconditional SIMPAT realization obtained using a 15 × 15 × 5 template and a skip

size of 4 on all grids. As the figure illustrates, the algorithm successfully captures the

non-stationary, local behavior of the training image, while also retaining the other more

stationary facies relations. The CPU time for this run was approximately 120 minutes.

Fig. 10 studies the sensitivity to the skip size. The figure demonstrates that the

pattern reproduction quality of realizations might be sensitive to the skip size setting,

especially in complex 3D training images where one facies dominates the others (such as

the non-channel, background facies of Fig. 10a). In such cases, one might observe propor-

tion reproduction problems, which in turn, might cause pattern reproduction problems.

Fig. 10b and Fig. 10c reveal this issue: When a higher skip size setting is used, the

realization honors the patterns in the training image better. Removing this sensitivity to

skip size is an active area of research and is further discussed in Section 5. In this case,

the better realization of Fig. 10b required approximately 100 minutes of CPU time; Fig.

10c, albeit produced worse results, took approximately 400 minutes.

4.3 3D Conditional Examples

Fig. 11 shows a 100× 100× 50 synthetic reference case with 6 facies generated using the

SBED software. Two different data sets are sampled from this reference case to test the

primary conditioning capabilities of SIMPAT (Fig. 11b and 11c).

The first, dense data set is used with a train imagine that is highly representative

20

of the reference (Fig. 12a). Using such a training image guarantees that, during the

simulation, the number of conflicting patterns are kept to a minimum. Yet, in a real

reservoir, one would expect a reasonable amount of conflict between the available data

and the selected training image. Thus, the final SIMPAT realization (Fig. 12c) obtained

for this case should be viewed as a check of the conditioning capabilities of SIMPAT rather

than a representative example of primary data conditioning in a real reservoir.

The second, sparse data set is used with a training image that contains patterns

which are more likely to conflict with the available data (Fig. 13a). In this case, the

data dictates stacked channels whereas the training image only has isolated channels.

Combined with the sparse data set, use of this training image can be considered as a

representative conditioning example in a real reservoir. Fig. 13c is the final conditional

SIMPAT realization obtained.

Fig. 14, 15 and 16 demonstrate the application of secondary data conditioning using

SIMPAT. In this case, secondary data is obtained by applying a seismic forward model F∗

to the binary reference case (See Wu and Journel (2004) for details of the forward model

used). The same model is applied to the training image to obtain the secondary training

image. The final SIMPAT realization generated (Fig. 16b) has the settings 11 × 11 × 3

template size and a skip size of 2. The realization conditions to secondary data relatively

well but pattern reproduction is somewhat degraded as made evident by the disconnected

channel pieces in Fig. 16b. It is believed that this problem occurs because the template T

used for the secondary data events dev2T(u) might not be representative of the volume

support of the secondary data. This issue is furthered discussed in Section 5.

21

[ a ] training image [ b ] multiple-grid ( g = 2 )

[ c ] multiple-grid ( g = 1 ) [ d ] realization ( g = 0 )SIMPAT

250

250

sand

non-sand

[ e ] closeup area of ( c ) [ f ] closeup area of ( d )

Figure 5: Intermediate steps of the multiple-grid, unconditional SIMPAT. (e) and (f) are

the closeups of the marked regions on (c) and (d).

22

[ a ] training image [ b ] template size = 3x3

[ c ] template size = 5x5 [ d ] template size = 7x7

250

250

sand

non-sand

[ e ] template size = 11x11 [ f ] template size = 21x21

Figure 6: SIMPAT realizations obtained using different template sizes with a fixed skip

size of 2 for all grids.

23

[ a ] training image [ b ] realizationSIMPAT

[ c ] training image [ d ] realizationSIMPAT

100

100

levee

non-sand

[ e ] training image [ f ] realizationSIMPAT

1.0

0.0

250

250

300

300

sand

non-sand

channel

Figure 7: SIMPAT realizations obtained from different training images: (a) is a binary

training image generated using the SBED software; (b) is a 3 facies training image gener-

ated using an object-based model and (c) is a continuous training image generated from

an unconditional SGSIM run.

24

[ a ] training image

[ b ] SIMPAT realization

0.5

0.1

Figure 8: A wavy bed lamination training image (continuous) and its SIMPAT realization.

25


[ b ] SIMPAT realization

7

1

Figure 9: A tidal channel training image (7 facies) and its SIMPAT realization.

26


[ b ] SIMPAT realization w/ skip size 4, 4, 2

[ c ] SIMPAT realization w/ skip size 2, 2, 1

3

0

Figure 10: A channel training image (4 facies) and two SIMPAT realizations obtained

using different skip size settings. Each figure shows two z-slices.

27

[ a ] reference

[ b ] dense well data ( 150 wells )

[ c ] sparse well data ( 50 wells )

5

0

Figure 11: Reference for primary data conditioning and two data sets randomly sampled

from this reference.

28


[ b ] primary data ( 150 wells )

[ c ] realizationSIMPAT

5

0

Figure 12: Primary data conditioning result obtained using a training image that is

highly representative of the reference given in Fig. 11 and a dense data set.

29


[ b ] primary data ( 50 wells )

[ c ] realizationSIMPAT

5

0

Figure 13: Primary data conditioning result obtained using a training image that has

conflicting patterns with the given sparse data set.

30

[ a ] reference

[ b ] secondary data ( seismic )

11.0

3.0

sand

non-sand

Figure 14: Reference for secondary data conditioning and the corresponding synthetic

seismic obtained using an approximate forward model F∗.

31


[ b ] secondary training image

11.0

3.0

sand

non-sand

Figure 15: Training image for secondary data conditioning and the corresponding sec-

ondary training image obtained using the same F∗ as Fig. 14.

32

[ a ] reference

[ b ] conditional realizationSIMPAT

sand

non-sand

sand

non-sand

Figure 16: The conditional SIMPAT realization generated using the secondary data shown

in Fig. 14 and the training image in Fig. 15.

33

5 Future Work

In this section, several ongoing additions to the current SIMPAT implementation in various

stages of development and testing are discussed.

5.1 Conditional SIMPAT

As discussed in Section 4.3, there are two main issues related to the conditioning capa-

bilities of SIMPAT that require further studying:

1. Conflicting training image patterns and primary data: As discussed in conjunction

with Fig. 13, a conflict between the patterns and the primary data might result

in poor pattern reproduction. One possible solution to the problem is to mod-

ify the primary conditioning scheme such that, when comparing the conditioning

data event devT,1 to patkT, data closer to the center node u of the template T

are given more weight in the distance calculation. This modification is expected

to improve pattern reproduction without sacrificing too much data conditioning

when the training image patterns and the primary data are at most moderately

conflicting. If there is a strong disagreement between the training image and the

primary data, the user is advised to revise the training image selection instead.

2. The issue of volume support when conditioning to secondary data: The secondary

data conditioning methodology explained in Section 3.3.2 dictates that the same

template T be used both for the primary data event devT(u) and the secondary

data event dev2T(u). Yet, in Earth Sciences, secondary data (such as seismic)

typically has a different volume support than the volume support of the primary

variable. Thus, for secondary data events and patterns, using a different template

T2 to reflect this difference might improve the pattern reproduction quality of the

realizations. Fig. 17 – 19 illustrate the case where a different template T2 is used

for the secondary data events and patterns. In the figure, the primary template T

has the size 11 × 11 × 3 and the secondary template T2 has the size 5 × 5 × 7.

34

[ a ] reference

[ b ] secondary data ( average )

1.0

0.0

sand

non-sand

Figure 17: Reference case for secondary data conditioning using a different T2. The

secondary data is obtained by applying a low-pass filter on the binary facies model.

35


[ b ] secondary training image

1.0

0.0

sand

non-sand

Figure 18: Training image for secondary data conditioning using a different T2.

36

[ a ] reference

[ b ] conditional realizationSIMPAT

sand

non-sand

sand

non-sand

Figure 19: The conditional SIMPAT realization generated using the secondary data shown

in Fig. 17 and the training image in Fig. 18.

37

5.2 Relational Distance

The Minkowski metric explained in Section 2.4 can be used both with multiple-category

and continuous variables. Yet, the generalized nature of this distance function might

not be desired in all cases; especially when used with multiple-categories: In Earth

sciences, multiple-categories are typically used to generate facies models of the field of

study (reservoir). It is well-understood that, in such models, certain facies exhibit tight

relations (For example, a channel facies and an attached levee facies) and such relations

should be honored in the generated realizations.

In SIMPAT, facies relations are described through the training image patterns. The

distance function used (Manhattan distance) is indifferent to these relations, i.e. it does

not differentiate between facies and treats them ‘equally’. For large templates, this

property of the Manhattan distance might be undesirable due to its possible effect on

the most similar pattern search (Duda et al., 2001). For example, the sensitivity to skip

size that is discussed in Section 4.2 is believed to be one of such undesirable effects.

To circumvent the problem, one might envision replacing the Manhattan distance such

that, in the new distance function, the facies relations are expressed by defining certain

categories to be ‘less distant’ to each other.

One possible implementation of the distance function described above can be achieved

thorough defining a matrix of category (facies) relations and using this matrix to calculate

the individual pixel distances within the summation over the template nodes hα. Consider

the below general distance function:

d⟨devT(u),patk

T

⟩=

nT∑

α=0

dα

⟨devT(u + hα), patkT(hα)

⟩(11)

where dα 〈xα, yα〉 is currently defined as |xα − yα|. Instead of this definition, one can

define dα 〈xα, yα〉 such that each xα, yα pair points to an entry in a matrix that defines

relative distances between categories. For categorical variables, xα and yα are indicator

values, i.e. = 0, 1, . . . , nf − 1 where nf is the number of categories. Thus, an xα, yα pair

can be taken as the column and row indices of the matrix and dα 〈xα, yα〉 as the cell value

of the matrix at this column and row. Unknown/missing values are trivially handled

38

by defining the unknown identifier χ as another unique category and setting the matrix

entry to zero for all xα, yα pairs such that either xα or yα equals to χ.

This above distance function is named “Relational Distance” and will be implemented

in a future version of SIMPAT. Two benefits are expected to be achieved through the use

of Relational Distance: (1) Better pattern reproduction due to explicit honoring of facies

relations within the distance function and (2) lower sensitivity to skip size.

5.3 Conditioning to Local Angles and Affinities

Constraining the realizations using available local angle and affinity information (typically

obtained from seismic or via expert input) can be achieved by using multiple training

images in a single simulation run.

The SNESIM algorithm imposes a similar approach (Zhang, 2002). Consider the case

of angle conditioning only. SNESIM first divides the angle data into classes. Say the

angles of the field in question range between [−90, 90] degrees and a total of 6 an-

gle classes are desired. SNESIM implicitly rotates the given training image to angles

−75o,−45o,−15o, 15o, 45o and 75o and uses these rotated training images for different

parts of the simulation grid dictated by the angle data. For example, if the angle at node

u falls between [0, 30] degrees (i.e. the first bin), the training image that is rotated 15o

is used to perform the data event lookup. A similar mechanism is utilized for affinities.

A slightly different approach will be used in future implementations of SIMPAT. Instead

of implicitly transforming the training images internally, SIMPAT will accept the use of

explicit training images for different parts of the simulation grid (called regions). A

separate software will be provided to the users of SIMPAT that performs angle and affinity

transformation on the training images for convenience.

To understand the approach better, consider the case of angle conditioning only.

Simply put, one can control the angle of individual patterns by using rotated training

images in different parts of the simulation grid while ensuring a smooth transition between

these different parts (regions). At a given node u, the most similar pattern search will be

39

performed using the particular rotated image; thus, resulting in a rotated pattern. The

same mechanism is also applicable to affinity conditioning. When both angle and affinity

information is desired to be retained for the same location, a training image that is both

rotated and scaled can be used.

5.4 Computational Issues

Step S-2 of the SIMPAT algorithm requires finding the most similar pattern pat∗T given

a data event devT(u) minimizing d⟨devT(u),patk

T

⟩for k = 0, . . . , npat

T− 1. As ex-

plained in Section 3.1.4, currently SIMPAT employs a modified linear search to perform

this minimization.

For complex training images and large simulation grids, the performance level of linear

search is undesirable; especially when a large template is used. One solution to address

this issue is to parallelize the algorithm and exploit multiple CPUs to improve the search

performance. From an algorithmic point of view, parallelization of SIMPAT is trivial:

One can divide the pattern database into ncpu parts where ncpu is the number of CPUs

available. Then, instead of minimizing d⟨devT(u),patk

T

⟩for k = 0, . . . , npat

T− 1, each

CPU minimizes d⟨devT(u),patk

T

⟩for k = α, . . . , β where α and β denote the beginning

and ending indices of ranges of size npatT/ncpu. In essence, each CPU searches for a

similar pattern pat∗,cT where c denotes the CPU index and c = 0, . . . , ncpu − 1. Once the

results for these minimizations are obtained, a final minimization of d 〈devT(u),pat∗,cT 〉

is performed to find the final most similar pattern pat∗T. This method of parallelization

is currently being tested and is expected to be included in the next version of SIMPAT.

Another possible solution to the CPU time problem can be achieved through the

use of dual templates. As explained in Section 2.6, once the multiple-grid simulation of

the coarsest grid Gng−1 is complete, the entire realization re is fully informed as dual

templates populate the finest grid nodes in addition to the coarse grid nodes. Thus,

the algorithm guarantees that for any grid Gg other than the coarsest grid Gng−1, the

data event devTg(u) is also fully informed and does not contain unknown/missing nodes.

40

Once can envision exploiting this property of dual templates by using a dimensionality

reduction method on devTg(u) and patterns patkTg and calculating the distance on these

reduced data events and patterns. A reduction in the dimensionality of the most similar

pattern search problem might provide significant performance gains. Furthermore, this

method can be used in conjunction with the above parallelization methodology. One

example of such a dimensionality reduction method is discussed in Zhang (2004).

41

6 Acknowledgments

I would like to thank to Prof. Dr. Jef Caers and Prof. Dr. Andre Journel from

Stanford University for their advices and continual support. Many parts of the algorithm

got refined and corrected thanks to Jef’s suggestions and his through testing of the

software implementation of SIMPAT. Dr. Sebastien Strebelle from ChevronTexaco set the

initial motivations for different parts of the study and endured long hours of discussions

about the methodology, sharing his experience on the subject matter. Tuanfeng Zhang

from Stanford University spent several days with me, explaining the implementation

details of SNESIM and shared his ideas about possible improvements. Jianbing Wu and

Jenya Polyakova (Stanford University) patiently helped with the testing of the software

implementation of SIMPAT. Jianbing also generated one of the data sets that we used as

a secondary data conditioning example. Dr. Renjun Wen from Geomodeling Technology

Corp. kindly provided us with the SBED software which we used to generate realistic and

highly complex training images. The wonderful GEMS software of Nicolas Remy (Stanford

University) was indispensable for visualizing the results. Finally, Anuar Bitanov (Agip)

and Sunderrajan Krishnan (Stanford University) also contributed to different parts of

the study by allowing me to see certain concepts in a more clear way.

References

J. Caers and G. B. Arpat. Multiple-point geostatistics: Choosing for stationarity or sim-

ilarity. SCRF Annual Meeting Report 17, Stanford Center for Reservoir Forecasting,

Stanford, CA 94305-2220, 2004.

J. Caers, S. Strebelle, and K. Payrazyan. Stochastic integration of seismic and geological

scenarios: A submarine channel saga. The Leading Edge, pages 192–196, March 2003.

C. Deutsch and A. Journel. GSLIB: Geostatistical Software Library. Oxford University

Press, Oxford, 2nd edition, 1998.

42

O. Duda, P. Hart, and D. Stork. Pattern Clasification. John Wiley & Sons, Inc., New

York, 2nd edition, 2001.

F. Guardiano and R. Srivastava. Multivariate geostatistics: Beyond bivariate moments.

In A. Soares, editor, Geostatistics-Troia, pages 133–144. Kluwer Academic Publica-

tions, Dordrecht, 1993.

D. E. Knuth. Art of Computer Programming. Addison-Wesley Pub. Co., 3rd edition,

1997.

M. Smid. Closes-point problems in computational geometry. In J. Sack and J. Urrutia,

editors, Handbook on Computational Geometry. Germany, 1997.

S. Strebelle. Sequential Simulation Drawing Structures from Training Images. PhD thesis,

Stanford University, 2000.

S. Strebelle. Conditional simulation of complex geological structures using multiple-point

statistics. Mathematical Geology, Jan. 2002.

S. Strebelle, K. Payrazyan, and J. Caers. Modeling of a deepwater turbidite reservoir con-

ditional to seismic data using multiple-point geostatistics. In SPE ATCE Proocedings,

number SPE 77425. Society of Petroleum Engineers, Oct. 2002.

T. Tran. Improving variogram reproduction on dense simulation grids. Computers and

Geosciences, 20(7):1161–1168, 1994.

R. Wen, A. Martinius, A. Nass, and P. Ringrose. Three-dimensional simulation of small-

scale heterogeneity in tidal deposits - a process-based stochastic simulation method. In

4th. Annual Conf. Proocedings. International Association for Mathematical Geology,

1998.

J. Wu and A. Journel. Water saturation prediction using 4D seismic data. SCRF Annual

Meeting Report 17, Stanford Center for Reservoir Forecasting, Stanford, CA 94305-

2220, May 2004.

43

T. Zhang. Rotation and affinity invariance in multiple-point geostatistics. SCRF Annual


2220, May 2002.

T. Zhang. A filter approach to modeling patterns from training images. SCRF Annual


2220, May 2004.

44

Documents

SIMPAT: Stochastic Simulation with Patterns · Flow in a reservoir is mostly controlled by the connectivity of extreme perme-abilities (both high and low) which are generally associated