Supervised Multiattribute Classification Kurt J. Marfurt (The University of Oklahoma) Kurt J....

Preview:

Citation preview

Supervised Multiattribute Classification

Kurt J. Marfurt (The University of Oklahoma)

3D Seismic Attributes for Prospect Identification and Reservoir Characterization

15-1

Course OutlineIntroductionComplex Trace, Horizon, and Formation AttributesMultiattribute DisplaySpectral Decomposition

Geometric AttributesAttribute Expression of Geology

Tectonic DeformationClastic Depositional EnvironmentsCarbonate Deposition EnvironmentsShallow Stratigraphy and Drilling HazardsIgneous and Intrusive Reservoirs and Seals

Impact of Acquisition and Processing on AttributesAttribute Prediction of Fractures and StressData ConditioningInversion for Acoustic and Elastic ImpedanceImage Enhancement and Object Extraction

Interactive Multiattribute AnalysisStatistical Multiattribute AnalysisUnsupervised Multiattribute ClassificationSupervised Multiattribute Classification

Attributes and Hydraulic Fracturing of Shale ReservoirsAttribute Expression of the Mississippi Lime

15-2

Multiattribute Analysis Tools

• Statistical Pattern Recognition

• Support Vector Machine

• Projection Pursuit

• Artificial Neural Networks

Supervised Learning

• K-means

• Mixture Models

• Kohonen Self-Organizing Maps

• Generative Topographical Maps

Unsupervised Learning

Machine Learning Attribute AnalysisInterpreter-Driven Attribute Analysis

• Cross-correlation on Maps

• Cross-plotting and Geobodies

• Connected Component Labeling

• Component Analysis

• Image Grand Tour

Interactive Analysis

• Analysis of Variance (ANOVA, MANOVA)

• Multilinear Regression

• Kriging with external drift

• Collocated co-kriging

Statistical Analysis

15-3

Artificial Neural Nets (ANN)

Neurons

15-4

Artificial Neural Nets (ANN)

Objective: From continuous input measurements (e.g. seismic attributes):

• Predict a continuous output (e.g. porosity)

• Predict discrete lithologies (e.g. wet sand, gas sand, limestone, shale,…)

15-5

Artificial Neural Nets (ANN)

Attributes

Looks like a duck?

Quack like a duck?

Walk like a duck?

Observations+1

0

yes

no

15-6

Linear Neurons used in Predictive Deconvolution

(Courtesy Rock Solid Images)

OutputOutputPerceptron,r

N

iiiawy

1

a1

a2

a3

aN

w3

w2

w1

wN

a0=1 (Bias)

w0

N-long operator, w

Prediction

0 1 2 3Time (s)

Prediction distance

15-7

The Perceptron

w2 wnw1

w0

a2ana1

. . .

Input attributes, ai

N

iiiawy

0

1 if y > +0.50 if y < -0.5{Output, r =

yer 1

1

Unknown weights, wia0=1

r

y0.0

0.5

1.0

0.0+1.0 +1.5+0.5-0.5-1.0-1.5

yes

no

15-8

input a1

output r

0 11 0

y

a1

w1= -11

w0= 0.5

r

y0.0

0.5

1.0

0.0+1.0 +1.5+0.5-0.5-1.0-1.5

yes

no

y=-1*0+0.5*1= +0.5

-1*1+0.5*1= -0.5

Inverter

15-9

r

y0.0

0.5

1.0

0.0+1.0 +1.5+0.5-0.5-1.0-1.5

yes

no

input a1

input a2

outputr

0 0 0

0 1 1

1 0 1

1 1 1 a2

y

a1

w2=1w1=1

w0= -0.5

1

y=1*0+1*0-0.5*1= -0.5

y=1*0+1*1-0.5*1= +0.5

y=1*0+1*1+0.5*1= +0.5

y=1*1+1*1-0.5*1= +1.5

Boolean OR

15-10

r

y0.0

0.5

1.0

0.0+1.0 +1.5+0.5-0.5-1.0-1.5

yes

no

input x1

input x2

outputr

0 0 0

0 1 0

1 0 0

1 1 1 a2

y

a1

w2=1w1=1

w0= -1.5

1

Boolean AND

input a1

input a2

outputr

0 0 00 1 01 0 01 1 1

y=1*0+1*0-1.5*1= -1.5

y=1*0+1*1-1.5*1= -0.5

y=1*0+1*1-1.5*1= -0.5

y=1*1+1*1-1.5*1= +0.5

15-11

Boolean XOR

input a1

input a2

outputr

0 0 00 1 11 0 11 1 0 a2

y

a1

Doesn’t work!

15-12

a1

a2

a1

a2

Linear Separability

a1

a2

AND

10

1 1

OR

00

0 1

10

1 0

XOR

OK! OK! Can’t separate!

15-13

a2

h2

w2=1

w1=1 w0= -1.5

1

Boolean ANDh1

a1

w2=1

w1=1

w0= -0.5

1

Boolean OR

y w0= -0.5

1w1=1 w1=-1

input a1

input a2

outputr

0 0 0

0 1 1

1 0 1

1 1 0

r

y0.0

0.5

1.0

0.0+1.0 +1.5+0.5-0.5-1.0-1.5

yes

noy=1*h1-1*h2+-0.5*1=-0.5

y=1*h1-1*h2+-0.5*1=0.5

y=1*h1-1*h2+-0.5*1=0.5

y=1*h1-1*h2+-0.5*1=-0.5

Boolean XOR

the hidden layer!

15-14

(Ross, 2002)

A typical neural network

hidden layer!input layer! output layer!

15-15

Decision workflow

1. Choose the classes you wish to discriminate

2. Choose attributes that differentiate these classes

3. Train using calibrated or “truth” data

4. Validate with “truth” data not used in the training step

5. Apply to the target data

6. Interpret the results

(van der Baan and Jutten, 2000)15-16

Alternative perceptrons

Discrete output classese.g. lithology

Continuous output classes (e.g. porosity)Intermediate results (in hidden layer)

(van der Baan and Jutten, 2000)

differentiable

differentiable

r(w) r(w)r(w)

fs[r(w)]fG[r(w)]fh[r(w)]

15-17

Attributes Weights Perceptron Output

0 or 1

r(w)

a1

a2

w0

w2

w1

y

2-attribute example with a single decision boundary

(van der Baan and Jutten, 2000)

Decision boundary

15-18

Example of two attributes with a single decision boundary

(van der Baan and Jutten, 2000)

a1

a 2

Class 1

Class 2

Decision boundary

a 2=-w 1

/w 2*a 1

+w 0/w 1

Brad

Brad says: “We could have more than one decision boundary!”

15-19

Attributes Weights Perceptron Output

0 or 1

Weights

Explicit representation

Hidden

Layer

(van der Baan and Jutten, 2000)

Example of two attributes with three decision boundaries

Decision boundaries

15-20

Attributes Weights Perceptron Output

0 or 1

This is a more compact representation of the previous image

Hidden

Layer

(van der Baan and Jutten, 2000)

Example of two attributes with three decision boundaries

Decision boundaries

15-21

a1

a2

Class 2

Class 2 Class 2

Class 1

Class 2 Class 2

boundary 1

boundary 3

boundary 2 Class 2

(van der Baan and Jutten, 2000)

Example of two attributes with three decision boundaries

15-22

The danger of too many boundaries (hidden neurons)

(courtesy Brad Wallet, OU)

Brad

Brad says: “You can overfit your data by putting in too many

decision boundaries, thereby overdividing your attribute space!”

15-23

7th order polynomial

The danger of too many degrees of freedom (polynomial fitting)

a1

a 2

Prediction error

2nd order polynomial

1st order polynomial

Prediction errorPrediction error

15-24

The danger of too many attributes

a1

a 2

4D hyperplane

a3

2D hyperplane (a line)

3D hyperplane (a plane)

J

jjjawwy

10

Training data

Validation data

15-25

A feed-forward network

One of several ways of estimating the weights, w(easily understood by Geophysicists). Use a Taylor Series expansion:

Let’s define

Initial guess based on random weights, w.a0=input attributesz0=output measurements

Prediction error given current weights, w.

Sensitivity of output to the weights (Jacobian matrix) (note that f must be differentiable!)

Equation predicting the output from the input

(van der Baan and Jutten, 2000)

Δw

w

r(w),afw),afz 0

00

(

00 zr(w),afΔz

w

w,afA(w) 0

)(r

ww,A(aΔz 0 )

15-26

J

jj

j

kkk s

s

ttt

1

0 )( Δ(s)

s

Tomography

Known output (measurements)

Differentiable model system

Unknown model parameters

15-27

Known previous model resultl

j

j

Δww

r(w),azr(w),azz 0

00

Neural networks

Known output (“truth” data)

Known input (attributes)

Unknown weights

15-28

Differentiable model system

Computing the weights, w

(van der Baan and Jutten, 2000)

w)Δ),A(aΔz 0

w

w,afw),A(a 0

0

)(r

Differentiable preceptron!

f[r(w)]

r(w)

15-29

zwAIwAwAw TT )()()(1

Iterative least-squares solution using the normal equations

Levenberg-Marquardt (or Tikhonov)

Regularization

15-30

(Ross, 2002)

A typical neural network

hidden layer!input layer! output layer!

15-31

Example 1. Mapping a stratigraphic depositional system

(Ruffo et al., 2009)15-32

Seismic line perpendicular to channel system

(Ruffo et al., 2009)15-33

Seismic facies classification using a neural network classifier

(Ruffo et al., 2009)15-34

Use 4-way averaged vertical 2D GLCM attributes parallel to dip at a suite of azimuths

(Ruffo et al., 2009)15-35

Seeding the facies classification algorithm

(Ruffo et al., 2009)15-36

Lithofacies classification

(Ruffo et al., 2009)15-37

Lithofacies classification scheme

(Ruffo et al., 2009)15-38

Lithofacies classification

(Ruffo et al., 2009)15-39

Seismic facies overlain on seismic data

(Ruffo et al., 2009)15-40

Horizon slice

(Ruffo et al., 2009)15-41

Example 2. Clustering of - and - volumes

-

(Chopra and Pruden, 2003)15-42

Neural network estimation

Gamma ray response Porosity(With mask generated from gamma

ray response)

(Chopra and Pruden, 2003)15-43

San Luis Pass weather prediction exercise

August 24, 2005 – sunnyAugust 25, 2005 - storms August 26, 2005 - sunnyAugust 27, 2005 - sunnyAugust 28, 2005 - sunnyAugust 29, 2005 - storms

Exercise: flip 6 coins: Heads=sunny Tails=stormy

Read out your correlation rate:0/6 = -1.00 3/6 = -0.00 1/6 = -0.67 4/6 = +0.33 2/6 = -0.33 5/6=+0.67 6/6 = 1.00

heads tails

15-44

San Luis Pass weather prediction exercise

Which coins best predict the weather in San Luis Pass?Should Marfurt go fishing?

15-45

(Kalkomey, 1997)

Potential risks when using seismic attributes as predictors of reservoir properties

When the sample size is small, the uncertainty about the value of the true correlation can be large. • given 10 wells with a correlation of r=0.8, the 95%

confidence level is [0.34,0.95]

• given only 5 wells with a correlation of r=0.8, the 95% confidence level is [-0.28,0.99] !

15-46

(Kalkomey, 1997)

Spurious Correlations

A spurious correlation is a sample correlation that is large in absolute value purely by chance.

15-47

(Kalkomey, 1997)

The more attributes, the more spurious correlations!

15-48

(Kalkomey, 1997)

Risk = expected loss due to our

uncertainty about the truth * cost of

making a bad decision

Cost of a Type I Error (using a seismic attribute to predict a reservoir property which is actually uncorrelated) is:• Inaccurate prediction biased by the attribute.• Inflated confidence in the inaccurate prediction — apparent prediction errors are small.

Cost of a Type II Error (rejecting a seismic attribute for use in predicting a reservoir property when in fact they are truly correlated) is:• Less accurate prediction than if we’d used the seismic attribute.• Larger prediction errors than if we’d used the attribute.15-49

Validation of Attribute Anomalies

1. Basic QC• is the well tie good?• are the interpreted horizons consistent and accurate?• are the correlations statistically meaningful?• is there a physical or well-documented reason for an attribute to correlate

with the reservoir property to be predicted?

2. Validation • does the prediction correlate to control not used in training?• does the prediction make geologic sense?• does the prediction fit production data?• can you validate the correlation through forward modeling?

(Hart, 2002)15-50

Validation of Attribute Anomalies(Porosity prediction in lower Brushy Canyon)

Right map has higher statistical significance and is geologically more realistic

From probabilistic neural network. From multivariate linear regression

(Hart, 2002)15-51

Validation of Attribute Anomalies(Through modeling the Smackover formation)

Seismic

Instantaneous frequency

Envelope

Field data Model data

Seismic Attribute Correlations: “Trust, but verify!”(Hart, 2002)15-52

Validation of Attribute Anomalies(Through engineering and geology)

Neural Net. R=0.96

Multivariate Linear Regression. R=0.89

Dip map. Engineering and geologic analyses indicate fractures, associated with high dip areas, play an important role in enhancing gas production from these tight carbonates. Stars indicate locations of wells drilled in 1999

(Hart, 2002)15-53

15-54

Neural Networks

In Summary

• Neural networks find linear and nonlinear trends in the seismic data that can help correlate well control to maps and formations.

• Avoid using cyclical attributes (phase, strike,…) with neural networks.

• A good neural network application will mimic the interpreter who trains it.

• Don’t ask a poor interpreter to train a neural network!

• Lack of sufficient control or use of too many attributes can lead to false positive and false negative predictions!

“Understand your assumptions!Quality control your results!

Avoid Mindless Interpretation!”(Bob Sheriff, 2004)

14-55

15-56

Recommended