Deep$learning$seismic$tomography - sesaai.stanford.edu · Velocity$model$building$(tomography) •...

Preview:

Citation preview

Deep  learning  seismic  tomographyMauricio  Araya-­‐Polo  (Shell),  Joseph  Jennings*  (Shell,  Stanford)  and  Stuart  Farris  (Stanford)

March  30,  2018

Velocity  model  building  (tomography)

• Estimate  subsurface  wave  speed  from  seismic  data

• Arguably  the  most  important  and  difficult  task  in  exploration  geophysics

• Common  approaches:1. Refraction/reflection  tomography

2. Full  waveform  inversion

2

J(m) =1

2||f(m)� d

obs

||22

Formulation  of  tomography  problem

-­‐ velocity  model  

-­‐ Recorded  data

-­‐ Predicted  data

mdobs

• models  synthetic  data  via  the  wave/Eikonal equation

• Solved  using  gradient-­‐based  optimization  techniques

• Computationally  demanding  

3

f(m)

f(m)

Oberved data    (                )dobs

4

Velocity  model  (        )  from  tomography

(Almomin,   2016)

m

5

Another  perspective  on  tomography

6

is  known  deterministicallyf(m)

Statistical  learning

7

f : X ! Y X

Y

-­‐ Random  input  vector  (observed  data)

-­‐ Random  output  variable  (velocity  models),

Statistical  learning

8

f : X ! Y X

Y

-­‐ Random  input  vector  (observed  data)

-­‐ Random  output  variable  (velocity  models)

argminf

L(Y, f(X)) L(Y, f(X)) -­‐ Loss  function,

,

Statistical  learning

9

f : X ! Y X

Y

-­‐ Random  input  vector  (observed  data)

-­‐ Random  output  variable  (velocity  models)

argminf

L(Y, f(X)) L(Y, f(X)) -­‐ Loss  function,

,

L(Y, f(X)) = (Y � f(X))2 (squared  error  loss)

Statistical  learning

10

f : X ! Y X

Y

-­‐ Random  input  vector  (observed  data)

-­‐ Random  output  variable  (velocity  models)

argminf

L(Y, f(X)) L(Y, f(X)) -­‐ Loss  function,

,

L(Y, f(X)) = (Y � f(X))2

) f(x) = E(Y |X = x)

(squared  error  loss)

A  different  perspective  on  tomography

Estimated  via  statistical  learningmethods

11

Deep  learning  as  a  seismic  imaging  tool

• 2D  fault  prediction  from  shot  gathers  – (Zhang,  C.  et  al.,  2014)

• 3D  fault  prediction  from  shot  gathers  – (Araya-­‐Polo,  M.  et  al.,  2017)

12

Why  deep  learning  for  velocity  estimation?

• Potential  for  less  computational  burden

• Automated  (no  human-­‐curated  analysis  of  gathers)

13

Deep  learning  training  workflow

14

(Y )

(X)

argminf

L(Y, f(X))

Deep  learning  training  workflow

15

(Y )

(X)

argminf

L(Y, f(X))

Features  for  deep-­‐learning  tomography

16

Choosing  the  feature

• Suspected  poor  performance  on  raw  data  and  dimensionality  issues

• Many  seismic  attributes  from  which  to  choose

• Cheap  to  compute

17

Semblance  (velocity  spectrum)

• A  basic  velocity  analysis  tool  (Taner and  Koehler,  1969)

• Contains  “apparent  velocity”  information

• Cheap  to  compute

18

Earth  Model

19

Seismic  Experiment

20

Seismic  Experiment

21

Seismic  Experiment

22

Common  midpoint  gather  (CMP)

23

Synthetic  CMP  (muted  direct  arrival)

24

25

• Data  redundancy

Synthetic  CMP  (muted  direct  arrival)

Synthetic  CMP  (NMO  hyperbola)

26

NMO-­‐corrected  CMP      (                                            )

27

NMO-­‐corrected  CMP        (                                          )

28

NMO-­‐corrected  CMP        (                                          )

29

for

30

q

q[j, k]

for

31

Stack  over  offset

q

for

32

Stack  over  offset

q

Smooth   along  time

Semblance

33

• -­‐ NMO-­‐corrected  image  

for  a  particular  velocity

• -­‐ time  index

• -­‐ offset  index

• -­‐ output  index

• -­‐ length  of  smoothing  window

Semblance

34

• -­‐ NMO-­‐corrected  image  

for  a  particular  velocity

• -­‐ time  index

• -­‐ offset  index

• -­‐ output  index

• -­‐ length  of  smoothing  window

Stack  over  offsetSmooth  in  time

Normalization

Semblance  (velocity  spectrum)

35

Training  data

36

X

Y

(feature)

(label)

Deep  learning  training  workflow

37

(Y )

(X)

argminf

L(Y, f(X))

Deep  neural  network  training

38

Training  the  network• 10,000  synthetic  pseudorandom  velocity  models- 80%  training,  20%  testing

• Squared  error  loss:  

• Metrics:  Structual similarity  index  (SSIM)  and                      score  

39

L(Y, f(X)) = (Y � f(X))2

R2

SSIM

40

SSIM

41

x

y1

• Compares  mean  and  standard  deviation  within  window

SSIM

42

x

y1

• Compares  mean  and  standard  deviation  within  window

• Mean  SSIM  (                          )  average  of  SSIM  for  all  windowsMSSIM

• Compares  mean  and  standard  deviation  within  window

• Mean  SSIM  (                          )  average  of  SSIM  for  all  windows

• , MSSIM 1

SSIM

43

x

y1

MSSIM = 0.14

MSSIM

SSIM

44

x

y2

SSIM

45

x

MSSIM = 0.84

y2

Training  results

46

Deep  learning  testing  workflow

47

Test  set  (20%  of  the  total  data)

Deep  learning  testing  results

48

MSSIM  =  0.66170

49

Deep  learning  testing  resultsMSSIM  =  0.72079

50

Deep  learning  testing  resultsMSSIM  =  0.78219

51

Deep  learning  testing  resultsMSSIM  =  0.76253

FWI  results

52

MSSIM  =  0.49666

Deep  learning  testing  results

53

MSSIM  =  0.66170

Multiscale FWI  results

54

MSSIM  =  0.84702

DNN  vs  multiscale FWI  results

55

MSSIM  =  0.66170 MSSIM  =  0.84702

Conclusions/future  work

• We  estimated  a  tomography  operator  via  deep  learning

• Semblance  was  the  input  feature

• Extend  the  method  to  3D  data

• Test  on  real  data

56

Questions?

57

SStot

=mX

i=1

(yi � µy)2

Coefficient  of  determination  

R2 = 1� SSres

SStot

58

(R2)

where,

Variance  in  the  labels

Error  in  prediction

,

,

,SSres =mX

i=1

(yi � f(xi))2

SStot

=mX

i=1

(yi � µy)2

Coefficient  of  determination  

R2 = 1� SSres

SStot

59

(R2)

where,

Variance  in  the  labels

Error  in  prediction

,

,

,SSres =mX

i=1

(yi � f(xi))2

SStot

=mX

i=1

(yi � µy)2

Coefficient  of  determination  

R2 = 1� SSres

SStot

60

(R2)

where,

Variance  in  the  labels

Error  in  prediction

,

,

,SSres =mX

i=1

(yi � f(xi))2

Coefficient  of  determination  

• zero  error  (we  fit  the  data  perfectly)

• no  learning  occurred.  We  predict  

• we  perform  worse  than  just  predict

61

(R2)

R2 = 1� SSres

SStot

R2 = 1 )

R2 = 0 ) µy

µyR2 < 0 )

Deep  learning  testing  results

4

62

Recommended