Non-parametric regression and neural-network infill drilling recovery models for carbonate reservoirs

Non-parametric regression and neural-network in®ll drillingrecovery models for carbonate reservoirs

C.H. Wu*, R.D. Soto, P.P. Valko, A.M. Bubela

Texas A&M University, College Station, TX 77843, USA

Received 15 March 1999; received in revised form 9 September 1999; accepted 9 September 1999

Abstract

This work introduces non-parametric regression and neural network models for forecasting the in®ll drillingultimate oil recovery from reservoirs in San Andres and Clearfork carbonate formations in West Texas.

Development of the oil recovery forecast models helps understand the relative importance of dominant reservoircharacteristics and operations variables, reproduce recoveries for units included in the database, forecast recoveriesfor possible new units in similar geological settings, and make operations decisions. The variety of applications

demands the creation of multiple recovery forecast models. One of the signi®cant constraints for the modeldevelopment is the limited number of ®eld data that are inexact and often exhibit uncertain relationships. Theinexact and uncertain relationship may also encompass a large number of possible independent variables. This

situation mandates proper selection of independent variables for the in®ll drilling recovery model. Non-parametricregression and multivariate principal component analysis are used to identify the dominant and the optimumnumber of independent variables. The advantage of the non-parametric regression is easy to use and can quicklyprovide results that reveal the dominant independent variables and relative characteristics of the relationships. The

disadvantage is retaining a large variance of forecast results for a particular data set. The insight of interdependencyof the variables gained in non-parametric regression and multivariate principal component analysis is employed todevelop an e�ective neural network. The neural network in®ll drilling recovery model is capable of forecasting the

oil recovery with less error variance. This work shows that a multiple use of various modeling techniques mayprovide a healthy interaction between the di�erent approaches and thereby, a better oil recovery forecast. 7 2000Elsevier Science Ltd. All rights reserved.

Keywords: West Texas; Oil recovery; Multivariate principal component; Dominant variables; Forecast models; Reservoir engineer-

ing

1. Introduction

The amount of oil that can be recovered from an oil

reservoir is dependent on the reservoir characteristics,

recovery method, number of wells, and operations e�-ciency. Consequent to the primary recovery, water-¯ood is often used as a secondary recovery process.

Since most reservoirs are heterogeneous, in®ll drillingafter an initial water ¯ood permits production of oilfrom parts of the reservoir that might otherwise be

bypassed (Shao et al., 1994a; Wu et al., 1989).Researchers and engineers have shown that the in®ll

Computers & Geosciences 26 (2000) 975±987

0098-3004/00/$ - see front matter 7 2000 Elsevier Science Ltd. All rights reserved.

PII: S0098-3004(00 )00033-9

* Corresponding author.

E-mail address: [email protected] (C.H. Wu).

drilling in the West Texas carbonate reservoirs has

indeed accelerated and increased the oil recovery (Luet al., 1993). However, the dominant mechanisms andfactors that a�ect the oil recovery are not fully ident-

i®ed or may be subjected to varying degrees of uncer-tainty. To forecast in®ll drilling recovery e�ciency ofan individual unit or a reservoir is a painful task for

most reservoir-engineering professionals.Many approaches are used to evaluate in®ll drilling

recovery e�ciency. The approach we take here is stat-istical. It is based on an oil recovery e�ciency database(Wu et al., 1989, 1997) developed for speci®c produ-

cing formations in a particular geological basin.The database includes oil recovery and reservoir and

production data of 21 units in the San Andres for-mation and 23 units in the Clearfork formation. Esti-mation of ultimate and incremental in®ll drilling

recovery has been a di�cult task because of the limiteddata and the uncertainties in the independent variables(Shao et al., 1994b; Wu et al., 1997). Many attempts

have been made to develop the in®ll drilling oil recov-ery forecast models with reference to reservoir rock

and ¯uid properties (Lu et al., 1993; Shao et al.,1994b; Wu et al., 1997). Linear regression, statisticalanalysis, or fuzzy logic was used to analyze the oil

recovery data. Whereas these reported models haveimproved over the years, they are not entirely satisfac-tory due to the inexact nature of the data set and the

inherent limitations in the models themselves.Much e�ort was spent in determining the dominant

and the optimum number of the independent variables(Lu et al., 1993; Shao et al., 1994b; Wu et al., 1997).Some of these independent variables go beyond the

basic reservoir and ®eld properties. Additional par-ameters such as the primary ultimate recovery (PRUR)

and the initial water¯ood ultimate recovery (IWUR)are found needed for forecasting IDUR (Lu et al.,1993).

Non-parametric regression analysis and neural net-work modeling are proposed to create additional fore-cast models that would improve the degree of

consistency and accuracy. The non-parametricapproach is based on the work of Breiman and Fried-

man (1985) and Xue and Datta-Gupta (1996). Whereasthe non-parametric regression analysis provided betteridenti®cation of dominant independent variables and

more consistent forecast, it did not improve the var-iance of forecast results. The application of neural net-works for modeling nonlinear systems has increased

substantially in recent years (Nikravesh et al., 1996;Azimi-Sajadi and Liou, 1992). However, one of the

problems that needs to be addressed is the determi-nation of the dominant and the optimum number ofindependent variables for practical ®eld application.

The non-parametric regression analysis and the princi-pal components and factor analysis (Johnson and

Wichern, 1998) are used to identify the dominant inde-pendent variables to develop more e�cient and realis-

tic neural-networks models.

2. Non-parametric regression approach for estimating

optimal transformations for multiple regressions

Non-parametric regression is one of the novelapproaches to constructing a suitable model descrip-tion from available information. It is developed to alle-

viate the problem of parametric regression that oftenleads to erroneous results caused by the mismatchbetween assumed model structure and the behavior of

the actual data. In non-parametric regression we donot ®x a priori the `form' of the dependency of thedependent variable on the independent variables. In

fact, one of the main results of non-parametric re-gression is the form of the relationship.Non-parametric regression is intended to build a

model in the form,

y � f ÿ10 � f1�x 1� � f2�x 2� . . .� fn�xn�� 1�

where the inverse transformation fÿ10 and the individ-ual transformations, f1, f2, . . . , fn are selected to maxi-mize the correlation between the right-hand and left-hand sides of the relation

f0� y� � f1�x 1� � f2�x 2� . . .� fn�xn� �2�subject to some constraints.

In this case the symbol fi�xi � does not necessarilymean a certain algebraic expression. It is rather a re-lationship de®ned point-wise. The method of alternat-ing conditional expectations (ACE) (Breiman and

Friedman, 1985) constructs and modi®es the individualtransformations in order to maximize the correlationin the transformed space. To make the ACE algorithm

really work, however, one has to imply some kind ofrestriction on the smoothness of the individual trans-formations by the way a certain ``smoother'' is used to

construct and improve the transformations.The results of the ACE algorithm consist of the indi-

vidual transformations, given in the form of a point-by-point plot or table. The resulting ``model'' can be

used to predict a y value corresponding to a set ofindependent variables, x 1, x 2, . . ., xn: The problem is,of course, that the transformations are known only

point-by point, therefore some kind of interpolation isneeded to ``calculate'' the transformed variables and toapply the inverse transformation at the end. Fortu-

nately, for many physically sound problems the shapeof the individual transformations is rather simple, andhence (maximum) quadratic polynomials are usually

C.H. Wu et al. / Computers & Geosciences 26 (2000) 975±987976

Table

1

SanAndresdata

I

FIE

LD/U

NIT

OOIP

(MSTB)

AREA

(acres)

DEPTH

(feet)

NET

(feet)

GROSS

(feet)

POR

(%)

SW

(%)

API

VIS

(sp)

FVF

(RB/STB)

PRESS

(psi)

PERM

(md)

1Adair``SA''

169,439

5338

4800

50

105

14.1

35.0

34

1.6

1.12

1875

3.7

2FuhramnM/BL10``GBSA''

78,383

6134

4300

41

250

7.7

40.0

31

3.5

1.15

1600

2.4

3FuhramnM/BL9``GBSA''

55,939

3948

4450

41

250

7.0

30.0

29

3.3

1.10

1600

4.0

4Johnson/``G

B''``SA''

63,003

3720

4150

50

130

6.7

21.8

33

3.6

1.20

1595

5.3

5Johnson/`À

B''``SA''

18,247

840

4100

60

148

8.0

30.0

39

1.3

1.20

2500

1.8

6Levella/N

Cen

UN

``SA''

131,981

11,250

4750

31

70

8.0

25.0

31

2.5

1.23

1690

1.8

7Mabee/JEMabee

À'``SA''

279,112

13,030

4700

40

50

10.5

29.0

32

2.4

1.08

1905

1.5

8Means``SA''

376,693

14,328

4300

55

300

9.0

28.8

29

6.2

1.04

1850

29.0

9Ownby``SA''

47,508

2960

5200

32

85

14.1

38.1

32

1.5

1.35

1800

4.5

10

Ownby/BLGilstrap``SA''

3643

160

5235

40

85

14.2

38.0

31

2.0

1.20

1800

4.5

11

Sable

``SA''

33,331

1340

5200

57

79

9.0

25.0

32

2.2

1.20

1550

1.5

12

Sem

inle

``SA''

1,154,378

15,699

5300

126

154

12.0

16.0

35

1.1

1.34

2020

31.2

13

Shafter

``SA''

184,381

11,080

4300

55

200

6.5

25.0

32

1.3

1.25

1865

5.0

14

Slaughter/IgoeSM

``SA''

63,155

2124

4930

49

120

11.2

14.1

32

1.4

1.23

1710

5.0

15

Triple-N

``GB''

18,683

2040

4325

20

20

12.1

40.0

32

1.8

1.23

2129

6.6

16

Wasson/Bennet

``SA''

394,925

7027

5100

130

865

10.0

27.0

33

1.6

1.31

1805

1.7

17

Wasson/C

ornell``SA''

182,409

1923

4900

220

305

8.5

15.0

33

1.3

1.30

1850

3.7

18

Wasson/D

enver

``SA''

2,172,316

25,505

4800

141

290

12.0

15.0

33

1.8

1.31

1805

5.0

19

Wasson/R

eborts``SA''

394,971

13,575

4900

68

255

8.5

15.0

33

1.6

1.31

1805

5.0

20

Wasson/W

illard

``SA''

699,419

13,360

5100

130

200

8.5

20.0

32

1.8

1.31

1805

1.5

21

WestSem

inole

``SA''

196,021

3640

5112

118

210

9.9

18.0

32

1.0

1.38

2020

20.8

C.H. Wu et al. / Computers & Geosciences 26 (2000) 975±987 977

enough to represent the individual transformations inalgebraic form.

3. Neural networks

A neural network is a series of layers with nodesand weights that represent complex relationships

among input and output variables. The ®rst layer hasinput nodes representing the input variables speci®edby the problem. The node of a hidden layer uses the

sum of the weighted outputs of previous layer and sig-moid function to provide output for the nodes in thesubsequent hidden layer. The number of nodes in eachlayer and the weights are determined by trials and by

optimization. The objective of the neural network is toobtain optimal weights to give a best value for thenodes of the output layer.

Performance learning is one of the most importantsteps where network parameters are adjusted in ane�ort to optimize the performance of the network.

Two steps are involved in this optimization process.The ®rst step is to de®ne a quantitative measure of net-work performance, performance index. It represents a

global error of the neural network de®ned as,

E �Xp

Xk

� yrealpk ÿ yestimated

pk � �3�

where the inner summation is over all nodes in theoutput layer, and the outer sum is over the number of

the training set. During training, the size of error gen-erally decreases until it reaches a threshold level.The second step of the optimization process is to

search the parameter space to reduce the performanceindex. A steepest descent algorithm is often used forthe optimization. Mathematically, the weights are

adjusted as follows:

Wnew �Wold � Z � d � o �4�where Z is learning rate and o is the output value from

hidden nodes. d is the error signal term produced bythe nodes in the hidden or output layer,

d � ÿ @E

@ �SWold � o� : �5�

Before training the network, the learning rate (Z ) forthe network must be speci®ed. The new weights arethen used to calculate the new output. The procedure

is repeated till a tolerance is satis®ed.

4. Data set

The input data are tabulated in Tables 1, 2, 3, and

4. Tables 1 and 2 list the reservoir and operationsproperties of the San Andres units. The dependent

Table 2

San Andres data II

FIELD/UNIT NOPW PRUR

(MSTB)

NOWW IWUR

(MSTB)

DIWUR

(MSTB>

NOIW IDUR (MSTD) DIDUR

(MSTB)

1 Adair ``SA'' 109 21,398 130 44,352 22,954 178 65,401 21,049

2 Fuhramn Masho/BL10 ``GBSA'' 108 7733 118 9957 2224 133 10,513 556

3 Fuhramn Masho/BL9 ``GBSA'' 77 6354 136 8435 2081 158 10,343 1908

4 Johnson/``GB'' ``SA'' 83 9690 116 15,528 5838 149 17,413 1885

5 Johnson/`ÀB'' ``SA'' 15 1548 38 3842 2294 93 5812 1970

6 Levelland/N Cen UN ``SA'' 268 18,112 363 33,947 15,835 489 57,198 23,251

7 Mabee/JE Mabee À' ``SA'' 290 34,316 592 74,618 40,302 620 88,786 14,168

8 Means ``SA'' 299 64,477 398 127,512 63,035 754 151,695 24,183

9 Ownby ``SA'' 49 6224 59 10,886 4662 72 16,175 5289

10 Ownby/BL Gilstrap ``SA'' 4 389 5 1257 868 8 1593 336

11 Sable ``SA'' 37 4209 64 9203 4994 71 11,094 1891

12 Seminle ``SA'' 327 196,265 523 448,771 252,506 604 537,711 88,940

13 Shafter ``SA'' 258 21,039 326 32,961 11,922 369 35,201 2240

14 Slaughter/IGOE Smith ``SA'' 42 9044 82 25,786 16,742 97 27,732 1946

15 Triple-N ``GB'' 23 2096 40 4966 2870 73 6876 1910

16 Wasson/Bennet ``SA'' 213 36,325 293 97,279 60,954 468 119,502 22,223

17 Wasson/Cornell ``SA'' 71 21,718 92 64,338 42,620 128 67,765 3427

18 Wasson/Denver ``SA'' 386 207,826 593 383,090 175,264 1417 943,060 559,970

19 Wasson/Reborts ``SA'' 194 45,491 377 101,171 55,680 424 111,600 10,429

20 Wasson/Willard ``SA'' 223 59,029 304 102,844 43,815 461 178,312 75,468

21 West Seminole ``SA'' 65 10,073 93 28,537 18,464 152 40,421 11,884


Table

3

Clearfork

data

I

FIE

LD/U

NIT

OOIP

(MSTB)

AREA

(acres)

DEPTH

(feet)

NET

(feet)

GROSS

(feet)

POR

(%)

SW

(%)

API

VIS

(cp)

FVF

(RB/STD)

PRESS

(psi)

PERM

(md)

1DiamondM/Jack

3004

320

3170

34

105

7.0

40.0

30.5

2.4

1.18

1600

8.0

2DiamondM/M

cLaAC

16574

720

3170

32

65

7.0

38.0

30.5

2.4

1.18

1200

3.0

3DollarhideÀ

B'

72,873

2631

6500

68

357

8.9

18.0

37.0

0.6

1.39

2890

8.4

4Flanagan/C

learfork

81,812

4850

6380

32

468

11.4

24.9

32.2

1.7

1.26

1875

5.2

5Fullerton

1,032,853

29,542

6700

87

500

10.0

22.3

42.0

0.5

1.50

2940

3.0

6Goldsm

ith5600/C

A610,244

15,200

5600

75

350

15.0

31.0

38.0

0.7

1.50

2330

28.0

7Goldsm

ith/Landreth

119,967

7814

5550

39

345

9.6

26.0

39.0

0.5

1.40

2330

2.6

8Lee

Harrison/W

est

20,698

920

4850

44

84

12.5

42.0

25.0

8.7

1.10

2000

4.0

9Monahans

111,620

4700

4600

60

600

10.0

25.0

37.0

8.1

1.47

2200

2.0

10

NorthRiley

``CF''

140,362

6960

6300

65

66

7.7

33.0

32.0

2.6

1.29

2760

12.0

11

Ownby/U

CFU

39,283

2133

6525

78

259

5.0

30.0

27.0

1.7

1.15

2400

1.2

12

Smyer/East

80,445

3120

6700

84

700

7.0

35.0

29.0

1.8

1.15

2400

7.7

13

Prentice/6700/6700CLFK

161,577

6828

6700

73

696

8.2

41.4

28.0

1.7

1.15

2400

3.0

14

Prentice/N

E51,362

2000

6450

100

370

6.2

38.6

28.0

1.7

1.15

2400

3.0

15

Robertson/N

orth

274,757

4696

5800

236

1300

6.3

30.0

31.0

1.2

1.38

2950

0.7

16

Russell/7000CFU

209,836

8510

7350

101

307

5.3

24.0

34.7

0.8

1.28

2600

1.0

17

Smyer/East

63,419

4410

5800

36

110

8.3

33.0

26.5

5.8

1.08

2100

3.4

18

Smyer/Ellwood`À''

81,877

4320

5990

39

174

8.3

20.0

25.0

5.1

1.06

1858

5.0

19

Wasson72/G

aines

108,446

4400

5675

85

760

6.4

27.0

35.0

1.0

1.25

2600

1.0

20

Wasson72/G

ibson

151,836

3760

6600

169

725

5.5

30.0

31.0

1.5

1.25

2700

0.5

21

Wasson72/South

240,354

4961

6400

137

1255

7.7

26.0

32.0

1.4

1.25

2600

5.5

22

Wasson72/Y

oakum

80,032

7400

5675

37

336

6.4

27.0

35.0

1.0

1.24

2600

0.5

23

WassonNECF/N

orth

69,224

4320

6400

81

234

5.1

35.0

30.0

1.5

1.30

2643

0.2


variables are primary ultimate oil recovery (PRUR),initial water¯ood ultimate recovery (IWUR) and in®ll

drilling ultimate recovery (IDUR). The labels andunits of each variable are referred to in Appendix A.During the data analysis and model development, it is

found that IWUR is strongly dependent on PRUR;and IDUR on PRUR and IWUR. The sequentialdependency complicates the development of depend-

able IDUR models. Table 3 and 4 list the properties ofClearfork units.

5. The dominant independent variables

Both San Andres and Clearfork units are combined

as a data set with BASIN 1 indicating San Andresunits and BASIN 2 Clearfork units. The input datainclude the reservoir and ¯uid properties. They also

include the operations properties such as the numberof primary recovery wells, the number of initial water-¯ood wells, and the number of in®ll wells. Since the oil

recovery is assumed to be partially dependent on theproductive area and the number of wells drilled (aver-age well spacing by taking a ratio), the well locations

need not be speci®ed.The non-parametric regression with alternative con-

ditional expectation (ACE) algorithm provides point-wise transformations that reveal a wealth of infor-

mation on the signi®cance of the given variable. Theindependent variable can be ranked by the range of itstransformed values. For dominant independent vari-

ables, the range is greater than unity, indicating thatthe transformed dependent variable depends heavily onthe actual value of the given independent variable.

Table 5 shows the ranking of the available independentvariables to forecast PRUR, IWUR and IDUR, re-spectively.

When the ranges of two or more transformed inde-pendent variables are basically the same, the shape ofthe transformation function can be used to assist theselection of dominant variables. Monotonic behavior is

anticipated from reservoir engineering considerations,and the nonparametric transformation should re¯ectthis condition for selected dominant variables. A vari-

able is preferred if the shape of the transformationdemonstrates a higher degree of linearity.After examining the graphs and ranges of the trans-

formed variables for all the independent variables, thefour dominant independent variables for PRUR are:productive area, net pay, oil formation volume factor,

and API gravity. The dominant independent variablesfor IWUR are: primary ultimate recovery, number of

Table 4

Clearfork data II

FIELD/UNIT NOPW PRUR

(MSTB)

NOWW IWUR

(MSTB)

DIWUR

(MSTB)

NOIW IDUR

(MSTB)

DIDUR

(MSTB)

1 Diamond M/Jack 5 346 9 585 239 17 786 201

2 Diamond M/McLa AC 1 11 513 18 620 107 33 851 231

3 Dollarhide ÀB' 77 13,663 80 25,511 11,848 175 37,070 11,560

4 Flanagan/Clearfork Cons 93 12,307 105 27,333 15,026 110 29,987 2653

5 Fullerton 739 119,055 821 217,634 98,579 1136 325,222 107,588

6 Goldsmith 5600/CA Gldsmith 461 64,109 661 119,520 55,411 800 121,038 1518

7 Goldsmith/Landreth (2) 191 28,936 195 46,281 17,345 260 65,440 19,159

8 Lee Harrison/West 12 2212 19 2861 649 26 3513 652

9 Monahans 63 5245 124 12,176 6931 235 20,933 8757

10 North Riley ``CF'' 131 18,378 139 23,718 5340 232 38,688 14,970

11 Ownby/UCFU 42 3729 43 7079 3351 69 9472 2393

12 Smyer/East 73 18,861 74 24,068 5208 95 36,601 12,533

13 Prentice/6700/6700 CLFK 129 33,200 139 77,712 44,512 273 99,519 21,807

14 Prentice/NE 50 6554 69 16,322 9768 125 23,696 7374

15 Robertson/North 104 27,841 124 35,682 7841 361 67,538 31,856

16 Russell/7000 CFU 185 38,902 198 54,866 15,964 304 62,680 7814

17 Smyer/East 52 4949 100 14,507 9557 134 14,592 86

18 Smyer/Ellwood `À'' 108 8191 135 20,365 12,174 154 26,402 6037

19 Wasson 72/Gaines 105 17,563 107 21,869 4305 138 23,655 1787

20 Wasson 72/Gibson 85 12,798 96 14,968 2170 118 21,973 7004

21 Wasson 72/South 121 41,098 171 59,401 18,303 184 73,063 13,662

22 Wasson 72/Yoakum 91 13,697 130 15,621 1924 145 17,662 2040

23 Wasson NE CF/North 82 9986 96 13,400 3414 117 17,212 3812


water¯ood wells, productive area, and porosity. The

dominant independent variables for IDUR are: num-

ber of initial water¯ood wells, initial water¯ood ulti-

mate recovery, number of in®ll wells, and primary

ultimate recovery.

It is surprising to note that the PRUR is a strong

function of the original oil in place as implied by the

dominant input variables such as productive area and

net pay. It is dependent more on operations e�orts

(better reservoir and production engineering) than on

the variation of ¯uid properties. It is also surprising to

note that the PRUR has a great in¯uence on the

IWRU and thereby, the IDUR. It is not surprising to

note that IWUR and IDUR are heavily dependent on

the number of in®ll drillings. It is also interesting to

note that the non-parametric regression method ®nds

that basically the same model describes the oil recovery

of the two basins.

Multivariate statistical analysis is also performed to

determine the dominant independent variables with

reference to that investigated by the non-parametric re-

gression analysis. Table 6 shows the results of a princi-

pal component analysis for PRUR. To describe the

possible relationships among the variables and deter-

mine if there is any possibility for grouping variables,

we used the principal component factor analysis (John-

son and Wichern, 1998). Table 7 shows an output of

the rotated factor loading. The ®rst factor grouped the

primary ultimate oil recovery (PRUR) with the pro-

ductive area and the number of primary recovery wells

(NOPW) as indicated by the loading values. The ®rst

factor shows the dependence of the primary ultimateoil recovery on the productive area and the number of

wells for primary recovery. The second factor groupsthe depth and initial reservoir pressure. The third fac-tor groups the API gravity and FVF. The fourth factor

groups gross and net thickness. The other factors showthat the water saturation, porosity, permeability, andviscosity are also important independent variables.Similar principal component analysis was performed

for IWUR and IDUR.

Table 5

Range and ranking of transformed independent variables

Variable PRUR IWUR IDUR

Max Min Range Rank Max Min Range Rank Max Min Range Rank

Area 0.9 ÿ4.1 5.0 1 1.2 ÿ0.2 1.4 3 0.1 ÿ0.3 0.4 6

Net 0.9 ÿ1.0 1.9 2 0.0 ÿ0.1 0.1 13 0.0 0.0 0.1 13

POR 0.2 ÿ0.4 0.6 9 0.2 ÿ0.4 0.6 4 0.1 ÿ0.3 0.4 7

SW 0.2 ÿ0.1 0.3 11 0.1 ÿ0.1 0.2 11 0.1 ÿ0.2 0.3 8

API 0.3 ÿ0.8 1.1 4 0.2 ÿ0.1 0.3 9 0.1 ÿ0.1 0.2 10

VIS 0.3 ÿ0.3 0.6 7 0.1 0.0 0.2 12 0.1 ÿ0.1 0.1 14

FVF 0.8 ÿ0.6 1.4 3 0.0 0.0 0.1 14 0.0 0.0 0.1 15

Depth 0.1 ÿ0.4 0.5 10 0.4 ÿ0.2 0.6 5 0.1 ÿ0.1 0.2 11

Gross 0.4 ÿ0.5 0.8 5 0.1 ÿ0.3 0.4 6 0.0 0.0 0.0 17

Press 0.1 ÿ0.1 0.2 12 0.1 ÿ0.3 0.4 7 0.1 ÿ0.2 0.2 12

Perm 0.2 ÿ0.3 0.6 8 0.0 ÿ0.1 0.1 15 0.1 ÿ0.1 0.3 9

NOPW 0.1 ÿ0.5 0.6 6 0.1 ÿ0.1 0.2 10 0.1 ÿ0.5 0.6 5

Basin 0.1 ÿ0.1 0.2 13 0.2 ÿ0.3 0.4 8 0.0 ÿ0.1 0.1 16

PRUR 0.9 ÿ3.4 4.2 1 0.3 ÿ1.5 1.8 4

NOWW 0.2 ÿ2.0 2.2 2 2.3 ÿ0.5 2.8 1

IWUR 0.5 ÿ2.1 2.6 2

NOIW 0.4 ÿ1.9 2.3 3

Table 6

PRUR principal components

Eigenvalue Di�erence Proportion Cumulative

PRIN1 1.37501 0.48509 0.35456 0.35456

PRIN2 0.88991 0.49477 0.22947 0.58404

PRIN3 0.39514 0.12847 0.10189 0.68593

PRIN4 0.26667 0.05136 0.06876 0.75469

PRIN5 0.21531 0.02799 0.05552 0.81021

PRIN6 0.18732 0.02899 0.04830 0.85851

PRIN7 0.15832 0.02338 0.04083 0.89934

PRIN8 0.13494 0.04149 0.03480 0.93414

PRIN9 0.09346 0.02555 0.02410 0.95824

PRIN10 0.06791 0.02022 0.01751 0.97575

PRIN11 0.04769 0.02118 0.01230 0.98804

PRIN12 0.02651 0.00981 0.00684 0.99488

PRIN13 0.01669 0.01352 0.00431 0.99918

PRIN14 0.00317 0.00000 0.00082 1.00000


Results of the principal component and factor analy-sis indicate that the dominant independent variables

for PRUR are: productive area, initial reservoir press-ure, API gravity, FVF, net pay, initial water satur-ation, and porosity. The dominant independent

variables for IWUR are productive area, number of in-itial water¯ood wells, initial water saturation, API,

FVF, net pay, porosity, and viscosity. The dominantindependent variables for IDUR are productive area,

the number of in®ll wells, API, FVF, net pay, vis-cosity, initial water saturation, and porosity. Theseidenti®ed dominant independent variables are similar

to that identi®ed by the non-parametric regressionanalysis.

Table 7

PRUR rotated factor loadings. Most important loadings are shown in bold

Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 Factor 7 Factor 8

Basin ÿ0.01042 0.56157 0.13817 ÿ0.11926 0.31755 0.09918 0.11858 ÿ0.18657Area 0.46452 ÿ0.00983 ÿ0.06421 ÿ0.07218 0.06722 0.08984 ÿ0.17765 ÿ0.09534Net ÿ0.05270 ÿ0.14476 ÿ0.11316 0.61773 ÿ0.05406 ÿ0.08267 0.06656 0.05867

POR ÿ0.06108 0.08698 ÿ0.00592 0.06310 0.05034 1.02195 ÿ0.13180 0.01320

SW 0.10311 ÿ0.07040 ÿ0.03856 0.11391 0.97638 ÿ0.18867 0.02639 0.06323

API ÿ0.03532 ÿ0.08297 0.58774 ÿ0.12251 0.00345 0.08644 ÿ0.04098 ÿ0.13480VIS 0.03059 0.04800 0.05336 0.06179 0.00214 ÿ0.17756 ÿ0.06160 0.49233

FVF ÿ0.14050 0.01949 0.51102 0.08040 0.17243 ÿ0.08363 0.01365 0.12214

Depth 0.00448 0.56556 ÿ0.34269 ÿ0.15889 ÿ0.20299 ÿ0.36244 ÿ0.12011 0.40166

Gross 0.01756 ÿ0.07490 0.07738 0.79419 0.18190 0.28622 ÿ0.09721 0.03603

Press ÿ0.01853 0.32704 0.08920 ÿ0.03770 ÿ0.05751 0.06433 0.09882 ÿ0.04928Perm ÿ0.10414 0.04018 ÿ0.00424 ÿ0.01917 ÿ0.04296 0.04682 0.32101 ÿ0.17247NOPW 0.42865 0.00465 0.00249 ÿ0.07670 0.02886 0.18458 ÿ0.09693 ÿ0.10831PRUR 0.31515 ÿ0.03005 ÿ0.18314 0.12197 ÿ0.03446 0.00673 0.07818 0.05876

Table 8

San Andres recoveries calculated from non-parametric model (for forecasting IWUR known PRUR is used, for calculating IDUR

known IWUR is used)

Unit PRUR (MSTB) IWUR (MSTB) IDUR (MSTB)

Observed Calculated Observed Calculated Observed Calculated

1 21,398 23,515 44,352 78,109 65,401 60,913

2 7733 10,774 9957 17,225 10,513 11,160

3 6354 6929 8435 17,112 10,343 8872

4 9690 7305 15,528 17,391 17,413 16,302

5 1548 1935 3842 4950 5812 4584

6 18,112 15,720 33,947 37,691 57,198 52,429

7 34,316 38,283 74,618 133,393 88,786 98,852

8 64,477 57,935 127,512 239,654 151,695 181,826

9 6224 4215 10,886 15,072 16,175 12,861

10 389 219 1257 1363 1593 1650

11 4209 3160 9203 9710 11,094 9795

12 196,265 181,363 448,771 577,523 537,711 554,002

13 21,039 24,336 32,961 42,634 35,201 37,698

14 9044 7831 25,786 28,663 27,732 28,430

15 2096 2223 4966 5112 6876 6247

16 36,325 33,662 97,279 134,534 119,502 139,923

17 21,718 20,973 64,338 51,856 67,765 83,231

18 207,826 214,968 383,090 686,479 943,060 699,007

19 45,491 49,840 101,171 114,810 111,600 123,205

20 59,029 60,746 102,844 120,265 178,312 220,204

21 10,073 20,517 28,537 39,888 40,421 47,290


Fig. 1. E�ect of productive area on IDUR predicted by neural network model for san andres units.

Table 9

Clearfork recoveries calculated from non-parametric model (for forecasting IWUR known PRUR is used, for calculating IDUR

known IWUR is used)

Unit PRUR (MSTB) IWUR (MSTB) IDUR (MSTB)

Observed Calculated Observed Calculated Observed Calculated

1 346 300 585 426 786 617

2 513 680 620 621 851 667

3 13,663 14,503 25,510 19,268 37,070 37,043

4 12,307 15,073 27,333 20,653 29,986 33,712

5 119,055 152,657 217,634 195,753 325,222 365,656

6 64,108 76,725 119,519 179,009 121,038 171,679

7 28,935 23,947 46,280 36,124 65,439 59,119

8 2212 1151 2861 3766 3513 4016

9 5245 6058 12,176 12,250 20,933 21,400

10 18,377 21,566 23,717 23,525 38,687 36,325

11 3728 4408 7079 4430 9472 8984

12 18,860 14,810 24,068 23,742 36,601 25,810

13 33,200 28,030 77,712 43,758 99,519 106,229

14 6553 7594 16,321 8852 23,696 18,852

15 27,840 22,881 35,682 34,415 67,537 111,505

16 38,901 43,968 54,865 43,288 62,680 80,443

17 4949 7824 14,506 8865 14,592 19,104

18 8191 9902 20,365 16,574 26,402 29,322

19 17,563 18,007 21,868 21,478 23,655 25,265

20 12,798 18,635 14,968 16,105 21,972 22,219

21 41,098 37,767 59,401 60,923 73,062 64,144

22 13,697 16,191 15,621 15,894 17,661 19,343

23 9986 7012 13,400 10,872 17,212 16,962


6. In®ll drilling oil recovery forecast models

The results of the non-parametric regression forecastare shown in Tables 8 and 9. As can be seen, the var-

iance of forecast results is large and comparable to

that of regression and statistical forecast models (Wu

et al., 1997). The large variance of the forecast resultsindicates that from the given set of data the in¯uenceof the variable could not be revealed in a satisfactory

manner. The real behavior is masked by the stochastic

Fig. 2. E�ect of number of in®ll wells over that of initial water¯ooding on IDUR predicted by neural network model for San

Andres units.

Fig. 3. Calculated PRUR from neural network model vs actual PRUR.


errors, that become detrimental because of limited datapoint.

The dominant independent variables identi®ed for

each model are used to develop the neural network oilrecovery models. The 44-sample data set was divided

into three subsets: one for training (72%), validation

Fig. 4. Calculated IWUR from neural network model vs actual IWUR.

Fig. 5. Calculated IDUR from neural network model vs actual IDUR.


(14%) and testing (14%). A series of sensitivity analy-sis with di�erent neural network topologies is per-

formed to develop the best neural network models.For the sensitivity analysis of variable dependency, wemade a series of runs for each basin (San Andres and

Clearfork) by varying the value of each independentvariable while keeping other independent variables atindividual mean. As an example, Fig. 1 shows the

dependency of calculated IDUR on the productivearea. The monotonely increasing relationship indicatesphysically meaningful dependency of the variables.

Fig. 2 shows the dependency of the calculated IDURon the number of in®ll wells. Similar sensitivity analy-sis was made for other independent variables.The ®nal topology of the neural network for predic-

tion of PRUR has 7 neurons in the input layer andtwo hidden layers with 12 and 10 neurons, respectively.For IWUR, the topology has 9 neurons in the input

layer and two hidden layers with 10 and 6 neurons, re-spectively. The topology of the neural network forIDUR has 9 neurons in the input layer and two hidden

layers with 10 and 8 neurons, respectively. Fig. 3shows the calculated PRUR (including training set,validation set and testing set) vs the actual PRUR.

Fig. 4 shows the calculated IWUR vs the actualIWUR. Fig. 5 shows the calculated IDUR vs theactual IDUR. The correlation coe�cients are 0.998,0.992, and 0.995, respectively.

7. Conclusions

1. Non-parametric regression and neural networkmodels are developed for forecasting the in®ll dril-ling ultimate oil recovery from reservoirs in SanAndres and Clearfork carbonate formations in West

Texas.2. Development of the non-parametric regression oil

recovery forecast models helps understand the rela-

tive importance of dominant reservoir characteristicsand operations variables, reproduce recoveries forunits included in the database, and forecast recov-

eries for units in similar geological basins.3. Non-parametric regression and multivariate princi-

pal component analysis help identify the dominantand the optimum number of independent variables.

The advantage of the non-parametric regression isthat it is easy to use and can quickly reveal thedominant independent variables and relative charac-

teristics of the relationships. The disadvantage isthat it retains a large variance of forecast results fora particular data set.

4. The insight of interdependency of the variablesgained in non-parametric regression and multivari-ate principal component analysis is employed to

develop an e�ective neural network. The neural net-work in®ll drilling recovery model is capable of

forecasting the oil recovery with less error variance.

Acknowledgements

This material is based in part upon work supportedby the Texas Advanced Technology Program underGrant No. ATP-036327-066 1998. The InstitutoColombiano del Petroleo support for part of the works

is also appreciated.

Appendix A. Abbreviations used in text

ACE=alternating conditional expectationsAPI=API gravity of crude oil

AREA=productive area (acres)BASIN=basin index: 1 for San Adres, 2 for Clear-

forkE=performance index

f= transformed functionFVF=oil formation volume factor (RB/STB)GROSS=gross pay (feet)

IDUR=in®ll drilling ultimate oil recovery (MSTB)IWUR=initial water¯ood ultimate oil recovery

(MSTB)

k=node index of output layern= index of independent variablesNET=net pay (feet)

NOIW=number of wells after in®ll drillingNOPW=number of wells for primary recoveryNOWW=number of wells for initial water¯oodingPRUR=primary ultimate oil recovery (MSTB)

o=output value from hidden nodesOOIP=original oil in place (MSTB)p= index of training set

PERM=permeability (md)POR=porosity (fraction)PRESS=initial reservoir pressure (psia)

PRUR=primary ultimate recovery (MSTB)RB=reservoir barrelSTB=stock tank barrel, oil production unitSW=initial water saturation (%)

VIS=oil viscosity (cp)W=weightsx= independent variable

y=dependent variablez=value of individual transformd=error signal

Z=learning rateDIWUR=IWUR±PRUR (MSTB)DIDUR=IDUR±IWUR (MSTB)


References

Azimi-Sajadi, M., Liou, R.J., 1992. Fast learning process of

multilayer neural networks using recursive least squares

method. IEEE Transactions On Signal Processing 40 (2),

446±449.

Breiman, L., Friedman, J.H., 1985. Estimating optimal trans-

formations for multiple regression and correlation. Journal

of the American Statistical Association, September, p. 580.

Johnson, R.A., Wichern, D.W., 1998. Applied Multivariate

Statistical Analysis, 4th ed. Printice-Hall Inc, Upper

Saddle River, New Jersey, 816 pp.

Lu, G.F., Brimhall, R.M., Wu, C.H., 1993. Geographical dis-

tribution and forecast models of in®ll drilling oil recovery

for Permian Basin carbonate reservoir. SPE 16503 (poster).

1993 Annual Technical Conference and Exhibition,

Houston.

Nikravesh, M., Kovscek, A.R., Patzek, T.W., Soroush, M.,

1996. Identi®cation and control of industrial scale pro-

cesses via neural networks. In: 1996 Chemical Process

Control V, Tahoe City, CA, pp. 284±287.

Shao, H., Brimhall, R.M., Ahr, W.M., Wu, C.H., 1994a.

Integrated recovery e�ciency forecast models for San

Andres reservoirs of Central Basin Platform and Northern

Shelf, West Texas. SPE 27697 (poster). 1994 Permian

Basin Oil & Gas Conference, Midland, Texas, 7 pp.

Shao, H., Brimhall, R.M., Ahr, W.M., Wu, C.H., 1994b. An

application of idealized carbonate depositional sequences

for reservoir characterization. SPE 28458 (poster). 1994

Annual Technical Conference and Exhibition, New

Orleans, 13 pp.

Wu, C.H., Laughlin, B.A., Jardon, M., 1989. In®ll drilling

enhances water¯ood recovery. Journal of Petroleum

Technology October, 1088±1095.

Wu, C.H., Lu, G.F., Gillespie, W., Yen, J., 1997. Statistical

and fuzzy in®ll drilling models for carbonate reservoirs.

SPE 37728. 1997 SPE Middle East Oil Show &

Conference, Bahrain, 20 pp.

Xue, G., Datta-Gupta, A., 1996. A new approach to seismic

data integration during reservoir characterization using op-

timal non-parametric transformations. SPE 36500. 1996

SPE Annual Technical Conference, Denver, 14 pp.


Documents

Non-parametric regression and neural-network infill drilling recovery models for carbonate reservoirs