13
WATER RESOURCES BULLETIN VOL. 14, NO. 5 AMERICAN WATER RESOURCES ASSOCIATION OCTOBER 1978 GENERATION OF DAILY PRECIPITATION OVER AN AREA1 Clarence Richardson' ABSTRACT: A model was developed of the periodic-stochastic structure of daily precipita- tion over an area. The model is based on a multivariate normal distribution. The square roots of daily precipitation at a point were found to approximate a sample from a univariate normal distribution that had been truncated at zero. The zero daily precipitation amounts were con- sidered negative amounts of unknown quantity. The multivariate normal distribution was used to describe the variation of daily precipitation over an area. The periodic fluctuations of the model parameters were described with Fourier series. The model was tested using data from two areas of different precipitation characteristics. Data generated with the model contained many of the statistical characteristics observed in the historical data. (KEY TERMS: daily precipitation; areal precipitation; generation; stochastic structure; per- iodic.) INTRODUCTION Precipitation is a natural occurrence that results from the interaction of complex atmospheric processes. Because of the complexity of the processes, precipitation cannot be described in purely deterministic terms. Future behavior of the processes must be predicted on a probability basis. The precipitation process also contains periodic com- ponents due to the seasonal variation within the year and persistence in both time and space. A model of precipitation over an area must describe each of these characteristics. The objective of this study was to develop a model of the periodic-stochastic structure of daily precipitation over an area. The model should be capable of generating long samples of daily precipitation at selected locations within an area or watershed. Use of the model in this sense should provide a better understanding of the temporal and spatial characteristics of daily precipitation within a region. Samples generated with the model could be used as inputs to deterministic hydrologic models, for regional drought studies, for design of water resource projects, and other applications. 'Paper No. 77163 of the Water Resources Bulletin. Discussions are open until June 1, 1979. Contribution from the Agricultural Research Service, USDA, in cooperation with Texas Agricultural Experiment Station, Texas A&M University. 'Agricultural Engineer, Agricultural Research Service, USDA, P.O. Box 748, Temple, 'Texas 76501. 1035

GENERATION OF DAILY PRECIPITATION OVER AN AREA

Embed Size (px)

Citation preview

Page 1: GENERATION OF DAILY PRECIPITATION OVER AN AREA

WATER RESOURCES BULLETIN VOL. 14, NO. 5 AMERICAN WATER RESOURCES ASSOCIATION OCTOBER 1978

GENERATION OF DAILY PRECIPITATION OVER AN AREA1

Clarence Richardson'

ABSTRACT: A model was developed of the periodic-stochastic structure of daily precipita- tion over an area. The model is based on a multivariate normal distribution. The square roots of daily precipitation at a point were found to approximate a sample from a univariate normal distribution that had been truncated at zero. The zero daily precipitation amounts were con- sidered negative amounts of unknown quantity. The multivariate normal distribution was used to describe the variation of daily precipitation over an area. The periodic fluctuations of the model parameters were described with Fourier series. The model was tested using data from two areas of different precipitation characteristics. Data generated with the model contained many of the statistical characteristics observed in the historical data. (KEY TERMS: daily precipitation; areal precipitation; generation; stochastic structure; per- iodic.)

INTRODUCTION

Precipitation is a natural occurrence that results from the interaction of complex atmospheric processes. Because of the complexity of the processes, precipitation cannot be described in purely deterministic terms. Future behavior of the processes must be predicted on a probability basis. The precipitation process also contains periodic com- ponents due to the seasonal variation within the year and persistence in both time and space. A model of precipitation over an area must describe each of these characteristics.

The objective of this study was to develop a model of the periodic-stochastic structure of daily precipitation over an area. The model should be capable of generating long samples of daily precipitation a t selected locations within an area or watershed. Use of the model in this sense should provide a better understanding of the temporal and spatial characteristics of daily precipitation within a region. Samples generated with the model could be used as inputs to deterministic hydrologic models, for regional drought studies, for design of water resource projects, and other applications.

'Paper No. 77163 of the Water Resources Bulletin. Discussions are open until June 1, 1979. Contribution from the Agricultural Research Service, USDA, in cooperation with Texas Agricultural Experiment Station, Texas A&M University.

'Agricultural Engineer, Agricultural Research Service, USDA, P.O. Box 748, Temple, 'Texas 76501.

1035

Page 2: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Richardson

MODEL DEVELOPMENT

Daily precipitation series for all stations in an area contain many zero values. Most precipitation models, developed for dady or shorter time intervals, have been restricted to a single station and utilize a Markov chain model for describing the probabilities of occurrence or nonoccurrence of precipitation (Smith and Schreiber, 1973; Pattison, 1965). These models cannot easily be generalized to describe the probabilities of rainfall at multiple points. The model developed in this study utilizes a multivariate normal dis- tribution to describe both the occurrence and the amounts of daily precipitation at multiple points within an area. The assumption of a multivariate normal distribution means that the marginal distributions (rainfall at a point) must be normally distributed. The transformation of point rainfall to a normally distributed random variable does not ensure that the precipitation at scveral points is multivariate normal. because normal marginal distributions are a necessary but not a sufficient condition for a multivariate normal distribution. In this study, however, it is assumed that if precipitation at each station in an area is transformed so that each sample approximates a sample from a uni- variate normal distribution, precipitation over the area can be described by a multivariate normal distribution.

The distribution of daily precipitation at a point is actually a mixed distribution? con- taining both discrete and continuous variable values. For a given day, there is a finite probability of zero rainfall, while the distribution of rainfall amounts greater than zero must be described by a continuous probability density function. In this study, the square roots of nonzero daily precipitation data at a point were found to approximate a sample from a univariate normal distribution that had been truncated at zero. The zero values are considered negative amounts of unknown quantity. The concept is illustrated in Figure 1 . The integral of the normal distribution from -- M to 0 gives the probability of zero daily precipitation, and the remainder of the distribution describes the distribution of rainfall amounts for days with rainfall greater than zero. Nonzero daily precipitation amounts less than 0.01 inch are recorded as a trace. The trace amounts are treated as zero in this study.

Daily precipitation amounts are usually not independent of preceding values. Meteoro- logical conditions on a given day tend to carry over, or persist, into later times. Simi- larly, precipitation amounts for stations in a region are correlated, with the degree of correlation depending on interstation distances and other factors (Yevjevich and Karplus, 1973).

Mathematical Model of Daily Precipitation Let xp,T(i) be the daily precipitation at station i, with p the year, and r the day within

the year. Let y Ji) be the daily precipitation after application of a normalizing trans- formation. A kactional power transformation has been used by several researchers (Stidd, 1953; Franz, 1970) to normalize precipitation data. The square root transforma- tion was shown by Richardson (1 977) to adequately transform daily precipitation from stations in Central Texas to a normal distribution. Therefore, the square root transforma- tion, given by

was used in this study.

1036

Page 3: GENERATION OF DAILY PRECIPITATION OVER AN AREA

I10

100

90

00

70

60

50

40

0

3

a LL

30

20

10

0

Generation of Daily Precipitation Over an Area

I I I I I I I I I

P(y =TRACE)

F ITTED FREOUENCY FUNCTION

/-

08SERVED FREOUENCY HISTOGRAM

I 1

-2.0 - I .o 0 I .o 2 .o

SQUARE ROOT OF DAILY PRECIPITATION

Figure 1 . The Truncated Normal Distribution of the Square Root of Daily Precipitation at a Point.

Let p7(i) and uAi) be the mean and standard deviation, respectively, of yp,di) for day T. The mean and standard deviation of yp,7(i) must be estimated for each day of the year because of the periodicity of p&i) and udi) within the year. The method of mo- ments could not be used to estimate p#) and udi) becatlse the data were truncated at zero. A method given by Cohen (1950) for obtaining maximum likelihood estimates of

1037

Page 4: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Richardson

the mean and variance of normal populations from truncated samples was adapted for cstimating &i) and uAi) (Richardson, 1977).

The sample estimates of p,(i) and uJi) are subject to large sampling errors becuase of the small sample siLes obtained when only nonzero values ofyp.,(i) are considered. The sampling errors may be minimized by smoothing the 365 estimates of pJi) and u,(i) using Fourier series or other smoothing techniques. Smoothing also reduces Ihe number of parameters required t o describe the seasonal variations in p,(i) and uT(i). The random component. Ep,7(i). with the periodic mean and standard deviation removed is then given hY

This results in truncated Ep,,(i) series that have a mean of zero and a standard deviation of unity for all seasons of the year (Richardson, 1977). The Ep,,(i) series for a given station is serially correlated because of the tendency of precipitation t o persist in time. The ep ,Ji) series for closely spaced stations are also cross correlated because of the ten- dency of precipitation to persist in space.

I f a first-order autoregressive model is assumed to describe the time dependence o f Ep.,(i), the Ep,,(i) sequence is given by

whet-e p I(i) is the lag-one autocorrelation coefficient, and tP Ji) is a time-independont random component. The $p,T series for closely spaced stations are independent in sequence but dependent in space. The linear cross-correlation coefficient between tp,,(i) and tP,?(j) may bc used to express the degree of linear association between the series for station 1 and station j . The linear space dependence may be expressed by

wlieie iP,,(i) is ;I random component that is independent in both time and space

Multzvanate (;eneration Model The ~nultivariate generation proLedure 1s the inverse of the process described abovc

the tP di) beries that are dependent in both time and space are generated using the pio- Ledure given by Matalas (1967’) Thc basic equation I S

where C T + 1 is a vector of m random components for day ~ + 1 : c7+1 and E~ are vectors whose values are thc generated series f o r m stations with the means removed: and A and B are m x ni matrices whose elements are defined so that the new sequences preserve

1038

Page 5: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Generation of Daily Precipitation Over an Area

standard deviations, lag-one autocorrelation coefficients, and lag-zero cross-correlation coefficients determined from precipitation data. The A and B matrices are given by

A = M ~ M ~ ]

and

BBT = M~ - M ~ M C ~ MT (7)

where -1 and T denote the inverse and transpose of the matrix, respectively. The Mo and M I matrices were defined by Matalas (1967) as the lag-zero and the lag-one co- variance matrices, respectively. With the approach used in this study, the yp Ai) series are standardized, Equation (21, resulting in Ep,7(i) series that are stationary with zero means and unity variances. Assuming that each Ep,7(i) sequence approximates a lag-one autoregressive model, the Mo and M1 matrices may be simplified to

r

M1 (9)

Mo is simply the lag-zero cross correlation matrix and is symmetric with each element of the principal diagonal equal to unity. M1 contains the lag-one serial correlations on the diagonal and the offdiagonal elements are the product of the lag-one serial correlations and the lag-zero cross correlations.

New ep,7(i) values are generated using Equation (5). Values of yp,di) are produced by multiplying Ep,7(i) by odi) and adding pr(i). The xp,7(i) values are obtained by setting the negative y (i) values to zero and applying the inverse of the normalizing transforma- tions to the positive yP,&) values.

P ?T

1039

Page 6: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Richardson

TESTS OF THE MODEL

Two precipitation networks were used to test the model. Network I included three U.S. Weather Bureau precipitation stations in eastern Texas (Palestine, Dialville: and Crockett). The interstation distances ranged from 25 to 41 miles. Mean annual precipi- tation in the network is about 42 inches. The distribution of precipitation within the year is bimodal, with the primary peak occurring in the spring and a secondary peak occurring in the fall. Daily precipitation data for 40 years (1933-1972) were assembled and used in evaluating the model parameters.

Network IT was composed of three precipitation stations near Hastings, Nebraska. operated by the Agricultural Research Service and designated G42, D45, and B32. Thc stations were more closely spaced than those in the first network with interstation dis- tances ranging from 1.7 to 4.2 miles. Mean annual precipitation is about 24 inches, most of which occurs in the summer with a monthly high of about 4.3 inches in June. A 28- year (1940-1967) daily precipitation record was used to test the model.

!Ve IWO rk I The parameters that are needed for the generation model are: ( I ) p,(i) and o,(i).

(2) pl(i), and (3) po(ij). The square root transformation was applied to the daily preci- pitation data for the three stations in network I (northeast Texas). Maximum likelihood estimates of p,(i) and o,(i) were obtained. Fourier series with six harmonics were used to smooth p,(i) and o,(i) and to describe the seasonal variation of the parameter. The Fourier series representations of p , and u, for the Palestine station are shown in Figure 2.

The ep,7(i) series were calculated using Equation (2). The lag-one autocorrelation co- efficients, p l(i). were determined from the .cp,,(i) series, considering only the case when E ~ , , and eP.,+1 were both nonzero. The estimates of pl(i) that are obtained by con- sidering only nonzero pairs from the truncated samples are less than the estimatcs that are obtained from untruncated samples. The estimates of pl(i) that were obtained using only nonzero pairs were corrected using an expression derived by Regier and Ham- dan (1971) relating the correlation of a bivariate standard normal distribution that had been truncated at a given point to the correlation of the untruncated distribution. The cstimates of p l(i) for each station are given in Table 1.

The lag-zero cross-correlation coefficients, p0(ij), were determined from each pair 01' Ep,,(i) series, considering only the cases when Ep,,(i) and ~ ~ , ~ ( j ) were both nonzero. The estimates obtained from the truncated samples were corrected as shown above. The csti- mates of po(ij) for each pair of stations are also given in Table 1 .

The model was used to generate a 50-year sample of daily precipitation for each o t the three stations in network I, using the model parameters determined from the observed data. Area-weighted daily precipitation, defined as the average of the daily precipitation for the three stations, was calculated from both the observed and generated samples. If the model is a good description of daily precipitation over an area, the generated sample of weighted daily precipitation should closely resemble the observed sample in ternis of important statistical characteristics. Several statistics were selected for comparing the generated and observed samples of weighted daily precipitation. The statistics are as follows :

(1) The longest wet run each year, defined as the lenglh of consecutive wet days preceded and followed by dry days.

1040

Page 7: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Generation of Daily Precipitation Over an Area

0 60 120 I80 240 30 0 560

D A Y S

Figure 2. Fourier Series Description of p T and u7 Estimated from Data from Palestine, Texas.

( 2 ) The date of the longest wet run, as the day of the year when the longest wet run

(3) The longest dry run each year. (4) The date of the longest dry run each year. ( 5 ) The maximum precipitation event, defined as the largest amount of precipitation

(6) The maximum daily precipitation each year. (7) Total precipitation for the year. (8) Total precipitation for each 28-day period o f the year.

begins.

from consecutive wet days, each year.

1041

Page 8: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Richardson

I A B L E 1 Autocorrelation and Cross-Corieldtion Coeffuenta l o r Stations in Network I (Northeast Texas)

_ _ ~ Station 4utocorrelation Coefficient, p1 (i)

Palestine Oialville Crockett

0.41 0.3 I 0.53

~~

Cross-Correlation Coefficient, po ( i j )

Palestine-Ddville 0 88 Palestinccrockett 0 85 DdvilleCrockett 0 76

Each statistic was determined for each year from the weighted observed and generated data. The hypothesis that the statistics from the observed and generated data were from the same population was tested using the distribution-free Smirnov two-sample test. The results for the first seven statistics are shown in Table 2 . The hypothesis that the samples from the observed and generated data were from the same population was accepted for all statistics, except the maximum precipitation events and the maximum daily precipita- tion. Both the maximum precipitation events and the maximum daily precipitation from the generated data tended t o be greater than that from the observed data. The distribu- tion of run lengths (both wet and dry runs) and the date of occurrence of runs from the generated data were not significantly different from that from the observed data. This in- dicated that the tendency of wet or dry periods t o persist in time and the time o f oc- currence of wet or dry periods within the year were reproduced well in the generated data. The distribution of annual precipitation amounts from the generated data also closely approximated that from the observed data.

The means and standard deviations of the precipitation amounts for each ?%day period of the year are shown in Figure 3 for both the generated and the observed data. The seasonal pattern of the means and the standard deviations of the generated data cor- responded closely with that of the observed data. The largest 28-day mean occurred during period 5 for both the observed and generated data. The smallest mean 28-day precipitation occurred during period 8. The distribution of the 28-day precipitation amounts were not significantly different from the distribution of thc observed 28-day amounts for any period, except period 3.

:Vetwork I1

l h e parameters of the daily precipitation generation model were determined from the data for the three stations in network I1 (Hastings, Nebraska). The Fourier series des- criptions of p T and uT for station G42 are shown in Figure 4. The lag-one autocorrelation coefficients and the lag-zero cross-correlation coefficients are given in Table 3. The seasonal variation of p7 and u7, shown in Figure 4, for network I1 is distinctly different

1042

Page 9: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Generation of Daily Precipitation Over an Area

from that shown in Figure 2 for network I. The crosscorrelation coefficients for the stations in network I1 are greater than that for the stations in network I because the stations in network I1 are closer than those in network I .

TABLE 2. Statistics of Weighted Observed and Generated Rainfall for a Three-Station Precipitation Network in Central Texas.

Observed Generated

Mean Std. Dev. Mean Std. Dev. Statistic

Longest wet run (days) 8.4 3.0 7.4 2.0 Date of longest wet run 195.3 99.6 184.9 122.3 Longest dry run (days) 20.7 6.6 18.8 4.8 Date of longest dry run 225.7 87.5 210.2 84.4 Maximum precipitation event (inches) 5.3 2.4 7.1 3.6* Maximum daily precipitation (inches) 2.8 1.2 3.8 1.4* Annual precipitation (inches) 42.2 9.6 41.0 9.9

*The distributions of the statistic: from the generated and observed samples are significantly different at the 5% level, according to a Smirnov two-sample test.

A 50-year sample of daily precipitation was generated for each of the three stations in network 11, using the parameters obtained from the observed data. Area-weighted daily precipitation was calculated for both the observed and generated samples. The eight statistics described above were determined for each year from the weighted observed and generated data. The first seven statistics from the observed and generated data are com- pared in Table 4. The results resembled that for network I . The distributions of run lengths (wet and dry) and date of longest runs obtained from the generated data were not significantly different from that obtained from the observed data. The distribution of annual precipitation amounts from the generated data were also not significantly different from that from the observed data. The maximum precipitation events and the maximum daily precipitation from the generated data were significantly greater than that from the observed data.

The means and standard deviations of 28-day precipitation amounts are shown in Figure 5 . The seasonal pattern of precipitation in network 11 is considerably different from that in network I . The means and standard deviations from the generated data are about the same as that from the observed data. None of the distributions of the 28-day precipitation amounts were different at the 5% level.

CONCLUSIONS

The model of daily precipitation over an area that is proposed here is based on a multi- variate normal distribution. The multivariate normal approach had previously been applied to hydrologic series that did not contain zeros, like continuous streamflow (Mata- las, 1967) OT monthly precipitation (Yevjevich and Karplus, 1973). This study was an

1043

Page 10: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Richardson

6 . 0 -

5.0

4.0- cn

0

3 30- ; 2 . 0 -

z a

1.0

0.0

cn 4 . 0 - W I 0

2 - - = 2.0 3*0-

Q

d 1.0

0.0

v)

attempt to apply the multivariate normal approach to the intermittent process of daily precipitation that contains many 7ero values and is persistent in both time and space.

I I I 1 1 1 I I I I 1 I I

- &--A GENERATED d

- OBSERVED

- - -

1 1 1 I 1 . I 1

A -- A

\ \ /

- O\\ \ \ Y - - 5 . - - - I I 1 1 1 1 1 1 I 1 I 1 1

Figure 3 . Means and Standard Deviations of the 28-Day Totals of Weighted Precipitation for Network I.

The value of the model depends on its ability to generate new precipitation series that correctly reproduce, in a statistical sense, characteristics that are observed in the histori- cal series. The model was tested by generating new precipitation sequences for two areas with different precipitation regimes. The model performed satisfactorily in several aspects, while other aspects need further improvement.

The generated sequences successfully reproduced many precipitation characteristics. For both test areas, the distribution of lengths of runs of wet or dry days and the times ot' occurrence of runs obtained from the generated data were not statistically different from

1044

Page 11: GENERATION OF DAILY PRECIPITATION OVER AN AREA

1.5

I .o > W P

0.0

- 0 . 5

2

w I a

- 1.c

- 1.3

Generation of Daily Precipitation Over an Area

.

0 60 120 180 2 40 300 3(

DAYS

3

Figure 4 . Fourier Series Description of /.fT and uT Estimated from Data from Station C42 near Hastings, Nebraska.

TABLE 3. Autocorrelation and CrossCorrelation Coefficients for Stations in Network I1 (Hastings, Nebraska).

Station Autoconelation Coefficient, p (i)

G42 0.39 D45 0.35 B32 0.29

Crowcorrelation Coefficient, p ,, (i j)

G42 - D45 G42 - B32 D45 - B32

0.98 0.96 0.99

1045

Page 12: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Richardson

that from the observed data. Similarly, the generated data successfully reproduced the distributions of annual precipitation and the seasonal nature of- precipitation as reflected by the diytributions of %-day totals.

'TABLE 4 . Slatistics of Weighted Observed and Gcncratcd Rainfall for a Three-Station Precipitation Network near Hastings Nebraska.

__ Observed Generated - __

Statistic Mean Std. Dev. Mean Std. Dev,

L.ongest wet r u n (days) 5 .2 1 . 2 3.7 1.7 Date of longest wet run 177.2 57.3 186.9 67.1 Longest dry run (days) 36.1 13.7 34.5 10.6

Maximum precipitation event (inche.;) 3.4 I .7 5 5 7 n, Maximum daily precipitation (inches) 2 .5 1 -0 .3 .3 I .4'.' Annual precipitation (inches) 23.6 6.5 2.5.4 1 . 5

Date of longest dry run 284.3 88.2 21 I . ? 135. I

7 -

"The distributions of the statistic from the generated and observed samples arc significantly different a t the 5 Y level. according to a Smirnov two-sample test.

The model was not able to accurately reproduce the distributions of the annual rtiaxi- riiurn precipitation events and the annual maximum daily precipitation. The failure o f the tnodel t o reproduce these extremes in precipitation amounts is probably due to the inability of the square root transformation t o accurately transform daily precipitation t o a normal distribution for all times of the year. Possibly a power transformation that varics with season of the year would help alleviate this problem.

LITERATURE CITED

('ohen, A. C., 1950. Estimating the Mean and Variance of Normal Populations from Singly I r u n - cated and Doubly Truncated Samples. Annals of Math. Statist. 21 557.569.

I-ranz, I). I)., 1970. Hourly Rainfall Synthesis for a Network of Stations. Stanford University Dcpr of Civil Engineering, Technical Report 126, 141 pp.

Matalas, N. C., 1967. Mathematical Assessment of Synthetic Hydrology. Water Resources Research

Paltison. A., 1965. Synthesis of Hourly Rainfall Data. Water Resources Research 1(4):489-498. Regitx. M. H . and M . A. Hamdan, 1971. Correlation in a Bivariate Normal Distribution with Trunca-

Richardson, C. W., 1977. A Model of Stochastic Structure of Daily Precipitation over an Area.

Smith, R . F.. and H. A. Schreib er, 1973. Point Processes of Seasonal Thunderstorm Rainfall ~ 1. Dis-

Stidd, C. K .. 1953. Cube-Root Normal Precipitation Distributions. Transactions. American Gcophy-

Yevjevich. V. and A. K. Karplus, 1973. Area-Time Structure of the Monthly Precipitation Process.

3(4) : 937-945.

tion in Hoth Variahlcs. Austral. I . Statist. 13(3):77-83.

Colorado State University Hydrology Paper No. 91 ,45 pp.

tribution of Rainfall Events. Water Resources Research 9(4):871-884.

sical Union, 34(1):31-35.

Colorado State University Hydrology Paper No. 6 4 , 4 5 pp.

1046

Page 13: GENERATION OF DAILY PRECIPITATION OVER AN AREA

Generation of Daily Precipitation Over an Area

6.0

5.0 L - 4 GENERATED

4.0

- OBSERVED

c v)

0 z - 3.0 a z g 2.0

I .o

0.0

Si 4.0

z 3.0

w I 0 - - 5 2.0

ti W P

1.0

0.0

v)

I 2 3 4 5 6 7 8 9 10 I I 12 13

28- DAY P E R I O D

Figure 5 . Means and Standard Deviations of the 28-Day Totals of Weighted Precipitation for Network 11.

1047