13
Journal of Forecasting, Vol. 13, 369-381 (1994) Modelling Non-normal First-order Autoregressive Time Series C. H. SIM University of Malaya, Malaysia ABSTRACT We shall first review some non-normal stationary first-order autoregressive models. The models are constructed with a given marginal distribution (logistic, hyperbolic secant, exponential, Laplace, or gamma) and the requirement that the bivariate joint distribution of the generated process must be sufficiently simple so that the parameter estimation and forecasting problems of the models can be addressed. A model-building approach that consists of model identification, estimation, diagnostic checking, and forecasting is then discussed for this class of models. KEY WORDS Model building methodology Non-normal AR( 1) models Monte Carlo simulation Bootstrap technique INTRODUCTION A class of models for non-normal time series, with specified marginal distribution, has been discussed by Granger and Newbold (1976), and Janacek and Swift (1990). In their approach, the non-normal series ( Yr) is taken as an instantaneous transformation T(&) of an underlying Gaussian ARMA process (Z,]. The relationship between the correlation structure of the (Y,) and (Z,) are obtained by expanding T(Zt) in terms of Hermite polynomials. An advantage of these models is that the estimation and forecasting of non-normal time series can be handled by the standard model-building methodology of Box and Jenkins (1976). However, as pointed out by Swift and Janacek (1991), the optimal quadratic loss forecast of the series (Y,) cannot be obtained as the distribution of the predictor is unknown, and this class of models is only suitable for non-normal data that do not exhibit strong directionality. Note that Yr = T(Z,) is time-reversible if and only if Zt is also time-reversible (Weiss, 1975). Thus, as Gaussian processes are time-reversible, we have that the series ( Yt) is time-reversible if (2,) is a Gaussian process. In this paper we shall restrict our discussion to the problem of modelling a class of stationary first-order autoregressive time series (with given non-normal marginal distribution) where the distribution of its predictor can be obtained explicitly. Specifically, we shall propose a model- building methodology for the modelling of time-irreversible logistic, hyperbolic secant, exponential, and Laplace AR(1) time series [X,) which follow the well-known stochastic CCC 0277-6693/94/040369- 13 Received August 1992 0 1994 by John Wiley & Sons, Ltd. Revised July 1993

Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

Journal of Forecasting, Vol. 13, 369-381 (1994)

Modelling Non-normal First-order Autoregressive Time Series

C. H. SIM University of Malaya, Mala ysia

ABSTRACT We shall first review some non-normal stationary first-order autoregressive models. The models are constructed with a given marginal distribution (logistic, hyperbolic secant, exponential, Laplace, or gamma) and the requirement that the bivariate joint distribution of the generated process must be sufficiently simple so that the parameter estimation and forecasting problems of the models can be addressed. A model-building approach that consists of model identification, estimation, diagnostic checking, and forecasting is then discussed for this class of models.

KEY WORDS Model building methodology Non-normal AR( 1) models Monte Carlo simulation Bootstrap technique

INTRODUCTION

A class of models for non-normal time series, with specified marginal distribution, has been discussed by Granger and Newbold (1976), and Janacek and Swift (1990). In their approach, the non-normal series ( Yr) is taken as an instantaneous transformation T(&) of an underlying Gaussian ARMA process ( Z , ] . The relationship between the correlation structure of the (Y , ) and ( Z , ) are obtained by expanding T ( Z t ) in terms of Hermite polynomials. An advantage of these models is that the estimation and forecasting of non-normal time series can be handled by the standard model-building methodology of Box and Jenkins (1976). However, as pointed out by Swift and Janacek (1991), the optimal quadratic loss forecast of the series (Y,) cannot be obtained as the distribution of the predictor is unknown, and this class of models is only suitable for non-normal data that do not exhibit strong directionality. Note that Yr = T(Z,) is time-reversible if and only if Zt is also time-reversible (Weiss, 1975). Thus, as Gaussian processes are time-reversible, we have that the series ( Yt) is time-reversible if (2,) is a Gaussian process.

In this paper we shall restrict our discussion to the problem of modelling a class of stationary first-order autoregressive time series (with given non-normal marginal distribution) where the distribution of its predictor can be obtained explicitly. Specifically, we shall propose a model- building methodology for the modelling of time-irreversible logistic, hyperbolic secant, exponential, and Laplace AR(1) time series [X,) which follow the well-known stochastic CCC 0277-6693/94/040369- 13 Received August 1992 0 1994 by John Wiley & Sons, Ltd. Revised July 1993

Page 2: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

370 Journal of Forecasting Vol. 13, Iss. No. 4

difference equation

Xr = (Y Xi - 1 + Et t = 1 ,2, . . . (1) where 1 a I < 1 (except for the exponential model, where 0 < a < l), and the innovation process (c,) is a sequence of independent and identically distributed (IID) random variables with characteristic function (CF)

Ms) = E[exp(-is~)l = C ~ X ( S ) / ~ X ( C Y S ) in which ~ x ( s ) is the CF of the stationary non-normal process ( X r ) .

We shall also discuss the modelling of the gamma AR(1) process of Sim (1990), which is generated by replacing the ax of model (1) with a * X , 0 < a < 1. The operator ' * ' is defined as

where (i) the W; are IID exponential random variables with parameter /3 and (ii) for each fixed positive value of x, N(x) is a Poisson random variable with parameter apx.

Both models (1) and (2) are constructed with the requirement that the bivariate joint distribution of the generated process { X t ] must be sufficiently simple so that parameter estimation and forecasting problems of the models can be addressed. In our discussion we shall follow the Box-Jenkins model-building approach, namely, model identification, estimation, diagnostic checking, and forecasting.

Some properties of the above-mentioned non-normal AR( 1) models are discussed in the next section. In fitting the non-normal models to data we can clearly identify a model from their marginal distribution and first-order Markovian correlation structure. These are discussed in the third section together with the conditional least-squares (CLS) and maximum likelihood (ML) estimation procedures of the models. The diagnostic checking and forecasting procedures of the models are discussed in the fourth and fifth sections, respectively. The proposed procedures are then illustrated by using the monthly average discharge of the Mekong river in Thailand.

PROPERTIES OF SOME NON-NORMAL AR(1) MODELS

The autocorrelation function (ACF) of ( X f ) , constructed from the difference equations (1) and (2), is easily derived as

Corr(Xf+j, X,) = a', j 2 0 which is the same as that of the standard Gaussian AR(1) process. Another interesting property of the models discussed is that the conditional means of Xt+, given Xt = y are all simple linear functions of y (see fourth section) despite the fact that (A',] is not a Gaussian process.

Logistic model The logistic AR(1) process (X , ) was constructed by Sim (1993). It was shown that in order for the process 1 X,) to have the required logistic marginal probability density function (PDF)

f ~ ( x ) = asech2(ix), - 00 < x < 00

Page 3: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

C. H. Sim Modelling Non-normal First-order Autoregressive Time Series 37 1

the innovation process (el) of model (1) has to be a sequence of IID random variables with PDF

sin(a?r) - c o < z < o o

'") = 2a?r [cosh(z) + cos(ar?r)] ' The conditional density of Xt+j given Xt = y is

An alternative logistic AR(1) model is that of Arnold and Robertson (1989). However, its usefulness in practice is limited by the fact that explicit forms of its conditional density and likelihood function are intractable.

Hyperbolic secant model The process [ X J of model (1) is a hyperbolic secant AR(1) process (Rao and Johnson, 1988) with marginal PDF

f x ( x ) = )sech()?rx), - 00 c x < 00

and conditional density

where the innovation process (en) is a sequence of IID random variables with PDF

--oQ<z<m cos(ar?r/2)cosh(~z/2)

f E ( z ) = cosh(?rz) + cos(a?r) '

Exponential model By assuming that the process [ Xt] has an exponential marginal PDF with parameter X, Gaver and Lewis (1980) showed that the innovation process (4 of model (1) take the form

with probability a with probability 1 - a " = kt

where 0 < a < 1 and (El) is a sequence of IID exponential random variables with parameter A.

~ ~ ~ , , J ~ , ( x ) ~ ) = c w ~ s ( x - ~ ~ ~ ) + x ( ~ - a j ) e x p [ - X ( x - a r j y ) ~ U ( x - a j y ) , --oo < x , y < 00

where 6 ( x ) is the Dirac delta function and U(u) is the unit step function. This exponential AR(1) model has been generalized by Lawrance and Lewis (1981) and Sim (1990) to the two- parameter NEAR( 1) model and the three-parameter GEAR( 1) model, respectively. Both models have a tractable joint PDF and both are likely candidates for our model-building approach.

The conditional density of Xt+j given Xt = y is

Laplace model The construction of a Laplace AR(1) process (Dewald and Lewis, 1985) was similar to that of

Page 4: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

312 Journal of Forecasting Vol. 13, Iss. No. 4

the exponential AR(1) process. The innovation process (4 was shown to be

0 with probability a2

where I a I < 1, and ( Lt) is a sequence of IID standard Laplace variates. The marginal PDF and conditional density of X,+j given Xt = y are, respectively,

fx(x)=fexp(-Ixl), -a < x < a0

f x , + , l x f ( x ) y ) = a 2 J s ( x - a j y ) + ~ ( 1 -a2j)exp(- lx-ajyl) ,

“ = [ Lt with probability 1 - a’

and

-a < x , y < a

Gamma model The gamma AR(1) process (X , ) , with gamma (p(1 - a), v) marginal distribution, was constructed by Sim (1990) as

x, = a! * xt - 1 + Er

where ( E , ) is a sequence of IID gamma ( p , v ) random variables with p, v > 0 and the operator ‘ * ’ is defined as in model (2). The marginal density of (X,) and its conditional density are, respectively,

fx(x) = [@(I - a!)~”x“-~exp[- p(1 - a ) ~ ] / r ( ~ )

and

where 0 = @(1 - a) / ( l - a’), 0 < a c 1, and I r (z ) is the modified Bessel function of the first kind and of order r. Another well-developed gamma model is the GAR(1) model of Gaver and Lewis (1980). The GAR(1) model was constructed from the simple difference equation (1). However, unlike the exponential AR(1) model, its intractable joint PDF makes it an unlikely candidate for our model-building approach.

Note that the conditional density and thus the joint PDF of all the models in this section, except the gamma model, are not symmetric in x and y. Consequently, the generated processes, except the gamma AR(1) process, are not time-reversible as is the Gaussian AR(1) process.

MODEL IDENTIFICATION AND ESTIMATION

At the identification stage we shall tentatively select a suitable model from the class of models discussed in this paper. The standard identification procedure of Box and Jenkins (1976) is carried out by matching patterns in the sample ACF and partial autocorrelation function (PACF) of the observed series with the theoretical patterns of known models. Thus the sampling distributions of the sample ACF and PACF are needed to check whether the sample ACF and PACF are effectively zero after some specific lag. Recently, the non-parametric bootstrap technique has been used by Aczel and Josephy (1992) in estimating the sampling distributions of the sample ACF and PACF based on information in a single sample from an unknown population. In their procedure the non-parametric bootstrap resampling scheme of Kiinsch (1989) was used to generate bootstrap replications from the observations XO, XI, ..., x,,.

Page 5: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

C. H. Sim Modelling Non-normal First-order Autoregressive Time Series 373

For each replicate, the estimated value of p k and +kk, k = 1,2, ..., are computed. These bootstrap values can then be used to construct the bootstrap (empirical) distributions and the lOO(1 - a)Vo prediction limits of each of the statistics pk and +kk, k = 1,2, ... .

If the ACF and PACF of the sample data are consistent with that of the AR(1) model, then the remaining problem in selecting an AR(1) model from our proposed models is to identify an appropriate marginal distribution from the sample data. The chi-square and Kolmogorov-Smirnov tests are the best-known statistical tests of goodness-of-fit. Other widely used methods are the quantile-quantile (Q-Q) probability plots (e.g. Hamilton, 1992) and the percentage-percentage (P-P) probability plots (e.g. Gan and Koehler, 1990).

After the model has been identified, one can then proceed to obtain the CLS and ML estimates of the unknown parameters in the identified model. For a given set of observations XO, XI, ..., x,, that follow an AR(1) model with unknown vector parameter 8 , the CLS estimate of 8 is obtained by minimizing the sum of squares (Klimko and Nelson, 1978)

n

t = l S n W = C [xt - E ( X ~ I xf - l ) l

where E ( X 1 x t - l ) is the conditional expectation of Xt given XI- 1 = x t - l . The CLS estimators are generally not efficient, but they can be used as initial values for the more desirable ML estimation procedure. The ML estimate of the unknown parameter 8 is obtained by numerically maximizing the log-likelihood function

or by solving its likelihood equations with the Newton-Raphson method.

Exponential model The parameter estimation procedure of the exponential AR( 1) model is well established. Having observed X O , X I , ..., xn, Gaver and Lewis (1980) pointed out that a natural estimate for the model parameter a! is

and this was shown to be a consistent estimator by Andel (1989). The CLS estimates of a and X are given as

and

respectively. Both estimators are shown by Billard and Mohamed (1991) to be strongly consistent and asymptotically normally distributed. Simulation studies by Billard and Mohamed also suggest that they perform better than the Yule-Walker estimators of Lawrance and Lewis (1981).

Page 6: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

374 Journal of Forecasting

For given value of a, the ML estimate of X is easily derived as

Vol. 13, Iss. No. 4

(Sim, 1987)

where n’ is the number of pairs of ( x t , x t - l ) such that xt > a x t - l .

Gamma model The system of likelihood equations for a gamma AR(1) model is complicated. We shall discuss an iterative estimation procedure which involves maximizing the log-likelihood function with respect to a parameter while holding all others at their current trial estimates. This is done for each parameter in turn and usually requires several passes through all parameters. The first step is to obtain the CLS estimates of a, v, and /3 as starting values in the succeeding iterative ML procedures. Given observations xo, x1, . . . , xn, the CLS estimate of a is the same as the liC (equation (3)) of the exponential model. The CLS estimate of Y/@ is easily derived as

. ”

By taking a = i& and v = @ ( ~ / / 3 ) ~ , the individual estimates (h, &) of (v, 0) that maximize the log-likelihood function of xo, xl, . . . , x,,, can then be obtained by considering all the trial values of @. The second step of the estimation procedure is to obtain the ML estimate of one of the parameters while holding the other two parameters at their latest trial estimates. For instance, by taking a = bC and v = i*, we can obtain the ML estimate BL of P by solving its likelihood equation with the Newton-Raphson method. This second step of the ML estimation procedure is repeated, for each parameter in turn, until all the parameter estimates converge.

Logistic, hyperbolic secant models As the estimation procedures of the hyperbolic secant model is similar to that of the logistic model, we shall discuss only the ML procedure of the latter. For a model with a standard logistic marginal distribution, the ML estimate of a can easily be obtained from its likelihood equation (Sim, 1993) by using the CLS estimate

of a as an initial value. For a model with a two-parameter logistic marginal distribution, the ML procedure is much more difficult. Note that if 2 has the standard logistic PDF as defined above, then the transformed variable X = uZ + p, u > 0, has the two-parameter PDF .fx(x) = (1/4u)sech2 [ ( x - p)/2u]. The AR(1) model with two-parameter logistic marginal has conditional density

sin(aj?r) fx ,+”X, (X’y )=2ua ia (cosh[ (x - p ) / u - a j (y - p) /u l + cos(aia))

The CLS estimate liC of a is given as in equation (3), the CLS estimate of p is = l&, where ^x, is as in equation (4). The iterative ML estimation procedure begins by taking the CLS estimates of a and p as fixed values in solving the likelihood equation of u. Then, the second step of the ML procedure as discussed in the gamma model is used iteratively to obtain the ML estimates of a, p and u.

Simulation experiments have been performed to provide a better understanding of the above two-step iterative ML procedure. For each value of a = 0.2(0.2)0.8 with u = 1 .O and p = 2.0,

Page 7: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

C. H . Sim Modelling Non-normal First-order Autoregressive Time Series 375

Table I. Simulation results for logistic model with OL = 0.2(0.2)0.8, u = 1 .O, p = 2.0

0.2 0.1937 (0.0713)

0.4 0.3983 (0.0646)

0.6 0.5927 (0.0499)

0.8 0.8007 (0.0291)

0.9936 (0.0680) 1.0128

(0.0729) 1.0145

(0.1077) 1.0106

(0.1185)

1.9768 (0.1421) 1.9904

(0.1484) 1.9744

(0.1896) 1.9885

(0.2329)

100 samples each of size 200 were generated from the logistic AR(1) model. The ML estimates of a, u, and p were then obtained for each sample, and the mean and standard deviation (in parentheses) of each of the parameters were then calculated for the set of 100 samples. These results are summarized in Table I, and we conclude that all the ML estimates of a, u, and p are not significantly different from their true values.

Laplace model For a model with a two-parameter Laplace marginal distribution fx(x) = (1/2a)exp [ - I x - p 11.1, u > 0, the CLS estimate & of CY is given as in equation (3). If a is known, the ML estimate of p that minimizes Zj'l1 19; - p(1 - a) I and thus maximizes its log-likelihood function is

iL = - (median of 91,92, . . . ,Pni)

1 - a

where 9;=xf,-xf,- l and n' is the number of pairs of (x f ,x f - l ) such that I (x , -ax t - l ) - p(1 - a ) I # 0. The ML estimate of u can then be obtained as

DIAGNOSTIC CHECKING

The distributional assumption of the identified model that needs to be checked is that its IID innovation process (cfj follows the required distribution. This check can be done by using the standard goodness-of-fit tests or the graphical P-P and Q-Q probability plots to test whether the fitted residuals 2's follow the specified marginal. To check the independence assumption of [cf), we can once again obtain a good estimate of the sampling distribution of the sample ACF of [ E ) by using the non-parametric bootstrap method of Kunsch as discussed by Aczel and Josephy (1992). Note that for the logistic, hyperbolic secant, exponential and Laplace models, we have Zt = xf - &xf-l . However, for our gamma AR(1) model, the fitted residuals E f ' s are not available. Thus, alternative procedure must be used in checking the adequacy of the fitted gamma model.

Page 8: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

376 Journal of Forecasting Vol. 13, Iss. No. 4

A diagnostic analysis that can be used without the fitted residuals (&) is the parametric bootstrap procedure of Tsay (1992). His basic idea is that a fitted parametric model is adequate if it can successfully reproduce some special characteristics of the underlying process such as long memory dependence, spectral density function, time reversibility, skewness, etc. As all the innovation processes (ct) of this paper have known probability distributions, the tentatively identified and fitted model can thus be used repeatedly to generate samples. These generated samples are then used to construct empirical distributions of the functionals (e.g. ACF, PN ratio, skewness) designed to describe specific features of interest. Each of these empirical distributions then serve as a reference distribution to which the corresponding functional quantity of the observed data can be compared. For instance, if time reversibility is our characteristic of interest and the fitted model is adequate, then the PN ratio of the observed sample should fall within the lOO(1 - a)% probability limits of the ratio constructed by using the iath and (1 - ia)th percentiles of its empirical distribution. Note that PN ratio is a statistic for describing the time reversibility of a time series. For details readers are referred to Tsay ( 1992).

Algorithms for generating the required random variables with logistic, hyperbolic secant, Laplace, exponential, and gamma marginals are readily available in Devroye (1986). The innovation process (ct) of the logistic and hyperbolic secant models can be generated by the inversion method as

e = log[sin(crU?r)/sin(tr(l - U ) r ) l

and

2 e = - sinh-’[cos(fa?r)sinh(ln(tan(f UT))]] ?r

respectively, where U is uniformly distributed over the interval (0, 1).

PREDICTOR AND PREDICTION INTERVAL

Given the information set Zn = (xo,x1, , . . , ~ n ] , the variable to be predicted, Xn+j, j > 0, is fully characterized by the conditional PDF P(x < Xn+j < x + dx 1 I n ) . Classical statistical theory (see, Rao 1973, p. 264) tells us that the minimum mean square error point forecast of Xn+j given the information set In is gn(j) = E(Xn+j I In) . Two salient features of all the AR(1) processes discussed in this paper are that (1) their conditional PDF fx.+,Ix,(xly) can be expressed explicitly and (2) the conditional expectation of Xn+j, given I n - 1 and Xn = y , is a linear function of y despite the fact that ( X l ) is not a Gaussian process. For our non-normal AR(1) models, the j-step-ahead forecast of Xn+j, given the information set Zn-1 and Xn = y , is the same as that obtained from the standard Gaussian AR(1) model, that is,

E(Xn+j( In - I ,Xn=y)=( l - ( ~ j ) ~ x + ~ l j y

where px is the mean value of the stationary process [XI. We define the statistics L = U ( X O , X I , ..., x,) and U = ~ ( X O , XI, ..., X n ) that satisfy

P(L < Xn+j < U ( I n ) = I - p

as the two-sided lOO(1 -p)To prediction limits for Xn+j given the information set Zn. For the logistic, hyperbolic secant, and Laplace models which have symmetrical marginals, we can

Page 9: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

C. H . Sim Modelling Non-normal First-order Autoregressive Time Series 377

Model

Logistic

a

lnIsin()aJpr)/sin[aJ(l- ) p ) r ]

sinh-’ (cos(f aJr)sinh [ln(tan($7rp))l] secant

deduce that

~ = A ( j ) - a , ~ = % ( j ) + a

where the expressions for a are given as in Table 11. The lower and upper prediction limits of the exponential model are given as

L = max(0, a j y - In [(I - f p ) / ( l - a ’ ) ] / ~ )

and

~ = a j y - ~ n [ k p ( l - a j ) l / ~

respectively. For the case of the gamma model, the lower prediction limit of Xn+j given the information set L-1 and Xn = y is the solution u of

m

where 6 = p(1 - a ) / ( l - ai) and r(a, z ) is the incomplete gamma function. Similarly, the upper prediction limit of Xn+j given X,, = y is the solution u of the above expression with f p replaced by 1 - fp. Note that if v is a positive integer, equation ( 5 ) can be expressed as

f p = ~ [ ~ : ~ ( 2 e ~ ’ y ) G 2eu]

where &(A) has a non-central chi-square distribution with 2v degrees of freedom and non- centrality parameter A.

AN EXAMPLE: DISCHARGE OF THE MEKONG RIVER

As an illustration, the gamma AR(1) model is fitted to the standardized monthly flows of the Mekong river in Thailand. The historical data consist of 240 values of the monthly average discharge (January 1951 to December 1970) taken from the Unesco Press publication entitled Discharge of Selected Rivers of the World. The seasonal effects of the monthly river flows are removed by using the widely used Winter’s seasonal decomposition method. For ease of parameter estimation, the seasonally adjusted data are standardized by dividing with its standard deviation.

The standardized data are positively skewed and their ACF and PACF are found to be consistent with the correlation structure of the AR(1) process. Figure 1 shows a histogram plot of the standardized Mekong river flows with the best-fitting gamma distribution superimposed on it. The chi-square and Kolmogorov-Smirnov goodness-of-fit tests applied to the fitted

Page 10: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

378 Journal of Forecasting Vol. 13, Zss. No. 4

50 t

I I I --.----I 1 3 5 7 9

Flow o f Mekong R i v e r

Figure 1. Histogram of the standardized Mekong river flows

distribution give significance probabilities (p-values) of 0.672 and 0.998, respectively, indicating that the gamma distribution provides a good fit to the standardized river flows. To verify that the sample PACF (&) is effectively zero after lag-1, 500 bootstrap replications are generated by using the non-parametric bootstrap method of Kunsch (1989). A histogram of the 500 bootstrap values of 422 is shown in Figure 2. The 5th and 95th percentiles of 4 2 2 are - 0.1169 and 0.0613, respectively, indicating that 4 2 2 is not significantly different from zero at significance level of 0.10.

The fitted gamma AR(1) model, with ML estimates GL = 0.40679, BL = 7.3916, and k=20.0422, gives a residual sum of squares (RSS) of 189.458 and a sum of absolute deviations (SAD) of 154.928, whereas the standard Gaussian AR(1) model gives a RSS of 190.086 and a SAD of 157.401. To diagnose the adequacy of our fitted gamma model, we consider two simple functionals. The first is the sample skewness y, and the second is the PN ratio that characterizes the time reversibility of a time series. The empirical PDF of y (Figure 3(a)) and the PN ratio (Figure 3(b)) are obtained by using 500 samples, each with 240 observations, that are generated from the fitted gamma AR(1) model. Both the sample skewness (0.4155) and the PN ratio (0.9752) of the standardized Mekong river flows fall within their respective 90% probability limits of (0.1225, 0.7199) and (0.8819, 1.1339). Thus, as the fitted gamma model has successfully reproduced the skewness and PN ratio of the standardized Mekong river flows, we conclude that the fitted model is adequate.

Figure 4 shows the actual values (solid line) and I-step-ahead forecasts (dashed line) made at time original 228 for lead times I = 1,2, ..., 12 for the standardized Mekong river flows [&). Also shown are the 95% prediction limits (dot-dash lines for the gamma model, dotted lines for the Gaussian model) for &8+/ for I = 1,2, ..., 12. Figure 4 clearly shows that the gamma AR( 1) model gives much narrower prediction limits than those obtained using the standard

Page 11: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

c. H. sim

120

100

80 % u c QI

U 0) L LL

3 60

40

20

0

80

60

Modelling Non-normal First-order Autoregressive Time Series 379

-

-

c

60

50

40 r u

230 al L !A

20

1

- -

-

-

-

I I I I I

-0.23 -0.13 -0.03 0.07 0.17

Lag-2 PACF

Figure 2. Bootstrap histogram for the 422 of standardized Mekong river flows

0 IJ B r u C

$40 -

OI r I I I I ' I ' I I I I I I

- 0 . 1 0 . 1 0.3 0.5 0 . 7 0.9 1.1 0.8 0.9 1 1.1 1.2 1.3

(a) (b)

Figure 3. Model checking for standardized Mekong river flows: (a) empirical PDF of the skewness; (b) empirical PDF of the PN ratio

Page 12: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

380 Journal of Forecasting Vol. 13, Iss. No. 4

7 . 2

6 . 2 i- 4 . 2

z 0

LL -

3.2

. . . . . . . . . . . . . . . . . 2.2 I t I I I I

2 15 220 225 230 235 240

. . . . . . . . . . . . . . . . . 2.2 I t I I I I

2 15 220 225 230 235 240

Time Figure 4. Forecast plot for standardized Mekong river flows. Actual values (solid line), forecast values (dashed line), prediction limits of the gamma AR(1) model (dot-dash lines), and Gaussian AR(1) model (dotted lines)

Gaussian AR( 1) model. Furthermore, the asymmetric prediction limits of the gamma model capture the asymmetric nature of the underlying distribution of the standardized river flows.

The computations of this example were performed on a 486-based personal computer using the Microsoft FORTRAN compiler and the STATGRAPHICS statistical graphics system.

CONCLUSIONS

The modelling of non-normal time series is in its early stage of development. In this paper we have discussed only a general framework of model building based on a class of non-normal AR( 1) models with specified marginals. The identification and diagnostic checking procedures of this paper are by no means complete. Many questions are still to be answered and much work remains to be done. For example, a more ambitious checking procedure (which needs a reasonable amount of data) would be to ensure that the fitted model is capable of generating the actual joint PDF of the observed series, rather than just the marginal. Further work is also needed in investigating properties of the estimators, predictors, and test statistics of non-normal models.

ACKNOWLEDGEMENTS

The author is indebted to the referees for careful reading and constructive comments on the original version of this paper.

Page 13: Modelling non-normal first-order autoregressive time seriespages.stern.nyu.edu/~dbackus/BCZ/discrete_time/Sim_nonnormal_JF_94.pdfModelling Non-normal First-order Autoregressive Time

C. H. Sim Modelling Non-normal First-order Autoregressive Time Series 38 1

REFERENCES

Aczel, A. D. and Josephy, N. H., ‘Using the bootstrap for improved ARIMA model identification’,

Andel, J., “on-negative autoregressive processes’, Journal of Time Series Analysis, 10 (1989), 1-1 1 . Arnold, B. C. and Robertson, C. A., ‘Autoregressive logistic processes’, Journal of Applied Probability,

Billard, L. and Mohamed F. Y., ‘Estimation of the parameters of an EAR(p) process’, Journal of Time

Box, G. E. P. and Jenkins, G. M., Time Series Analysis, Forecasting and Control, 2nd edition, San

Devroye, L.. Non-Uniform Random Variate Generation, New York: Springer-Verlag, 1986. Dewald, L. S. and Lewis, P.A.W., ‘A new Laplace second-order autoregressive time series

Gan, F. F. and Koehler, K. J., ‘Goodness-of-fit tests based on P-P probability plots’, Technometrics,

Gaver, D. P. and Lewis, P. A. W., ‘First-order autoregressive gamma sequences and point processes’,

Granger, C. W. J. and Newbold, P., ‘Forecasting transformed series’, Journal of the Royal Statistical

Hamilton, L. C. , Regression With Graphics: A Second Course in Applied Statistics, Belmont:

Janacek, G. J. and Swift, A. L., ‘A class of models for non-normal time series’, Journal of Time Series

Klimko, L. A. and Nelson, P. I., ‘On conditional least squares estimation for stochastic process’, Annals

Kiinsch, H. R., ‘The jackknife and the bootstrap for general stationary observations’, Annals of

Lawrance, A. J. and Lewis, P. A. W., ‘A new autoregressive time series model in exponential variables

Rao, C. R., Linear Statistical Inference and Its Applications, 2nd edition, New York: John Wiley, 1973. Rao, P. S. and Johnson, D. H., ‘A first-order AR model for non-Gaussisn time series’, Proceedings of

Sim, C . H., ‘A stochastic bivariate process associated with the EAR(1) model’, IEEE Transactions on

Sim, C . H., ‘First-order autoregressive models for gamma and exponential processes’, Journal of

Sim, C . H., ‘First-order autoregressive logistic processes’, Journal of Applied Probability, 30 (1993)’

Swift, A. L. and Janacek, G. J., ‘Forecasting non-normal time series’, Journal of Forecasting, 10 (1991),

Tsay, R. S., ‘Model checking via parametric bootstraps in time series analysis’, Applied Statistics, 41

Weiss, G., ‘Time reversibility of linear stochastic processes’, Journal of Applied Probability, 12 (1979,

Journal of Forecasting, 11 (1992), 71-80.

26 (1989), 524-31.

Series Analysis, 12 (1991), 179-92.

Francisco: Holden-Day, 1976.

model-NLAR(2)’, IEEE Transactions on Information Theory, 31 (1989, 645-51.

32 (1990), 289-303.

Advances in Applied Probability, 12 (1980). 727-45.

Society, B38 (1976), 189-203.

Wadsworth, 1992.

Analysis, 11 (1990) 19-32.

of Statistics, 6 (1978). 629-42.

Statistics, 17 (1989), 1217-41.

(NEAR( l))’, Advances in Applied Probability, 13 (1981), 826-45.

IEEE International Conference on ASSP, 3 (1988), 1534-7.

Information Theory, 33 (1987), 47-51.

Applied Probability, 27 (1990), 325-32.

467-70.

501-20.

(1992), 1-15.

143-71.

Author’s biography: Chiaw-Hock Sim is an Associate Professor in the Department of Mathematics, Univesity of Malaya, Malaysia. He received a BSc degree in mathematics from Nanyang University, Singapore, MA and PhD degrees in mathematical statistics from the University of Lancaster, and University of Malaya, respectively. His research interests include time-series modelling, data simulation, and statistical quality control.

Author’s address: Chiaw-Hock Sim, Department of Mathematics, University of Malaya, 59100 Kuala Lumpur, Malaysia.