Issues on Spurious Behaviors Christos Agiakloglou University of Piraeus Universidad Complutense de Madrid March 18, 2014

Issues on Spurious Behaviors

Christos AgiakloglouUniversity of Piraeus

Universidad Complutense de Madrid March 18, 2014

Three topics on spurious behaviors

Spurious Correlations for stationary AR(1) processes

with Apostolos Tsimbanos The Balance between Size and Power

with Charalambos Agiropoulos Spurious Regressions for non-linear or time varying

coefficient processes

with Anil Bera

Spurious Behavior

Granger and Newbold (1974) set the cat among the pigeons with their simulations results showing that when two independent drift-free random walks are used in a simple linear regression, one ends up with significant t-statistic 76% of the time.

This phenomenon was introduced by Yule (1926) as a spurious correlation.

Agiakloglou and Tsimpanos (2012) examined the spurious correlation phenomenon for stationary processes and they found no evidence of spurious behavior using the true variance of the sample correlation coefficient of the two independent AR(1) processes.

However, it was left to Phillips (1986) to mathematically prove the Granger-Newbold simulation results, showing that the usual t-statistic does not have a limiting distribution.

Entorf (1997) consider random walk with drifts. Granger, Hyung and Jeon (2001) found spurious results even for

stationary independent AR(1) processes.

The story continues

Marmol (1995) who generalized the work of Phillips (1986) for high-order integrated processes, showed also that the Durbin Watson statistic will converge in probability to zero and therefore low values of this statistic are expected to appear in the presence of spurious regressions, a finding that was also indicated by Granger and Newbold (1974).

Agiakloglou (2009) showed that evidence of serially correlated errors will also appear in the case of two independent stationary AR(1) processes, not only in first moments but also in second defined by an ARCH(1) error structure.

Recall, that the presence of serially correlated errors in the context of spurious regression had also been investigated by Newbold and Davies (1978) for variables that were generated for non-stationary moving average processes.

Tsay (1999) also examined spurious regression for I(1) processes with infinite variance errors.

Two Examples

Harvey ( 1980) “Econometrics – Alchemy or science?” studied the relationship between rainfall and inflation rate in U.K..

Ferson, Sarkissian and Simin (2003) “Spurious regression in financial Economics” found that many predictive stock return regressions in the literature, based on individual predicting variables, may be spurious.

Spurious Correlations for Stationary AR(1) Processes

An alternative approach for testing for linear association for two independent stationary AR(1) processes

Story Number One

Spurious Correlations vs. Spurious Regressions

Two similar if not identical terms referring to the same phenomenon of obtaining false evidence about the existence of a linear relationship between two variables.

In the case of spurious correlations the analyst has no indication about the existence of such behavior, unless he or she has some a priori information about the relationship between these two variables, such that a high absolute value of the sample correlation coefficient will be considered very suspicious.

In the case of spurious regressions the analyst will have an indication such as low value of the Durbin-Watson as Granger and Newbold (1974) have pointed out, see also Marmol (1995) and Agiakloglou (2009).

More difficult to detect spurious correlations.

Testing for Linear Association

The test of zero correlation in population is based on the following hypotheses:

H0: ρ = 0 against H1: ρ ≠ 0

and it is implemented by the usual t statistic, i.e.,

where r is the sample correlation coefficient and T is the sample size.

The t statistic follows a t distribution with (T - 2) degrees of freedom and the null hypothesis will be accepted if its absolute value is less than the critical value.

212

rt

rT

Frequency Distribution of r

Yule (1926) studied the properties of the sample correlation coefficient of two random variables and noticed that the major factor which determines this spurious behavior is the shape of the frequency distribution of the correlation coefficient of the two series.

More precisely, if the distribution has a U shape it is certain that spurious correlations will arise.

Banerjee, A., Dolado, J., Galbraith, J. W. and Hendry, D. F. (1993) using Monte Carlo analysis, examined the frequency distribution of the correlation coefficient for various orders of integrated independent time series verifying Yule’s (1926) initial results.

They concluded:

I) Frequency Distribution of r

A) if the two series are stationary white noise processes, the frequency distribution of the correlation coefficient will be symmetric around zero and it will look like normal distribution.

Frequency distribution for the correlation coefficient between two independent white noise processes

(T=100 & 10,000 replications)

-0.3 -0.2 -0.1 0. 0.1 0.2 0.3 0.4

100

200

300

400

II) Frequency Distribution of r

B) if the two processes are non-stationary I(1) processes, the frequency distribution of the correlation coefficient will be semi-ellipse.

Frequency distribution for the correlation coefficient between two independent I(1) processes (T=100 & 10,000 replications)

-0.75 -0.5 -0.25 0. 0.25 0.5 0.75

20

40

60

80

100

120

140

III) Frequency Distribution of r

C) If the two processes are non-stationary I(2) processes, the frequency distribution has a U shape with values of -1 and +1 to be more likely to occur.

Frequency distribution for the correlation coefficient between two independent I(2) processes (T=100 & 10,000 replications)

-0.5 0. 0.5 1.

200

400

600

800

1000

Frequency Distribution of t

Consider:

a) two independent white noise processes, b) two independent I(1) random walk processes,

i.e., Yt = Yt-1 + ut, c) two independent non-stationary I(2) processes,

i.e., Yt = 2Yt-1 – Yt-2 + ut, for sample size of 100 observations using 10,000 replications.

Besides the percentage of rejections of the null hypothesis, the value of the standard deviation of the t statistic strongly deviates from one for the two non-stationary cases as appose to the white noise case which remains unchanged with value of one, regardless of the sample size.

Note the scale of the graphical presentation is not the same.

Simulation results for spurious correlations for white noise and non-stationary I(1) & I(2) processes based on 10,000 replications

T

Percentage of

rejections of H0

(|t| > 1.96)Mean value of

r

Standard deviationof r

Mean value of t

Standard deviationoft

White Noise processes

100 5.03 0.00 0.10 0.00 1.00

500 5.24 0.00 0.04 0.00 1.00

1,000 5.10 0.00 0.03 0.01 1.00

I(1) processes

100 76.95 0.00 0.49 0.03 7.38

500 89.32 0.00 0.49 0.01 16.37

1,000 92.76 0.00 0.49 0.07 23.37

I(2) processes

100 94.82 0.00 0.84 0.03 49.01

500 97.51 0.00 0.82 0.28 98.55

1,000 98.18 -0.01 0.82 -0.40 139.39

I) Frequency Distribution of t

Frequency distribution for the t statistic between two independent white noise processes (T=100 & 10,000 replications)

-4. -2. 0. 2. 4.

100

200

300

400

II) Frequency Distribution of t

Frequency distribution for the t statistic between two independent I(1) processes (T=100 & 10,000 replications)

-30. -20. -10. 0. 10. 20. 30. 40.

100

200

300

400

500

600

III) Frequency Distribution of t

Frequency distribution for the t statistic between two independent I(2) processes (T=100 & 10,000 replications)

-300. -200. -100. 0. 100. 200. 300.

200

400

600

800

1000

1200

Spurious correlations for AR(1)

Consider two independent AR(1) processes Xt and Yt generated from the following DGP:

and

where the errors εxt and εyt are each white noise N(0, 1) processes independent of each other and the autoregressive parameters are allowed to take values of 0.0, 0.2, 0.5, 0.8 and 0.9.

Note that if φx = φy = 1, both processes are non-stationary random walk processes without drift, whereas if φx = φy = 0, both processes are white noise processes.

1t x t xtX X 1t y t ytY Y

Simulation Results

Unlike the two non-stationary cases, previously discussed and especially the I(1) case, the percentage of rejections of the null hypothesis of zero correlation remains unchanged regardless of the sample size.

It is only affected by the magnitude of the autoregressive parameters.

We get more spurious results as the value of the autoregressive parameter increases.

For example, for φx = φy = 0.5, the null hypothesis is rejected approximately 13%, for the 5% nominal level, whereas for φx = φy = 0.9, this number becomes approximately 52%.

Percentage of rejections of the null hypothesis of zero correlation at the 5% nominal level (|t| > 1.96) for two independent stationary

AR(1) processes based on 10,000 replications

T φy

φx

0.0 0.2 0.5 0.8 0.9

100

0.0 5.03

0.2 5.18 6.20

0.5 5.38 7.77 12.85

0.8 5.04 9.86 19.53 35.23

0.9 4.81 10.44 22.42 41.12 50.27

500

0.0 5.24

0.2 5.13 5.77

0.5 5.31 7.92 13.15

0.8 4.90 9.44 19.72 34.78

0.9 5.42 10.46 22.67 42.56 52.22

1,000

0.0 5.10

0.2 5.25 6.00

0.5 4.73 7.38 12.87

0.8 5.30 9.94 19.92 36.05

0.9 4.83 10.53 22.55 43.21 52.01

I) Further Simulation Results

Frequency distribution for the correlation coefficient

Clearly, if the decision, as to whether or not spurious behavior exists, was based on the shape of the frequency distribution of the correlation coefficient, the analyst will have no indication in this case.

The frequency distribution for the correlation coefficient of two independent stationary AR(1) processes is symmetric around mean zero and it looks very similar to the white noise case previously presented.


Frequency distribution for the correlation coefficient between two independent AR(1) processes for φχ = φy = 0.5 (T=100 &

10,000 replications)


Frequency distribution for the correlation coefficient between two independent AR(1) processes for φχ = φy = 0.9 (T=100 &

10,000 replications)

II) Further Simulation Results

Frequency distribution for the t statistic

As in the case of non-stationary processes, the problem of spurious correlations appears because the value of the standard deviation of the t statistic for testing the null hypothesis of zero correlation is not one for all values of the autoregressive parameters.

Thus, although the frequency distribution for the t statistic is symmetric around mean zero, it becomes flatter than the standard normal distribution as the value of the autoregressive parameters increases.

The standard deviation of the t statistic is affected only by the values of the autoregressive parameters and not by the sample size.

Standard deviation of the t statistic for testing the null hypothesis of zero correlation for two independent stationary AR(1)

processes based on 10,000 replications

T φy

φx

0.0 0.2 0.5 0.8 0.9

100

0.0 1.00

0.2 1.02 1.05

0.5 1.01 1.11 1.30

0.8 1.00 1.17 1.52 2.12

0.9 1.02 1.21 1.62 2.45 3.00

500

0.0 1.00

0.2 1.01 1.04

0.5 1.00 1.11 1.29

0.8 1.02 1.18 1.51 2.10

0.9 1.00 1.21 1.62 2.48 3.10

1,000

0.0 1.00

0.2 1.00 1.05

0.5 1.00 1.11 1.28

0.8 1.01 1.17 1.52 2.15

0.9 1.00 1.20 1.61 2.47 3.07


Frequency distribution for the t statistic between two independent AR(1) processes for φχ = φy = 0.5 (T=100 & 10,000 replications)


Frequency distribution for the t statistic between two independent AR(1) processes for φχ = φy = 0.9 (T=100 & 10,000 replications)

Variance of r of two independent AR(1) processes

For two independent stationary AR(1) processes Xt and Yt generated by equations (2) and (3) with autocorrelation coefficients ρx and ρy respectively we have:

and since

we have

which approximately is equal to:

2 2 22 2

1 1[ ( ) ] [ 2 ]t t t t t t s s

t s

E X Y E X Y X Y X YT T

1 1 2 2 ...t t s s t t t t t t t tt s

X Y X Y X X YY X X YY

2 2

2 2 2 2 22 2

21 1[ ( ) ] [( 1) ( 2) ...]x y

t t x y x y x yE X Y T TT T T

2 2 22

21 1[ ( ) ] (1 )

1x y

t t x yx y

E X YT T


Hence, the variance of the sample correlation coefficient between the two independent stationary AR(1) series is approximately defined as:

or equivalently as:

since ρx = φx and ρy = φy for AR(1) processes. For more evidence about the proof of this variance see Bartlett (1935). McGregor (1962) also verifies the existence of this variance by

determining the approximate null distribution of the sample correlation coefficient of two stationary Markov chain processes using the steepest descents method proposed by Daniels (1954 and 1956).

11( ) ( )

1x y

x y

Var rT

11( ) ( )

1x y

x y

Var rT


The degree of accuracy of this variance depends on three things: a) on the sign of the autoregressive parameters b) on the absolute magnitude of the two autoregressive

coefficients and c) on the sample size

One should expect less accuracy: a) if φx and φy are both positive (or negative) b) if their absolute magnitude is close to one and c) if their sample size is small.

Therefore, it is interesting to investigate the accuracy of this variance in the context of spurious correlations for all positive values of the autoregressive parameters and for various sample sizes.

Simulation Results using the Var(r) of two independent AR(1) processes

Series of two independent AR(1) processes Xt and Yt previously defined are generated for values of the autoregressive parameter of 0.0, 0.2, 0.5, 0.8 and 0.9 and for sample sizes of 100, 500 and 1,000 observations.

Based on the sample correlation coefficient of these two series, the test for zero correlation is conducted by replacing the denominator of the usual t statistic by the square root of the variance previously defined.

The simulation results support no evidence of spurious correlations.

Empirical levels are close to nominal levels for moderate and large sample sizes.

Percentage of rejections of the null hypothesis of zero correlation at the 5% nominal level (|t| > 1.96) for two independent stationary AR(1) processes using the approximate variance of their sample

correlation coefficient based on 10,000 replications

T φy

φx

0.0 0.2 0.5 0.8 0.9

100

0.0 4.76

0.2 5.22 5.13

0.5 5.18 5.08 4.56

0.8 4.96 4.72 4.59 3.41

0.9 5.08 4.73 4.46 2.75 1.59

500

0.0 5.21

0.2 5.09 5.03

0.5 5.07 5.03 5.25

0.8 4.91 4.77 5.07 4.81

0.9 5.09 4.88 5.11 4.61 4.56

1,000

0.0 4.98

0.2 5.21 4.81

0.5 4.90 5.13 4.90

0.8 4.80 5.08 5.09 4.77

0.9 5.09 4.88 4.92 4.79 5.16

Further Simulation Results using the Var(r) of two indep. AR(1) processes

Frequency distribution for the corrected t statistic

The frequency distribution of the corrected t statistic for testing the null hypothesis of zero correlation using the variance of two independent AR(1) processes is very close to the standard normal distribution since the standard deviation of the t statistic is one for almost all cases.

Standard deviation of the t statistic for testing the null hypothesis of zero correlation for two independent stationary AR(1)

processes using the approximate variance of their sample correlation coefficient based on 10,000 replications

T φy φx

0.0 0.2 0.5 0.8 0.9

100 0.0 1.01

0.2 1.01 1.00

0.5 1.00 1.00 0.98

0.8 1.01 1.00 0.99 0.95

0.9 1.01 1.00 0.98 0.92 0.88

500 0.0 1.00

0.2 1.00 1.00

0.5 1.01 1.01 1.01

0.8 1.00 0.99 0.99 0.99

0.9 1.01 1.00 1.00 0.99 0.99

1,000 0.0 1.00

0.2 1.00 1.00

0.5 1.00 1.01 1.00

0.8 0.99 1.01 1.01 0.99

0.9 1.01 1.00 1.00 1.00 1.00

Concluding Remarks

Using the approximate variance of the sample correlation coefficient of two independent stationary AR(1) processes, this study shows that the spurious behavior can be eliminated for large and moderate sample sizes, even for large values of the autoregressive parameter.

The Balance between Size and Power

in testing for linear association for two stationary AR(1) processes

Story Number Two

Fisher(1915) has revealed some of the properties of the sample correlation coefficient, r, indicating that it is a biased estimator of the population correlation coefficient, ρ, for normal populations, proving also that Ε[r] = ρ – ρ(1 – ρ2)/2Ν.

See also Kenny and Keeping (1951) and Sawkins (1944).

Clearly, the bias is not a large number, taking into account that the correlation coefficient takes values from -1 to 1.

However, if one is concerned with the accuracy of the t – test for testing the null hypothesis of zero correlation, especially when the absolute value of the sample correlation coefficient is small, this bias may affect the variance and, therefore, the test.

Sample Correlation Coefficient

Consider and using the following three t statistics:

The test Statistics

Consider two independent AR(1) stationary processes generated by the following DGP:

and

where the errors and are white noise N(0,1) processes, independent of each other and the autoregressive parameters are allowed to take values of 0.0, 0.2, 0.5, 0.8 and 0.9.

Sample sizes of 50, 100, 500 and 1000 observations.

Simulation Process

T t-statistics 0.0 0.2 0.5 0.8 0.9

50

5.89 6.63 13.20 33.46 48.01

5.45 5.16 4.53 1.89 0.07

5.35 5.51 5.44 4.11 2.56

100

5.58 5.88 12.75 35.10 50.43

5.40 4.77 4.42 3.34 1.60

5.38 4.87 4.81 4.54 3.79

500

4.74 6.02 12.39 35.41 52.19

4.70 5.19 4.63 4.59 4.31

4.67 5.10 4.82 4.87 4.84

1000

5.22 6.12 12.55 35.89 52.61

5.21 5.22 4.55 4.64 4.73

5.20 5.23 4.63 4.93 5.13

Percentage of rejections of the null hypothesis of zero correlation at the 5% nominal level (|t| > 1.96) for two independent stationary AR(1) processes based on 10,000 replications

Using the empirical values of the autoregressive

parameters we are getting better size for large

values of the autoregressive parameter.

This is probably due to the fact that the estimates

were not so close to the true large values of the

autoregressive parameters for small and moderate

sample sizes.

For example, for 0.9 and for T = 100, the mean

values of the estimates were 0.8622 and 0.8614.

Comments

T mean 0.0 0.2 0.5 0.8 0.9

50

-0.0196 0.1674 0.4474 0.7291 0.8202

-0.0206 0.1660 0.4484 0.7298 0.8216

100

-0.0096 0.1842 0.4754 0.7651 0.8622

-0.0099 0.1833 0.4747 0.7651 0.8614

500

-0.0016 0.1968 0.4951 0.7930 0.8927

-0.0021 0.1975 0.4954 0.7935 0.8925

1000

-0.0005 0.1981 0.4972 0.7969 0.8964

-0.0009 0.1987 0.4980 0.7964 0.8965

Mean values of the estimates of the autoregressive parameters of the AR(1) processes based on 10,000 replications

Power of the test

Consider two linearly dependent AR(1) processes Xt and Yt

for t = 1, 2, …, T, such that:

where is their correlation coefficient.

Using matrix notation, we may write:

where the errors and are white noise N(0, 1) processes,

but not independent of each other.

Equivalently, using matrices the former equation can be expressed as a VAR(1) model:

where and are vectors with being a matrix.

It can be showed that:

where and are the covariance matrices of and respectively defined as:

and

where and are the variances of and , respectively, defined by an AR(1) process, or is their covariance and or is the covariance of the error terms with unit variances.

Simulation Process

Using vectorization, the above equation becomes as:

where stands for vectorisation, ⊗ is the Kronecker product and is the identity matrix.

It is easy to show that the equation can be written precisely as:

from which the desired correlation ρ, between the two series, is determined by the following equation:

Simulation Process – Cont. I

Hence, to generate two dependent AR(1) processes with desired correlation , we need to generate random errors and with unit variances and correlation given by:

Simulation Process – Cont. II

T ρt-

statistics0.0 0.2 0.5 0.8 0.9

50

0.2

29.8 30.5 34.5 45.1 53.3

28.5 26.3 17.6 5.4 0.2

28.3 26.9 20.0 9.9 4.9

0.4

84.3 83.6 78.9 71.1 70.0

83.3 80.5 61.5 19.9 1.4

83.3 81.3 65.4 30.8 13.4

0.6

99.7 99.6 98.6 92.8 88.2

99.7 99.5 95.8 54.1 7.4

99.7 99.5 96.8 69.3 36.8

0.8

100.0 100.0 100.0 99.8 98.8

100.0 100.0 100.0 93.7 37.3

100.0 100.0 100.0 97.7 74.6

Percentage of rejections of the null hypothesis of zero correlation at the 5% nominal level (|t| > 1.96) for two dependent stationary AR(1) processes based on 10,000

replications

T ρt-

statistics0.0 0.2 0.5 0.8 0.9

100

0.2

53.4 53.1 52.3 54.7 60.5

52.6 49.1 34.1 13.3 4.9

52.5 49.8 35.9 16.4 9.1

0.4

98.7 98.5 96.4 86.9 81.0

98.6 98.2 90.7 47.9 19.2

98.7 98.2 91.5 55.4 30.4

0.6

100.0 100.0 100.0 99.2 96.3

100.0 100.0 99.9 89.1 52.7

100.0 100.0 100.0 93.2 68.5

0.8

100.0 100.0 100.0 100.0 99.9

100.0 100.0 100.0 99.9 93.1

100.0 100.0 100.0 100.0 97.9


replications

T ρt-

statistics0.0 0.2 0.5 0.8 0.9

500

0.2

99.3 99.2 97.6 89.2 82.0

99.3 99.1 94.2 56.2 30.4

99.3 99.0 94.4 57.5 32.6

0.4

100.0 100.0 100.0 100.0 99.4

100.0 100.0 100.0 99.4 86.1

100.0 100.0 100.0 99.5 88.7

0.6

100.0 100.0 100.0 100.0 100.0

100.0 100.0 100.0 100.0 99.9

100.0 100.0 100.0 100.0 100.0

0.8

100.0 100.0 100.0 100.0 100.0

100.0 100.0 100.0 100.0 100.0

100.0 100.0 100.0 100.0 100.0


replications

The test has low power for small and moderate sample sizes, especially for low values of the correlation coefficient, a result that has been also indicated by Zimmerman et. al. (2003) for normal populations.

However, as the sample size increases the power of the test also increases, regardless of the values of ρ and φ.

In general the classical t test has larger power than the other two tests. The difference is obvious for large values of the autoregressive parameter, low values of the correlation coefficient and small sample size.

For example, for T = 100, the null hypothesis is rejected 60.5% for ρ=0.2 and φ=0.9 using the classical t test, as opposed to 4.9% and 9.1% using the t' and the t'' statistic respectively.

For small values of the autoregressive parameters the power of the test is very similar for all three cases, regardless of the value of the correlation coefficient.

On the other hand, the test has larger power using the t'' rather than the t' statistic for all cases.

Comments

Spurious Regressions for non-linear or time varying coefficient processes

Effects of ARCH on Spurious Regressions

Story Number Three

Spurious with nonlinear series

The spurious result is known as a linear behavior. Lee, Kim and Newbold (2005) investigated the possibility of

spurious nonlinear relationships between two independent random walks.

However, to the best of our knowledge, there is no study on the spurious regression phenomenon for two independent nonlinear processes.

Nonlinear models are discussed in Granger and Terasvirta (1993). Chaos? A special case of nonlinear models, such as AR(1) with ARCH(1)

effects, are discussed in Bera, Higgins and Lee (1996) and in (1992).

Randomness of the autoregressive parameter apparently increases the unconditional variability for both series damping the degree of linear relationship.

AR(1) Process

Consider a stationary AR(1) process with mean zero:

where

and the unconditional variance of X is:

for all absolute values of the autoregressive parameter less than one.

Processes with Time Varying Parameters The ARMA models typically are considered to have constant

autoregressive and moving average parameters. When taste, technology and policy change it is difficult to

assume constant parameters. For example, consider

where This particular model structure is called time-varying parameter

autoregressive model, i.e., VPAR(1). φt is unobserved, but it can be estimated iterative using Kalman

Filter. In the case where α = 0, the above model gives a “random

coefficient model” which is discussed in detail in Nicholls and Quinn (1982).

There are several form of how is defined.

Non linear AR Processes

Granger (2008) said than any linear AR model can be approximated as a non-linear model.

For example, an AR(2) model with mean zero:

can be expressed as an AR(1) process with random or time varying coefficient, i.e.,:

where

AR(1) Process with Random Coefficient

Consider:

where and

for and

The conditional mean and variance of Xt are:

where the unconditional variance is defined as:

The process behaves like an ARCH(1).

1t t t tX X

1 1( | ) ( ) ( ) 0t t t t tE X X X E E

2 21 1 0 1 1( | ) ( ) ( )t t t t t tV X X X V V X

20

1 1

( )1 1tV X

10 1 20 0

AR(1) Process with Random Coefficient II

Consider:

where and

for |φ| < 1, and

The conditional mean and variance of Xt are:

The process behaves like an AR(1) and ARCH(1).

This process is the general case.

So what is the unconditional variance of Xt?

1t t t tX X

1 1 1( | ) ( ) ( )t t t t t tE X X X E E X

2 21 1 0 1 1( | ) ( ) ( )t t t t t tV X X X V V X

10 1 20 0

Unconditional Variance for AR(1) + ARCH(1) Process

The unconditional variance is defined as:

where

Hence

which is

or

So the unconditional variance for an AR(1) + ARCH(1) depends on?

1 1 1( | ) ( ) ( )t t t t t tE X X X E E X

2 21 1 0 1 1( | ) ( ) ( )t t t t t tV X X X V V X

1 1( ) [ ( | )] [ ( | )]t t t t tV X E V X X V E X X

20 1 1 1( ) [ ] [ ]t t tV X E X V X

2 21( ) ( ) ( )t t tV X V X V X

2

21

( )1tV X

AR(1) + ARCH(1) Process Consider:

where and

for |φ| < 1, and

Another way of defining this process is an AR(1) process with ARCH(1) errors in variable, i.e.,

where

and ut ~ iidN(0, 1). Under this set up is defined as:

1t t t tX X

10 1 20 0

1t t tX X

20 1 1t t tu X

Simulation Process Consider two independent series Yt and Xt for t = 1, 2, …, T

generated by the AR(1) + ARCH(1) specification, previously defined, under the following set up:

α0 = 1.0 α1 = 0.0, 0.2, 0.5, 0.8 and 0.9 Φ = 0.0, 0.2, 0.5, 0.8, 0.9 and 1.0.

To examine the presence of a linear relationship between these two variables the following model:

is estimated using simple ordinary least squares regression for sample sizes of 50, 100 and 500 observations.

The objective is to define how many times the null hypothesis of testing that β = 0 is rejected.

t t tY X e

Simulation Results

CASE I: Linear Processes:

Φ = 1.0 and α1 = 0.0. Yt and Xt are independent random walks, which corresponds to Granger and Newbold (1976) set up.

The rejection proportions increases as sample size increases, for T = 500 it is almost 90%.

Φ < 1.0 and α1 = 0.0. Yt and Xt are independent AR(1) processes, which corresponds to Granger et. al. (2001) set up.

The rejection proportions increases only as the value of the autoregressive parameter increases and not as sample size increases.

See Linear processes as a special case of non-linear processes.

Simulation Results CASE II: Non-Linear Processes:

Φ = 1.0 and for different values of α1. As we increase the degree of non-linearity the rejection proportions decreases, i.e., for α1 = 0.9 and for T = 500 it becomes 0.074, almost close to nominal, as apposed to 0.896 in the absence of ARCH.

This implies that the performance of unit root test and cointegration tests might be abruptly affected by conditional heteroscedasticity in time series variables.

Φ < 1.0 and for different values of α1, the spurious results found by Granger et. al. (2001) almost disappear.

For example, for Φ = 0.9 at the 5% nominal level the proportion of rejections becomes 0.089 for T = 500 as apposed to 0.516 in the absence of ARCH.

Spurious regression essentially looks for the presence of linear relationships between two independent series.

Randomness of the autoregressive parameter apparently increases the unconditional variability of both series damping the degree of linear relationship.

Proportions of rejections of the null hypothesis that β = 0 at the 5% nominal level based on 1,000 replications

a1

φ

1.0 0.9 0.8 0.5 0.2 0.0

0.00.6820.7620.896

0.4620.5190.516

0.3510.3490.347

0.1300.1270.145

0.0680.0610.055

0.0550.0470.048

0.20.4450.4740.401

0.3610.3850.377

0.2760.2920.281

0.1280.1210.127

0.0630.0680.067

0.0440.0490.040

0.50.2570.2170.129

0.2160.2000.185

0.2110.2130.195

0.0890.1000.123

0.0530.0520.050

0.0640.0600.037

0.80.1610.1510.090

0.1530.1250.091

0.1280.1370.098

0.0950.0780.090

0.0620.0550.060

0.0470.0600.052

0.90.1540.1420.074

0.1490.1330.089

0.1540.0970.084

0.0770.0850.073

0.0670.0490.056

0.0580.0580.053

Note: In each cell numbers are corresponding for sample sizes of 50, 100 and 500 observations.

General Comments

It is very difficult to analyze time series data. To avoid spurious behaviors, do not use time

series data. Spurioucity does exist in Greece.

Documents

Issues on Spurious Behaviors Christos Agiakloglou University of Piraeus Universidad Complutense de Madrid March 18, 2014