18
Sustainable Development in Popular Newspapers How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? ARIMA modelling with (G)ARCH and Fractional Integration Assignment 6 Mark Boukes ([email protected]) 5616298 1 st semester 2010/2011

Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

  • View
    935

  • Download
    0

Embed Size (px)

DESCRIPTION

Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? ARIMA modelling with (G)ARCH and Fractional Integration

Citation preview

Page 1: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

Sustainable Development in Popular

Newspapers

How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable

development?

ARIMA modelling with (G)ARCH and Fractional Integration

Assignment 6

Mark Boukes ([email protected])5616298

1st semester 2010/2011Dynamic Data Analysis

Lecturer: Dr. R. VliegenthartDecember 23, 2010

Communication Science (Research MSc) Faculty of Social and Behavioural Sciences

University of Amsterdam

Page 2: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

Table of contents

INTRODUCTION..................................................................................................................................1

METHOD................................................................................................................................................1

RESULTS................................................................................................................................................2

ARIMA MODEL......................................................................................................................................2

THE CONDITIONAL VARIANCE.................................................................................................................5

CONCLUSION.......................................................................................................................................6

REFERENCE..........................................................................................................................................6

Do File

Page 3: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

IntroductionIn this study I aim to investigate the influence news coverage in a particular newspaper has on

the coverage of another newspaper. For this purpose, I have chosen a specific topic,

sustainable development, that seems to get a lot media attention in the last years. The topic of

sustainable development was chosen, because it can be related to several parts of society, such

as the economy, science and also for the man in the street this topic is relevant.

As those different parts of society are represented by different media, it is interesting

to see how they influence each other on the amount of attention that is paid to this issue. Will

an increase in attention of business men’s newspapers result in an increase of attention in

newspapers that deal with popular issues; and how is this newspaper affected by a newspaper

that is more focused on scientific issues? In brief, How is the attention in a popular

newspaper caused by attention in scientific and economic newspapers. The most read popular

newspaper in the Netherlands is De Telegraaf, a newspaper that has a main financial or

business focus is Het Financieele Dagblad, and NRC Handelsblad is known for its relative

large attention to scientific developments. The effects of both Het Financieele Dagblad and

NRC Handelsblad are expected to be positive on the coverage of De Telegraaf. My hypothesis

is therefore:

An increase in the number of articles about sustainable development in Het Financieele

Dagblad or in NRC Handelsblad is likely to be followed by a increase in the number of

articles this topic in coverage of de Telegraaf in future weeks.

MethodIn order to investigate whether changes in the number of articles about sustainable development

in NRC Handelsblad and Het Financieele Dagblad have an effect on De Telegraafs’s news, a

dataset was created via a computer-assisted content analysis, which was conducted using the

digital archive of the Web-based version of LexisNexis. Articles were selected via the

Boolean search term duurza! OR "groene energie" OR "zonne-energie" OR "windenergie".

The period I analyzed was from 1 January 1999 until 31 December 2009. This period was

chosen, because information about De Telegraaf is only available from 1999. The search

procedure was repeated three times; one time for every newspaper, so three variables could be

created by aggregating the data on a weekly basis. A weekly basis is chosen, because it is

more detailed than the monthly basis, whereas a daily basis would lead to many days on

which no coverage was found, what consequently meant that a lot cases had to be filled in by

hand. A total of 35225 articles were found for 581 weeks; 18501 in Het Financieele Dagblad,

10335 in NRC Handelsblad and 6389 in De Telegraaf.

1

Page 4: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

To analyse the effects on the coverage of De Telegraaf, first an adequate ARIMA

model is developed using Stata 10.1 for the time series of this variable, this was followed by

adding a GARCH term to model the volatility, thereafter the independent variables were

added to the model, resulting in a multivariate ARIMA model.

ResultsIn this results section, I specify how the analysis was conducted and discuss the results that

were found. I followed the ARIMA-framework described by Vliegenthart (n.d.) to make the

base ARIMA-model, thereafter GARCH-terms were added to take heteroscedasticity into

account and finally the independent variables were added to the model.

ARIMA modelFigure 1 plots the time series of the attention in the three newspapers for the period that we

are studying. It seems that Het Financieele Dagblad pays the most attention to sustainable

development and De Telegraaf the least attention. The amount of attention seems to be rising

a little over time, but this is not confirmed by augmented Dickey-Fuller tests (see Table 1);

hypotheses for unit root are rejected. However the results were close to insignificant and the

graphs also show that there is some upward trend. Therefore, the data were fractionally

integrated at a level of 0.352 following the Robinson (1995) multivariate estimate of the long

memory (fractional integration) parameters for the number of articles in De Telegraaf.

Figure 1. The number of articles about sustainable development over time in the three newspapers of interest.

2

Page 5: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

Table 1. The results of augmented Dickey-Fuller tests for the amount of articles over time Augmented Dickey-Fuller test Telegraaf FD NRC

Random walk without drift -6.560 -4.990 -5.596

Random walk with drift -10.878 -10.104 -14.525Random walk with drift and trend -16.487 -15.716 -16.751

FI.Telegraaf FI.FD FI.NRC

Random walk without drift-11.756 -8.938 -10.011

Random walk with drift-18.478 -17.020 -22.947

Random walk with drift and trend -25.563 -24.102 -25.383

Note. All tests indicate the absence of a unit root. FI, fractionally integrated

The next step was predicting the number of articles about sustainable development as good as

possible by accounting for its own past, either with autoregressive (AR) terms, moving

average (MA) terms or both. This was done by inspecting the autocorrelation (ACF) and

partial autocorrelation functions (PACF). The ACF graph showed an unclear pattern, while

the PACF graph displays a declining pattern for the first lags. This pattern is indicative for a

process with a moving average at lag 1. A ARIMA (0,0.352,1) model seems thus the right

choice. This model was tested for autocorrelation with the Ljung–Box Q test statistic and for

the presence of conditional heteroscedasticity with the Engle-Granger test. A significant result

for the Ljung-Box Q-test for autocorrelation (20 lags) was found, meaning that the null

hypothesis of white noise was rejected and that the absence of autocorrelation cannot be

assumed (Q = 807.40, p < 0.001). Therefore, to avoid autocorrelation in the ARIMA-model, it

was extended with an autoregressive term at lag 1 and a moving average term at lag 2;

following the ACF and PACF graphs for every extension until the residuals of the AR(1)-

I(0.352)-MA(1,2)-model did reflect no autocorrelation (Q = 24.89, p = 0.206). However, it

seems not possible to reduce the Ljung-Box test statistic Q2 based on the squared residuals to

insignificance, with AR or MA terms only. This means that there is a strong temporal

dependency in the variance of the number of Telegraaf articles about sustainable

development; heteroscedasticity. Residuals of the last ARIMA model were saved and are later

analysed to see how the conditional variance of the number of articles in De Telegraaf was

affected by changes in the number of articles in the other two newspapers.

To avoid the heteroscedasticity in the model, it was necessary to model also the

conditional variance of the dependent variable, either with autoregressive conditional

heteroscedasticity (ARCH) terms or with a combination of ARCH and generalized

autoregressive conditional heteroscedasticity (GARCH) terms. This last option was chosen,

3

Page 6: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

because it has considerable better model fit according to Akaike Info Criterion (AIC) and the

Bayesian information criterion (BIC) (ΔAIC = 104.94, ΔBIC = 100.573). The results of this

general model are showed in Table 2. The ARCH and GARCH terms are both statistically

significant and positive. This indicates that that innovations in the prior period increase

conditional variance in a next period. Periods of high volatility are thus likely to be grouped

together in time. The autoregressive and moving average terms are also significant, meaning

that a particular number of articles in some week in De Telegraaf, are partly determined by

the number of articles in this newpaper the week before and two weeks before.

Table 2. GARCH models: number of Telegraaf articles about sustainable developmentGeneral GARCH model GARCH with independent variables

Constant 2.502 (0.634)* 1.637 (0.638)*

Autoregressive (t - 1) 1.007 (0.001)* 1.007 (0.001)*

Moving average (t - 1) -1.125 (0.045)* -1.154 (0.046)*

Moving average (t - 2) 0.128 (0.045)* 0.156 (0.046)*

ARCH term 0.050 (0.013)* 0.047 (0.012)*

GARCH term 0.958 (0.010)* 0.961 (0.010)*

NRC Handelsblad (t - 1) 0.052 (0.030)a

Het Financieele Dagblad (t - 1) 0.047 (0.023)*

Ljung-Box Q(20) residuals 25.92 25.91

AIC 3599.80 3590.38

BIC 3630.34 3629.63Note. Unstandardized coefficients. Standard errors in parentheses; * p < 0.05 , a = 0.091

Now a model was built that properly accounts for its own past and heteroscedasticity, the

analysis could go on with the next step: assessing the impact of the amount of news coverage

about sustainable development in NRC Handelsblad and Het Financieele Dagblad on that of

the coverage in De Telegraaf. The cross-correlation function (CCF) for the residuals of this

GARCH model and for the amount of coverage in NRC, and the CCF for the residuals with

the coverage in Het Financieele Dagblad, indicate both that strong association is present when

coverage in both newspapers is lagged at 1 week. The results of the GARCH model that

included these two independent variables can also be found in Table 2.

The results indicate that news coverage about sustainable development in Het

Financieele Dagblad (FD) had a significant positive effect on coverage about this issue in De

Telegraaf. A one article increase in FD, did on average result in a 0.05 article increase in the

next week’s coverage of De Telegraaf. The effect of NRC Handelsblad is not significant,

following a two-tailed test with 95% confidence interval. However, because this effect is

4

Page 7: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

expected to be positive, a one-tailed test can be used. Then the effect of NRC Handelsblad’s

coverage about sustainable development also becomes significant (χ2 = 2.85, p = 0.0456). A

one article increase about sustainable development in NRC Handelsblad, will on average also

result in a 0.05 article increase in De Telegraaf. Both effects seem thus to be rather weak.

The conditional varianceAs written before, the ARIMA-model was highly volatile, meaning that the variance was not

stabile over time. This heteroscedasticity could not be reduced by adding more autoregressive

or moving average terms, but a GARCH model had to be used. Though this heteroscedasticity

was unpleasant in the attempt to predict the numbers of articles published by De Telegraaf

about sustainable development, it can also be used as interesting information. Consequently, I

would like to know whether the conditional variance was affected by developments in the

numbers of articles published by the other newspapers. Therefore, the squared residuals of the

AR(1)-I(0.352)-MA(1,2)-model were used as dependent variables.

The time series of this variance-variable was stationary according to augmented

Dickey-Fuller tests. Following the same framework (Vliegenthart, n.d.) as was done above, a

ARIMA(1,0,1)-model was built that did neither reflected autocorrelation (Q = 28.90, p =

0.09), nor heteroscedasticity (Q = 1.29, p = 0.999) in the residuals of the predictions of the

variance of the fractionally integrated number of articles in De Telegraaf. The next step was

to insert the same independent variables as in the GARCH model, because the cross-

correlation function indicates that the residuals of this model correlate most strongly with the

fractionally integrated values of the other two dependent variables at a lag of one week. The

results of this model can be found in Table 3.

Table 3. ARIMA model for the variance of the fractionally integrated number of articles in De Telegraaf

ARIMA (1,0,1)

Constant 3.740 (0.681)

Autoregressive (t - 1) -0.737 (0.857)

Moving average (t - 1) 0.708 (0.944)

NRC Handelsblad (t - 1) 0.502 (1.404)

Het Financieele Dagblad (t - 1) 1.413 (0.681)*

Ljung-Box Q(20) residuals 28.79

Ljung-Box Q(20) residuals2 1.97

AIC 7149.06

BIC 7175.23Note. Unstandardized coefficients. Standard errors in parentheses; * p < 0.05

5

Page 8: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

News coverage about sustainable development in NRC Handelsblad does not influence the

variance of the fractionally integrated number of articles in De Telegraaf. Interestingly, this

variance was significantly and positively affected by news coverage in Het Financieele

Dagblad. This means that when the fractionally integrated number of articles about

sustainable development in Het Financieele Dagblad increased, the variance in the number of

articles in De Telegraaf also increased and it thus became more difficult to predict this values

precisely. The results of the ARIMA-model make clear that the variance on a certain moment

is not strongly affected by previous variance, as both the AR and the MA-term are

insignificant.

ConclusionThis study has found that changes in the number of articles about sustainable development in

the scientifically oriented newspaper NRC Handelsblad and the business oriented newspaper

Het Financieele Dagblad, both lead to changes in the same direction in the number of articles

about this topic in the popular newspaper De Telegraaf. A GARCH-model was used to

analyse this data, as the dependent variable had a volatility that was high at certain moments.

Because this heteroscedasticity was an interesting part of the dependent variable

another model was developed to predict volatility. The ARIMA-model that was built for this

purpose, found that variance could partly be predicted by the fractionally integrated number

of articles about sustainable development in Het Financieele Dagblad, but not by the coverage

of NRC Handelsblad.

ReferenceRobinson, P. M. (1995). Log-periodogram regression of time series with long range

dependence. Annals of Statistics, 23(3), 1048-1072.

Vliegenthart, R. (n.d.). Moving up. Applying aggregate level time series analysis in

communication science. Unpublished manuscript.

6

Page 9: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

Do Filetsset week, weekly

twoway (tsline FD, lcolor(red)) (tsline NRC, lcolor(green) lpattern(dash) lwidth(medthick)) (tsline Telegraaf, lcolor(blue) lpattern(dash) lwidth(medium))twoway (tsline FD, lcolor(red))

*with driftdfuller Telegraaf*random walkdfuller Telegraaf, noconstant*trenddfuller Telegraaf, trend

*with driftdfuller FD*random walkdfuller FD, noconstant*trenddfuller FD, trend

*with driftdfuller NRC*random walkdfuller NRC, noconstant*trenddfuller NRC, trend

search ARFIMAroblpr Telegraafgen dfTelegraaf=Telegraaf-.3520017*l.Telegraafgen dfNRC=NRC-.3520017*l.NRCgen dfFD=FD-.3520017*l.FD

*with driftdfuller dfTelegraaf*random walkdfuller dfTelegraaf, noconstant*trenddfuller dfTelegraaf, trend

*with driftdfuller dfFD*random walkdfuller dfFD, noconstant*trenddfuller dfFD, trend

*with driftdfuller dfNRC*random walkdfuller dfNRC, noconstant*trenddfuller dfNRC, trend

ac dfTelegraafpac dfTelegraafcorrgram dfTelegraaf

Page 10: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

*The ACF graph shows a clear spike a unclear pattern, while the PACF graph displays a declining pattern for the first lags. *A ARIMA (0,fi,1) model seems thus the right choice

arima dfTelegraaf, ma(1) estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)ac rpac rdrop r r_s*Q(r) and Q(r2) significicant, a peak at lag 2

arima dfTelegraaf, ma(1 2) estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)ac rpac rdrop r r_s

arima dfTelegraaf, ar(1) ma(1 2) estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)ac rpac rdrop r r_s*Q(r) insignificant but Q(r2) significicant

arch dfTelegraaf, ar(1) ma(1 2) arch(1)estat ic

arch dfTelegraaf, ar(1) ma(1 2) arch(1) garch(1)estat icdi 3704.739 - 3599.802di 3730.917 - 3630.344predict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)ac rpac rdrop r r_s*Q(r) stays insignificant. Now the model is ok!

*now see how the residuals of this models are best predicted by which lags for NRC and FDarch dfTelegraaf, ar(1) ma(1 2) arch(1) garch(1)estat icpredict r, res

xcorr r dfNRC, lags(13)xcorr r dfFD, lags(13)*both strongly correlate at the first lag

ii

Page 11: Sustainable Development in Popular Newspapers: How is coverage in De Telegraaf influenced by other newspapers’ attention to sustainable development? - ARIMA modelling with (G)ARCH

drop r arch dfTelegraaf l1.dfNRC l1.dfFD, ar(1) ma(1 2) arch(1) garch(1)estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)drop r r_stest l1.dfNRCdi 0.0912/2

*****Explaining conditional variance*****final arima modelarima dfTelegraaf, ar(1) ma(1 2) predict r, resgen sArima=r*rdrop r

twoway (tsline sArima, lcolor(black))*with driftdfuller sArima*random walkdfuller sArima, noconstant*trenddfuller sArima, trend

ac sArimapac sArimacorrgram sArima

arima sArimaestat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)ac rpac rdrop r r_s

arima sArima, ar(1) ma(1)estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)drop r r_s

arima sArima, ar(1) ma(1)predict r, resxcorr r dfNRC, lags(13)xcorr r dfFD, lags(13)

arima sArima l1.dfNRC l1.dfFD, ar(1) ma(1)estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)drop r r_s

iii