24
Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views of Norges Bank. The views expressed are those of the authors and do not necessarily reflect those of Norges Bank. 1 / 24

Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Big data, news, and economics

Leif Anders Thorsrud

BI and Norges Bank

April 2019

Disclaimer This work should not be reported as representing the views of Norges Bank. The views expressed are those of the authors and do not necessarily reflect those ofNorges Bank.

1 / 24

Page 2: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Framing expectations

Today’s topic:

“Kunstig intelligens/data science fra A til Å”

What is Artificial Intelligence (AI)?

Combining potentially many Machine Learning (ML) algorithms tosolve complex problems (Taddy (2018))

This presentation:

Will focus on the problems addressed in Norges Bank’s UnstructuredBig Data Project, and why AI/ML was useful

Will not focus on the technicalities (ML/IT)

2 / 24

Page 3: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Core questions in (macro)economics

What is the state of theeconomy now?

What’s causing businesscycles?

How are expectationsformed?

Why and how is mone-tary policy communica-tion important?

3 / 24

Page 4: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

How to answer: Economics in one slide

y = E(x|I)

y = OutcomesE = Expectationsx = “the future”

I = Information

In words:Outcomes today are a function of our expectations about the future given theinformation we have today

4 / 24

Page 5: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Textbook economics

Event Information flow Choices and outcomes

Full Information (I)Rational Expectations(FIRE)

y = E(x|I)

5 / 24

Page 6: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Reality

Event Information flow Choices and outcomes

Textbook economics:Full Information (I)Rational Expectations(FIRE)

y = E(x|I)

Real life: The ether mat-ters

People read the news to getinformation

y = E(x|I(media))

6 / 24

Page 7: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Challenges and opportunities

Our challenge: Quantify news

News and media data is often textual

This is Big Data: many words and articles, and highly unstructured

Need AI, but algorithms and data required are new to economists

7 / 24

Page 8: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Challenges and opportunities

Norges Bank solution: Unstructured Big Data Project (UBDP)

Broad mandate: Investigate potential usefulness of “Text as Data"

Identify/document its benefits over conventional economic data

Speak to economic theory and the academic literature

Managed and run by Norges Bank Research department

Started late 2016

8 / 24

Page 9: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

In a nutshell: Do data sources like these...

+

ñ

vlèRlól!f_

5lo

ll.ålã

vloIEIl\

-l-rl¡Il,ll_E

t*

ñlotsI-

Ele

ñl(Utbl-

.lLt6=t|ìt¡l Inl

ñ

g_Ëþ*

flEIt!EoÐEra

.e. ãE

+E

q-G

ñ-- l!u-c,T!Q=

=bÈ l!oÉ¡

oal\

ñF-

6

oao

+ooorôf¡l

Fl

(,NoÊq

LIdF-tc,- L aoILfEao

- b

ñ

L{oÉ

rcE

FI

HB

ËR

ndE

.ü¡-{H

.9P(/) õ

E#

çn O.

ãEboo'E

l .'tr'iã.'t=

å.È8çË

iü(uãE

{o"H

EE

'õE

EtrEoro>!

oaE

i-lI

.ÉtrorlIFI

-lIHgflooraHoT-lIfún

á

¡a!ÊÉl¡

èR

àR

>ES

goo

I

ñæ

èRN èRo

FoEbt!-o-o,--

*N

Ë.r!(¡)tt(u

.GÞo

c,o69p

EtJJ q¡ÊoI,l

LÊÐ-o6Le

.LolâobÐoLèl'ol'

GtroIl,lEoÊoJEIttrL

€Et!aÃ|IË.

|!Jl'ol'oIttr-l!t

- L€,u0- h¡I

Ft

tlÊt!Êr

ÉL

o¡ËrÞ

.

))-¡

èR

èSS

N

èslg--fl-

l-.

ËF

Èt\

fi*O

Rg@

lrl c_)

cf

èRo

èRo

6otc¡Èü

!¡ILCL

t^-LoËUI

FåÈ.gIrtn)5E(t)

(o(u

oaboo:

.9ÊÀfloT¡aorL.r!ta

È.Fl

Þ {

àlÞ.:àeosñèe

sNoÈ

9oo

tCIrâ+#

ì)

ì

ìIfì

-(D-rD

\l

LJIO(-\!-a)tìO1)\,/

O$-oorU-U

C

l)ru'=n

rôNôO

rghn=

ICoo

4)aaN

ú)gÉ

r-a

z:/ra!ôn-$rc'õg

Yl *o

) __J

ôl-\

...contain useful information for answering ourquestions?

9 / 24

Page 10: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

In a nutshell: Do data sources like these...

+

ñ

vlèRlól!f_

5lo

ll.ålã

vloIEIl\

-l-rl¡Il,ll_E

t*

ñlotsI-

Ele

ñl(Utbl-

.lLt6=t|ìt¡l Inl

ñ

g_Ëþ*

flEIt!EoÐEra

.e. ãE

+E

q-G

ñ-- l!u-c,T!Q=

=bÈ l!oÉ¡

oal\

ñF-

6

oao

+ooorôf¡l

Fl

(,NoÊq

LIdF-tc,- L aoILfEao

- b

ñ

L{oÉ

rcE

FI

HB

ËR

ndE

.ü¡-{H

.9P(/) õ

E#

çn O.

ãEboo'E

l .'tr'iã.'t=

å.È8çË

iü(uãE

{o"H

EE

'õE

EtrEoro>!

oaE

i-lI

.ÉtrorlIFI

-lIHgflooraHoT-lIfún

á

¡a!ÊÉl¡

èR

àR

>ES

goo

I

ñæ

èRN èRo

FoEbt!-o-o,--

*N

Ë.r!(¡)tt(u

.GÞo

c,o69p

EtJJ q¡ÊoI,l

LÊÐ-o6Le

.LolâobÐoLèl'ol'

GtroIl,lEoÊoJEIttrL

€Et!aÃ|IË.

|!Jl'ol'oIttr-l!t

- L€,u0- h¡I

Ft

tlÊt!Êr

ÉL

o¡ËrÞ

.

))-¡

èR

èSS

N

èslg--fl-

l-.

ËF

Èt\

fi*O

Rg@

lrl c_)

cf

èRo

èRo

6otc¡Èü

!¡ILCL

t^-LoËUI

FåÈ.gIrtn)5E(t)

(o(u

oaboo:

.9ÊÀfloT¡aorL.r!ta

È.Fl

Þ {

àlÞ.:àeosñèe

sNoÈ

9oo

tCIrâ+#

ì)

ì

ìIfì

-(D-rD

\l

LJIO(-\!-a)tìO1)\,/

O$-oorU-U

C

l)ru'=n

rôNôO

rghn=

ICoo

4)aaN

ú)gÉ

r-a

z:/ra!ôn-$rc'õg

Yl *o

) __J

ôl-\

What is the state of the economynow?

What’s causing business cycles?

How are expectations formed?

Why and how is monetary policycommunication important?

...contain useful information for answering ourquestions?

10 / 24

Page 11: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Lessons learned

11 / 24

Page 12: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

What is the state of the economy now?

A Newsy Coincident Index (NCI) for Norway (Thorsrud (2018))

High-frequency indicator of the business cycleLike a daily survey (but much cheaper)

12 / 24

Page 13: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Text as data: Benefits,...

Potentially available at a high fre-quency

Potentially reflecting the broadereconomy

Financial data is high frequency,but NOT reflecting the broadereconomy

13 / 24

Page 14: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Text as data: Benefits,...

Potentially available at a high fre-quency

Potentially reflecting the broadereconomy

Potentially capturing economicrelevant concepts not measured byconventional hard economic data

Financial data is high frequency,but NOT reflecting the broadereconomy

E.g., politics, natural disasters,and uncertainty

14 / 24

Page 15: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Uncertainty and Brexit (Larsen (2017))Topic: EU

15 / 24

Page 16: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

What’s causing business cycles: Fluctuations inuncertainty?

(EU) Uncertainty shock and macro responses:

16 / 24

Page 17: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Text as data: Benefits,...

Potentially available at a high fre-quency

Potentially reflecting the broadereconomy

Potentially capturing economicrelevant concepts not measured byconventional hard economic data

A number is a fact, but themedia in which it is pre-sented/discussed/opinionated addsto the information

Financial data is high frequency,but NOT reflecting the broadereconomy

E.g., politics, natural disasters,and uncertainty

I.e., there might be an indepen-dent (causal) media effect

17 / 24

Page 18: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

How are expectations formed? (Larsen and Thorsrud(2017))

(1) Before strike (3) After strike(2) Strike periodNo media, but news

Average returns all firms

Average returns

r2,· Treatment group(“exposed to media")r1,· Control group (“notexposed to media")

Formally:∆ri,d−ba = δ + τ∆wi +∆uiwhere τ = ∆rm = ∆r2,s−∆r1,s with∆r2,s = r̄2,2− r̄2,1∆r1,s = r̄1,2− r̄1,1if r̄2,1 = r̄1,1∆rm = r̄2,2− r̄1,2

r̄0,1 r̄0,3

r̄0,2

∆rs = r̄0,2− r̄0,1 ≈−0.62

∆rs Strike effect

r̄2,1, r̄1,1 r̄2,3, r̄1,3

r̄1,2

r̄2,2

∆rm = r̄2,2− r̄1,2 ≈−0.57

∆rm Media effect

Back

18 / 24

Page 19: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Text as data: Benefits, theory, and some UBDP output

Potentially available at a high fre-quency

Potentially reflecting the broadereconomy

Potentially capturing economicrelevant concepts not measured byconventional hard economic data

A number is a fact, but themedia in which it is pre-sented/discussed/opinionated addsto the information

Intu

itive

bene

fits

News-driven/sentiment-drivenbusiness cycle view

Rational (in)attention theory andinformation rigidities

Narrative economics

Inth

eory

Norwegian data“Words are the new numbers: Anewsy coincident index of thebusiness cycle" (Thorsrud (2018))

“Components of Uncertainty"(Larsen (2017))

“Asset returns, news topic,and media effect" (Larsen andThorsrud (2017))

“The Value of News for Eco-nomic Developments" (Larsenand Thorsrud (2018b))

International data“News-driven inflation expecta-tions and information rigidities"(Larsen et al. (2019))

“Business cycle narratives"(Larsen and Thorsrud (2018a))

19 / 24

Page 20: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

How (success factors)?Algorithms

Combined (close to) off the shelf Machine Learning algorithms fromthe Natural Language Processing literature with conventional tools usedin econometrics

Latent Dirichlet Allocation, Dynamic Factor Models, Latent ThresholdModels,...

Keywords: Dimension reduction, sparsity, and non-linearity

IT

All computations done in simple cloud environment. In-househardware not adequate:

“Small” computers, firewall/security issues, software restrictions

Keywords: Flexible and low cost for R&D

Dissemination

Internal (and external) courses and presentations on using “Text asdata”

20 / 24

Page 21: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Difficulties

Algorithms and data

Economists often care more about the story than the outcome...I.e.,often look for causal explanation rather than best prediction. ML/AIbetter, or mostly used, for the latter. Ongoing work to combine

Domain knowledge is important

Constructing the appropriate data sets might be difficult. Need to relyon external provider(s), which can be expensive, or construct ourself

Text data is abundant. (Macro)Economic data is scarce (relative totext):

Becomes an issue for training algorithms

Supervised versus unsupervised learning

21 / 24

Page 22: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Difficulties cont’d

From research to production?

New techniques and data require different skills: Need to train staff, orhire people with the right skills

And an interest in economics. Domain knowledge is important!

Ownership: ML/AI often black box (and more so for those that havenot done the development). Hard to get people to use stuff they do notunderstand

This is often a good thing, but also relates to a preference for “causalunderstanding”

Data and model management: Need a well functioning datamanagement/science team/infrastructure

22 / 24

Page 23: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

Conclusions and (potential) advice

Combining potentially many Machine Learning algorithms to solve complexproblems works in economics too. Some general lessons/advice

Focus on the questions, then the tools (algorithms) needed to solve theproblem

Make sure the questions are of relevance for your business. “Need moreof what we do not have rather than more of what we already have”

Start small

E.g., pilot based on cloud infrastructure might be sensible when output isuncertain

Expect resistance

Successful (eventual) implementation in production requires that peopleoutside the project group “understands” the work being done

Value domain knowledge

23 / 24

Page 24: Big data, news, and economics · Big data, news, and economics Leif Anders Thorsrud BI and Norges Bank April 2019 Disclaimer This work should not be reported as representing the views

References I

Larsen, V. H. (2017, April). Components of uncertainty. Working Paper2017/5, Norges Bank.

Larsen, V. H. and L. A. Thorsrud (2017). Asset returns, news topics, andmedia effects. Working Paper 2017/17, Norges Bank.

Larsen, V. H. and L. A. Thorsrud (2018a). Business Cycle Narratives.Working Paper 2018/03, Norges Bank.

Larsen, V. H. and L. A. Thorsrud (2018b). The Value of News for EconomicDevelopments. Journal of Econometrics (Forthcoming).

Larsen, V. H., L. A. Thorsrud, and J. Zhulanova (2019, February).News-driven inflation expectations and information rigidities. WorkingPaper 2019/5, Norges Bank.

Taddy, M. (2018, January). The Technological Elements of ArtificialIntelligence. University of Chicago Press.

Thorsrud, L. A. (2018). Words are the new numbers: A newsy coincidentindex of the business cycle. Journal of Business & Economic Statistics(Forthcoming).

24 / 24