36
Integrating economic statistics in monitoring the 2030 Agenda Integrating economic statistics in monitoring the 2030 Agenda Plenary session 1 Emerging techniques in using big data

Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 AgendaIntegrating economic statistics in monitoring the 2030 Agenda

Plenary session 1

Emerging techniques in using big data

Page 2: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Memorable quotes from yesterday:

• “it’s my camera”

• “glad to see the room is packed with Statisticians”

• “timely, accurate, and comprehensive”

• need to think about “the future of economic statistics”

• “regional cooperation”

• “revisit the trade-offs”

• “don’t break the time-series!”

• “communication with users”

Page 3: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Sample survey of APES 2019 participants:

“what is the best opportunity and the biggest challenge for your country to use big data in monitoring the 2030 Agenda?”

Category of points made:

Page 4: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Reference papersPaper title Authors Panel member

Lessons for effective Public-Private

Partnerships (PPPs) from the use of mobile

phone data in Indonesian tourism statistics

Titi Kanti Lestari, Siim Esko

(BPS Indonesia and

Positium)

Titi and Siim

Subjective Happiness Index based on Twitter

in Indonesia

Asita Sekar Asri, Siti

Mariyah

(BPS Indonesia – District

Office in Sulawesi)

Sita

Measuring Commuting Statistics in Indonesia

Using Mobile Positioning Data

Amanda Pratama Putra,

Ignatius Aditya Setyadi,

Siim Esko, Titi Kanti Lestari

(BPS Indonesia and

Positium)

Ignatius, Titi

and Siim

Development of a Big Data Analysis System

(Case Study: Unemployment Statistics)

Maftukhatul Qomariyah

Virati, Rachmi Agustiyani,

Siti Mariyah, Setia Pramana

(BPS Statistic, District

Office in Kalimantan)

Vira

Page 5: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Page 6: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Part III – SDG Data Sources and Gaps

Page 7: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

TRUST HOTSPOT

TRUST HOTSPOT

Page 8: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Discussion session (1): responding to user needs

• Policy-making needs (SDGs) in Indonesia driving big data developments in:

• MPD in commuting statistics

• MPD in tourism statistics

• Social media analysis for well-being/happiness

• Analysis system for unemployment statistics

• How do we plan/design methodologies and approaches for big data that meet user needs?

Page 9: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Methodology

Figure 1. Usual Environment (Saluveer, E)

Page 10: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Figure 2. Commuters algorithm

Page 11: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

GS

BP

M 5

.0

Page 12: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

GSBPM new areas for NSO in terms of big data

Sim

ilar –

NSO

has

exp

erti

se

Sim

ilar –

NSO

has

ex

per

tise

Dif

fere

nt –

Exp

erti

se li

es

in e

xter

nal

par

ties

Ver

y d

iffe

ren

t –

Exp

erti

se is

ext

ern

al

Ver

y d

iffe

ren

t –

MN

O

exp

erti

se

Ver

y d

iffe

ren

t –

Exp

erti

se is

ex

tern

al

Rel

ativ

ely

sim

ilar

–N

SO h

as

exp

erti

se

Rel

ativ

ely

sim

ilar –

NSO

has

ex

per

tise

Sim

ilar –

NSO

has

exp

erti

se

Sim

ilar –

NSO

has

ex

per

tise

Conceptually – NSO; Technically - external

Page 13: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

GSBPM for cross-border tourism processing

BP

S –

Stat

isti

cs In

do

nes

ia BP

S –

ou

tpu

ts

and

va

riab

les

Posi

tiu

m –

Co

llect

ion

an

d p

roce

ssin

g d

esig

n

Posi

tiu

m -

Bu

ild c

olle

ctio

n a

nd

pro

cess

ing

Telk

om

sel -

Co

llect

Telk

om

sel -

Pro

cess

BP

S w

ith

Po

siti

um

Wei

ght

and

ag

greg

ate

BP

S

BP

S

BP

S

Positium – QA and metadata tools

Page 14: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

User Needs in the 2nd and 3rd Level Government

Time

Publication

Once every several year

Coverage

Analysis

Till 2nd level government

Autonomy

Data

Sometimes manipulated (not by

BPS)

Subjectivity

Real Case

In every region

Page 15: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

DataWhat, How, Why?

Wu et al. (2015) & Mitchell et al. (2013)

Literature

Python 2.7 – Tweepy –Streaming API – JSON

Collecting data

NLP – LabMT DictionaryAnalysis

Dataset

⚫ April till July, 2M overall, 67.589 tourism tweets and 3.800 health tweets.

SDGs

Page 16: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Page 17: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

User Activity in the System

Working System Time Estimation Total Proposed System Time Estimation Total

Opening news site 30 seconds 30 seconds

Web scraping news site* @1 second30 news x 1 second = 30

seconds

Open the news link @30 seconds30 news X 30 seconds =

900 seconds

Copy Paste news into

excel@30 seconds

30 news X 30 seconds =

900 seconds

Compiling news 300 seconds 300 secondsDownload news that has

been collected*@5 seconds

6 sites X 5 seconds = 30

seconds

Skimming News to search

related news@120 seconds

30 news X 120 seconds =

360 secondsSearching news* 2 seconds 2 seconds

Opening jobs vacancy site 30 seconds 30 seconds

Web scraping job vacancy

site*@1 second

30 job vacancy x 1 second

= 30 seconds

Open the job vacancy link @30 seconds30 job vacancy X 30

seconds = 900 seconds

Copy Paste job vacancy

into excel@30 seconds

30 job vacancy X 30

seconds = 900 seconds

Compiling job vacancy 300 seconds 300 secondsDownload job vacancy

that has been collected*5 seconds 5 seconds

Opening & Searching

Index Google Trend site60 seconds 60 seconds

Monitoring Index Google

Trend*5 seconds 5 seconds

Opening & Searching

Tweet and Trending

Topic

60 seconds 60 seconds Monitoring Tweet* 5 seconds 5 seconds

Total Activity : 11Total : 3840 seconds ~

64 minutesTotal Activity : 7

Total : 107 seconds ~

1.783 minutes

Page 18: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Discussion session (1): responding to user needs

• Policy-making needs (SDGs) in Indonesia driving big data developments in:

• MPD in commuting statistics

• MPD in tourism statistics

• Social media analysis for well-being/happiness

• Analysis system for unemployment statistics

• How do we plan/design methodologies and approaches for big data that meet user needs?

Page 19: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Discussion session (2): data quality issues with new methodologies

Page 20: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Data Science Journal (May 2015): The Challenges of Data Quality and Data Quality Assessment in the Big Data; Era Li Cai and Yangyong Zhu, Fudan University, Shanghai, China

Page 21: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Data Science Journal (May 2015): The Challenges of Data Quality and Data Quality Assessment in the Big Data; Era Li Cai and Yangyong Zhu, Fudan University, Shanghai, China

Page 22: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Discussion session (2) - data quality issues with new methodologies

Methodology Main quality gains Main quality challenges

Happiness Index via Twitter

Timeliness; data re-use Processing; biases; cultural/language

Big Data Analysis System for unemployment statistics

Timeliness; accessibility Processing

Commuting statistics using MPD

Timeliness; relevance Reliability/ validity/ accessibility long term

Page 23: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Preprocessing

Tweet

URL + Symbol Removal

Formalizing

Cleaning after formalizing

Stemming

Stop Word Removal

Tweet

Emoji Extraction

Translating emoji to string

Cleaning text Final

Tweet

Life Satisfaction

Affections

Eudaimonia

Personally

Social

Page 24: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Processing

Final Tweet

Tokenization

Matching with labMT dictionary

The scoring :

T = containing N words in each areaf = frequency of ith wordh = happiness value of that wordpi = normalized frequency

2.068.232 tweet clean

Page 25: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

SHI vs Happiness Index in Indonesia

highest

highest

lowest

lowest

67.5275.86

BPS >> 2017SHI >> 2018

??

?

Festival

Daily Ranting

Page 26: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Demak Regency 5135 5.4622

Lombok Timur Regency 4719 5.4542

Karang Asem Regency 2821 5.4018

Cilegon City 2873 5.7582

Bangka Regency 2444 5.7164

Kapuas Regency 8593 5.6888

highest

lowest

SHI By 3rd Level Government

Page 27: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Questions and comments?

Page 28: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

<h1 class="title-large"> Balikpapan international airport sees 40% drop in passengers this Idul Fitri</h1>

<span class="day">Fri, June 7, 2019</span>

<p>The Sultan Aji Muhammad Sulaiman International Airport in Balikpapan, East Kalimantan saw a 40 percent drop in passengers this Idul Fitri holiday as travelers choose to depart from the newly developed Aji PangeranTumenggung Pranoto in Samarinda.</p>

Page 29: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Questions and comments?

Page 30: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Figure 4 Commuting Pattern from Household Survey and MPD

Survey MPD

Page 31: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Figure 5 Correlation of Household Survey and MPD

Page 32: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Questions and comments?

Page 33: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Discussion session (3): forming effective partnerships

Page 34: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Admin data providers

Official Statisticians

Data Scientists

Policymakers (SDGs)

Other?

Big data providers

Public consent/

trust

Eg Lawmakers, software providers, psychologists etc

The partnership web for big data in official statistics

PPP may be needed

Page 35: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

Wider international perspectives

• UNESCAP workshop on big data, Jakarta June 2019

• GWG on big data emerging tools

• UNDP international conference on big data, Kigali, May 2019

• ISI WSC, Kuala Lumpur, August 2019

Page 36: Plenary session 1 Emerging techniques in using big data · (SDGs) Other? Big data providers Public consent/ trust Eg Lawmakers, software providers, psychologists etc The partnership

Integrating economic statistics in monitoring the 2030 Agenda

You draw the conclusions

On post-it notes, for each discussion topic give:

• your main conclusions from the debate and/or

• where regional cooperation is needed

• matching big data opportunities to user needs/SDGs

• managing big data quality effectively in official statistics

• forming effective partnerships for using big data