31
Sofie De Broe, Johan van der Valk 17 December 2019 Preconditions for effective "data science" deployment Organisational and statistical challenges

Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Sofie De Broe, Johan van der Valk

17 December 2019

Preconditions for effective "data science" deployment

Organisational and statistical challenges

Page 2: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Mission and focus

Using state-of-the-art data science methods and techniques with new and combined datasources to

obtain faster, cheaper and better statistical information on burning and relevant policy questions and

challenges

Training Data scouting Ethics IT

PROJECTS IMPACT 1 NEW STATS

GRANTS

Research: BD methods and techniques

Health, energy, labour market, mobility, safety, economic indicators

PROJECTS IMPACT 2

PROJECTS IMPACT 3

Page 3: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

3

1. Multi-disciplinarity

Page 4: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

2. Eco-system

Page 5: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

5

3. Datascouting

Bron Data Toepassingen

Cargonaut Aerial cargo in NL Economy and trade

Coosto Social media posts Desire to move, social tension

Enelogic/Liander Smart meter data (testset) Energy consumption

Tennet Quantity of current Energy consumption

KNMI Solar radiation data Solar power production

Tomtom Floating Car Data Traffic intensity

T-mobile Signalling data Movement/displacements

Page 6: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Example: Foreign persons in the Netherlands: Germans (left), Belgians (right) March-April 2018

Page 7: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

4. Funding: Deep Solaris Project

6

Page 8: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

5. Relevance: Urban Data Centres

Page 9: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

6. Communication

Methods: ML to detect solar panels

Social media and text mining: cyber security

URLs: small innovative companies

Visualisation techniques: dotmaps

Page 10: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

7. Research: combining data sources

10

Page 11: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

11

8. Privacy Preserved Data Sharing and Ethics

model

Local data

Local data

FAIR data station

FAIR data station

Page 12: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

- Example 1: Analysing combined data within a ‘thrusted secure environment’

- The set-up of the systems guarantees that both parties will get no insight in the underlying datasets, but are only able to see the results

Page 13: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

9. Implementation

1. Adjust statistical proces2. Software/IT infra3. Q and stability of the source4. Train staff5. ……

Page 14: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Are we still making the right statistics?

Can we tackle policy questions nationally?

14

10. Questioning output and practises

Page 15: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

15

Are we still making the right statistics?

Page 16: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Half of the EU-region consists of aninternational border

150 miljoen inhabitants representing40 procent of the EU territory

Page 17: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Can we tackle policy questions nationally?

17

Page 18: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

Johan van der Valk

December 2019

Producing cross-border statistics

Organisational and statistical challenges

Page 19: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

19

Cross-border statistics: an example of new demands that impacts on how we work

Page 20: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

20

➢Need for comparable data on the border regions: low regional detail and harmonised

➢ EU data: Harmonised statistics of Member States with limited regional detail

➢National data: regional detail with specific methods

Cross-border statistics not available?

Page 21: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

21

Cross-border statistics are difficult to produce, because they …

- Require low regional detail,

- using international comparable methods and sources and

- measuring flows of persons & goods

➢ Strong incentive to explore new (BD) sources!

Production information is challenging

Page 22: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

22Borderdata portal

Data portal Grensdata for NL Border regions

Page 23: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

23

Open data to download

Page 24: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

24

Users: Diverse➢ In needs

➢ Different region levels

Producers: Organised

➢ NSI’s within ESS

➢ Eurostat, UNECE

Users vs Producers of cross-border data

Page 25: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

25

• Methods should be harmonised (inter)nationally

• Is efficient using existing data and expertise

• Equipped for setting up a sustainable system

• Cross-border data is relevant from national perspective

• Stop depicting a country as an island!

• Fits into mandate of the ESS

• Cannot rely on Eurostat because of lack of resources and expertise

SI’s should cover X-border stats

Page 26: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

26

• Make use of existing data and infrastructures

• Use strategy of constructive collaboration

• Start with small group of volunteering SI’s

• Cooperating with stakeholders and relevant organisations

• Aligned with other initiatives

• Meet to exchange practices

How should we develop x-border stats?

Page 27: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

27

New demands: change the way we work

Page 28: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

28

➢ Production✓ One source per Statistic-> Multiple sources

✓ Micro data available -> not available

✓ Few Models -> Many Models

➢ Dissemination➢ Tables & publications -> Open data & Tools & Dashboarding

➢ Different roles within ESS✓ ESS: SI’s have to take more responsibility

✓ Eurostat: Coordinating -> Facilitating

Change organisation of producing statistics

Page 29: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

29

➢ Different forms of cooperation is required with✓ Other SI’s, international organisation

✓ Academics to develop methods

✓ Users (local, regional, national, international)

✓ Data suppliers

✓ Private parties for dissemination

Product of statistics: cooperation

Page 30: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

30

➢New hard skills:

✓Data science

✓Visualisation

➢New soft skills

✓Communication

✓Relations

✓Collaboration rather than cooperation

Requiring new skills

Page 31: Preconditions for effective data science deployment · 2019. 12. 19. · Using state-of-the-art data science methods and techniques with new and combined datasources to ... • Equipped

31

Towards Collaboration