43

The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Embed Size (px)

DESCRIPTION

In this talk we describe how the Fourth Paradigm for Data-Intensive Research is providing a framework for us to develop tools, technologies and platforms to support actionable science. We discuss applications that take advantage of cloud computing, particularly Microsoft Azure, to realise the potential for turning data into decisions, knowledge and understanding. http://www.fourthpardigm.org and http://www.azure4research.com

Citation preview

Page 1: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 2: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 3: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

𝜌𝐷𝑣

𝐷𝑡= −𝛻𝑝 + 𝛻 ∙ 𝜯 + 𝒇

Page 4: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Data

Acquisition &

modelling

Collaboration

and

visualisation

Analysis &

data mining

Dissemination

& sharing

Archiving and

preserving

fourthparadigm.org

Data-intensive Research

Page 5: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

X-Info

• Data ingest

• Managing a petabyte

• Common schema

• How to organize it

• How to reorganize it

• How to share with others

• Query and Vis tools

• Building and executing models

• Integrating data and Literature

• Documenting experiments

• Curation and long-term

preservation

The Generic Problems

Experiments &Instruments

Simulations

Literature

Other Archives

facts

facts

facts

facts

Questions

Answers

Page 6: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

All Scientific Data Online

• Many disciplines overlap and use

data from other sciences.

• Internet can unify all literature and

data

• Go from literature to computation to

data back to literature.

• Information at your fingertips –

For everyone, everywhere

• Increase Scientific Information

Velocity

• Huge increase in Science

Productivity

(From Jim Gray’s last talk)

Literature

Derived and recombined data

Raw data

Page 7: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Gartner: http://t.co/Co3EK1ERfN

Page 8: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 9: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Manual Measurement

Automated Measurement

Sample Collection

Historical Photographs

Counting

Ubiquitous

Motes

Aircraft SurveysModel Output

Typing

Page 10: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Monitoring

Collation

Quality assurance

Aggregation

Analysis

Reporting

Forecasting

Distribution

Done poorly,but a few notablecounter-examples

Done poorly to moderately,not easy to find

Sometimes done well,generally discoverable and available,

but could be improved

Integration

(I. Zaslavsky & CSIRO, BOM, WMO)

Page 11: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 12: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Web search:

“open weather

data azure”

Page 13: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 14: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Water depth map of London(~130km2). Storm event of 60

minutes and 100 years return period

http://www.ncl.ac.uk/ceser/researchprogramme

/informatics/citycaturbanfloodmodel/

Page 15: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 16: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

http://www.fetchclimate.org/

Page 17: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 18: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 19: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 20: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 21: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 22: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Parker MacCready: Univ. of Washington

Rob Fatland:, Wenming Ye, Nels Oscar, Microsoft Research

Page 23: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 24: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Numerical model of 3-D ocean currents and water properties

• salinity,

• temperature,

• biogeochemistry

Relies on external data sources:• Bathymetry

• Wind and heating

• Open Ocean BC’s

• Tides

• Rivers

Page 25: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Model Validation Comparisons are done to an extensive suite of in-situ observations

• sea surface height

12 NOAA tide gauges

• salinity and temperature

over 2000 CTD casts from ECOHAB, RISE,

DOE, NANOOS, Hood Canal, IOS, King

County, and NOAA

• velocity and moored S,T

7 coastal ADCP / CTD moorings from the

ECOHAB and RISE projects, 2 moorings

from IOS

Page 26: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Interactive 3-D Model Visualization using WorldWide Telescope, Narwhal and Layerscape

www.layerscape.org

Page 27: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

EH4 32 m

Figure from SA Siedlecki, UW/JISAO; Observations from Connolly et al., 2010

Validation: Dissolved Oxygen & Temperature

Page 28: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

LiveOcean: System Architecture

HPClinux 150 cores

ForecastNetCDF files

LiveOcean

Server• Post Processing

• Pre-make .png “views”

• Archive NetCDF files

• API for web sites

• Admin.js

• Client.jsBlob Storage:

Forecast Copy

Science UserpythonAzure Table:

Log Info

Admin

Website

Client Websitehttp://mappable.azurewebsites.

net/liveocean/

Rivers

USGS

Atmosphere

UW WRFOcean

HYCOM

Page 29: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

http://mappable.azurewebsites.net/liveocean

Page 30: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 31: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

CloudBig data

Aggregation

MachineLearning Analytics

Page 32: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

The Cloud

democratizes

access to scale &

economies of scale

Page 33: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Commodity at Scale

Page 34: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

http://azure.microsoft.com/

Page 35: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

http://github.com/windowsazure

Page 36: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 37: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 38: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 39: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Research Cloud Ecosystem

Page 40: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

www.azure4research.com

Page 41: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014

Use laptops &

desktop computers

Overwhelmed by

data

Finding analysis

ever more difficult;

sharing even

harder

www.azure4research.com

Page 42: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
Page 43: The Fourth Paradigm - Deltares Data Science Day, 31 October 2014