14
Accelerating Scientific Dataflows Sudarshan S. Chawathe Associate Professor of Computer Science & Cooperating Associate Professor of Climate Change Institute University of Maine

Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

Accelerating Scientific Dataflows

Sudarshan S. Chawathe

Associate Professor of Computer Science& Cooperating Associate Professor of Climate Change Institute

University of Maine

Page 2: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 2

A Data-Centric Viewn What are the primary and supplemental datasets?

n How are different datasets acquired?

n What are the key transformations, interpretations, andvisualizations?

n What may be automated? What requires humaninterpretation?

n What are effective and efficient modes of interaction withdata?

Page 3: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 3

Project 301n Cyber-Infrastructure for Climate-Change Research.

n Goal: Accelerate scientific discoveries by enabling moreeffective management of large and diverse datasets.

n Approach: Develop domain-specific adaptations of datamanagement methods. Implement and evaluate the methodson real data.

n Research topics (Computer Sci.):u Data importation: “ETL” for scientific data.u Data integration: instruments, documents, Web services, ...u Interactive data exploration and visualization.u Visual programming.u Data mining.u Provenance of data.u Workflows.u Systems issues: performance, scalability, reliability, ...

Page 4: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 4

P301dx Featuresn Integrated view of large, diverse datasets: ice-core data,volcanic records, data extracted from documents, ...

n Interactive data exploration based on charts plottingtime-series and related data, maps, ...

n Palette of tools for data processing, plotting, and othermanipulations. Built-in tools for resampling, smoothing, ...

n Tools that operate on, and produce, objects in theworking-object store, simplifying multi-step data manipulationand plotting.

n Interactive generation of new tools by composition and otherhigher-level operations: tool-generating tools.

n Chart exportation in high-quality vector and raster formats.

n A door to the larger cyber-infrastructure effort, P301.

Page 5: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 5

Tambora and SO4

Page 6: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 6

Map: Icereader Data

Page 7: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 7

Web Application Challenges

1. REST: Representational State Transfer.n Robust and scalable Web applications.

n Standards-based, wide availability.

n Broadly accessible.

2. Modern Web interfaces: JavaScript, HTML5, ...n High interactivity.

n Client-side optimizations.

n Glamor.

3. How to consolidate 1 and 2?

Page 8: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 8

RFDE: Robust Web Applications

n REST Framework for Dynamic Environments

Page 9: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 9

RFDE Client Upgrades

Page 10: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 10

Web Mapping Service

InterpolationModule

TileRenderer

LoadBalancer

Database ServersTMS Servers

Clients

Desktop Applications

WebApplications

MobileApplications

x,y,z

Grids

Cached

Tiles

& Static

n Arbitrary geocoded point and grid data, backgrounds, ...

n Web interface similar to Google Maps; de-facto standard.

n REST-based design; easily re-targetable: android, iOS, ...

n Challenges: 1013 tiles, 10

4 Terabytes.

n Fast in-database dynamic tile generation from numeric data.

n Easy to replicate, map on to cloud services.

Page 11: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 11

WMS Descriptive Parameters

data parameters 115

period 32 years

tiles 23× 1012

rendered tile size 10, 000 Terabytes

database size 0.42 Terabytes

avg static response time 0.2 seconds

avg dynamic response time 0.5 seconds

Page 12: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 12

Handheld Data Analysis

n Test data; do not use!

n HCDX: handheldchronological data explorer.

n Android, iOS, Maemo, Web, ...

n Very high-level end-userprogramming.

n Interactive analysis oftime-series datasets.

n In-field data collection andanalysis.

n Handheld interfaces,functional programming,database optimizations, ...

Page 13: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 13

Summary

n Scientific dataflows: from raw data to insights.u Explication, documentation, optimization, ...u Durability, traceability, analyses, visualizations, ...u Platforms: desktop/laptop, Web, mobile, ...u Bottleneck in the research process?

n Investments in improving dataflow have a multiplier effect onother research investments.

n Acknowledgments:u Faculty: Shaleen Jain, Andrei Kurbatov, Paul Mayewski.u Graduate students: Erik Albert, Mark Royer.u Undergraduate students: Will Lamond, Joe Petrakovich.u Project teams: P301, 10green, RFDE/SSI.u Funding: NSF, U.Maine.

n Data management collaborations? [email protected]

Page 14: Sudarshan S. Chawathe › epscor › wp-content › uploads › sites › 25 › 2013 › ... · 2015-09-13 · WMS Parameters Handheld Data Summary Sudarshan S. Chawathe Accelerating

A Data-Centric ViewProject 301P301dx FeaturesTambora & SO4Map: Icereader DataWeb App ChallengesRFDERFDE Client UpgradesWeb Mapping ServiceWMS ParametersHandheld DataSummary

Sudarshan S. Chawathe Accelerating Scientific Dataflows – p. 14