35
Why so many data systems? Dickerson – ppt

Why so many data systems? Dickerson – ppt. Information as a Resource Shared not exchanged …

Embed Size (px)

Citation preview

Why so many data systems?

• Dickerson – ppt

Information as a Resource

• Shared not exchanged • …

The Transformational Effect of Networking

“Networking has led to an unprecedented surge of productivity” Time Magazine, Person of the Year 2006, YOU

• These are opportunities to enable Earth Science through more networking

• But many resistances to networking exist that need to be overcome

• Information has become the main driver of progress• Time and place are no longer barriers to participation and interaction • The Web has become a medium participation - ‘Web 2.0’ phenomenon

Networking Multiplies Value Creation

ApplicationData

1 User Stovepipe Value = 1 1 Data x 1 Program = 1

Enclosed Value-Creating Process - ‘Stovepipe’

ApplicationData

Application

Application

Application

Application

Stovepipe

1 User Stovepipe Value = 1 1 Data x 1 Program = 1

5 Uses of Data Value = 5 1 Data x 5 Program = 5

Networking Multiplies Value Creation

Merging data may creates new, unexpected opportunities

Not all data are equally valuable to all programs

1 User Stovepipe Value = 1 1 Data x 1 Program = 1

5 Uses of Data Value = 5 1 Data x 5 Program = 5

Open Network Value = 25 5 Data x 5 Program = 25

Data

Data

Data

Data

Data

StovepipeApplication

Application

Application

Application

Application

Networking Multiplies Value Creation

The Network Effect:Less Cost, More Benefits through Data Multi-Use

ProgramPublic

Data Organization

Data

Data Program

Program

OrganizationData

Data

ProgramData

Orgs Develop Programs

Programs ask/get Data Public sets

up Orgs

Pay only once Richer content

Less Prog. Cost More Knowledge

Less Soc. Cost More Soc. Benefit

Data Re-Use Network Effect

Data are costly resource – should be reused (recycled) for multiple applications

Data reuse saves $$ to programs and allows richer knowledge creation

Data reuse, like recycling takes some effort: labeling, organizing, distributing

Data repositories/Systems

Data are costly resource – should be reused (recycled) for multiple applications

Data reuse saves $$ to programs and allows richer knowledge creation

Data reuse, like recycling takes some effort: labeling, organizing, distributing

Increasing the Size of the Pie

Data are costly resource – should be reused (recycled) for multiple applications

Data reuse saves $$ to programs and allows richer knowledge creation

Data reuse, like recycling takes some effort: labeling, organizing, distributing

Cost = 1 for single use Cost = 1.5 for 5 uses

Benefit = 5 for 5 uses

Benefit = 1 for single use

Data Re-Use and Synergy

• Data producers maintain their own workspace and resources (data, reports, comments).

• Part of the resources are shared by creating a common virtual resources.

• Web-based integration of the resources can be across several dimensions:Spatial scale: Local – global data sharing

Data content: Combination of data generated internally and externally

• The main benefits of sharing are data re-use, data complementing and synergy.• The goal of the system is to have the benefits of sharing outweigh the costs.

Content

Content

User

User

User

LocalLocal

GlobalGlobal

Virtual Shared Resources

Data, KnowledgeTools, Methods

User

User

Shared part of resources

Federated Information System

• Data producers maintain their own workspace and resources (data, reports, comments).

• However, part of the resources are shared through a Federated Information System.

• Web-based integration of the shared resources can be across several dimensions:

Data sharing federations: • Open GIS Consortium (GIS data layers)• NASA SEEDS network (Satellite data)• NSF Digital Government • EPA’s National Env. Info Exch. Network.

VIEWSRPO

NASANAAPS

RPO Federated Data System

Data, Tools, Methods

SharedPrivate

RPO

Other Federations

Applications

PM Policy

Regulation

Mitigation

Federated Information System

• Data producers maintain their own workspace and resources (data, reports, comments).

• However, part of the resources are shared through a Federated Information System.

• Web-based integration of the shared resources can be across several dimensions:

Data sharing federations: • Open GIS Consortium (GIS data layers)• NASA SEEDS network (Satellite data)• NSF Digital Government • EPA’s National Env. Info Exch. Network. VIEWSRPO

RPO Federated Data System

Data, Tools, Methods

SharedPrivate

RPO

Other Federations

Applications

PM Policy

Regulation

Mitigation

Unidata Portal

ESIP Portal

Portal

Data to be “dispersed” to multiple “portals”

This brings data closer to the user

Each portal can serve different clientele

Conditions is open architecture so that the resources can be reconfigured into many different “views” through the different portals

User communities

Smoke Event

Public

EPA

1.

2.

3.

NAAQS Exc. Events

States: AQ Warning

NOAATravel Advisories

AQ Forecasting

FAAFlight Advisories

NASAEarth Obs: Public

SatModis

Mod

Vis

PM25

SatTOMS

SatGOES

Chem

ScientistScience

DAACs

• Current info systems are project/program oriented and provide end-to-end solutions

Info UsersData Providers Info System

AIRNowPublicAIRNow

ModelCompliance

Manager

‘Stovepipe’ and Federated Usage Architectures Landscape

• Part of the data resources of any project can be shared for re-use through DataFed

• Through the Federation, the data are homogenized into multi-dimensional cubes

• Data processing and rendering can then be performed through web services

• Each project/program can be augmented by Federation data and services

• Applicable to: – Model Validation– Deliver Information to the Public– Track Trends – Accountability

• GEOSS

Data Acquisition and Usage Activities

Need similar generic pic for

analysis

Staged Data Integration? Staged portal

Monitor StoreData 1

Monitor StoreData 2

Monitor StoreData n

Monitor StoreData m

Integrated Data1

Virtual Int. DataIntegrated

Data2

Integrated Data3

System integrates foreword from provider to the users

So that user can find/monitor content

User can navigate backwards toward the provider

PoP – harvester

Oodle!

CNet

Agile Information System: Data Access, Processing and Products

Public

Manager

Scientist

Users

other

Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Organizing

DocumentStructure/FormatInterfacingV

alu

e A

dd

ing

P

rocesses

Agile Information System: Data Access, Processing and Products

Uniform Access

Public

Manager

Scientist

Users

other

Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Organizing

DocumentStructure/FormatInterfacingV

alu

e A

dd

ing

P

rocesses

Homogenizing

Format profile Standard accessData as Service

Agile Information System: Data Access, Processing and Products

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Public

Manager

Scientist

Users

other

Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Organizing

DocumentStructure/FormatInterfacing

Characterizing

Display/BrowseCompare/Fuse CharacterizeV

alu

e A

dd

ing

P

rocesses

Homogenizing

Format profile Standard accessData as Service

Agile Information System: Data Access, Processing and Products

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Analyzing

Filter/IntegrateAggregate/FuseCustom Analysis

Organizing

DocumentStructure/FormatInterfacing

Characterizing

Display/BrowseCompare/Fuse CharacterizeV

alu

e A

dd

ing

P

rocesses

Homogenizing

Format profile Standard accessData as Service

Value-Adding Processes

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Analyzing

Filter/IntegrateAggregate/FuseCustom Analysis

Organizing

DocumentStructure/FormatInterfacing

Characterizing

Display/BrowseCompare/Fuse Characterize

Reporting

Inclusiveness Iterative/Agile Dynamic Report

Homogenizing

Format profile Standard accessData as Service

Information Value Chain

Agile Information System: Data Access, Processing and Products

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Data

Control

Seeking Information

Providing Information

Negotiating & Market Space

System of SystemsGlobal Earth Observing System of Systems - GEOSS

Characteristics of System of Systems (SoS)

• Autonomous constituents managed/operated independently• Independent evolution of each constituent• SoS displays emergent behavior

Must recognize, manage, exploit the characteristics:

• No stakeholder has complete SoS insight• Central control is limited; distributed control is essential• Users, must be involved throughout the life of a SoS

Lets agree onSpace-Time-Parameter

Data Access Query Protocol

Interoperability Stack: Key concept of the Web Connecting Machines and People

IP – Internet Protocol

Service Orientation Open Architecture

Data Standards

Amplify Individuals Connect Minds

System components have to be interoperable at each layer

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Standard Data Query Language:

Where? When? What? (Space-time query - WMS, WCS)

GetCapabilities

GetData

Capabilities, ‘Profile’

Data

Where? When? What? Which Format?

Server

Back End S

td.

Inte

rface

Client

Front EndS

td.

Inte

rface

Query GetData Standards

Where? BBOX OGC, ISO

When? Time OGC, ISO

What? Temperature CF

Format netCDF, HDF.. CF, EOS, OGC

T2T1

Loosely Coupled Data Access through Standard Protocols

Standard Messaging

What data you have?Give me this data

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Web Services and Workflow for Loose Coupling

Service Chaining & Workflow

Workflow Software:Dynamic Linking

Software Mashups

Software Mashup:Coarse-grain Linking

SeaWiFS Satellite

SeaWiFS Satellite

Aerosol Chemical

Air Trajectory

Map Boarder

VIEW by Web Service Composition

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Collaborative Reporting and Dynamic Delivery

Co Writing - Wiki

ScreenCast

Collaborative Analysis and Writing

Wiki, Blogs, Group Annotations

Dynamic Content Delivery:

GoogleEarth, Screencasting…

DataFed: 100+ Datasets Non-intrusively Federated

• Data are accessed from autonomous, distributed providers• DataFed ‘wrappers’ provide uniform geo-time referencing• Tools allow space/time overlay, comparisons and fusion

Near Real Time Data IntegrationDelayed Data Integration

Surface Air Quality AIRNOW O3, PM25 ASOS_STI Visibility, 300 sitesMETAR Visibility, 1200 sitesVIEWS_OL 40+ Aerosol Parameters

SatelliteMODIS_AOT AOT, Idea ProjectGASP Reflectance, AOTTOMS Absorption Indx, Refl.SEAW_US Reflectance, AOT

Model OutputNAAPS Dust, Smoke, Sulfate, AOTWRF Sulfate

Fire DataHMS_Fire Fire PixelsMODIS_Fire Fire Pixels

Surface MeteorologyRADAR NEXTRADSURF_MET Temp, Dewp, Humidity…SURF_WIND Wind vectorsATAD Trajectory, VIEWS locs.

Sample of Federated Datasets

Sulfate in the Northeast

Sahara Dust in the Gulf

Fires in the Southeast

Time Series Console: Southeast

Analyst Console Applications:

Sulfate Episode: 8/ 27/ 04

A Sample of Datasets Accessible through ESIP MediationNear Real Time (~ day)

It has been demonstrated (project FASTNET) that these and other datasets can be accessed, repackaged and delivered by AIRNow through ‘Consoles’

MODIS Reflectance

MODIS AOT TOMS Index

GOES AOT

GOES 1km Reflec

NEXTRAD Radar

MODIS Fire Pix

NRL MODEL

NWS Surf Wind, Bext

Summary Grand ConvergenceWill we make use of it?

• Third-party mediation can homogenize distributed ES data• Agile SOA-based IS can deliver diverse info products to users

• Since 2005, one such IS, DataFed is used by EPA and in research

• However, more data need to be federated by the community

Parting thoughts

Think outside the stovepipe – Think networking

Divide and Conquer, NO! Connect and Enable, YES!

Thank you