31
Unidata 2008: Shaping the Future of Data Use in the Geosciences Expanding Horizons: Using Environmental Data for Education, Research, and Decision Making 23 June 2003 23 June 2003 Boulder, CO Boulder, CO Mohan Ramamurthy Mohan Ramamurthy Unidata Program Center Unidata Program Center UCAR Office of Programs UCAR Office of Programs Boulder, CO Boulder, CO

Unidata 2008: Shaping the Future of Data Use in the Geosciences

  • Upload
    gaston

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Unidata 2008: Shaping the Future of Data Use in the Geosciences. Expanding Horizons: Using Environmental Data for Education, Research, and Decision Making 23 June 2003 Boulder, CO Mohan Ramamurthy Unidata Program Center UCAR Office of Programs Boulder, CO. Thank you, Dave. - PowerPoint PPT Presentation

Citation preview

Page 1: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Unidata 2008:Shaping the Future of Data Use in the

Geosciences

Expanding Horizons: Using Environmental Data for Education, Research, and Decision Making

23 June 200323 June 2003Boulder, COBoulder, CO

Mohan RamamurthyMohan RamamurthyUnidata Program CenterUnidata Program CenterUCAR Office of ProgramsUCAR Office of Programs

Boulder, COBoulder, CO

Page 2: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Thank you, Dave

I wish to take this opportunity to extend my sincerest gratitude to Dave Fulker, the founding director of Unidata, for his distinguished service to the Unidata Community for nearly 20 years.

Unidata would not be what it is today without his vision, leadership, energy and his many extraordinary qualities.

And thank you to Ben Domenico for his excellent stewardship during the transition.

Page 3: Unidata 2008: Shaping the Future of Data Use in the Geosciences

The Word of the Day for Jun 23rd is

The Word of the Day for Jun 23 is:

bloviate \BLOH-vee-ayt\ verb

: to speak or write verbosely and windily

[Courtesy: Jo Hansen, Unidata Program Center]

Example sentence:

Mohan can bloviate on a par with the windiest of professors, but he's also capable of being concise and getting right to the point. (yeah, right)

Page 4: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Expanding Horizons

New Strategic Plan New Director New 5-year proposal Many new and

exciting initiatives New logo!

Page 5: Unidata 2008: Shaping the Future of Data Use in the Geosciences

UnidataUnidata

Mission Statement:Mission Statement:

Provide data, tools, and community leadership for enhanced Earth-system education and research.

At the Unidata Program Center, we

• Facilitate Data AccessFacilitate Data Access

• Provide ToolsProvide Tools

• Support Faculty and StaffSupport Faculty and Staff

• Build and Advocate for a Community Build and Advocate for a Community user user workshops come under this activityworkshops come under this activity

Page 6: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Technology PortfolioTechnology Portfolio

1) McIDAS: A client/server analysis and display package, originally developed by U. Wisconsin/SSEC, that emphasizes image processing of data from satellite-borne sensors;

2) GEMPAK: An analysis, display, and product generation package for meteorological data;

3) Local Data Manager: Software for capturing, disseminating, and organizing data in near-real time; It is the heart of the Internet Data Distribution (IDD) system;

4) NetCDF: A software interface for platform-independent access to self describing datasets;

5) Integrated Data Viewer: Java-based, platform-independent data analysis and 3D visualization tools;

6) THREDDS: A project to facilitate remote access to thematic, distributed, interdisciplinary data servers;

Page 7: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Unidata as a Diverse CommunityUnidata as a Diverse Community

About 150+ sites are participating in About 150+ sites are participating in Unidata Internet Data Distribution Unidata Internet Data Distribution (IDD) system(IDD) system• 120 or so of those sites are in academia 120 or so of those sites are in academia

and the rest in government and and the rest in government and research labsresearch labs

User community is interdisciplinary - 2/3rd of sites have users outside atmospheric sciences

Page 8: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Internet Data DistributionInternet Data Distribution

Source

LDM

Source

Source

LDM LDM

LDMLDM

LDM LDM

LDM

LDM

Internet

Radar

Model

Satellite

Approximately 2 GB of data injected/hour from distributed sources;

Unidata IDD/LDM uses more of the Internet2 than any other advanced application;

Approx. 5 Terabytes of data transmitted each week. (Amount varies with weather)

By design, the system has no data center.

Page 9: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Proposed WSR-88D Data FlowProposed WSR-88D Data Flow(NWS Plans)(NWS Plans)

Page 10: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Education Drivers (a.k.a. A Community-Articulated Need)

Active, student-centered learning

Earth-system science or “holistic” approach to education

Learning science by doing science• Observations (data)• Tools (models,

visualization)• Discovery

Page 11: Unidata 2008: Shaping the Future of Data Use in the Geosciences

NSF Director Rita Colwell, 1998: "Interdisciplinary connections are absolutely fundamental. They are synapses in this new capability to look over and beyond the horizon. Interfaces of the sciences are where the excitement will be the most intense... ."

Science DriversScience Drivers

Grand Challenges in Environmental Sciences National Research Council

Page 12: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Multidisciplinary ProblemsMultidisciplinary Problems

Fire Danger determination requires taking into account past, present and future weather, fuel types, and the state of both live and dead fuel moisture.• Dead Fuel Moisture• Live fuel moisture (NDVI)Live fuel moisture (NDVI)• Drought conditionsDrought conditions• Atmospheric stabilityAtmospheric stability• Lightning mapsLightning maps• Lightning ignition efficiencyLightning ignition efficiency• AirflowAirflow• Recent rainfallRecent rainfall• Rainfall forecastRainfall forecast

Page 13: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Dual-Polarization Radar use in Fire Dual-Polarization Radar use in Fire Weather ManagementWeather Management

The differential The differential reflectivity (ZDR) reflectivity (ZDR) values are values are noteworthy in the noteworthy in the smoke signalsmoke signal

Many regions show Many regions show ZDR >+6 dB.ZDR >+6 dB.

Suggests flattened Suggests flattened ash particles (like ash particles (like corn flakes)corn flakes)

Source: CHILL Radar Group, CSU

Page 14: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Flooding due to Tropical Storms Flooding due to Tropical Storms Tropical Storm Allison

Research studies and emergency management of hurricane-induced flooding involve integrating data from atmospheric sciences, oceanography, hydrology, geology, geography, and social sciences.

Page 15: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Multidisciplinary SynthesisMultidisciplinary Synthesis Requires integration of disparate datasets and Requires integration of disparate datasets and

databases from diverse sources that are databases from diverse sources that are distributed geographically and disciplinarily;distributed geographically and disciplinarily;

Needs integration of Scientific Information Needs integration of Scientific Information Systems with Geographic Information SystemsSystems with Geographic Information Systems

The integration poses numerous challenges;The integration poses numerous challenges; However, such integration is critical to solving However, such integration is critical to solving

societal problems and advancing science.societal problems and advancing science. Metadata is crucial to achieving integrationMetadata is crucial to achieving integration

Page 16: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Remote Sensing & Data Explosion In the next 10 years, about 100 new

satellite instruments will be launched to monitor the environment

Five-order magnitude increase in satellite data is expected during that period• GIFTS (Geostationary Imaging Fourier

Transform Spectrometer) will have about 1700 channels and a resolution of 4 km

• Each NPOESS satellite will generate one terabyte of data each day

Advances in Radar technology• 28 fold increase in WSR-88D data

volume in 5 years• Phased-array radars will generate 100

fold increase data

By 2004, NOAA will ingest more data in one year than was contained in the total archive in 1998.

Page 17: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Advances in ModelingAdvances in Modeling

Shift from a purely deterministic to a more probabilistic approach, requiring the use of ensemble modeling techniques.

Growing emphasis on multidisciplinary studies, requiring coupled models:

• e.g., Hurricane landfall flooding problem: Atmospheric model (WRF/MM5), Ocean model (ROMS), Hydrologic model (HMS)

Page 18: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Local Modeling: A Notable Trend

Over 30 universities are now running mesoscale models locally.

One can think of this aggregation as a national forecasting instrument

However, only one or two groups initializing their model runs with local observations

As the scale of these local model runs becomes finer, there is a natural desire to integrate their output with information from other sources (e.g., hydrology, infrastructure, societal datasets in GIS form)

Iowa St. Linux Cluster

Page 19: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Technology DriversTechnology Drivers

Object-oriented programmingObject-oriented programming Open Standards, Interoperability and Open Source MovementOpen Standards, Interoperability and Open Source Movement

• Metcalfe's Law: the usefulness, or utility, of a network increases as the square of the number of users. .

Web services (HTTP, Java, XML, SOAP, UDDI, …)Web services (HTTP, Java, XML, SOAP, UDDI, …) Digital libraries (Metadata, discovery, information services…)Digital libraries (Metadata, discovery, information services…) Grid environments and distributed computingGrid environments and distributed computing Commodity microprocessorsCommodity microprocessors Cluster computingCluster computing High bandwidth networks: 10GigE, Fast IP, …High bandwidth networks: 10GigE, Fast IP, … Broadband accessBroadband access Wireless networks: 802.11 networks, GPRS, 3GWireless networks: 802.11 networks, GPRS, 3G IPv6: Next-generation internet protocolIPv6: Next-generation internet protocol Collaborative computingCollaborative computing Scientific data mining and knowledge discoveryScientific data mining and knowledge discovery

Page 20: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Users

Collections (Data, tools, educational materials)

Metadata repository

Services

Users

Collections (Data, tools, educational materials)

Metadata repository

Services

Web services is a technology and process for discovery and connection.

The eXtended Markup Language, XML, is accepted as THE emerging standard for data interchange on the Web.

XML allows authors to create their own markup, which has led to the proliferation of “MyOwn Markup Language”

Web Services and the Wild and Wooly World of Markup Languages

Page 21: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Title: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Six endeavors are proposed, focusing on Community and Support Services and Data Services, Systems, and Tools

The proposed endeavors will enable the community to advance scientific exploration, education, and decision-making.

We are moving from an era of data provision to one in which data- and related web-services are emphasized

Five-year Core Funding NSF ProposalFive-year Core Funding NSF Proposal

“The unanimous finding of the panel is that the Unidata Program Center program be supported as fully as possible by NSF for the years 2003-2008.”

Page 22: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Proposed EndeavorsProposed Endeavors

Endeavor 1.Endeavor 1. Responding to a broader and more diverse community.• Respond to increased emphasis on Earth-system

science (e.g., bring new data sets to the community)• Establish new partnerships with related communities

(e.g. with Hydrology via CUAHSI)• Support new tools in technically less-sophisticated

institutions (e.g., community colleges)

Endeavor 2Endeavor 2.. Comprehensive support services • Deploy web-based training modules Deploy web-based training modules • Simplify installation and maintenance for all supported Simplify installation and maintenance for all supported

packages packages • Explore new technologies (e.g., Access Grid) to facilitate Explore new technologies (e.g., Access Grid) to facilitate

remote collaborationremote collaboration

Page 23: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Endeavor 3: Endeavor 3: Real-time, self-managing Real-time, self-managing data flowsdata flows

More flexibility and control• Many more feed types for finer control over routing and subsetting• Configurable product priorities

Self-managing data flows (automatic dynamic routing)• Application-level multicast looks promising for hundreds of sites (IP multicast not suitable due to limitations)• NLDM: data flooding via Usenet protocols may provide practical routing solution (needs more testing)

 Support for new standards• Use of IP version 6 protocols • Internet2, Grid and e-services standards (authentication, resource use, ...) • Location-transparency for data

Page 24: Unidata 2008: Shaping the Future of Data Use in the Geosciences

LDM-5 Vs. LDM-6 LatenciesLDM-5 Vs. LDM-6 Latencies

CONDUIT ExperienceAverage delivery time : ~20 seconds to top-tier sites

Page 25: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Endeavor 4. Software to analyze and visualize geoscience data

Integrate diverse datasets Support analysis and

visualization of local and climate modeling efforts

Develop collaborative tools to make effective use of shared visualizations

Allow customized user experiences

Adapt to GIS frameworks

– Cloud water isosurface from COMMAS storm model data Cloud water isosurface from COMMAS storm model data

(courtesy Adam Houston and Dan Bramer, NCSA/UIUC)(courtesy Adam Houston and Dan Bramer, NCSA/UIUC)

Page 26: Unidata 2008: Shaping the Future of Data Use in the Geosciences

PeoplePeople

DocumentsDocuments DataDataC

atalog

Generation Tools

Analysis andVisualization Tools

Data Services

Discovery andPublication Tools

Discovery and Publication Services

Dat

a C

atal

ogS

ervi

ces

THREDDSTHREDDSMiddlewareMiddleware

Page 27: Unidata 2008: Shaping the Future of Data Use in the Geosciences

THREDDS, GIS, DL InteroperabilityTHREDDS, GIS, DL InteroperabilityGIS Client

ApplicationsTHREDDS Client

Applications

OpenGIS Protocols:WMS, WFS, WCS

OGC or proprietary GIS

protocols

OGC or OPeNDAPADDE. FTP…

protocols

GIS ServerGIS ServerGIS Servers

Demographic, infrastructure, societal impacts, …

datasets

THREDDS ServerTHREDDS Server

THREDDS ServersSatellite, radar,

forecast model output, … datasets

Digital Library Discovery Systems

Metadatacrosswalk

Open Archives Initiative (OAI) Metadata Harvesting

Metadatacrosswalk

Page 28: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Current Implementation

netCDF

Application

Parallel file

system

Proposed Implementation

HDF5 (serial and/or parallel)

netCDF

Application

Network or to/from another

application

Stream

 

POSIX I/O

File Meta-data

Rawdata

User-defined device

CustomMPI-

I/O

Split files

POSIX I/O

File

NetCDF-HDF IntegrationNetCDF-HDF Integration

Extend netCDF to high-performance computing environment

Implement parallel I/O, large grids, etc.

Work will directly benefit WRF and CCSM communities

Endeavor 6: Improved data access infrastructure

Page 29: Unidata 2008: Shaping the Future of Data Use in the Geosciences

The Visual Geophysical Exploration The Visual Geophysical Exploration Environment (VGEE)Environment (VGEE)

The VGEE is an integrated framework in which students use visualization tools, data, and curricular materials to learn basic physical principles of atmospheric science

It includes:• A learner interface to the IDVA learner interface to the IDV• Java-based concept models to Java-based concept models to

support physical insightsupport physical insight• A curriculum to guide inquiryA curriculum to guide inquiry• A catalog of data (THREDDS)A catalog of data (THREDDS)

Page 30: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Students notice that the Western Pacific is considerably warmer than the East.

Identify RelateExplainIntegrate

VGEE: An Integrated FrameworkVGEE: An Integrated Framework

Concept Models, which are used to explore relations in an idealized context.

Page 31: Unidata 2008: Shaping the Future of Data Use in the Geosciences

Concluding Remarks

We live in an exciting moment in the history of the Earth sciences.

Workshops like this and the diversity of representation from academia are testimony to the vibrancy of the community and the program.

The portfolio of tools and technologies within Unidata, coupled with the energies of a creative and collaborative community, puts us in an ideal position to meet the important challenges facing the education and research communities in the atmospheric and related sciences.