Upload
anthony-davidson
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
The LEAD Effort at Unidata
The Unidata Seminar will start at 1:30 PM MST
The LEAD Effort at Unidata
Tom Baltzer, Brian Kelly, Doug Lindholm, Anne Wilson
December 14, 2005
LEAD is funded by the National Science Foundation under the following Cooperative Agreements:
ATM-0331594ATM-0331591ATM-0331574ATM-0331480ATM-0331579 ATM-0331586 ATM-0331587 ATM-0331578
Outline
1. Setting the Stage: Introduction to LEAD and Unidata’s LEAD Efforts: Anne
2. Application of current technology on the LEAD testbeds: Tom
3. The LEAD Hardware at Unidata: Brian
4. The THREDDS Data Repository: Doug
Setting the Stage: Introduction to LEAD and Unidata’s LEAD
Efforts
Anne Wilson
Current IT Barriers to Mesoscale Weather Research and Education
• Data and tools useable mainly by experts
• Researchers and educators constrained by hardware limitations
• Rigid, brittle technology can’t accommodate mesoscale weather research requirements:– real time, on demand, dynamic data processing
and sensor steering
A Solution: Linked Environments for Atmospheric
Discovery (LEAD)• Funded by NSF Large Information Technology
Research (ITR) award• Produce a web service based, scalable
framework for handling meteorological data and model output:– Identifying, accessing, preparing, assimilating,
predicting, managing, analyzing, mining, visualizing– Independent of data format and physical location
• Dynamically adaptive workflows and steering of sensors
The LEAD Vision
• Data access via querying, and browsing
• Analysis and forecast tools that can be composed into workflows
• Workflows and sensors that respond to the weather
• Support users ranging from grade 6 to experienced researchers
LEAD Objectives
• Lower the barrier for entry and increase the sophistication of problems that can be addressed by complex end-to-end weather analysis and forecasting/simulation tools
• Improve our understanding of and ability to detect, analyze and predict mesoscale atmospheric phenomena by interacting with weather in a dynamically adaptive manner
• Result: Paradigm change in how experiments are conceived and performed
LEAD Challenges
Challenge RequirementsDisparate, high volume data sets Efficient transmission, remote
subsetting and aggregration, reliable, robust storage, format independence
Huge computational demands, e.g. ensemble forecasting
Distributed, load balanced computations
Use of existing complex numerical models and data assimilation systems
Make existing tools work in web service environment
Lack of controlled vocabulary Ontology, dictionary
Support for 6 – 12, college, graduate, and advanced research
Robust security, user aids, education modules, meaningful responses
Multidisciplinary Effort
• Meteorology
• Computer Science and Information Technology
• Education and Outreach
LEAD Institutions
> 100 scientists, students, technical staff
LEAD Thrust Groups
• Data*
• Orchestration
• Portal
• Meteorology
• Grid and Web Services Test Bed*
• Education and Outreach Test Bed
*Major Unidata areas
LEAD Data Subsystem
Query Service
Dictionary
Ontology Service
Resource Catalog
myLEAD Catalog
LEAD Data Repository (LDR)
Public Data (e.g. IDD
data)
LEAD Portal
Unidata Technology Used in LEAD
• LDM/IDD Data Delivery: near real time data delivery• THREDDS: catalogs of data and their associated
metadata• Common Data Model (CDM): single interface to
multiple data formats• THREDDS Data Server (TDS): integrated OPeNDAP
and http data access• Integrated Data Viewer (IDV): visualization• THREDDS Data Repository (TDR): data storage
framework• Decoders
Unidata and LEAD
• Unidata also brings:– Experience with atmospheric data– Community of users– Robust, fielded software
Recent LEAD-Related Efforts
2. Application of current technology on our LEAD testbed: Tom
3. Structure of the LEAD testbed: Brian
4. THREDDS Data Repository: Doug
Goal: Support both LEAD and our community
Application of Current Technologies on the LEAD
Testbed Systems
Tom Baltzer
Acronyms for LEAD ToolsADAS - ARPS Data Assimilation System (Center for Advanced Prediction of Storms at OU)
ADaM - Algorithm Development and Mining (University of Alabama at Huntsville)
IDV – Integrated Data Viewer (Unidata)
LDM/IDD – Local Data Manager/Internet Data Distribution (Unidata)
OPeNDAP – Open-source Project for a Network Data Access Protocol (OPeNDAP.org)
THREDDS – Thematic Real-time Environmental Distributed Data ServicesTDS - THREDDS Data ServerTDR – THREDDS Data Repository (Unidata)
WRF – The Weather and Research Forecasting Model (ARW Core - NCAR)
Also: WS-Eta – Workstation Eta Model
LEAD Testbed Systems• Testbed systems at several LEAD locations to provide:
– Data • Near Real-Time data ingest, storage and access• LEAD Data Product storage and access
– Data Processing• High Performance Computing• Grid and Web Services
• Allow each institution to develop methods by which their capabilities fit into LEAD effort• Single Web Portal system at Indiana Univ. to bring it all together and provide User Interface
Core Academic Partner + Grid Test BedCore Academic Partner + Education Test BedCore Academic Partner + Grid Test Bed + Education Test Bed
Core Academic Partner
CSUUnidata
OU
UI IU
UAH
UNC
MU
HU
LEAD Grid
Data Aspects of LEAD Testbeds
LEAD Testbed Systems• UPC Technologies being leveraged to facilitate LEAD needs
– LDM/IDD – THREDDS– IDV– NetCDF Decoders
– OPeNDAP (Unidata supported)
IDD
Testbed SystemTestbed System
Forecast Forecast Model OutputModel Output Weather station Weather station
observationsobservations
Aircraft dataAircraft data
Radar dataRadar data
Typical LEAD Testbed (Current Source Data Configuration)
Decoders
THREDDSCatalog
GridFTP
OPeNDAP
LEADGrid System
IDD
Testbed SystemTestbed System
Forecast Forecast Model OutputModel Output Weather station Weather station
observationsobservations
Typical LEAD “Data” Testbed (Future Source Data Configuration)
Decoders
THREDDSCatalog
GridFTP
LEADGrid System
TDS &TDR
Radar dataRadar data
Aircraft dataAircraft data
Note: UPC plans ~ 6 month store
OPeNDAP
LEAD Processing on the Unidata Testbed System
UPC Processing Testbed (Current Configuration)
NCEP NAM (Eta) Forecast
PrecipitationLocator
CenterLat/Lon
OPeNDAPAccess
THREDDSCatalog
Unidata LEAD Test Bed
RegionalForecasts
WS-Eta
WRF
Initial and
Boundary
Conditions
- WRF being Steered by Chiz’s GEMPAK precipitation locator
Next Steps
NCEP NAM (Eta) Forecast
PrecipitationLocator
Cen
ter
Lat
/Lo
n
OPeNDAPAccess
THREDDSCatalog
Unidata LEAD Test Bed
RegionalForecasts
WS-Eta
WRF
BoundaryConditions
CAPS ADASAssimilation
Initial
Conditions
MillersvilleADaM Precip
Locator
Longer Term
NCEP NAM (Eta) Forecast
PrecipitationLocator
CenterLat/Lon
OPeNDAPAccess
THREDDSCatalog
Unidata LEAD Test Bed
RegionalForecasts
WS-Eta
WRF
BoundaryConditions
ADAS
IDD Datasets• Radar• Surface & Upper air• Satellite• NCEP NAM
ADaM
Ultimately
NCEP NAM (Eta) Forecast
PrecipitationLocator
CenterLat/Lon
OPeNDAPAccess
THREDDSCatalog
Unidata LEAD Test Bed
RegionalForecasts
WS-Eta
Web ServiceWRF
BoundaryConditions
Web ServiceADAS
IDD Datasets• Radar• Surface & Upper air• Satellite• NCEP NAM
Web ServiceADaM
LEADGrid System
Objectives for UPC Testbed
• Testing ground for integration new UPC and LEAD technologies
• Determining ways to bring LEAD Technologies to the Unidata Community
• “Operational” environment for LEAD• Processing cluster• Data Storage
– ~6 months of IDD data– LEAD product data
The LEAD Hardware at Unidata
Brian Kelly
Existing LEAD Infrastructure
Lead1GRID Server
Development Tools
NFS Server
Cluster Node
Lead3HTTP Server
THREDDS Server
OpenDAP Server
LDM Node
NFS Server
Cluster Node
Lead2GRID Server
NFS Server
Cluster Node
Cluster Monitoring
Lead4TDS
LDM Node
NFS Server
Cluster Node
LeadStor8 TB of Disk
NFS Server
40 TB Storage
Cluster
~30 GFLOP
Processing Cluster
Portal Servers for Web,
TDS, Grid and
LDM Services
UCAR/Unidata LEAD
Infrastructure
LEAD Portal Systems
Processing Cluster
Head Node
HTTP, TDS and
Grid Server
LDM ServerTest Server
Gigabit Network for
NFS Storage Access
Storage Cluster
Gateway
Beowulf Cluster
Connected by a Gigabit
Fibre Network
LEAD Processing Cluster
Each Node contains Two Athlon 2400+ CPUs
Cluster Uses OSCAR with the MPICH MPD
Eight Nodes is ~30 GFLOPs
LEAD Storage Cluster
LEAD Storage
Gigabit Network
LEAD Storage Nodes
LEAD Storage
Head Node
One (1) Guanghsing GHI-583 5U Case24 hot swapable SATA trays
1000W 2+2 power supply
● One (1) Tyan Thunder K8SD Pro MotherboardDual Opteron CPUs
Four 64-bit 133/100 Mhz PCI-X Slots
Two Gigabit Ethernet ports
● One (1) AMD Opteron 242 Processor1.6 Ghz CPU
● Three (3) Broadcom RAIDCore BC4853Eight SATA ports
Controller spanning
Advanced raid
● Twenty-Four (24) Seagate Barracuda ST3400832AS7200 RPM 400GB SATA Drives
LEAD Storage Node
Twenty-Four (24) 400 GB Drives
Divided into Two (2) Eleven Column RAID 5 Arrays and Two Hot Spares
Form Two (2) 4 TB LUNs Using bcraid
Each Node Publishes the Two LUNS over iSCSI
LEAD Storage Node
● Mounts Each Node's Two (2) 4 TB LUNs Published via iSCSI
● Builds Two (2) 20 TB 6 column RAID 5 Meta-devices using mdadm
● Divides Each Meta-device into Volume using LVM
● Each Volume is Formatted with an XFS Filesystem
● Each Filesystem is Published with NFS
LEAD Storage Gateway
Result: 40 TB of mid-performance double-redundant storage
THREDDS Data Repository (TDR)
Doug Lindholm
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Unidata
NCSA
IU
OU
UAH
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
Application Services
DataMining
(ADAM)
Visualization(IDV)
DataAssimilation
(ADAS)
ForecastModel(WRF)
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Portal
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
Application Services
DataMining
(ADAM)
Visualization(IDV)
User
DataAssimilation
(ADAS)
ForecastModel(WRF)
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Portal
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
Application Services
DataMining
(ADAM)
Visualization(IDV)
User
DataAssimilation
(ADAS)
ForecastModel(WRF)
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Portal
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
Application Services
DataMining
(ADAM)
Visualization(IDV)
User
DataAssimilation
(ADAS)
ForecastModel(WRF)
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Portal
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
Application Services
DataMining
(ADAM)
Visualization(IDV)
User
DataAssimilation
(ADAS)
ForecastModel(WRF)
LEAD ArchitectureData Storage Perspective
LEAD Data Grid
Portal
Cataloger(myLEAD)
Storage Locator
Data Mover
ID Generator
Name Resolver
Metadata Generator
Metadata Crosswalk
Unidata
NCSA
IU
OU
UAH
“Atomic” Capabilities
Application Services
DataMining
(ADAM)
Visualization(IDV)
User
DataAssimilation
(ADAS)
ForecastModel(WRF)
DataRepository
TH
RE
DD
S D
ata
Rep
osito
ry
Stora
geLo
cato
r
locate-Storage()
Data
Move
r
move-Data()
Unique
ID
Gener
ator
generate-UniqueID()
Name
Resolv
er
mapID-ToURL()
Met
adat
a
Gener
ator
generate-Metadata()
Met
adat
a
Cross
walk
translate-Metadata()
Catal
oger
catalog-Metadata()
THREDDS Data RepositoryComponent Architecture
putData() getData()discoverData()
Data Storage
THREDDS Data Repository
Stora
geLo
cato
r
locate-Storage()
Data
Move
r
move-Data()
Unique
ID
Gener
ator
generate-UniqueID()
Name
Resolv
er
mapID-ToURL()
Met
adat
a
Gener
ator
generate-Metadata()
Met
adat
a
Cross
walk
translate-Metadata()
Catal
oger
catalog-Metadata()
THREDDS Data RepositoryComponent Architecture
THREDDS Data RepositoryputData() getData()discoverData()
Data Storage
Resou
rce
Broke
r
locate-Storage()
trebuch
et
move-Data()
Unique
ID
Gener
ator
generate-UniqueID()
RLS
mapID-ToURL()
THREDDS
Met
adat
a
Gener
ator
generate-Metadata()
THREDDS to L
EAD
Cross
walk
translate-Metadata()
myL
EAD
catalog-Metadata()
THREDDS Data RepositoryComponent Architecture
THREDDS Data RepositoryputData() getData()discoverData()
Data Storage
LEAD Configuration
Stora
geLo
cato
r
locate-Storage()
Data
Move
r
move-Data()
generate-UniqueID()
mapID-ToURL()
generate-Metadata()
translate-Metadata()
THREDDS
Catal
og
catalog-Metadata()
THREDDS Data RepositoryComponent Architecture
THREDDS Data RepositoryputData() getData()discoverData()
Data Storage
Alternate Configuration
THREDDS
Met
adat
a
Gener
ator
Unidata Architecture
Internet Data Distribution
(IDD)
DataStorage
Local DataManager
(LDM)
Unidata Architecture
Internet Data Distribution
(IDD)
DataStorage
Local DataManager
(LDM)
access
THREDDSClient
API
Unidata Architecture
Internet Data Distribution
(IDD)
THREDDSCatalog
DataStorage
Local DataManager
(LDM)discover
access
THREDDSClient
API
Unidata Architecture
Internet Data Distribution
(IDD)
THREDDSCatalog
DataStorage
Local DataManager
(LDM)
Common Data Model
(CDM)
discover
access
THREDDSClient
API
Unidata Architecture
Internet Data Distribution
(IDD)
THREDDSCatalog
THREDDSData
Server(TDS)
DataStorage
Local DataManager
(LDM)
Common Data Model
(CDM)
discover
access
THREDDSClient
API
Unidata Architecture
Internet Data Distribution
(IDD)
THREDDSCatalog
THREDDSData
Server(TDS)
THREDDSData
Repository(TDR)
DataStorage
Local DataManager
(LDM)
Common Data Model
(CDM)
discover
access
store
THREDDSClient
API
Unidata Architecture
Internet Data Distribution
(IDD)
THREDDSCatalog
THREDDSData
Server(TDS)
THREDDSData
Repository(TDR)
DataStorage
LocallyGenerated
Data
LocallyGenerated
Data
Local DataManager
(LDM)
Common Data Model
(CDM)
discover
access
store
store
store
THREDDSClient
API
Unidata Architecture
Internet Data Distribution
(IDD)
THREDDSCatalog
THREDDSData
Server(TDS)
THREDDSData
Repository(TDR) E-mail
Application(e.g. IDV)
Service
DataStorage
LocallyGenerated
Data
LocallyGenerated
Data
Local DataManager
(LDM)
Common Data Model
(CDM)
discover
access
store
store
store
notify
Questions?