1
NOAA Environmental Data Management Committee
(EDMC) Report to DAARWG
Jeff de La Beaujardière, PhDEDMC Chair
[email protected] +1 301-713-7175
2
• Discoverable
• Accessible
• Documented
• Preserved
VisionAll NOAA data will be:
for all types of users and applications.
3
Past: EDMC Milestones
o 4 years ago: Predecessor committee in existence (DMC) Focus on data preservation. Archive Procedure issued No comprehensive framework for Data Mgmt Reporting to CIO Council only
o 2 years ago: NOAA Admin. Order (NAO) 212-15 issued (2010 Nov 04) EDMC established, reporting to CIO Council & NOAA Obs Syst Council
o 1 year ago: 3 procedural directives issued (2011 Oct)
• Data Management Planning• Data Documentation• Data Sharing by Grantees
Active implementation & monitoring beginso 7 months ago:
Jeff DLB becomes EDMC chair DM Dashboard concept developed NOAA EDM Framework begun
4
Present: Status of EDMC Directives Data Management Planning:
• 6 plans submitted to DM Plan Repository– https://geo-ide.noaa.gov/wiki/index.php?title=Category:Data_Management_Plans
• Positive feedback regarding usefulness of DM Plan template• Working on metrics for scoring plan completeness• EDMC prepared to assist ITRB/CIO in reviewing DM Plans
Data Sharing by NOAA Grantees:• Grants Mgmnt Div now requiring data sharing language on new FFOs
– Recent: Joint Ctr for Satellite Data Assim, IOOS Regional Assoc. Data Documentation:
• EDMC directive mentioned favorably in GAO Geospatial Info Report• NGDC software analyzing metadata. Gradual improvement seen.• NCDDC/NODC metadata training sessions well attended.
Archive Procedure:• Issued 2008 but not actively promoted; now including in EDM Framework.
Data Citation: in preparation Data Access: in preparation
5
Future: EDMC Work Plan for FY 2013
Finalize SAB Actions (due Q2) Respond to SAB follow-up, if any
Perform EDM Assessment of Observing Systems of Record Continue monitoring and implementation of approved PDs Begin assigning persistent identifiers (DOIs) to datasets Finish and issue Data Citation PD Finish and issue Data Access PD Implement prototype EDM Dashboard Host the 2013 EDM Conference Other:
Review old NAO 216-101 “Ocean Data Acquisition” (1990) Support ITRB review of DM Plans as needed Augment liaison to EA Cmtee, GIS Cmtee, OSC
6
Response to SAB Action: External Data Usage Best
Practice
Jeff de La Beaujardière, PhDNOAA Data Management [email protected]
+1 301-713-7175
7
SAB Action 1: Develop policy on use of external data
o Action: Respond to SAB report which states, "A timely NOAA policy for the use of external data could improve NOAA’s data activities and serve as a model for wider collaborative adoption by partners."
o Status: Started from DAARWG White Paper (2011) Draft version 0.1 sent to EDMC in Aug 2012
53 comments received Draft v0.2 circulated to EDMC and original reviewers in Sept
• 31 new comments Draft v0.3 reviewed by NOSC & CIO Council in Oct
• Minor comments
Next step: NOAA Executive Council (NEC) review
Final version to be presented to SAB in Mar 2013
8
External Data Usage Best Practice Summary
o Rationale If NOAA uses external data that are inaccurate, unreliable, or
not within NOAA's legal right to use, then NOAA’s credibility or reliability may be damaged
o Scope Use of environmental or socio-economic data from non-NOAA
sources …especially if:
• NOAA imprimatur applied• May affect life, property, or highly influential scientific
assessmentso Best Practice
Answer questions in Worksheet …to satisfaction of responsible NOAA official
9
External Data Usage Best Practice Worksheet
o Worksheet: series of questions regarding Purpose Quality Reliability Terms of Use Life-Cycle Cost IT Systems & Security Metadata Accessibility Archiving Formal Agreements
10
Response to SAB Action: Environmental Data
Management Framework
Jeff de La Beaujardière, PhDNOAA Data Management [email protected]
+1 301-713-7175
11
SAB Action and Status - NOAA DM Framework
o Action: Respond to SAB report regarding the "urgent need to establish a NOAA-wide Environmental Data Management Framework" An EDM Framework means "the organization and governance
structure (i.e., roles and responsibilities, policies, procedures), data management principles and practices, technical standards, etc. needed to manage the life-cycle of environmental data across the NOAA enterprise."
o Status: Draft v0.1 circulated for comment to EDMC in Sept.
86 comments received Draft v0.2 now re-circulating @ cmtee level: EDMC, OSC, EAC, GISC
Will circulate v0.3 to NOSC & CIO Council
12
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
NOAA Environmental Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
Resources
o Purpose: To organize, guide and support NOAA environmental data management activities.
o Mandate: SAB recommendation to NOAA.
13
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
Resources
Principleso Full and Open Access
Except in special caseso Data Preservationo Information Qualityo Ease of Use
14
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
Resources
Governanceo Agency bodies
EDMC, CIO Council, NOSC, SAB, DAARWG, EA, OSC, GIS
o NOAA policies & docs NGSP, NAO 212-15,
EDMC PDs, etc.o National policies & docso External coordinationo Enforcement
15
CIO CouncilChief Information
Officer Council
NOSCNOAA Observing System Council
DMITData Management Integration Team
GISCommittee
EnterpriseArchitectureCommittee
DAARWGData Access &
Archiving Requirements WG
SABScience Advisory
Board
EDMCEnvironmental Data
Management Committee
ObservingSystems
Committee
NOAA DM GovernanceNEC & NEP
NOAA Executive Council & Panel
16
EDMC Procedural Directives
Archive ProcedureWhat to archive, how to submit to archive.
Data Access What on-line services to use so your data can be obtained.
Data CitationUse unique identifiers to allow data to be referenced and tracked.
Data Sharing by NOAA GranteesState in proposal how you will share data, and share within 2 years.
Data DocumentationHow to apply ISO 19115 metadata for discovery, use & understanding.
Data Management Planning PDPlan, in advance, how you will preserve, document and distribute your data.
17
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
Resources Resources• Personnel
• Training, Recognition
• Budget• Project-specific,
NOAA-wide• Other Resources
• Facilities, Teams, Software, etc.
18
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
ResourcesStandards• Services• Formats• Metadata• Vocabularies• Data Quality• IT Security
19
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
ResourcesArchitecture• Service-based approach• Distributed systems
approach• Infrastructure
• Observing systems• Ground systems• Archive centers• Legacy systems
• Cloud computing
20
Service-based, distributed systems approach
20
VisualizationService
Thematic Portal
TransformationService
internet
Observing System or Forecast Model
access service
Data + Metadata
Obs Syst. or Model
data provider data provider
access service
Data + Metadata
Community Catalog
discovery service
Obs Syst. or Model
access service access service
Data + MetadataData + Metadata
21
Producers
Data Management Services
Data Access Services
Data Management Architecture - Services Viewpoint
Observing Systems
Request Subscription Alert
Discovery Services Service Registry Data Catalog
Utility Services Mapping & Visualization Format Conversion
Coordinate Transformation Product Generation
Consumers
Models
Scientific Software
Archives
Web Browsers
Workflows
Models
External Catalogs
Web Portals
Data Collection & Processing Centers
Client Libraries
Archives
Dedicated Links
Sampling
Decision Support Tools
GIS
Surveys
22
Public Cloud
Cloud Deployment Concepts
Master copy of Data
NOAA security boundary
One-waypush
Access services
Discovery services
Publicusers
GovernmentCloud
ProcessingServices
NOAAInternal
customers
23
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
ResourcesAssessment• Current state
• Observing System DM assessment (planned)
• Progress measurements• EDMC Reporting• DM Dashboard (planned)
• Feedback from users & implementers (desired)
24
Observing System of Record EDM Assessment
o 100+ observing systems providing input to key NOAA products
o For each system (or each dataset): Archive location? Metadata completeness & format? Data access service & format? Dataset identifier(s) assigned?
25
Mockup: DM Dashboard Last Update: yyyy-mm-ddA
cces
sib
ilit
yD
ocu
men
tati
on
% Records with Data Access Service Link Data Access Service Types Offered
Metadata Dialects Used Metadata Completeness Scores
NOAA 46%
Line Ofc 1 25%
Line Ofc 2 80%
Line Ofc 3 50%
#
WMS NoneWCS DAP Esri WFS
% o
f R
eco
rds
FreeText
ISO FGDC OBIS DC
Mean σ Min Max
NOAA
Line Ofc 1
Line Ofc 2None
Historic Trend T
%
27
Data Management FrameworkD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
eD
ata
Life
cycl
e
Data Management Framework
PrinciplesGovernance
StandardsArchitectureAssessment
Resources
28
Data Lifecycle
Planning and ProductionActivities
Data ManagementActivities
UsageActivities
Data Lifecycle Overview
29
Dat
a Li
fecy
cle
UsageActivities
DataManagementActivities
Planning andProductionActivities
Data Lifecycle Activities
CollectionProcessing
Quality ControlDocumentation
CatalogingDisseminationPreservationStewardship
Usage TrackingFinal Disposition
Requirements DefinitionPlanning
DevelopmentDeploymentOperations
DiscoveryReception
UnderstandingAnalysis
New Product GenerationUser Feedback
CitationTagging
Gap Assessment
30
Applicability of EDMC Directives
Dat
a Li
fecy
cle
UsageActivities
DataManagementActivities
Planning andProductionActivities
CollectionProcessing
Quality ControlDocumentation
CatalogingDisseminationPreservationStewardship
Usage TrackingFinal Disposition
Requirements DefinitionPlanning
DevelopmentDeploymentOperations
Data Documentation
DM Planning
Data Sharing by Grantees
Archive Procedure
Data Citation
Data Access
DiscoveryReception
UnderstandingAnalysis
New Product GenerationUser Feedback
CitationTagging
Gap Assessment
External Data Usage BP
31
Backup Slides
32
Data User
ArchiveConOps
Data Management Concept of Operations
DMPlan* Data
Documentation*(Metadata)
ArchiveProcedure*
Data AccessService*
OAISReference
Model
Dashboard
CatalogService
ID*
Result• product• forecast• paper• decision• policy• response
ID
generatewrite
perform
preserve
guide
bind
publish
understand
get
find
register
measure
create
DataProducer
cite
*Subject of EDMC Procedural Directive
publish
Archive
(Not all activities illustrated)
NOAA Leadership
assess
Tools
measure
generate
Requirementsrefine
guide
use
33
NOAA Data Discovery Vision
33
NOAA data catalogingFind NOAA data by:• Location• Time• Phenomenon• Other characteristicswithout knowing NOAA organization
EnforcementMarine
MammalsROV/UAV NOAA LabsSurveysDataSources Satellite Radar Buoy Ship SonarFish Gauge Chart Model
Client Tools
DesktopGIS
ScientificSoftware
GeoPlatformMap Viewer
GEOdata.gov GCMDExternalCatalogs
Data AccessServices
Local Catalogs& Metadata Folders NCDC NODC IOOSUAFNGDC others...
34
Dat
a Li
fecy
cle
UsageActivities
DataManagementActivities
Planning andProductionActivities
CollectionProcessing
Quality ControlDocumentation
CatalogingDisseminationPreservationStewardship
Usage TrackingFinal Disposition
DiscoveryReception
UnderstandingAnalysis
New Product GenerationUser Feedback
CitationTagging
Gap Assessment
Requirements DefinitionPlanning
DevelopmentDeploymentOperations
Feedback and Support Interactions
Data Lifecycle Activities
35
Primary NOAA Data Catalog(s)
metadatarecord
Tag Database metadatarecord
metadatarecord metadata
record
metadatarecord
metadatarecord
metadatarecord
metadatarecord
DWH
data.gov
GEOSS CORE
GEOSS StP
next crisis
and the next
DWH Response data.gov GEOSS
Data CORE
TopicalCatalogs or
Portals
the next crisis
Tags are not inserted into metadata records
by data providers. Instead, the Catalog adds tags to indicate
datasets relevant to a particular purpose.
Datasets with a relevant tag are
recorded by external catalogs.
Tagging Concept