25
1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright 2009 California Institute of Technology Government sponsorship acknowledged

1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Embed Size (px)

Citation preview

Page 1: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

1

- A View from the Field -

The Next Generation Data Standards For the PDS

- PDS4 -

ESIP Federation Meeting

July 8, 2009

J. Steven HughesJPL

Copyright 2009 California Institute of Technology

Government sponsorship acknowledged

Page 2: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS Mission and Vision Statement• Mission: The mission of the Planetary Data System is to

facilitate achievement of NASA’s planetary science goals by efficiently collecting, archiving, and making accessible digital data produced by or relevant to NASA’s planetary missions, research programs, and data analysis programs

• Vision:• To gather and preserve the data obtained from

exploration of the Solar System by the U.S. and other nations

• To facilitate new and exciting discoveries by providing access to and ensuring usability of those data to the worldwide community

• To inspire the public through availability and distribution of the body of knowledge reflected in the PDS data collection

• PDS is a federation of heterogeneous nodes including science and support nodes

Page 3: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

About the Planetary Data System• NASA’s official archive for Planetary Science Data

• Missions are required to archive data with PDS• Data is peer reviewed as part of the archiving of data• PDS works to support planetary science R&A

• Federation of nodes• Science discipline nodes provide scientific and data

management expertise• Central engineering (JPL) addresses PDS-wide software and

standards• Management node (GSFC) provides management support to

PDS• Governed by the PDS Management Council which is formed

from the node managers

• Standard archiving processes and tools• PDS supports diversity of data types but promotes a

homogeneous architecture for archives

Page 4: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS Engineering Challenges

• All data online and distributed across the PDS enterprise• Single point of access to all data

through an integrated infrastructure

• Substantial increases in data volumes

• International coordination and interoperability• Shared standards• Sharing of science products• Satisfying US data sharing

regulations

• Supporting diverse user needs

• Replacing aging technology, tools and processes

Archive Volume Growth

0

10

20

30

40

50

60

70

80

90

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008

Year

TB

(A

ccu

m)

TBytes

Page 5: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Why is PDS4 Necessary?

• The PDS data standards were developed in the late 1980’s to define the concepts and terms needed for archiving science data in the planetary science domain.

• Even though the data standards were innovative for their time, ambiguity and many assumptions have crept in over almost two decades of use.

• This situation has caused significant problems for PDS operations, data providers, and end-users.

• In 2006 the International Planetary Data Alliance (IPDA) was formed.

• Charter stated that the PDS data standards, as the de-facto planetary standards, should reviewed and the core elements be adopted.

• The PDS4 task was initiated to formalize the PDS data standards and address the issues.

Page 6: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Key Goals

• Enable a stable and usable long-term archive.

• Enable more efficient archive preparation for data providers.

• Enable services for the data consumer to find the specific data they need and provide the formats they require.

Page 7: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Design Principles

• The data model:• is defined in a formal language• is independent of implementation• defines a few fundamental data structures that do

not evolve over time• is extensible enabling it to handle more complex

data formats

• The archive data formats is being designed independent of data provider and data consumer formats.

• The data architecture shall include a standard data dictionary model.

Page 8: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS 2010 Architecture

8

Page 9: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Data Product Model Components

Registry Object &Web Resource

Product

Description Combinations

Data Object Description

Structure

Data

Classification

Page 10: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Data Product Concept Map

10

Web & Registry View

Basic I/O View

Programmer View

User/Designers View

Page 11: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

The Model Design Process

• Master model constrains data design and defines validation.

•Master model and data dictionary are tightly coupled.

•Document writer translates modeling information into target languages based on grammar and munging rules.

•Updates to master model are reflectedquickly in the documents.

Page 12: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Data Standards Deliverables• PDS4 Information Model

• The Information Model defines PDS object classes. This includes data structures, formats, and products as well as data sets, documents, missions.

• PDS4 Data Dictionary Model• The Data Dictionary Model provides the schema for the PDS data

dictionary. The data dictionary documents the data elements used in the PDS4 Information Model.

• PDS Standards Reference V4.0• The PDS Standards Reference V4.0 will be written in the format

and tone of a standards reference document.

• Grammar Options• The Grammar is used to capture PDS archive metadata for product

labels.

12

Page 13: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Benefits of the PDS4 Data Model

• The data model is managed in a data modeling tool.• The model is formally defined.• The model can be validated and tested.

• Defines a few simple fundamental data structures.• Fundamental data structures may be extended and

combined to form more complex data formats• The overall architecture is model driven.

• Disentangles the model from its implementation.• Model can evolve over time as research domain changes.• Drives the generation of documentation, label schema, and

other model dependent artifacts.• The data dictionary uses a standard data dictionary

model.

Page 14: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Backup

14

Page 15: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Product Label Creation

Label

15

Model

Generic Label

Schema

Design

PopulateExtract

SpecificLabel

Schema

Mission Specific

Data DictionaryIngest/

Update

IngestLabel

Product Labels

ModelingTool A

Design

Data Engineer

ModelingTool B

Load

Edit

Validate

Generate

Current using editorCurrent using Oxygen/Future Design ToolProposed

Ingest

DDWG

Page 16: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Topics

• The View from the Field• What are other disciplines and agencies

doing for preservation/stewardship?• How do they deal with databases,

collections of files, physical objects, ad-hoc services such as work flows?

• How do they deal with provenance?• Any lessons to be learned and incorporated

into earth science practice

Copyright 2009 California Institute of Technology

Government sponsorship acknowledged

Page 17: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Data Design Accomplishments - General

• Project and Project Members Defined • Principles and Drivers updated• General Data Model (Draft)• Product Data Model (Draft)• Data Dictionary Model (Draft)• Grammar Options• PDS Standards Reference (Outline)• Data Dictionary Model (Final)• Grammar Decision • PDS Standards Reference (Draft)• PDS and community wide review• General Data Model (Final)• Product Data Model (Final)• PDS Standards Reference (Final)

17

Done

SubstantialProgress

Next 3 Months

Due 9/31/10

Page 18: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Proposed IPDA Data Standards Project

18

• Identify the core elements of the PDS4 data standards

• Develop a process for maintaining alignment between the IPDA and the PDS

UniqueNamespace

Page 19: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Charter for the IPDA Data Architecture Standards

19

“The data standards within the IPDA, including the data models and derived dictionaries, are based on the NASA Planetary Data System (PDS) standard that is the de-facto standard for all planetary data at the time of the IPDA founding”. Charter of the International Planetary Data Alliance, 3rd Draft, May 24, 2007

Page 20: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Data Design Accomplishments - Details

20

• Four basic data structures•Homogeneous N-dimensional array of scalars – Array_Base•Heterogeneous repeating record structure of scalars – Table_Base•Unencoded byte stream•Encoded byte stream

• Identifiable• Digital Product

• Data Product• Product_Image_Grayscale• …Image_3D• …Movie• …Table_Character• …Table_Binary• …Table_Binary_Grouped

• Document Set• Software_Set

• Non-Digital Product• Mission• Instrument• Resource

• Data Type• Binary Data Type

• Decimal Integer• …

• Character Literals• Character Integer• …

• Array_Base• Array_2D

• Image_Grayscale• Spectrum_2D

• Array_3D• Image_3D• Movie

• Table_Base• Table_Character• Table_Binary• Table_Binary_Grouped

• Draft PDS Data Dictionary Model

• Grammar – ODL+, PVL, and XML Labels

Page 21: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Example Results

Image _Grayscale

Concept Map

UML Class Diagram

XML Schema

PVL Label Template

Class Definition Table

<!-- PDS4 XML/Schema for Product_Image_Grayscale --><xs:complexType name="Image_Grayscale_Type"> <xs:sequence> <xs:element name="axes_order" type="axes_order_Type"

/* ******* Label Template - Product_Image_Grayscale Object = Product_Image_Grayscale; Object = Image_Grayscale; local_identifier = ${local_identifier};

Page 22: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

PDS4 Product Label Creation

Label

22

Model

Generic Label

Schema

Design

PopulateExtract

SpecificLabel

Schema

Mission Specific

Data DictionaryIngest/

Update

IngestLabel

Product Labels

ModelingTool A

Design

Data Engineer

ModelingTool B

Load

Edit

Validate

Generate

Current using editorCurrent using Oxygen/Future Design ToolProposed

Ingest

DDWG

Page 23: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright

Query Model

• A formal model consisting of classes, attributes, and relations that are appropriate for use as search constraints

• Types of query models include• Data Set• Product

• Data Product• Document Product• Software Product

• Any PDS4 Identifiable • The query models are subsets of the archive model

• Can be augmented with external metadata (e.g. LDD)• Example query constraints

• general parameters - time, target, mission, instrument host, instrument

• any metadata defined in the archive model• any association between two classes

• e.g. documents associated with data sets• class type and hierarchy• geometry - (lat, lon), (az, el), (ra, dec), (x,y,z).

Page 24: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright
Page 25: 1 - A View from the Field - The Next Generation Data Standards For the PDS - PDS4 - ESIP Federation Meeting July 8, 2009 J. Steven Hughes JPL Copyright