View
217
Download
0
Category
Tags:
Preview:
Citation preview
The Earth System Curator
Metadata RepresentationsPrototype Portal in Collaboration with ESMF and ESG
Rocky DunlapSpencer Rugaber
Georgia Tech
Who we are Cecelia DeLuca, NCAR V. Balaji, GFDL/Princeton University Don Middleton, NCAR Chris Hill, MIT Serguei Nikonov, GFDL Sylvia Murphy, NCAR Luca Cinquini, NCAR Julien Chastang, NCAR Spencer Rugaber, Georgia Tech Leo Mark, Georgia Tech Rocky Dunlap, Georgia Tech
Plus other collaborators: NMM, Metafor, BFG2, and others
What is the Earth System Curator?
The goal of Curator is to link climate datasets with a detailed description of the model that ran to produce the dataset
Transparent access to models and datasets Use cases for climate model metadata
Provenance (history of what happened) Archival and search (for models and datasets) Model inter-comparison Compatibility checking Generation of coupler components
Collaborations with Related Projects
Earth System Modeling Framework (ESMF) Software infrastructure to facilitate building numerical Earth
System models Component-based model development Built in tools for managing common modeling tasks (coupling
fields, calendars, grid creation, etc). Earth System Grid (ESG)
A large scale distributed portal for hosting data produced by Earth System models
Services such as dataset ingest, faceted search, dataset browsing, viewing metadata, downloading datasets
UML Unified Modeling Language What it is
A visual modeling language for representing software systems Source
OMG Standard Motivation
Conceptual modeling, human to human communication of the model, object oriented representation
of the 13 diagrams in UML 2.0, we are using one: class diagram static structure in terms of classes, attributes on classes,
relationships between classes
UML
Metamodel Access to metamodel for creating UML Profiles ability
to define a subset of UML used for building your own models
Tool support Enterprise Architect – recommended Others – Rational Rose, Poseidon, ArgoUML, Microsoft
Visio Constraint +Query Language – Object
Constraint Language (OCL)
RDF/OWL What it is
“Semantic web” ontology language Primary modeling constructs are properties and classes Conceptual implementation language (not low level like
XML) RDF – Resource Description Framework
Based on {subject, predicate, object} triples OWL – Web Ontology Language (2.0 coming soon!)
Strong theoretical basis on Description Logics Source
W3C standard
RDF/OWL
Motivations Now a widely accepted standard Simple data model, but OWL still allows complex class
descriptions Very “web friendly” for use with external systems, semantic
mediation, URIs, XML format for interchange “Non-experts” can build an ontology using Protégé Architectural considerations: faceted search interface
Tool support Protégé Sesame Triple Store, Jena Java API
Example RDF Statements
“Balaji works at GFDL.”
Curatormeeting GFDL
“18 Oct 2007”
“19 Oct 2007”
Balaji
hasLocationworksAt
starts
ends
RDF XML Representation<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:esc="http://www.earthsystemcurator.org">
<rdf:Description rdf:about=“http://....#OctCuratorMeeting"><esc:hasLocation rdf:resource=“http://....#GFDL”/><esc:starts>18 Oct 2007</esc:starts><esc:ends>19 Oct 2007</esc:ends>
</rdf:Description>
<rdf:Description rdf:about=“http://....#Balaji"><esc:worksAt rdf:resource=“http://....#GFDL”/>
</rdf:Description>
</rdf:RDF>
ESG Ontology with Curator Extensions
Protégé 4 beta: http://protege.stanford.edu/download/registered.html#p4
Update Pizza Tutorial (HIGHLY RECOMMENDED)http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial-p4.0.pdf
XML/XML Schema
What it is Very widely accepted format for communication between
applications, tag-based markup Source
W3C Standards Motivations
A standard implementation that modeling groups can adhere to (most will not be comfortable with RDF/OWL)
Can be output by modeling frameworks such as ESMF “Use profiles” are small chunks of XML for specific purposes
(part of the egg white?)
XML/XML Schema
Tool support XMLSpy, oXygen, Notepad...
Query languages XQuery, XPath XSLT for transforming XML to other formats
SQL – Relational Databases (RDBMS)
ANSI standardMotivations
Very mature technology RDF/OWL and XML are likely NOT good
solutions for long term storage Fast querying Large scale metadata storage
Representation Issues/Considerations
What kinds of constraints do we need to precisely model the domain? structural constraints vs. dynamic constraints
What kinds of reasoning and query capabilities do the applications require?
What role will the meta-model play? How do you keep consistency among several
representations/notations? What is the role of auto-generation?
Putting it all together...
A prototype application developed this summer at NCAR in collaboration with ESMF and ESG:
ESMF modeling components become “self-describing” Metadata is exported from an ESMF component in a
standardized XML format (multiple conventions allowed)
The XML is ingested into ESG and exposed to the portal for users to search
Metadata Lifecycle
1. ESMF component exports XML metadata2. The XML is validated and harvested into a
Java object representation3. The Java objects are persisted to a relational
database (RDBMS)4. Metadata in the RDBMS is then harvested into
RDF – a Semantic Web ontology language5. The RDF is accessed by the ESG web portal
for faceted search of the metadata
ESMF XML Output (example)<model_component name="Finite Volume Dynamical Core">
<discipline_set> <discipline name="Atmosphere" /> </discipline_set>
<physical_domain_set> <physical_domain name=“Earth system" /> </physical_domain_set>
<agency_set> <agency name="NASA" /> </agency_set>
<institution_set> <institution name="Global Modeling and Assimilation Office (GMAO)" /> </institution_set> ……
Viewed as a simple “use-profile”
ESMF XML Output (example) <author_set> <author name="Max Suarez" /> </author_set>
<coding_language_set> <coding_language name="Fortran 90" /> </coding_language_set>
<model_component_framework_set> <model_component_framework name="ESMF (Earth System Modeling Framework)" /> </model_component_framework_set>
<variable_set> <variable shortname="DPEDT" longname="Edge pressure tendency" units="Pa s-1" /> <variable shortname="DUDT" longname="Eastward wind tendency" units="m s-2" /> …… </variable_set>
</model_component>
Demo of Dycore Portal
http://dycore.ucar.edu/
Recommended