View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Chad BerkleyNational Center for Ecological Analysis and Synthesis (NCEAS),
University of California, Santa Barbara
http://kepler-project.org February 13, 2007Berkeley, California
The Kepler Actor Repository: Enabling Remote Storage, Query and Retrieval
of Workflow Components
Purpose of the Repository
Easy method for workflow authors to share components
A common archive for components Enables strong versioning Workflow components can become metadata
for research papers Helps with lineage tracking
RepositoryClient
Component
Functional Requirements
Kepler users should be able to easily locate and use components
Kepler users should be able to easily add components to the archive
Users should be able to restrict access to their components
Components should be browsable or searchable
Important Differences Between Kepler and Ptolemy
Kepler Object Manager (OM) database of all objects registered with the system objects are read in at startup OM organizes objects based on an ontology
Kepler Objects Each component has one or more semantic types Domain specific ordering via semantic
type/ontologies Each object has a unique LSID
Ontology: specification of a conceptualization within a knowledge domain
Kepler Archive (KAR) Files Used for component transfer and archiving OM can create or ingest KAR files Consits of actor metadat, manifest and
eventually class/jar files Each object has LSID listed in the manifest KAR itself has an LSID KAR files are used for transporting
components between the client and server
Important Differences Between Kepler and Ptolemy
Architecture
The Repository
All services provided via the EarthGrid (formerly known as the EcoGrid)
Web Services: Get/Query, Put, Auth Web interface allows users to search
for and download components outside of Kepler
Component Storage KAR files metadata file external to the KAR for
indexing
Client Interface
Object Manager Handles local get/put/query Handles remote get/put/query through the
EarthGrid interface User right clicks on a component to
upload it Remote search results are integrated into
the actor library User drags component from the search
results to download it Internal database of LSIDs is synced to
the server via EarthGrid interfaces
Uploading and Searching
Downloading to the Client
Remote Components downloaded when dragged to the canvas
For initial display (in the results tree), only actor metadata is loaded
After initial download, KAR file is cached
Want and need dynamic class loading to make this more useful (back to that in a minute)
Authentication
Uses the EarthGrid interface Backend is currently LDAP Currently, a component can either be public
or private, but control could be finer grained
Authentication interface in Kepler is extensible and provides for other authentication schemes, such as GAMA
Documentation
Kepler uses the Ptolemy documentation system
For displaying docs remotely, Kepler uses a custom attribute and inserts the docs directly into the actor metadata on the server
Docs can then be transformed for viewing on the website
Viewing Documentation
Future Work
Use the repository as the main storage location for components instead of shipping Kepler with an extensive library
Make the web interface more usable Dynamic class loading….
Dynamic Class Loading
Motivation: allow the use of multiple versions of the same class in one workflow execution.
Problems Loading classes isn’t too hard, but reloading classes requires
removing the entire classloader. Two different actors may use two different versions of the
same class. We don’t always want to use the class (with the same name)
that is already loaded. Security issues with not always using the preloaded java
classes. Potential Solutions
Create a custom classloader that allows reloading/coloading Create a new classloader for each loaded class (that can be
removed if necessary) Suggestions?
More info: http://kepler-project.org
This material is based upon work supported by the National Science Foundation under award 0225676 and others.
SEEK Partner InstitutionsUniversity of New Mexico
Napier University, Edinburgh Scotland
University of Kansas
University of Vermont
University of California, Santa Barbara
National Center for Ecological Analysis and Synthesis
University of California, Davis
Arizona State University
University of North Carolina
San Diego Supercomputer Center
Kepler Partner ProjectsSEEK
Ptolemy
ROADNet
SDM/SPA
SDM/CPES
GEON
Resurgence