16
Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara http://kepler-project.org February 13, 2007 erkeley, California The Kepler Actor Repository: Enabling Remote Storage, Query and Retrieval of Workflow Components

Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Chad BerkleyNational Center for Ecological Analysis and Synthesis (NCEAS),

University of California, Santa Barbara

http://kepler-project.org February 13, 2007Berkeley, California

The Kepler Actor Repository: Enabling Remote Storage, Query and Retrieval

of Workflow Components

Page 2: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Purpose of the Repository

Easy method for workflow authors to share components

A common archive for components Enables strong versioning Workflow components can become metadata

for research papers Helps with lineage tracking

RepositoryClient

Component

Page 3: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Functional Requirements

Kepler users should be able to easily locate and use components

Kepler users should be able to easily add components to the archive

Users should be able to restrict access to their components

Components should be browsable or searchable

Page 4: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Important Differences Between Kepler and Ptolemy

Kepler Object Manager (OM) database of all objects registered with the system objects are read in at startup OM organizes objects based on an ontology

Kepler Objects Each component has one or more semantic types Domain specific ordering via semantic

type/ontologies Each object has a unique LSID

Ontology: specification of a conceptualization within a knowledge domain

Page 5: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Kepler Archive (KAR) Files Used for component transfer and archiving OM can create or ingest KAR files Consits of actor metadat, manifest and

eventually class/jar files Each object has LSID listed in the manifest KAR itself has an LSID KAR files are used for transporting

components between the client and server

Important Differences Between Kepler and Ptolemy

Page 6: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Architecture

Page 7: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

The Repository

All services provided via the EarthGrid (formerly known as the EcoGrid)

Web Services: Get/Query, Put, Auth Web interface allows users to search

for and download components outside of Kepler

Component Storage KAR files metadata file external to the KAR for

indexing

Page 8: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Client Interface

Object Manager Handles local get/put/query Handles remote get/put/query through the

EarthGrid interface User right clicks on a component to

upload it Remote search results are integrated into

the actor library User drags component from the search

results to download it Internal database of LSIDs is synced to

the server via EarthGrid interfaces

Page 9: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Uploading and Searching

Page 10: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Downloading to the Client

Remote Components downloaded when dragged to the canvas

For initial display (in the results tree), only actor metadata is loaded

After initial download, KAR file is cached

Want and need dynamic class loading to make this more useful (back to that in a minute)

Page 11: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Authentication

Uses the EarthGrid interface Backend is currently LDAP Currently, a component can either be public

or private, but control could be finer grained

Authentication interface in Kepler is extensible and provides for other authentication schemes, such as GAMA

Page 12: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Documentation

Kepler uses the Ptolemy documentation system

For displaying docs remotely, Kepler uses a custom attribute and inserts the docs directly into the actor metadata on the server

Docs can then be transformed for viewing on the website

Page 13: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Viewing Documentation

Page 14: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Future Work

Use the repository as the main storage location for components instead of shipping Kepler with an extensive library

Make the web interface more usable Dynamic class loading….

Page 15: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

Dynamic Class Loading

Motivation: allow the use of multiple versions of the same class in one workflow execution.

Problems Loading classes isn’t too hard, but reloading classes requires

removing the entire classloader. Two different actors may use two different versions of the

same class. We don’t always want to use the class (with the same name)

that is already loaded. Security issues with not always using the preloaded java

classes. Potential Solutions

Create a custom classloader that allows reloading/coloading Create a new classloader for each loaded class (that can be

removed if necessary) Suggestions?

Page 16: Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara  February

More info: http://kepler-project.org

This material is based upon work supported by the National Science Foundation under award 0225676 and others.

SEEK Partner InstitutionsUniversity of New Mexico

Napier University, Edinburgh Scotland

University of Kansas

University of Vermont

University of California, Santa Barbara

National Center for Ecological Analysis and Synthesis

University of California, Davis

Arizona State University

University of North Carolina

San Diego Supercomputer Center

Kepler Partner ProjectsSEEK

Ptolemy

ROADNet

SDM/SPA

SDM/CPES

GEON

Resurgence