13
Brief Overview of Major Enhancements to PAWN

Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Brief Overview of Major

Enhancements to PAWN

Page 2: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Producer – Archive Workflow Network (PAWN)

Distributed and secure ingestion of digital objects into the archive.

Use of web/grid technologies – platform independent

Ease of integration with data grids or digital libraries.

XML Representation of metadata and bitstream• Self describing bitstream submissions

Accountability of transfer and guarantee of data integrity

Page 3: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Ingestion Workflow (PAWN)

1. Negotiate Submission Agreement.

2. Workflow Initialization and Submission Information Packet (SIP) creation.

3. Transfer of SIPs to receiving servers.

4. Validation of SIP transfer

5. Organization of data into collections and transfer into the distributed archive.

Page 4: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Distributed Ingestion

``

`

Producer

``

`

Producer

``

`

Producer

``

`

Producer

Distributed Archive

Page 5: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Distributed Ingestion

Each Producer registers and arranges files locally prior to transport.

Multiple distributed archival receiving stations.

X.509 based authentication between sites. Independent Certificate Authorities at each

Producer. Persistent archive is geographically

distributed and managed by a data grid.

Page 6: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Producer

Provides data to an Archive based on a prior agreement.

Consists of a management/metadata server and an ingestion client.

Provides initial arrangement, context, and metadata.

Producer Management Interface

Producer data suppliers

Archive

Management Server

Page 7: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Enhancements to the Producer

Data submissions are organized through a logical hierarchy negotiated between the archive and the producer.

Clients no longer see entire hierarchy, but rather attachments points

Better state tracking and oversight of submissions METS documents are no longer merged together,

but rather kept separate to support larger submissions.

Submission can be broken into multiple METS documents linked together through pointers.

Producer signed submissions to ensure integrity.

Page 8: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Different administrator and client views

Manager / Record Manager

Administrator• Views entire producer

hiearchy

Producer / Record Creator• View restricted to

allowable submission points

Page 9: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

New Interactions Between Client and Receiving Servers

Ability of client to reserve resources before starting to transfer data into the archive.

Client creates a session with a receiving server and uploads metadata.

Clients upload bitstreams, and receiving server validates checksums during transfer

Client can resume or retransmit failed submissions

Page 10: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Client-Receiver Interaction

1. Reservation Request*

3. Reservation Information*

2. Rese rvation

N

egotia

tion *

[5. Signed Mets Package and Acknowledgement]**6...n Send Payload and Acknowledgement

4. Open Session

Finished Transmitting

* Placeholder calls** Only required once

Scheduler

Receiving Server

Client

Page 11: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Archive - receiving

Receives data from a Producer

Validates bitstreams and metadata, and sends acknowledgement to Producer.

Arranges into collections and specifies preservation policy.

Publishes bitstreams into a digital archive.

Bitstream Validation Service

Digital Archive

Load Balancer

Producer 1

Producer n

Producer 2

Page 12: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

New Features for Receiver

Validation Services• Designed a standard API and test suite for rapid

development of validation services.

• New classes of services can be easily developed.

Receiving Server• Configurable endpoints into storage or metadata

repositories

• Better handling of multiple producers

Page 13: Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the

Scheduler

Allocates the processing of data streams from multiple clients to a cluster of receiving servers.

Clients are required to request a resource reservation.

Receiving server will acknowledge/deny the reservation.

Client will be informed about reservation/receiving server.

Currently, receiving server has hooks for scheduler