22
Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Embed Size (px)

Citation preview

Page 1: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Data Acquisition and Data Publishing with eSciDocMatthias Razum

DataCite Summer MeetingHannoverJune 7-8, 2010

Page 2: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide

Agenda

Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting2

eSciDoc „in a nutshell“

How to get the data into the repository

Data publication

Page 3: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting3

“Continuum of Data”

IdeaExploration

Data AcquisitionExperiment

AggregationAnalysis

PublicationArchival

eSciDoc Infrastructure

eSciDoc Solutions, Services, and exisiting Tools

Collaboration

eSciDoc Vision

Page 4: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting4

Detour 4: eSciDoc Services

User Account

Handler

Role Handler

Group H

andler

SecurityStatistics Manager Object Manager

Item Service

Container Service

Organizational U

nit Service

Policy Enforcement PointContext Service

Set Service

Content Model

Service

Aggregation D

efinition Service

Statistics Data

Service

Scope Service

Report Service

Report Definition

Service

Admin Service

Technical Metadata

Extraction

Search and Indexing

Digilib

PID Service

Duplicate D

etection

Admin Tool

OAI-PM

H

Control ofN

amed Entities

Data Acquisition

Service

Transformation

Citation Style Service

Deposit

(SWO

RD)

Validation

Page 5: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting5

Item Item

Item Resource

Thumbnail Web Resolution Original Resolution Thumbnail Web Resolution Original Resolution

Metadata Metadata

Metadata Metadata Metadata Metadata Metadata Metadata

Container

Page 6: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide

Agenda

Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting6

eSciDoc „in a nutshell“

How to get the data into the repository

Data publication

Page 7: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting7

Data in the Publication Process Today

Library

Publication

Manuscript

Private Files

Data Metadata

J. Helly, H. Staudigel, and A. Koppers, Scalable models of data sharing in Earth sciences, Geochem. Geophys. Geosyst., 4(1),1010, doi:10.1029/2002GC000318, 2003

Research

Page 8: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting8

BW-eLabs

BW-eLabs gives access to virtual and remote experiments in the field

of nano technology and holography. Key concepts include: reproducability of experiments discoverability of and access to primary data storage and curation of all artifacts that emerge throughout the

research process

Project partners Universität Stuttgart HdM Stuttgart Universität Freiburg FIZ Karlsruhe

Page 9: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting9

Preparation

BW-eLabs: Collecting Data throughout the Research Process

Synthesis and Online Analysis

Offline Analysis and Visualisization

Publication

Common Data Infrastructure

Experiment

Data Objects

Experiment Experiment

Data Objects

Experiment

Data Objects

Page 10: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting10

Data Center

OfficeLaboratory

BW-eLabs: Data Acquisition in the Lab

Infrastructure

eSync Daemon

Deposit Service

Data files fromonline analysis

Monitored folder

Instrument(e.g. spectroscope)

Replicateddata files

Metadata Extractor

MD MD

Data file formetadata extraction

Extractedmetadata

eLab Solution

eSciDoc Items

Creates new experiment (Container)

POSTs newconfiguration

to eSync daemon

1

2

4

3

5

6

7

8

9

Page 11: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting11

Core Scientific Metadata Model

B. Matthews et al. „Using a Core Scientific Metadata Model in Large-Scale Facilities.“ 5th International Digital Curation Conference. London, 2009.

Page 12: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting12

BW-eLabs: Metadata Model

Institution

StudyInvestigator

Dataset

RigInstrume

nt

Related Datafile

Datafile

Sample

Investigation

Page 13: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting13

Organizational Unit“Work Group”

Context “Nano Lab”

Container“Study X”

Container“Investigation 1”

Other Investigations for Project X

Item“Datafile”

Container“Dataset”

User Account

µ

Organizational Unit“Institut” µ

Other Studies Rig Descriptions

Context “Rigs”

Other Datasetsfor Investigation 1

Instrument Descriptions

Context “Instruments”

Item“Sample”

Item“Configuration”

Page 14: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting14

Organizational Unit“Work Group”

Context “Nano Lab”

Container“Study X”

Container“Investigation 1”

Other Investigations for Project X

Item“Datafile”

Container“Dataset”

User Account

µ

Organizational Unit“Institut” µ

Other Studies Rig Descriptions

Context “Rigs”

Other Datasetsfor Investigation 1

Instrument Descriptions

Context “Instruments”

Item“Sample”

Item“Configuration”

isConfigurationOf

requires

consistsOf

isMeasuredWith

Page 15: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide

Agenda

Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting15

eSciDoc „in a nutshell“

How to get the data into the repository

Data publication

Page 16: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting16

Data Publication Support in eSciDoc

Object Lifecycle Versions Persistent Identifiers Authentication and Authorization Linking Resources

Page 17: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting17

Item/Container Lifecycle

Quality control/ editorial process

Creator cannot modify the object any longer

Metadata may still be enriched

Object publically accessible

Version is fixed

Metadata can be harvested

Full-text indexed

Access to object is restricted

Only metadata is publically accessible

Only the creator can access and modify the object

Collaborators may be invited

Object may be deleted

pending released withdrawn

in-revision

submitted

Page 18: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting18

Lifecycle, Versions, and Persistent Identifiers

version-status

public-status

Version 1 2 3 3

released

released

3

pending

released

4

submitted

released

4

released

released

4

pending submitted

pending submitted

Resource

Visible version for unprivileged users (role „default“)

Visible version for creator (role „depositor“)

submit() release()

Creator contiues to work on resource submit() release()

Creator saves and contiues to work

on resource

Creator saves and contiues to work

on resource

Object PID 10.123/1

Version PID 10.123/2

Version PID 10.123/3

Page 19: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting19

Authentication and Authorization

Scope

Policies

Roles

Actions

GroupsUser

Resource

Policy Decision Point

Page 20: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting20

Discipline-specific Data Repositories

Page 21: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Slide

eSciDoc

Linking Resources

PID ServicePID1

PID2

PID3

PID4

Other Repositories

ShibbolethLDAP

OAI-PMHOAI-ORE

OpenSearchSRU/W

21 Data Acquisition and Data Publishing with eSciDoc - DataCite Summer Meeting

Page 22: Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010

Thank [email protected]

www.escidoc.org