Collaboration on Large Datasets using Globus Rachana
Ananthakrishnan University of Chicago
Slide 2
Data sharing in collaborations Registry Staging Store Ingest
Store Analysis Store Community Store ArchiveMirror Ingest Store
Analysis Store Community Store ArchiveMirror Registry
Slide 3
Data Management User Stories I need a good place to store /
backup / archive my (big) research data I need to easily, quickly,
and reliably move or mirror portions of my data to other places. I
need a way to easily and securely share my data with my colleagues
at other institutions. I want to publish my data. I want to
discover published data.
Slide 4
Exemplar: ISI-MIP Inter-Sectoral Impact Model Intercomparison
Project Framework to collate climate impact data across scales and
sectors World-wide collaboration with data assets managed by the
collaboration Inputs from various climate models & output forms
basis for model evaluation and improvement Credits: Dr. Joshua
Elliot, University of Chicago
Slide 5
ISI-MIP Use Cases Share data with researchers across
institutions world-wide Restricted sharing Multiple institutions
Accept data submissions Restricted writing to archive Publish
results Move selected results to other locations Track metadata
Discover data
Slide 6
What is Globus? Big data publish*, transfer and sharing with
Dropbox-like simplicity directly from your own storage systems * In
pilot phase
Slide 7
Collaboration Archive Univ. of Chicago Argonne IIT UIUC Publish
walk-through 3. Assemble Dataset (Transfer Data) Curator 2.
Describe Submission Scientist 4. Curate Dataset 1. Publish
Data
Slide 8
Login with Campus Identity 8
Slide 9
New submission 9
Slide 10
Assemble the Dataset 10
Slide 11
Move data to publish archive 11
Slide 12
Grant Submission License 12
Slide 13
Submission Complete 13
Slide 14
Curator Logs in 14
Slide 15
Curation Workflow Options 15
Slide 16
Verify Metadata & Files 16
Slide 17
Approve the Submission 17
Slide 18
Submission is now Published with DOI 18
Slide 19
Collaboration Archive Univ. of Chicago Argonne IIT UIUC
Discover walk-through 3. Assemble Dataset (Transfer Data) Curator
2. Describe Submission Scientist 4. Curate Dataset 1. Publish Data
6. Download 5. Search
Slide 20
Search Published Datasets 20
Slide 21
Discovering a Published Dataset 21
Slide 22
Download the Published Dataset 22
Slide 23
Select Download Destination 23
Slide 24
Globus Under the Covers Identity, Group, Profile Management
Services Sharing Service Transfer Service Globus Toolkit Globus
APIs Globus Connect
Slide 25
Reliable, secure, high-performance file transfer and
synchronization Fire-and-forget transfers Automatic fault recovery
Seamless security integration Powerful GUI and APIs Data Source
Data Source Data Destination Data Destination User initiates
transfer request 1 1 Globus moves and syncs files 2 2 Globus
notifies user 3 3
Slide 26
Simple, secure sharing off existing storage systems Data Source
Data Source User A selects file(s) to share, selects user or group,
and sets permissions 1 1 Globus tracks shared files; no need to
move files to cloud storage! 2 2 User B logs in to Globus and
accesses shared file 3 3 Easily share large data with any user or
group No cloud storage required
Slide 27
Thank you Signup and use Globus to transfer and share
globus.org/signup Signup as early adopters of publish
globus.org/data-publication Support [email protected]
Slide 28
Thank you to our sponsors! U.S. DEPARTMENT OF ENERGY