21
Genetic Diversity and HighPerformance File Systems Archiving CIUK 2016 David C Taylor HPC Account Manager

Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Genetic Diversity and High‐Performance File Systems Archiving

CIUK 2016

David C Taylor HPC Account Manager

Page 2: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Spectra Logic, Boulder Colorado

Page 3: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive
Page 4: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Society’s Genome

• Organizations are like Organisms—they must preserve their DNA—which is their most important asset

• There’s no permanent storage medium available at this point

• Digital Preservation, Maintaining and moving this information forward, may be our most important mission

Page 5: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

The Digital Universe: Risks and Threats

• New Forms of Cyber Attack

The New Nation‐State Attack… Sony‐ Sony was thrown into the Dark Ages of compute‐ Large quantities of Sony’s data was destroyed

The New Criminal Approach: RansomwareThe New Nation‐State Attack… Saudi Aramco‐ 30,000 hard drives were scrapped and replaced‐ Shift from cyber spying to destruction of operations and data

Malware infecting hard disk firmware‐ Remained hidden for 15 years

Page 6: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Electromagnetic Pulse (EMP)

IT infrastructure can be destroyed by short, sharp pulses high in voltage but low in energy

Page 7: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Intentional Electromagnetic Interference (IEMI)

$2,000 $250Nation‐State

‐ George Baker, CEO of BAYCOR , Data Center World Conference, 2014

“With the proliferation of cloud computing, more data is being placed in fewer baskets, and that reliance on failover sites has reduced physical security”

Page 8: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Traditional Functions of Storage Cloud

• Using the Cloud to improve process– Transcoding –Distribution / Convenience Copy

• Using the Cloud for long‐term storage–Diversity of Geography–Disaster Recovery

Page 9: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Private Cloud vs. Public Cloud

Private Cloud Solution1 year CapEx costs per GB:

Amazon Glacier StoragePrivate system paid for after:

• 10 months

• 1.5 years

• 2.5 years

• 10 PB $0.099 

• 2.4 PB $0.163 

• 300 TB  $0.307

How long will it take to payoff the CapEx if the same amount of data were kept in Amazon’s Glacier?

Page 10: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Genetic Diversity serves as a way for populations to adapt to changing 

environments

Page 11: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Archival Storage Technologies for Genetic Diversity

*  Stats vary for SSD based on model**Bit error rate – unrecoverable data error

Technology FlashSSD*

Archive SMRSATA HDD

TransactionalSAS HDD

IBM LTO7Digital Tape

IBM TS1150Digital Tape

Cloud

Capacity 400 GB 8TB (Raw) 8TB (Raw) 6TB (Native) 10TB (Native) Unlimited

Transfer rate 550MB/sec 150MB/sec 160MB/sec 320MB/Sec 366MB/Sec ???

Bit error rate** 1x10‐17 1x10‐14 1x10‐15 1x10‐19 1x10‐20 ???

Page 12: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

State of the Tape Storage Industry ‐ Tape Technology Roadmap

2013 2014 2015 2016 2017

IBM TS115010 TB

360 MB/sGMR

LTO-76 TB

300 MB/sGMR

IBMTS1155~15 TB

360 MB/sTMR

LTO-8~12 TB

360MB/sTMR

TS11x0 & LTO now on a 2 year cadence

LTO

Enterprise TS 

3 more generations already in works – up to 80+TB

All other brands stalled

Page 13: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Tri‐media ‐ Three Different Tape Technologies in the same library

• Spectra removes all constraints– Performance– Technology Felxibility – X3– Universal HSM compatibility

Page 14: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

TFinity Scalability • 44 Frames• Dual Robotics• 53,000 Slots• 384 Drives ~140GB/s• >>1EB in a single machine

Page 15: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Off‐Site Storage

Traditional Storage Model

Nearline Archive Disk(SMR)

Nearline Disk (SAS)

Object Storage Ecosystem

HTTP/10 GigE

Object Storage GatewayTraditional File Based Storage

Tape Libraries

Private Cloud(on premise)

Achieving a Diversified Storage Platform

SMR Drives

RESTful API

Page 16: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Spectra driving HPC expansion

Nearline Disk (SAS)

Archive  Disk (SATA/SMR)

Spectra Tape(LTO/TS11x0)

Black PearlReplication

Public Cloud

Black PearlData Sources

Black PearlData Destinations

Page 17: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Current HSM Issues

– Expensive– Limited Object count (before scaling)– Have limited monitoring and control– Command line control only– Slow search– Not conducive to end‐user‐access

Lustre MDS,MDT Coordinator,Changelog

Lustre OSS

Lustre File System

Page 18: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Campaign Storage HSM (Spectra Logic Archive for Lustre)

• Entirely new archive system– Fully scalable – speed and object count– Relies on well known tools and systems (Robinhood, Intel Lustre, ZFS)– Revolutionary new architecture for parallel file system archive

• Joint development between Campaign Storage and Spectra Logic– Partnering with Intel for Lustre/support and CEA for Robinhood

• Targeting Parallel, High‐Performance File Systems– First Implementation – Intel Lustre System

• Functional NOW 

Page 19: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

How does it work ‐ software modules

Black Pearl Archive

Black Pearl OS

Campaign Storage HSM

Robinhood VM

Campaign Storage VM

Customer Lustre Cluster

Lustre Client

Metadata Service / ZFS

Robinhood policy DB service

DS3 Gateway

Lustre OSS

Parallel Spectra S3 Bulk API

Lustre OSS

Lustre OSS

Lustre MDS & HSM coordinator

Mover ‐ Lustre Client

Lustre Client

HSM request source:1. FS access (restore)2. Lustre HSM3. Robinhood Util

Agent / Copytool

Mover ‐ Lustre Client Agent / Copytool

Subtree Search & Mgmt

Flash

Disk or Flash

SL Tape Library

Web Browser UI

DS3 GatewayDS3 Gateway

SpectraIntelCampaign StorageCEA Archive

Spectra Logic Backend BlackPearl ‐ OnlineArcticBlue – Nearline

Automated Tape Libraries

Page 20: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

How does it work ‐ software modules

Black Pearl Archive

Black Pearl OS

Campaign Storage HSM

Robinhood VM

Campaign Storage VM

Customer Lustre Cluster

Lustre Client

Metadata Service / ZFS

Robinhood policy DB service

DS3 Gateway

Lustre OSS

Parallel Spectra S3 Bulk API

Lustre OSS

Lustre OSS

Lustre MDS & HSM coordinator

Mover ‐ Lustre Client

Lustre Client

HSM request source:1. FS access (restore)2. Lustre HSM3. Robinhood Util

Agent / Copytool

Mover ‐ Lustre Client Agent / Copytool

Subtree Search & Mgmt

Flash

Disk or Flash

SL Tape Library

Web Browser UI

DS3 GatewayDS3 Gateway

SpectraIntelCampaign StorageCEA Archive

Spectra Logic Backend BlackPearl ‐ OnlineArcticBlue – Nearline

Automated Tape Libraries

Value Added Feature Examples

• Sharding / striping / packing (Large Files, Small Files, Speed scaling)• Multiple sources / multiple targets • Full Job Management• Subtree archive – instant searches• Tape, Disk, Nearline automatically• Data protection & Validation

Value Added Feature Examples

• Sharding / striping / packing (Large Files, Small Files, Speed scaling)• Multiple sources / multiple targets • Full Job Management• Subtree archive – instant searches• Tape, Disk, Nearline automatically• Data protection & Validation

Page 21: Genetic Diversity and High Performance File Systems Archiving193.62.125.70/CIUK-2016/Spectra.pdf · Campaign Storage HSM (Spectra Logic Archive for Lustre) • Entirely new archive

Thank You!