25
Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967 Computer Systems, Cluster, and Networking Summer Institute

Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Functional Assessment of Erasure Coded Storage Archive

Blair Crossman Taylor Sanchez Josh Sackos

LA-UR-13-25967

Computer Systems, Cluster, and Networking Summer Institute

Page 2: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Presentation Overview

•  Introduction

•  Caringo Testing

•  Scality Testing

•  Conclusions

1

Page 3: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Storage Mediums

•  Tape o  Priced for capacity not bandwidth

•  Solid State Drives o  Priced for bandwidth not capacity

•  Hard Disk o  Bandwidth scales with more drives

2

Page 4: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Object Storage: Flexible Containers

•  Files are stored in data containers •  Meta data outside of file system

•  Key-value pairs

•  File system scales with machines

•  METADATA EXPLOSIONS!!

3

Page 5: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

What is the problem?

•  RAID, replication, and tape systems were not designed for exascale computing and storage

•  Hard disk capacity continues to grow

•  Solution to multiple hard disk failures is needed

4

Page 6: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Erasure Coding : Reduce Rebuild Recalculate

Reduce! Rebuild! Recalculate!

5

Page 7: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Project Description

•  Erasure coded object storage file system is a potential replacement for LANL’s tape archive system

•  Installed and configured two prototype archives o  Scality o  Caringo

•  Verified the functionality of systems

6

Page 8: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Functionality Not Performance

Caringo o  SuperMicro admin node o  1GigE interconnect o  10 IBM System x3755

§  4 x 1TB HDD o  Erasure coding:

o  n=3 o  k=3

Scality o  SuperMicro admin node o  1GigE interconnect o  6 HP Proliant (DL160 G6)

§  4 x 1TB HDD o  Erasure coding:

o  n=3 o  k=3

7

Page 9: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Project Testing Requirements

•  Data o  Ingest : Retrieval : Balance : Rebuild

•  Metadata o  Accessibility : Customization : Query

•  POSIX Gateway o  Read : Write : Delete : Performance overhead

8

Page 10: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

How We Broke Data

•  Pulled out HDDs (Scality, kill daemon)

•  Turned off nodes

•  Uploaded files, downloaded files

•  Used md5sum to compare originals to downloaded copies

9

Page 11: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Caringo: The automated storage system

•  Warewulf/Perceus like diskless (RAM) boot

•  Reconfigurable, requires reboot

•  DHCP PXE boot provisioned

•  Little flexibility or customizability

•  http://www.caringo.com

10

Page 12: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

No Node Specialization

•  Nodes "bid" for tasks

•  Lowest latency wins •  Distributes the work

•  Each node performs all tasks •  Administrator : Compute : Storage

•  Automated Power management •  Set a sleep timer •  Set an interval to check disks

•  Limited Administration Options

11

Page 13: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Caringo Rebuilds Data As It Is Written

•  Balances data as written o  Primary Access Node o  Secondary Access Node

•  Automated o  New HDD/Node: auto balanced o  New drives format automatically o  Rebuilds Constantly o  If any node goes down rebuild starts immediately o  Volumes can go "stale” o  14 Day Limit on unused volumes

12

Page 14: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

What’s a POSIX Gateway

•  Content File Server o  Fully Compliant POSIX object o  Performs system administration tasks o  Parallel writes

•  Was not available for testing

13

Page 15: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

“Elastic” Metadata

•  Accessible

•  Query: key values o  By file size, date, etc.

•  Indexing requires “Elastic Search” machine to do

indexing o  Can be the bottleneck in system

14

Page 16: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Minimum Node Requirements

•  Needs a full n + k nodes to: •  rebuild •  write •  balance

•  Does not need full n +k to: •  read •  query metadata •  administration

15

Page 17: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Static Disk Install

•  Requires disk install

•  Static IP addresses •  Optimizations require deeper knowledge •  http://www.scality.com

16

Page 18: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Virtual Ring Resilience

•  Success until less virtual nodes available than n+k erasure configuration.

•  Data stored to ‘ring’ via distributed hash table

17

Page 19: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Manual Rebuilds, But Flexible •  Rebuilds on less than required nodes

o  Lacks full protection •  Populates data back to additional node •  New Node/HDD: Manually add node •  Data is balanced during:

•  Writing •  Rebuilding

18

Page 20: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Indexer Sold Separately

•  Query all erasure coding metadata per server

•  Per item metadata

•  User Definable

•  Did not test Scality’s ‘Mesa’ indexing service •  Extra software

19

Page 21: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Fuse gives 50% Overhead, but scalable

20

Page 22: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

On the right path

•  Scality o  Static installation, flexible erasure coding o  Helpful o  Separate indexer o  500MB file limit ('Unlimited' update coming)

•  Caringo o  Variable installation, strict erasure coding o  Good documentation o  Indexer included o  4TB file limit (addressing bits limit)

21

Page 23: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Very Viable

•  Some early limitations

•  Changes needed on both products

•  Scality seems more ready to make those changes.

22

Page 24: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Questions?

23

Page 25: Functional Assessment of Erasure Coded Storage Archive · 2019-11-07 · Functional Assessment of Erasure Coded Storage Archive Blair Crossman Taylor Sanchez Josh Sackos LA-UR-13-25967

Acknowledgements

Special Thanks to : Dane Gardner - NMC Instructor Matthew Broomfield - NMC Teaching Assistant

HB Chen - HPC-5 - Mentor Jeff Inman - HPC-1- Mentor

Carolyn Connor - HPC-5, Deputy Director ISTI

Andree Jacobson - Computer & Information Systems Manager NMC

Josephine Olivas - Program Administrator ISTI Los Alamos National Labs, New Mexico Consortium, and ISTI

24