57
Improving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick

Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

Embed Size (px)

Citation preview

Page 1: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Improving Cloud Storage Cost and Data Resiliency with Erasure Codes

Michael Penick

Page 2: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Commodity Storage

Hosting storage FTP backup Goals Inexpensive (use “commodity” hardware) Resilient to failures Highly available Customizable

2

Page 3: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

MogileFS

Open source distributed filesystem Written by Brad Fitzpatrick No single point of failure Automatic/Asynchronous file replication Shared-Nothing design (disks) Local filesystem agnostic Flat namespace

3

Tracker

Storage Node

MetadataDB

Page 4: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

MogileFS

4

Tracker

Storage Node

MetadataDB

Clients

Page 5: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

“NebulaFS”

Large file support Offsite Replication Self-healing Data retention C++ client (PHP and Perl SWIG wrappers) Metadata Sharding Range GETs

5

Page 6: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

“NebulaFS”

6

Tracker / Storage Node

MySQL

Storage Node

MySQL Tracker /

Storage Node

Storage Node

Clients

Page 7: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

FTP Backup

7

FTP Presentation (Net::FTPServer)

VFS DB

NebulaFS

Metadata DB Super Nodes Storage Nodes

NebulaFSAPI

Page 8: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Widely Applicable

“Storage service” (REST) layer New Product Integrations Online File Folder (videos and images) Website Builder/ Photo Album Go Daddy Cloud Servers (snapshots) Email …

8

Page 9: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Object Storage

9

RESTful Presentation (S3, GDCS)

VFS DB

NebulaFS

Metadata DB Super Nodes Storage Nodes

VFS

User DB

NebulaFSAPI

Page 10: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Why?

10

Page 11: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Why?

11

5.39% 8.51%

83.89%

1.20% 1.01% ~3.25 PB

Aries FTP WST/PA OFF VDC Other

1.80% 2.56%

38.44%

1.44%

55.44%

0.30%

~10.8 PB

Aries FTP WST/PA OFF VDC Email Other

Page 12: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

The Problem

NebulaFS = Inexpensive, resilient, highly available storage

Problem: Disk drives fail...a lot. F = mean time to failure In a system of n devices our mean time failure

is: F/n Solution: Replicate the data

12

Page 13: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Replication

13

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

Success!

Duplicate Replicate

Copy 1

Copy 2

Copy 3

Copy 4

Disk 1 … Disk 2 Disk n

Page 14: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Replication

Simple and effective Durability:

99.99999% over 1 year (or 0.1 of 1 million objects) 99.99% over 3 years (or 100 of 1 million objects)

Problem: 100 % overhead per copy +300% overhead for 3 onsite and 1 offsite

copy There has to be a better way.

14

Page 15: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Erasure Codes

Forward error correction code Add redundant data (codes) to message so that it

can be recovered Where’s EC used?

Optical media Media streaming File Systems (RAID-6, several distributed FS, …)

15

Page 16: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Erasure code (write)

Divide 01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

Encode

010101010110101010110101010101

010101011010101100101010110100

110101010101101001010101011010

101010000000000000000000000000

101010010101101010101010101010

101010001010101010100101101001

k

m

Disk 1

Disk 2

Disk n

Copy 1

Copy .75

101010001010101010100101101001

Page 17: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Erasure code (read)

010101010110101010110101010101

010101011010101100101010110100

110101010101101001010101011010

101010000000000000000000000000

101010010101101010101010101010

101010001010101010100101101001

Verify Decode

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

Disk 1

Disk 2

Disk n 101010001010101010100101101001

k

Page 18: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Erasure codes

What? k – number of original pieces m – number of redundant pieces (codes)

How? k = 4, m = 3: only 75% overhead (3 failures) k = 10, m = 6: only 60% overhead (6 failures) k = 9, m = 3: only 33% overhead (3 failures)

AKA: k = 10, m = 6 10 of 16

Page 19: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Trade-offs (positive)

Better resilience to failure Durability for 10 of 16:

99.9999999999% over 1 year (or 0.000001 of 1 million objects) 99.99999% over 3 years (or 0.1 of 1 million objects)

Durability for 9 of 12: 99.99999% over 1 year (or 0.1 of 1 million objects) 99.99% over 3 years (or 100 of 1 million objects)

Significant savings (includes a full offsite copy) 10 of 16: (4 – 2.60) / 4 = 35% savings (60 % w/o offsite) 9 of 12: (4 – 2.33) / 4 = 42% savings (67% w/o offsite)

Page 20: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Trade-offs (negative)

Computationally expensive Increased number of IOPS Complexity (additional metadata) More nodes and connections

Page 21: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Erasure Codes

Optimal erasure code Any k pieces of the message can recover the message Reed-Solomon (and Cauchy Reed-Solomon)

Libraries (Jerasure, Zfec, Luby, librs,…) Stability/Performance Evaluation

Paper – “A Performance Comparison of Open-Source Erasure Coding Libraries for Storage Applications” http://web.eecs.utk.edu/~plank/plank/papers/CS-08-

625.pdf

Page 22: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries - zfec

Reed Solomon Written in C (Python and Haskell bindings) Download: http://pypi.python.org/pypi/zfec Documentation is the source code

22

Page 23: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – zfec Encoding

23

Page 24: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – zfec Decoding

24

Page 25: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – zfec Decoding contd.

25

Page 26: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – zfec Decoding contd.

26

Page 27: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – zfec Decoding cond.

k = 6, m =2, erasures = { 0, 2, -1 }

index = { 6, 1, 7, 3, 4, 5 }

27

inpkts

coding 0

data 1

coding 1

data 3

data 4

data 5

outpkts

data 0

data 2

Page 28: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries - Jerasure

Reed Solomon, Cauchy Reed Solomon, and Minimal Density Codes

Written in C (no bindings) Download:

http://web.eecs.utk.edu/~plank/plank/papers/CS-08-627.html

Good documentation and examples

28

Page 29: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – Jerasure Encoding

29

Page 30: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – Jerasure Decoding

30

Page 31: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – Performance

31

0

500

1000

1500

2000

2500

w = 8 w = 16 w = 32

MB/s

Encoding

Jerasure RS Jearsure CRS zfec

Page 32: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

EC Libraries – Performance

32

0 100 200 300 400 500 600 700 800 900

w = 8 w = 16 w = 32

MB/s

Decoding

Jerasure RS Jearsure CRS zfec

Page 33: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library

EC library (Phase I) Read/Copy Write Repair

Page 34: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library

Inputs/Outputs abstracted boost::asio (HTTP) PHP/Perl bindings Random access reads (i.e. Range GET) Data validated/corrected on-the-fly

Page 35: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library Writes

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

010101010110101010110101010101

010101011010101100101010110100

110101010101101001010101011010

101010000000000000000000000000

101010010101101010101010101010

101010001010101010100101101001

k

m

Disk 1

Disk 2

Disk n 101010001010101010100101101001

Page 36: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Page 37: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library Failures

Page 38: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library Reads

010101010110101010110101010101

010101011010101100101010110100

110101010101101001010101011010

101010000000000000000000000000

101010010101101010101010101010

101010001010101010100101101001

01010101011010101011010101010101010101101010110010101011010011010101010110100101010101101010101….

Disk 1

Disk 2

Disk n 101010001010101010100101101001

k

Page 39: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library

Page 40: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library Copy

010101010110101010110101010101

010101011010101100101010110100

110101010101101001010101011010

101010000000000000000000000000

101010010101101010101010101010

101010001010101010100101101001

Disk 1

Disk 2

Disk n 101010001010101010100101101001

k Disk 1

Disk 2

Page 41: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library

Page 42: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library Repair

010101010110101010110101010101

010101011010101100101010110100

110101010101101001010101011010

101010000000000000000000000000

101010010101101010101010101010

101010001010101010100101101001

Disk 1

Disk 2

Disk n 101010001010101010100101101001

k

Page 43: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – EC Library

Page 44: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – Reads/Writes

Page 45: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – Reads/Writes DB

Increased number of “file_device” entries Decreased number of “file” entries

Change the meaning of “class”

45

Page 46: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – Reads/Writes DB

Page 47: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – Write

Page 48: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – Read

Page 49: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Integration – Recovery

Page 50: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Lessons Learned

CRC32 can be slow Intel’s Slicing-by-8 Algorithm

Block size can limit your smallest file size Lighttpd doesn’t support “Transfer-Encoding:

chunked”

50

Page 51: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Performance Test Setup

6 super nodes (tracker and storage node) 180 drives Drives not distributed i.e. not 30 drives per node EC strips maximally distributed

51

Page 52: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Performance Test Results

0 10 20 30 40 50 60 70 80

1KB 1MB 16MB 32MB 64MB 128MB 256MB

MB

/s

File Size

Writes

ec_1_of_2 ec_6_of_9 ec_9_of_12 ec_10_of_16 replication

52

Page 53: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Performance Test Results

0 20 40 60 80

100 120 140 160 180 200

1KB 1MB 16MB 32MB 64MB 128MB 256MB

MB

/s

File Size

Reads

ec_1_of_2 ec_6_of_9 ec_9_of_12 ec_10_of_16 replication

53

Page 54: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Migrations

54

RESTful Presentation (S3, GDCS)

VFS DB

NebulaFS

Metadata DB Super Nodes Storage Nodes

User DB

Migration Script

VFS NebulaFSAPI

Page 55: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Future

Finish Phase III Repairs Offsite copy

Net new growth Optimizations Open source

55

Page 56: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

Questions

56

Thank You!

Page 57: Improving Cloud Storage Cost and Data Resiliency with ... · PDF fileImproving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick . ... Cauchy Reed Solomon, and

2012 Storage Developer Conference. © 2012 GoDaddy.com. All Rights Reserved.

57