46
Ceph vs Swift Performance Evaluation on a Small Cluster eduPERT monthly call July, 24th 2014 Jul 24, 2014 GÉANT eduPERT meeting

Ceph vs Swift - Services - GÉANT · Ceph vs Swift Performance Evaluation on a Small Cluster eduPERT monthly call July, 24th 2014 Jul 24, 2014 GÉANT eduPERT meeting

  • Upload
    vothu

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Ceph vs Swift Performance Evaluation on a Small Cluster

eduPERT monthly call

July, 24th 2014

Jul 24, 2014 GÉANT eduPERT meeting

About me

• Vincenzo Pii

• Researcher @

• Leading research initiative on Cloud Storage • Under the theme IaaS

• More on ICCLab: www.cloudcomp.ch

Jul 24, 2014 GÉANT eduPERT meeting

About this work

• Performance evaluation study on cloud storage

• Small installations

• Hardware resources hosted at the ZHAW ICCLab data center in Winterthur

• Two OpenStack clouds (stable and experimental)

• One cluster dedicated to storage research

Jul 24, 2014 GÉANT eduPERT meeting

INTRODUCTION

Jul 24, 2014 GÉANT eduPERT meeting

Cloud storage

• Cloud storage • Based on distributed, parallel, fault-tolerant file systems • Distributed resources exposed through a single homogeneous

interface • Typical requirements

• Highly scalable • Replication management • Redundancy (no single point of failure) • Data distribution • …

• Object storage • A way to manage/access data in a storage system • Typical alternatives

• Block storage • File storage

Jul 24, 2014 GÉANT eduPERT meeting

Ceph and Swift

• Ceph (ceph.com) • Supported by Inktank

• Recently purchased by RedHat (owners of GlusterFS)

• Mostly developed in C++

• Started as PhD thesis project in 2006

• Block, file and object storage

• Swift (launchpad.net/swift) • OpenStack object storage

• Completely written in Python

• RESTful HTTP APIs

Jul 24, 2014 GÉANT eduPERT meeting

Objectives of the study

1. Performance evaluation of Ceph and Swift on a small cluster • Private storage • Storage backend for own-apps with limited requirements

in size • Experimental environments

2. Evaluate Ceph maturity and stability • Swift already widely deployed and industry-proven

3. Hands-on experience • Configuration • Tooling • …

Jul 24, 2014 GÉANT eduPERT meeting

CONFIGURATION AND PERFORMANCE OF SINGLE COMPONENTS

Jul 24, 2014 GÉANT eduPERT meeting

Network configuration

• Three servers on a dedicated VLAN

• 1 Gbps NICs

• 100BaseT cabling

Jul 24, 2014 GÉANT eduPERT meeting

10.0.5.x/24

Node 1 (.2)

Node 2 (.3)

Node 3 (.4)

Servers configuration

• Hardware • Lynx CALLEO Application Server 1240

• 2x Intel® Xeon® E5620 (4 core)

• 8x 8 GB DDR3 SDRAM, 1333 MHz, registered, ECC

• 4x 1 TB Enterprise SATA-3 Hard Disk, 7200 RPM, 6 Gb/s (Seagate® ST1000NM0011)

• 2x Gigabit Ethernet network interfaces

• Operating system • Ubuntu 14.04 Server Edition with Kernel 3.13.0-24-

generic

Jul 24, 2014 GÉANT eduPERT meeting

Disks performance

READ:

WRITE:

Jul 24, 2014 GÉANT eduPERT meeting

$ sudo hdparm -t --direct /dev/sdb1

/dev/sdb1:

Timing O_DIRECT disk reads: 430 MB in 3.00 seconds = 143.17 MB/sec

$ dd if=/dev/zero of=anof bs=1G count=1 oflag=direct

1+0 records in

1+0 records out

1073741824 bytes (1.1 GB) copied, 8.75321 s, 123 MB/s

Network performance

Jul 24, 2014 GÉANT eduPERT meeting

$ iperf -c ceph-osd0

------------------------------------------------------------

Client connecting to ceph-osd0, TCP port 5001

TCP window size: 85.0 KByte (default)

------------------------------------------------------------

[ 3] local 10.0.5.2 port 41012 connected with 10.0.5.3 port 5001

[ ID] Interval Transfer Bandwidth

[ 3] 0.0-10.0 sec 1.10 GBytes 942 Mbits/sec

942 Mbits/s 117.5 MB/s

CLOUD STORAGE CONFIGURATION

Jul 24, 2014 GÉANT eduPERT meeting

Ceph OSDs

Jul 24, 2014 GÉANT eduPERT meeting

Monitor (mon0)

HDD1 (OS)

Not used

Not used

Not used

St. node 0

HDD1 (OS)

osd0 – XFS

osd1 – XFS

Journal

St. node 1

HDD1 (OS)

osd2 – XFS

osd3 – XFS

Journal

cluster-admin@ceph-mon0:~$ ceph status

cluster ff0baf2c-922c-4afc-8867-dee72b9325bb

health HEALTH_OK

monmap e1: 1 mons at {ceph-mon0=10.0.5.2:6789/0}, election epoch 1,

quorum 0 ceph-mon0

osdmap e139: 4 osds: 4 up, 4 in

pgmap v17348: 632 pgs, 13 pools, 1834 bytes data, 52 objects

199 MB used, 3724 GB / 3724 GB avail

632 active+clean

cluster-admin@ceph-mon0:~$ ceph osd tree

# id weight type name up/down reweight

-1 3.64 root default

-2 1.82 host ceph-osd0

0 0.91 osd.0 up 1

1 0.91 osd.1 up 1

-3 1.82 host ceph-osd1

2 0.91 osd.2 up 1

3 0.91 osd.3 up 1

Swift devices

Building rings on storage devices

(No separation of Accounts, Containers and Objects)

Jul 24, 2014 GÉANT eduPERT meeting

export ZONE= # set the zone number for that storage device

export STORAGE_LOCAL_NET_IP= # and the IP address

export WEIGHT=100 # relative weight (higher for bigger/faster disks)

export DEVICE=

swift-ring-builder account.builder add z$ZONE-$STORAGE_LOCAL_NET_IP:6002/$DEVICE $WEIGHT

swift-ring-builder container.builder add z$ZONE-$STORAGE_LOCAL_NET_IP:6001/$DEVICE $WEIGHT

swift-ring-builder object.builder add z$ZONE-$STORAGE_LOCAL_NET_IP:6000/$DEVICE $WEIGHT

Swift Proxy

HDD1 (OS)

Not used

Not used

Not used

St. node 0

HDD1 (OS)

dev1 – XFS

dev2 – XFS

Not used

St. node 1

HDD1 (OS)

dev3 – XFS

dev4 – XFS

Not used

Highlighting a difference

• LibRados used to access Ceph • Plain installation of a “Ceph storage cluster” • Non ReST-ful interface • This is the fundamental access layer in Ceph • RadosGW (Swift/S3 APIs) is an additional component on top of

LibRados (as block and file storage clients)

• ReST-ful APIs over HTTP used to access Swift • Extra overhead in the communication • Out-of-the box access method for Swift

• This is part of the differences to be benchmarked, even if... • … HTTP APIs for object-storage are interesting for many use cases • This use case:

• Unconstrained self-managed storage infrastructure for, e.g., own apps • Control over infrastructure and applications

Jul 24, 2014 GÉANT eduPERT meeting

WORKLOADS

Jul 24, 2014 GÉANT eduPERT meeting

Tools

• COSBench (v. 0.4.0.b2) - https://github.com/intel-cloud/cosbench • Developed by Intel • Benchmarking for Cloud Object Storage • Supports both Swift and Ceph

• Cool web interface to submit workloads and monitor current status • Workloads defined as XML files • Very good level of abstractions applying to object storage

• Supported metrics • Op-Count (number of operations) • Byte-Count (number of bytes) • Response-Time (average response time for each successful request) • Processing-Time (average processing time for each successful request) • Throughput (operations per seconds) • Bandwidth (bytes per seconds) • Success-Ratio (ratio of successful operations)

• Outputs CSV data • Graphs generated with cosbench-plot - https://github.com/icclab/cosbench-plot

• Describe inter-workload charts in Python

Jul 24, 2014 GÉANT eduPERT meeting

Workloads gist

Jul 24, 2014 GÉANT eduPERT meeting

COSBench web interface

Jul 24, 2014 GÉANT eduPERT meeting

Workload matrix

Containers Objects size R/W/D Distr. (%) Workers

1 4 kB 80/15/5 1

20 128 kB 100/0/0 16

512 kB 0/100/0 64

1024 kB 128

5 MB 256

10 MB 512

Jul 24, 2014 GÉANT eduPERT meeting

Workloads

• 216 workstages (all the combinations of the values of the workload matrix)

• 12 minutes per workstage

• 2 minutes warmup

• 10 minutes running time

• 1000 objects per container (pools in Ceph)

• Uniformly distributed operations over the available objects (1000 or 20000)

Jul 24, 2014 GÉANT eduPERT meeting

READING

Performance Results

Jul 24, 2014 GÉANT eduPERT meeting

Read tpt – Workstage AVGs

Jul 24, 2014 GÉANT eduPERT meeting

Read tpt – 1 cont – 4 KB

Jul 24, 2014 GÉANT eduPERT meeting

Read tpt – 1 cont – 128 KB

Jul 24, 2014 GÉANT eduPERT meeting

Read ResponseTime – 1 cont – 128 KB

Jul 24, 2014 GÉANT eduPERT meeting

Read bdw – Workstage AVGs

Jul 24, 2014 GÉANT eduPERT meeting

Read tpt – 20 cont – 1024 KB

Jul 24, 2014 GÉANT eduPERT meeting

Response time – 20 cont – 1024 KB

Jul 24, 2014 GÉANT eduPERT meeting

WRITING

Performance Results

Jul 24, 2014 GÉANT eduPERT meeting

Write tpt – Workstage AVGs

Jul 24, 2014 GÉANT eduPERT meeting

Write tpt – 1cont – 128 KB

Jul 24, 2014 GÉANT eduPERT meeting

Ceph write tpt – 1 cont – 128 KB – replicas

Jul 24, 2014 GÉANT eduPERT meeting

Ceph write RT – 1 cont – 128 KB – replicas

Jul 24, 2014 GÉANT eduPERT meeting

Write bdw – Workstage AVGs

Jul 24, 2014 GÉANT eduPERT meeting

Write Response Time – 20 cont – 512 KB

Jul 24, 2014 GÉANT eduPERT meeting

READ/WRITE/DELETE

Performance Results

Jul 24, 2014 GÉANT eduPERT meeting

R/W/D tpt – Workstage AVGs

Jul 24, 2014 GÉANT eduPERT meeting

Read R/W/D Response Time

Jul 24, 2014 GÉANT eduPERT meeting

CONCLUSIONS

General considerations and future works

Jul 24, 2014 GÉANT eduPERT meeting

Performance Analysis Recap

• Ceph performs better when reading, Swift when writing • Ceph → librados • Swift → ReST APIs over HTTP

• More remarkable difference with small objects • Less overhead for Ceph

• Librados • CRUSH algorithm

• Comparable performance with bigger objects • Network bottleneck at 120 MB/s for read operations • Response time

• Swift: greedy behavior • Ceph: fairness

Jul 24, 2014 GÉANT eduPERT meeting

General considerations: challenges

• Equivalency • Comparing two similar systems that are not exactly overlapping • Creating fair setups (e.g., Journals on additional disks for Ceph) • Transposing corresponding concepts

• Configuration • Choosing the right/best settings for the context (e.g., number of Swift

workers)

• Identifying bottlenecks • To be done in advance to create meaningful workloads

• Workloads • Run many tests to identify saturating conditions • Huge decision space

• Keep up the pace • Lot of developments going on (new versions, new features)

Jul 24, 2014 GÉANT eduPERT meeting

General considerations: lessons learnt

• Publication medium (blog post) • Excellent feedback (e.g., Rackspace developer) • Immediate right of reply and “real” comments

• Most important principles • Openness

• Share every bit of information • Clear intents, clear justifications

• Neutrality • When analyzing the results • When drawing conclusions

• Very good suggestions coming from “you could”, “you should”, “you didn’t” comments

Jul 24, 2014 GÉANT eduPERT meeting

Future works

• Performance evaluation necessary for cloud storage

• More object storage evaluations • Interesting because it’s very close to the application

level

• Block storage evaluations • Very appropriate for IaaS

• Provide storage resources to VMs

• Seagate Kinetic • Possible opportunity to work on a Kinetic setup

Jul 24, 2014 GÉANT eduPERT meeting

THANKS! QUESTIONS?

Vincenzo Pii: [email protected]

Jul 24, 2014 GÉANT eduPERT meeting