66
Gluster – Overview & Future Directions Vijay Bellur GlusterFS Co-maintainer Red Hat

Gluster overview & future directions vault 2015

Embed Size (px)

Citation preview

Page 1: Gluster overview & future directions vault 2015

Gluster – Overview & Future Directions

Vijay BellurGlusterFS Co-maintainerRed Hat

Page 2: Gluster overview & future directions vault 2015

03/12/15

Agenda● Overview

● Why Gluster?● What is Gluster?● Use Cases & Features

● Future Directions

● Q & A

Page 3: Gluster overview & future directions vault 2015

03/12/15

Why Gluster?

Page 4: Gluster overview & future directions vault 2015

03/12/15

Why Gluster?

● 2.5+ exabytes of data produced every day!

● 90% of data in last two years● Data needs to be stored somewhere!● Commoditization and Democratization –

way to go

source: http://www-01.ibm.com/software/data/bigdata/what-is-big-data.html

Page 5: Gluster overview & future directions vault 2015

03/12/15

What is Gluster?

Page 6: Gluster overview & future directions vault 2015

03/12/15

What is Gluster?

● Scale-out distributed storage system.

● Aggregates storage exports over network interconnects to provide an unified namespace.

● File, Object and Block interfaces

● Layered on disk file systems that support extended attributes.

Page 7: Gluster overview & future directions vault 2015

03/12/15

Typical Gluster Deployment

Page 8: Gluster overview & future directions vault 2015

03/12/15

Gluster Architecture – Foundations

● Software only, runs on commodity hardware

● No external metadata servers

● Scale-out with Elasticity

● Extensible and modular

Page 9: Gluster overview & future directions vault 2015

03/12/15

Volumes in Gluster

● Logical collection of exports aka bricks.

● Identified by an administrative name.

● Volume or a part of the volume used by clients for data

CRUD operations.

● Multiple volume types supported currently

Page 10: Gluster overview & future directions vault 2015

03/12/15

Distributed Volume

Page 11: Gluster overview & future directions vault 2015

03/12/15

Replicated Volume

Page 12: Gluster overview & future directions vault 2015

03/12/15

Distributed Replicated Volume

Page 13: Gluster overview & future directions vault 2015

03/12/15

Dispersed Volume

● Introduced in GlusterFS 3.6

● Erasure Coding / RAID 5 over the network

● “Disperses” data on to various bricks

● Algorithm: Reed solomon

● Non–systematic erasure coding

● Encoding / decoding done on client side

Page 14: Gluster overview & future directions vault 2015

03/12/15

Access Mechanisms

Page 15: Gluster overview & future directions vault 2015

03/12/15

FUSE based native access

Page 16: Gluster overview & future directions vault 2015

03/12/15

NFSv3 access with Gluster NFS

Page 17: Gluster overview & future directions vault 2015

03/12/15

Object/ReST - SwiftonFile

Client Proxy Account

Container

Object

HTTP Request ( Swift REST

API)

Directory

Volume

FileClientNFS or

GlusterFS Mount

● Unified File and object view.

● Entity mapping between file and object building blocks

Page 18: Gluster overview & future directions vault 2015

03/12/15

HDFS access

Page 19: Gluster overview & future directions vault 2015

03/12/15

libgfapi access

Page 20: Gluster overview & future directions vault 2015

03/12/15

Nfs-Ganesha with GlusterFS

Page 21: Gluster overview & future directions vault 2015

03/12/15

SMB with GlusterFS

Page 22: Gluster overview & future directions vault 2015

03/12/15

Block/iSCSi access

Page 23: Gluster overview & future directions vault 2015

03/12/15

Features● Scale-out NAS

● Elasticity, quotas● Data Protection and Recovery

● Volume and File Snapshots, User Serviceable Snapshots, Geographic/Asynchronous replication

● Archival

● Read-only, WORM● Native CLI / API for management

Page 24: Gluster overview & future directions vault 2015

03/12/15

Features● Isolation for multi-tenancy

● SSL for data/connection, Encryption at rest

● Performance

● Data, metadata and readdir caching

● Monitoring

● Built in io statistics, /proc like interface for introspection

● Provisioning

● Puppet-gluster, gluster-deploy

● More..

Page 25: Gluster overview & future directions vault 2015

03/12/15

Gluster & oVirt

Row 1 Row 2 Row 3 Row 40

2

4

6

8

10

12

Column 1

Column 2

Column 3

Page 26: Gluster overview & future directions vault 2015

03/12/15

Gluster Monitoring with Nagios

http://www.ovirt.org/Features/Nagios_Integration

Page 27: Gluster overview & future directions vault 2015

03/12/15

How is it implemented?

Page 28: Gluster overview & future directions vault 2015

03/12/15

Translators in Gluster

● Translator = shared library

● Each translator is a self-contained functional unit.

● Translators can be stacked together for achieving desired functionality.

● Translators are deployment agnostic – write once use anywhere!

Page 29: Gluster overview & future directions vault 2015

03/12/15

Customizable Translator Stack

Page 30: Gluster overview & future directions vault 2015

03/12/15

Where is Gluster used?

Page 31: Gluster overview & future directions vault 2015

03/12/15

Gluster Use Cases

Source: 2014 GlusterFS user survey

Page 32: Gluster overview & future directions vault 2015

03/12/15

Future Directions

Page 33: Gluster overview & future directions vault 2015

03/12/15

Recent Gluster Releases

● 3.5 – April 2014

● 3.6 – Oct 2014

● 3.7 – April 2015

● Currently in development

Page 34: Gluster overview & future directions vault 2015

03/12/15

New Features in Gluster 3.7

Page 35: Gluster overview & future directions vault 2015

03/12/15

Data Tiering

● Policy based data movement across hot and cold tiers

● New translator for identifying candidates for promotion/demotion

● Enables better utilization of different classes of storage device/SSDs

Page 36: Gluster overview & future directions vault 2015

Tier Xlator

HOT DHT COLD DHT

Replication Xlator

HOT Tier

POSIX Xlator

CTR Xlator

Other Server Xlator

Brick Storage

Heat Data Store

POSIX Xlator

CTR Xlator

Other Server Xlator

Brick Storage

Heat DataStore

COLD Tier

Demotion

Promotion

Data Tiering

Page 37: Gluster overview & future directions vault 2015

03/12/15

Bitrot detection

● Detection of at rest data corruption● Checksum associated with each file

● Asynchronous checksum signing

● Periodic data scrubbing● Bitrot detection upon access

Page 38: Gluster overview & future directions vault 2015

03/12/15

Sharding

● Solves fragmentation in Gluster volumes● Chunks and places data in any node that has

space● Suitable for large file workloads requiring

parallelism

Page 39: Gluster overview & future directions vault 2015

03/12/15

Netgroups and Exports for NFS in 3.7

● More advanced configuration for authentication based on /etc/exports like syntax

● Support for netgroups

● Patches written at Facebook

● Forward ported from 3.4 to 3.7

Page 40: Gluster overview & future directions vault 2015

03/12/15

NFS Ganesha improvements

● Supports active – active NFSv4, NFSv4.1 with Kerberos

● pNFS support for Gluster

● New upcall infrastructure added in Gluster

● Gluster CLI to manage NFS Ganesha

● High-Availability based on Pacemaker and Corosync

Page 41: Gluster overview & future directions vault 2015

03/12/15

Performance enhancements

● Small file

● Multi-threaded epoll● In memory metadata caching on bricks● Improvements for directory listing

● Rebalance

● Parallel rebalance● More efficient disk crawling

● Data tiering

Page 42: Gluster overview & future directions vault 2015

03/12/15

TrashCan

● Protection from fat finger deletions, truncations.● Stored in a designated directory within the brick● Captures deletions performed by maintenance

operations like self-healing, rebalance etc.

Page 43: Gluster overview & future directions vault 2015

03/12/15

Arbiter Replication

● 2 Data, 3 Metadata replication● Additional metadata copy used for arbitration● Minimizes possibilites of split-brain by a great

degree● convert existing replica 2 volumes to arbiter

replica volumes

Page 44: Gluster overview & future directions vault 2015

03/12/15

Split-brain Resolution

● Existing behavior – EIO● Administrative policies to automatically resolve

split-brain● User can view split objects & resolve split-brain

Page 45: Gluster overview & future directions vault 2015

03/12/15

Other major improvements

● Support for inode quotas● Volume clone from snapshot● Snapshot scheduling● glusterfind – 'Needle in a haystack'● Loads of bug fixes

Page 46: Gluster overview & future directions vault 2015

03/12/15

Features beyond GlusterFS 3.7

● HyperConvergence with oVirt

● Compression (at rest)

● De-duplication

● Overlay translator

● Multi-protocol support with NFS, FUSE and SMB

● Native ReST APIs for gluster management

● More integration with OpenStack, Containers

Page 47: Gluster overview & future directions vault 2015

03/12/15

Hyperconverged oVirt – Gluster

● Server nodes are used both for virtualization and storage

● Support for both scaling up, adding more disks, and scaling out, adding more hosts

VM

s an

d S

tora

geE

ngin

e

GlusterFS Volume

Bricks Bricks Bricks

Page 48: Gluster overview & future directions vault 2015

03/12/15 48

GlusterFS Native Driver – OpenStack Manila

● Supports Certificate based access type of Manila

● Provision shares that use the 'glusterfs' protocol

● Multi-tenant

● Separation using tenant specific certificates

● Supports certificate chaining and cipher lists

Page 49: Gluster overview & future directions vault 2015

03/12/15 49

GlusterFS Native Driver – OpenStack Manila

10.1.1.1-24Admin

192.168.1.2Tech

10.1.2.1-12HR

Share: Admin(allow admin)

Share: Tech(allow Tech)

Share: HR(allow HR)

Gluster Pool

Manila Orchestration

Page 50: Gluster overview & future directions vault 2015

03/12/15 50

GlusterFS Ganesha Driver for OpenStack Manila

Storage Backend

GlusterFS

Tenant 1

Service VM

Gluster FSAL

NFS-Ganesha Server

Tenant 2

Service VM

Gluster FSAL

NFS-Ganesha Server

Nova VM Nova VM

Page 51: Gluster overview & future directions vault 2015

03/12/15

Gluster 4.0

Page 52: Gluster overview & future directions vault 2015

03/12/15

Gluster 4.0● Address higher scale

● not just higher node count, also correctness and consistency at higher node count

● glusterd, DHT changes● Support more heterogeneous environments

● multiple OSes, multiple storage types, multiple networks, NSR

● Increase deployment flexibility

● e.g. data classification, multiple replication/erasure types and levels

Page 53: Gluster overview & future directions vault 2015

03/12/15

New Style Replication

● Server Side Replication● Controlled by a designated “leader” also known

as sweeper.● Advantages

● Bandwidth usage of client network optimized for direct (fuse) mounts

● Avoidance of split brain

Page 54: Gluster overview & future directions vault 2015

03/12/15

New Style Replication

Page 55: Gluster overview & future directions vault 2015

03/12/15

DHTv2

● Improved scalability and performance for all directory-entry operations.

● High consistency and reliability for conflicting directory-entry operations, and for layout repair.

● Better performance for rebalance

Page 56: Gluster overview & future directions vault 2015

03/12/15

Thousand node glusterd

● Scale glusterd to manage more than 1000 nodes

● Paxos/Raft for membership and configuration management

Page 57: Gluster overview & future directions vault 2015

03/12/15

Gluster 4.0 – What's next?

● Code name for the release? Open to suggestions

● Submissions for feature proposals is still open!

● Implementation of key features in progress.

● Voting on feature proposals during design summit

● Tentatively planned for May 2016

Page 58: Gluster overview & future directions vault 2015

03/12/15

Resources

Mailing lists:[email protected]@nongnu.org

IRC:#gluster and #gluster-dev on freenode

Web:http://www.gluster.org

Page 59: Gluster overview & future directions vault 2015

Thank You!

vijay at gluster.orgtwitter: @vbellur

Page 60: Gluster overview & future directions vault 2015

BACKUP

Page 61: Gluster overview & future directions vault 2015

03/12/15

Striped Volume

● Aggregation of chunks of files placed on various bricks.

● Recommended normally for workloads involving very

large files and parallel access.

● WIP Sharding feature likely to supersede striped

volumes.

Page 62: Gluster overview & future directions vault 2015

03/12/15

GlusterFS concepts – Trusted Storage Pool

● a.k.a cluster

● glusterd uses a membership protocol to form trusted storage pool.

● Trusted Storage Pool is invite only.

● Membership information used for determining quorum.

● Members can be dynamically added and removed from the pool.

Page 63: Gluster overview & future directions vault 2015

03/12/15

How does a distributed volume work?

Page 64: Gluster overview & future directions vault 2015

03/12/15

How does a distributed volume work?

Page 65: Gluster overview & future directions vault 2015

03/12/15

How does a distributed volume work?

Page 66: Gluster overview & future directions vault 2015

03/12/15

A brick is the combination of a node and an export directory – for e.g. hostname:/dir

Each brick inherits limits of the underlying filesystem

No limit on the number of bricks per node

Data and metadata get stored on bricks

/export3 /export3 /export3

Storage Node

/export1

Storage Node

/export2

/export1

/export2

/export4

/export5

Storage Node

/export1

/export2

3 bricks 5 bricks 3 bricks

GlusterFS concepts - Bricks