GlusterFS Documentation

GlusterFS DocumentationRelease 3.8.0

Gluster Community

Aug 10, 2016

Contents

1 Quick Start Guide 31.1 Single Node Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Multi Node Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Overview and Concepts 72.1 Volume Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 FUSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 Translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.4 Geo-Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Installation Guide 233.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3 Installing Gluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.5 Quick Start Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.6 Setup Baremetal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.7 Deploying in AWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.8 Setting up in virtual machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Administrator Guide 33

5 Upgrade Guide 35

6 Contributors Guide 376.1 Adding your blog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376.2 Bug Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376.3 Bug Reporting Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.4 Bug Triage Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

7 Changelog 47

8 Presentations 49

i

ii

GlusterFS Documentation, Release 3.8.0

GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributedstorage solutions for media streaming, data analysis, and other data and bandwidth-intensive tasks. GlusterFS is freeand open source software.

Contents 1


2 Contents

CHAPTER 1

Quick Start Guide

For this tutorial, we will assume you are using Fedora 22 (or later) virtual machine(s). If you would like a more detailedwalk through with instructions for installing using different methods (in local virtual machines, EC2 and baremetal)and different distributions, then have a look at the Install Guide.

1.1 Single Node Cluster

This is to demonstrate installation and setting up of GlusterFS in under five minutes. You would not want to do this inany real-world scenario.

Install glusterfs client and server packages:

# yum install glusterfs glusterfs-server glusterfs-fuse

Start glusterd service:

# service glusterd start

Create 4 loopback devices to be consumed as bricks. This exercise is to simulate 4 hard disks in 4 different nodes.Format the loopback devices with XFS filesystem and mount it.

# truncate -s 1GB /srv/disk1..4# for i in `seq 1 4`;do mkfs.xfs -i size=512 /srv/disk$i ;done# mkdir -p /export/brick1..4# for i in `seq 1 4`;do echo "/srv/disk$i /export/brick$i xfs→˓loop,inode64,noatime,nodiratime 0 0" >> /etc/fstab ;done# mount -a

Create a 2x2 Distributed-Replicated volume and start it:

# gluster volume create test replica 2 transport tcp `hostname`:/export/brick1..4/→˓data force# gluster volume start test

Mount the volume for consumption:

# mkdir /mnt/test# mount -t glusterfs `hostname`:test /mnt/test

For illustration, create 10 empty files and see how it gets distributed and replicated among the 4 bricks that make upthe volume.

3


# touch /mnt/test/file1..10# ls /mnt/test# tree /export/

1.2 Multi Node Cluster

1.2.1 Step 1 – Have at least two nodes

• Fedora 22 (or later) on two nodes named “server1” and “server2”

• A working network connection

• At least two virtual disks, one for the OS installation, and one to be used to serve GlusterFS storage (sdb). Thiswill emulate a real world deployment, where you would want to separate GlusterFS storage from the OS install.

1.2.2 Step 2 - Format and mount the bricks

(on both nodes): Note: These examples are going to assume the brick is going to reside on /dev/sdb1.

# mkfs.xfs -i size=512 /dev/sdb1# mkdir -p /data/brick1# echo '/dev/sdb1 /data/brick1 xfs defaults 1 2' >> /etc/fstab# mount -a && mount

You should now see sdb1 mounted at /data/brick1

1.2.3 Step 3 - Installing GlusterFS

(on both servers) Install the software

# yum install glusterfs-server

Start the GlusterFS management daemon:

# service glusterd start# service glusterd status

1.2.4 Step 4 - Configure the trusted pool

From “server1”

# gluster peer probe server2

Note: When using hostnames, the first server needs to be probed from one other server to set its hostname.

From “server2”

# gluster peer probe server1

Note: Once this pool has been established, only trusted members may probe new servers into the pool. A new servercannot probe the pool, it must be probed from the pool.

4 Chapter 1. Quick Start Guide


1.2.5 Step 5 - Set up a GlusterFS volume

From any single server:

# gluster volume create gv0 replica 2 server1:/data/brick1/gv0 server2:/data/brick1/→˓gv0# gluster volume start gv0

Confirm that the volume shows “Started”:

# gluster volume info

Note: If the volume is not started, clues as to what went wrong will be in log files under /var/log/glusterfs on one orboth of the servers - usually in etc-glusterfs-glusterd.vol.log

1.2.6 Step 6 - Testing the GlusterFS volume

For this step, we will use one of the servers to mount the volume. Typically, you would do this from an externalmachine, known as a “client”. Since using this method would require additional packages to be installed on the clientmachine, we will use one of the servers as a simple place to test first, as if it were that “client”.

# mount -t glusterfs server1:/gv0 /mnt# for i in `seq -w 1 100`; do cp -rp /var/log/messages /mnt/copy-test-$i; done

First, check the mount point:

# ls -lA /mnt | wc -l

You should see 100 files returned. Next, check the GlusterFS mount points on each server:

# ls -lA /data/brick1/gv0

You should see 100 files on each server using the method we listed here. Without replication, in a distribute onlyvolume (not detailed here), you should see about 50 files on each one.

1.2. Multi Node Cluster 5


6 Chapter 1. Quick Start Guide

CHAPTER 2

Overview and Concepts

2.1 Volume Types

Volume is the collection of bricks and most of the gluster file system operations happen on the volume. Gluster filesystem supports different types of volumes based on the requirements. Some volumes are good for scaling storagesize, some for improving performance and some for both.

Distributed Volume - This is the default glusterfs volume i.e, while creating a volume if you do not specify the typeof the volume, the default option is to create a distributed volume. Here, files are distributed across various bricks inthe volume. So file1 may be stored only in brick1 or brick2 but not on both. Hence there is no data redundancy. Thepurpose for such a storage volume is to easily & cheaply scale the volume size. However this also means that a brickfailure will lead to complete loss of data and one must rely on the underlying hardware for data loss protection.

Create a Distributed Volume

gluster volume create NEW-VOLNAME [transport [tcp | rdma | tcp,rdma]] NEW-BRICK...

For example, to create a distributed volume with four storage servers using TCP.

7


# gluster volume create test-volume server1:/exp1 server2:/exp2 server3:/exp3 server4:→˓/exp4Creation of test-volume has been successfulPlease start the volume to access data

To display the volume info

# gluster volume infoVolume Name: test-volumeType: DistributeStatus: CreatedNumber of Bricks: 4Transport-type: tcpBricks:Brick1: server1:/exp1Brick2: server2:/exp2Brick3: server3:/exp3Brick4: server4:/exp4

Replicated Volume - In this volume we overcome the data loss problem faced in the distributed volume. Here exactcopies of the data are maintained on all bricks. The number of replicas in the volume can be decided by client whilecreating the volume. So we need to have at least two bricks to create a volume with 2 replicas or a minimum of threebricks to create a volume of 3 replicas. One major advantage of such a volume is that even if one brick fails the datacan still be accessed from its replicated bricks. Such a volume is used for better reliability and data redundancy.

Create a Replicated Volume

gluster volume create NEW-VOLNAME [replica COUNT] [transport [tcp | rdma | tcp,rdma]] NEW-BRICK...

For example, to create a replicated volume with two storage servers:

# gluster volume create test-volume replica 2 transport tcp server1:/exp1 server2:/→˓exp2Creation of test-volume has been successfulPlease start the volume to access data

8 Chapter 2. Overview and Concepts


Distributed Replicated Volume - In this volume files are distributed across replicated sets of bricks. The number ofbricks must be a multiple of the replica count. Also the order in which we specify the bricks matters since adjacentbricks become replicas of each other. This type of volume is used when high availability of data due to redundancy andscaling storage is required. So if there were eight bricks and replica count 2 then the first two bricks become replicasof each other then the next two and so on. This volume is denoted as 4x2. Similarly if there were eight bricks andreplica count 4 then four bricks become replica of each other and we denote this volume as 2x4 volume.

Create the distributed replicated volume:

# gluster volume create NEW-VOLNAME [replica COUNT] [transport [tcp | rdma | tcp,rdma]] NEW-BRICK...

For example, four node distributed (replicated) volume with a two-way mirror:

# gluster volume create test-volume replica 2 transport tcp server1:/exp1 server2:/→˓exp2 server3:/exp3 server4:/exp4Creation of test-volume has been successfulPlease start the volume to access data

Striped Volume - Consider a large file being stored in a brick which is frequently accessed by many clients at thesame time. This will cause too much load on a single brick and would reduce the performance. In striped volume thedata is stored in the bricks after dividing it into different stripes. So the large file will be divided into smaller chunks(equal to the number of bricks in the volume) and each chunk is stored in a brick. Now the load is distributed and thefile can be fetched faster but no data redundancy provided.

Create a Striped Volume

#gluster volume create NEW-VOLNAME [stripe COUNT] [transport [tcp | dma | tcp,rdma]] NEW-BRICK...

For example, to create a striped volume across two storage servers:

# gluster volume create test-volume stripe 2 transport tcp server1:/exp1 server2:/exp2Creation of test-volume has been successfulPlease start the volume to access data

2.1. Volume Types 9


Distributed Striped Volume - This is similar to Striped Glusterfs volume except that the stripes can now be distributedacross more number of bricks. However the number of bricks must be a multiple of the number of stripes. So if wewant to increase volume size we must add bricks in the multiple of stripe count.

Create the distributed striped volume:

gluster volume create NEW-VOLNAME [stripe COUNT] [transport [tcp | rdma | tcp,rdma]] NEW-BRICK...

For example, to create a distributed striped volume across eight storage servers:

# gluster volume create test-volume stripe 4 transport tcp server1:/exp1 server2:/→˓exp2 server3:/exp3 server4:/exp4 server5:/exp5 server6:/exp6 server7:/exp7 server8:/→˓exp8Creation of test-volume has been successfulPlease start the volume to access data.

2.2 FUSE

GlusterFS is a userspace filesystem. This was a decision made by the GlusterFS developers initially as getting themodules into linux kernel is a very long and difficult process.

Being a userspace filesystem, to interact with kernel VFS, GlusterFS makes use of FUSE (File System in Userspace).For a long time, implementation of a userspace filesystem was considered impossible. FUSE was developed as asolution for this. FUSE is a kernel module that support interaction between kernel VFS and non-privileged userapplications and it has an API that can be accessed from userspace. Using this API, any type of filesystem can bewritten using almost any language you prefer as there are many bindings between FUSE and other languages.

Structural diagram of FUSE

This shows a filesystem “hello world” that is compiled to create a binary “hello”. It is executed with a filesystem mountpoint /tmp/fuse. Then the user issues a command ls -l on the mount point /tmp/fuse. This command reaches VFS viaglibc and since the mount /tmp/fuse corresponds to a FUSE based filesystem, VFS passes it over to FUSE module.The FUSE kernel module contacts the actual filesystem binary “hello” after passing through glibc and FUSE libraryin userspace(libfuse). The result is returned by the “hello” through the same path and reaches the ls -l command.



2.2. FUSE 11


The communication between FUSE kernel module and the FUSE library(libfuse) is via a special file descriptor whichis obtained by opening /dev/fuse. This file can be opened multiple times, and the obtained file descriptor is passed tothe mount syscall, to match up the descriptor with the mounted filesystem.

• More about userspace filesystems

• FUSE reference

2.3 Translators

Translating “translators”:

• A translator converts requests from users into requests for storage.One to one, one to many, one to zero (e.g. caching) .. figure:: ../_static/xlator.png

• A translator can modify requests on the way through :

– convert one request type to another ( during the request transfer amongst the translators)

– modify paths, flags, even data (e.g. encryption)

• Translators can intercept or block the requests. (e.g. access control)

• Or spawn new requests (e.g. pre-fetch)

How Do Translators Work?

• Translators are implemented as shared objects

• Dynamically loaded according to ‘volfile’

– dlopen/dlsync (setup pointers to parents/children)

– call init (constructor) (call IO functions through fops)

• There are conventions for validating/passing options, etc.

• The configuration of translators (since GlusterFS 3.1) is managed through the gluster command line interface(cli), so you don’t need to know in what order to graph the translators together.

2.3.1 Types of Translators

List of known translators with their current status.

TranslatorType

Functional Purpose

Storage Lowest level translator, stores and accesses data from local file system.Debug Provide interface and statistics for errors and debugging.Cluster Handle distribution and replication of data as it relates to writing to and reading from bricks &

nodes.Encryption Extension translators for on-the-fly encryption/decryption of stored data.Protocol Extension translators for client/server communication protocols.Performance Tuning translators to adjust for workload and I/O profiles.Bindings Add extensibility, e.g. The Python interface written by Jeff Darcy to extend API interaction

with GlusterFS.System System access translators, e.g. Interfacing with file system access control.Scheduler I/O schedulers that determine how to distribute new write operations across clustered systems.Features Add additional features such as Quotas, Filters, Locks, etc.

The default/general hierarchy of translators in vol files :


http://www.linux-mag.com/id/7814/

http://fuse.sourceforge.net/


2.3. Translators 13


All the translators hooked together to perform a function is called a graph. The left-set of translators comprises ofClient-stack.The right-set of translators comprises of Server-stack.

The glusterfs translators can be sub-divided into many categories, but two important categories are - Clusterand Performance translators :

One of the most important and the first translator the data/request has to go through is fuse translator which fallsunder the category of Mount Translators.

Cluster Translators:

• DHT(Distributed Hash Table)

• AFR(Automatic File Replication)

Performance Translators:

• io-cache

• io-threads

• md-cache

• open behind

• quick read

• read-ahead

• readdir-ahead

• write-behind

Other Feature Translators include:

• changelog

• locks: provides internal locking operations called inodelk and entrylkwhich are used by AFR to achieve synchronization of operations on files or directories that con-flict with each other.

• marker

• quota

Debug Translators

• trace

• io-stats

2.3.2 DHT(Distributed Hash Table) Translator

What is DHT?

DHT is the real core of how GlusterFS aggregates capacity and performance across multiple servers. Its responsibilityis to place each file on exactly one of its subvolumes – unlike either replication (which places copies on all of itssubvolumes) or striping (which places pieces onto all of its subvolumes). It’s a routing function, not splitting orcopying.

How DHT works?

The basic method used in DHT is consistent hashing. Each subvolume (brick) is assigned a range within a 32-bit hashspace, covering the entire range with no holes or overlaps. Then each file is also assigned a value in that same space,by hashing its name. Exactly one brick will have an assigned range including the file’s hash value, and so the file“should” be on that brick. However, there are many cases where that won’t be the case, such as when the set of bricks



(and therefore the range assignment of ranges) has changed since the file was created, or when a brick is nearly full.Much of the complexity in DHT involves these special cases, which we’ll discuss in a moment.

When you open() a file, the distribute translator is giving one piece of information to find your file, the file-name. Todetermine where that file is, the translator runs the file-name through a hashing algorithm in order to turn that file-nameinto a number.

A few Observations of DHT hash-values assignment:

1. The assignment of hash ranges to bricks is determined by extended attributes stored on directories, hence distri-bution is directory-specific.

2. Consistent hashing is usually thought of as hashing around a circle, but in GlusterFS it’s more linear. There’sno need to “wrap around” at zero, because there’s always a break (between one brick’s range and another’s) atzero.

3. If a brick is missing, there will be a hole in the hash space. Even worse, if hash ranges are reassigned whilea brick is offline, some of the new ranges might overlap with the (now out of date) range stored on that brick,creating a bit of confusion about where files should be.

2.3.3 AFR(Automatic File Replication) Translator

The Automatic File Replication (AFR) translator in GlusterFS makes use of the extended attributes to keep track ofthe file operations.It is responsible for replicating the data across the bricks.

Responsibilities of AFR

Its responsibilities include the following:

1. Maintain replication consistency (i.e. Data on both the bricks should be same, even in the cases where there areoperations happening on same file/directory in parallel from multiple applications/mount points as long as allthe bricks in replica set are up).

2. Provide a way of recovering data in case of failures as long as there is at least one brick which has the correctdata.

3. Serve fresh data for read/stat/readdir etc.

Overall working of GlusterFS

As soon as GlusterFS is installed in a server node, a gluster management daemon(glusterd) binary will be created. Thisdaemon should be running in all participating nodes in the cluster. After starting glusterd, a trusted server pool(TSP)can be created consisting of all storage server nodes (TSP can contain even a single node). Now bricks which are thebasic units of storage can be created as export directories in these servers. Any number of bricks from this TSP can beclubbed together to form a volume.

Once a volume is created, a glusterfsd process starts running in each of the participating brick. Along with this,configuration files known as vol files will be generated inside /var/lib/glusterd/vols/. There will be configuration filescorresponding to each brick in the volume. This will contain all the details about that particular brick. Configurationfile required by a client process will also be created. Now our filesystem is ready to use. We can mount this volumeon a client machine very easily as follows and use it like we use a local storage:

# mount.glusterfs <IP or hostname>:<volume_name> <mount_point>

2.3. Translators 15


IP or hostname can be that of any node in the trusted server pool in which the required volume is created.

When we mount the volume in the client, the client glusterfs process communicates with the servers’ glusterd process.Server glusterd process sends a configuration file (vol file) containing the list of client translators and another contain-ing the information of each brick in the volume with the help of which the client glusterfs process can now directlycommunicate with each brick’s glusterfsd process. The setup is now complete and the volume is now ready for client’sservice.

When a system call (File operation or Fop) is issued by client in the mounted filesystem, the VFS (identifying the typeof filesystem to be glusterfs) will send the request to the FUSE kernel module. The FUSE kernel module will in turnsend it to the GlusterFS in the userspace of the client node via /dev/fuse (this has been described in FUSE section).The GlusterFS process on the client consists of a stack of translators called the client translators which are defined inthe configuration file(vol file) send by the storage server glusterd process. The first among these translators being theFUSE translator which consists of the FUSE library(libfuse). Each translator has got functions corresponding to eachfile operation or fop supported by glusterfs. The request will hit the corresponding function in each of the translators.Main client translators include:

• FUSE translator

• DHT translator- DHT translator maps the request to the correct brick that contains the file or directory required.

• AFR translator- It receives the request from the previous translator and if the volume type is replicate, it dupli-cates the request and pass it on to the Protocol client translators of the replicas.

• Protocol Client translator- Protocol Client translator is the last in the client translator stack. This translatoris divided into multiple threads, one for each brick in the volume. This will directly communicate with theglusterfsd of each brick.

In the storage server node that contains the brick in need, the request again goes through a series of translators knownas server translators, main ones being:

• Protocol server translator

• POSIX translator

The request will finally reach VFS and then will communicate with the underlying native filesystem. The responsewill retrace the same path.



2.4 Geo-Replication

Geo-replication provides asynchronous replication of data across geographically distinct locations and was introducedin Glusterfs 3.2. It mainly works across WAN and is used to replicate the entire volume unlike AFR which is intra-cluster replication. This is mainly useful for backup of entire data for disaster recovery.

Geo-replication uses a master-slave model, whereby replication occurs between Master - a GlusterFS volume andSlave - which can be a local directory or a GlusterFS volume. The slave (local directory or volume is accessed usingSSH tunnel).

Geo-replication provides an incremental replication service over Local Area Networks (LANs), Wide Area Network(WANs), and across the Internet.

Geo-replication over LAN

You can configure Geo-replication to mirror data over a Local Area Network.

Geo-replication over WAN

You can configure Geo-replication to replicate data over a Wide Area Network.

Geo-replication over Internet

You can configure Geo-replication to mirror data over the Internet.

Multi-site cascading Geo-replication

You can configure Geo-replication to mirror data in a cascading fashion across multiple sites.

2.4. Geo-Replication 17




There are mainly two aspects while asynchronously replicating data ine change detection and replication:

Change detection - These include file-operation necessary details. There are two methods to sync the detected changesnamely a) changelogs and b) xsync:

a. Changelogs - Changelog is a translator which records necessary details for the fops that occur. The changes can bewritten in binary format or ASCII. There are three category with each category represented by a specific changelogformat. All three types of categories are recorded in a single changelog file.

Entry - create(), mkdir(), mknod(), symlink(), link(), rename(), unlink(), rmdir()

Data - write(), writev(), truncate(), ftruncate()

Meta - setattr(), fsetattr(), setxattr(), fsetxattr(), removexattr(), fremovexattr()

In order to record the type of operation and entity underwent, a type identifier is used. Normally, the entity on whichthe operation is performed would be identified by the pathname, but we choose to use GlusterFS internal file identifier(GFID) instead (as GlusterFS supports GFID based backend and the pathname field may not always be valid andother reasons which are out of scope of this this document). Therefore, the format of the record for the three types ofoperation can be summarized as follows:

Entry - GFID + FOP + MODE + UID + GID + PARGFID/BNAME [PARGFID/BNAME]

Meta - GFID of the file

Data - GFID of the file

GFID’s are analogous to inodes. Data and Meta fops record the GFID of the entity on which the operation wasperformed, thereby recording that there was a data/metadata change on the inode. Entry fops record at the minimum aset of six or seven records (depending on the type of operation), that is sufficient to identify what type of operation theentity underwent. Normally this record includes the GFID of the entity, the type of file operation (which is an integer[an enumerated value which is used in Glusterfs]) and the parent GFID and the basename (analogous to parent inodeand basename).

Changelog file is rolled over after a specific time interval. We then perform processing operations on the file likeconverting it to understandable/human readable format, keeping private copy of the changelog etc. The library thenconsumes these logs and serves application requests.

b. Xsync - Marker translator maintains an extended attribute “xtime” for each file and directory. Whenever any updatehappens it would update the xtime attribute of that file and all its ancestors. So the change is propagated from the node(where the change has occurred) all the way to the root.

Consider the above directory tree structure. At time T1 the master and slave were in sync each other.

At time T2 a new file File2 was created. This will trigger the xtime marking (where xtime is the current timestamp)from File2 upto to the root, i.e, the xtime of File2, Dir3, Dir1 and finally Dir0 all will be updated.

Geo-replication daemon crawls the file system based on the condition that xtime(master) > xtime(slave). Hence in ourexample it would crawl only the left part of the directory structure since the right part of the directory structure still hasequal timestamp. Although the crawling algorithm is fast we still need to crawl a good part of the directory structure.

Replication - We use rsync for data replication. Rsync is an external utility which will calculate the diff of the twofiles and sends this difference from source to sync.

2.5 Terminologies

Access Control Lists Access Control Lists (ACLs) allows you to assign different permissions for different users orgroups even though they do not correspond to the original owner or the owning group.

Brick Brick is the basic unit of storage, represented by an export directory on a server in the trusted storage pool.

2.5. Terminologies 19




Cluster A cluster is a group of linked computers, working together closely thus in many respects forming a singlecomputer.

Distributed File System A file system that allows multiple clients to concurrently access data over a computer net-work

FUSE Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like computer operating systems thatlets non-privileged users create their own file systems without editing kernel code. This is achieved by runningfile system code in user space while the FUSE module provides only a “bridge” to the actual kernel interfaces.

glusterd Gluster management daemon that needs to run on all servers in the trusted storage pool.

Geo-Replication Geo-replication provides a continuous, asynchronous, and incremental replication service from siteto another over Local Area Networks (LANs), Wide Area Network (WANs), and across the Internet.

Metadata Metadata is defined as data providing information about one or more other pieces of data.There is nospecial metadata storage concept in GlusterFS. The metadata is stored with the file data itself.

Namespace Namespace is an abstract container or environment created to hold a logical grouping of unique identifiersor symbols. Each Gluster volume exposes a single namespace as a POSIX mount point that contains every filein the cluster.

POSIX Portable Operating System Interface [for Unix] is the name of a family of related standards specified by theIEEE to define the application programming interface (API), along with shell and utilities interfaces for softwarecompatible with variants of the Unix operating system. Gluster exports a fully POSIX compliant file system.

RAID Redundant Array of Inexpensive Disks”, is a technology that provides increased storage reliability throughredundancy, combining multiple low-cost, less-reliable disk drives components into a logical unit where alldrives in the array are interdependent.

RRDNS Round Robin Domain Name Service (RRDNS) is a method to distribute load across application servers. Itis implemented by creating multiple A records with the same name and different IP addresses in the zone file ofa DNS server.

Trusted Storage Pool A storage pool is a trusted network of storage servers. When you start the first server, thestorage pool consists of that server alone.

Userspace Applications running in user space don’t directly interact with hardware, instead using the kernel to mod-erate access. Userspace applications are generally more portable than applications in kernel space. Gluster is auser space application.

Volume A volume is a logical collection of bricks. Most of the gluster management operations happen on the volume.

Vol file .vol files are configuration files used by glusterfs process. Volfiles will be usually located at/var/lib/glusterd/vols/volume-name/. Eg:vol-name-fuse.vol,export-brick-name.vol,etc.. Sub-volumes in the .volfiles are present in the bottom-up approach and then after tracing forms a tree structure, where in the hierarchylast comes the client volumes.

Client The machine which mounts the volume (this may also be a server).

Server The machine which hosts the actual file system in which the data will be stored.

Replicate Replicate is generally done to make a redundancy of the storage for data availability.

2.5. Terminologies 21



CHAPTER 3

Installation Guide

3.1 Getting Started

This tutorial will cover different options for getting a Gluster cluster up and running. Here is a rundown of the stepswe need to do.

To start, we will go over some common things you will need to know for setting up Gluster.

Next, choose the method you want to use to set up your first cluster: - Within a virtual machine - To bare metal servers- To EC2 instances in Amazon

Finally, we will install Gluster, create a few volumes, and test using them.

3.1.1 Common Setup Criteria

No matter where you will be installing Gluster, it helps to understand a few key concepts on what the moving partsare.

First, it is important to understand that GlusterFS isn’t really a filesystem in and of itself. It concatenates existingfilesystems into one (or more) big chunks so that data being written into or read out of Gluster gets distributed acrossmultiple hosts simultaneously. This means that you can use space from any host that you have available. Typically,XFS is recommended but it can be used with other filesystems as well. Most commonly EXT4 is used when XFSisn’t, but you can (and many, many people do) use another filesystem that suits you. Now that we understand that, wecan define a few of the common terms used in Gluster.

• A trusted pool refers collectively to the hosts in a given Gluster Cluster.

• A node or “server” refers to any server that is part of a trusted pool. In general, this assumes all nodes are in thesame trusted pool.

• A brick is used to refer to any device (really this means filesystem) that is being used for Gluster storage.

• An export refers to the mount path of the brick(s) on a given server, for example, /export/brick1

• The term Global Namespace is a fancy way of saying a Gluster volume

• A Gluster volume is a collection of one or more bricks (of course, typically this is two or more). This isanalagous to /etc/exports entries for NFS.

• GNFS and kNFS. GNFS is how we refer to our inline NFS server. kNFS stands for kernel NFS, or, as mostpeople would say, just plain NFS. Most often, you will want kNFS services disabled on the Gluster nodes.Gluster NFS doesn’t take any additional configuration and works just like you would expect with NFSv3. It ispossible to configure Gluster and NFS to live in harmony if you want to.

Other notes:

23


• For this test, if you do not have DNS set up, you can get away with using /etc/hosts entries for the two nodes.However, when you move from this basic setup to using Gluster in production, correct DNS entries (forwardand reverse) and NTP are essential.

• When you install the Operating System, do not format the Gluster storage disks! We will use specific settingswith the mkfs command later on when we set up Gluster. If you are testing with a single disk (not recommended),make sure to carve out a free partition or two to be used by Gluster later, so that you can format or reformat atwill during your testing.

• Firewalls are great, except when they aren’t. For storage servers, being able to operate in a trusted environmentwithout firewalls can mean huge gains in performance, and is recommended.

3.2 Configuration

3.2.1 Configure Firewall

For the Gluster to communicate within a cluster either the firewalls have to be turned off or enable communication foreach server.

iptables -I INPUT -p all -s `<ip-address>` -j ACCEPT

3.2.2 Configure the trusted pool

Remember that the trusted pool is the term used to define a cluster of nodes in Gluster. Choose a server to be your“primary” server. This is just to keep things simple, you will generally want to run all commands from this tutorial.Keep in mind, running many Gluster specific commands (like ‘gluster volume create‘) on one server in the cluster willexecute the same command on all other servers.

gluster peer probe (hostname of the other server in the cluster, or IP address if you don’t have DNS or /→˓etc/hosts entries)

Notice that running ‘gluster peer status‘ from the second node shows that the first node has already been added.

3.2.3 Partition, Format and mount the bricks

Assuming you have a brick at /dev/sdb:

fdisk /dev/sdb and create a single partition

3.2.4 Format the partition

mkfs.xfs -i size=512 /dev/sdb1

3.2.5 Mount the partition as a Gluster “brick”

mkdir -p /export/sdb1 && mount /dev/sdb1 /export/sdb1 && mkdir -p /export/sdb1/brick

24 Chapter 3. Installation Guide


3.2.6 Add an entry to /etc/fstab

echo "/dev/sdb1 /export/sdb1 xfs defaults 0 0" >> /etc/fstab

Set up a Gluster volume

The most basic Gluster volume type is a “Distribute only” volume (also referred to as a “pure DHT” volume if youwant to impress the folks at the water cooler). This type of volume simply distributes the data evenly across theavailable bricks in a volume. So, if I write 100 files, on average, fifty will end up on one server, and fifty will end upon another. This is faster than a “replicated” volume, but isn’t as popular since it doesn’t give you two of the mostsought after features of Gluster — multiple copies of the data, and automatic failover if something goes wrong. To setup a replicated volume:

gluster volume create gv0 replica 2 node01.mydomain.net:/export/sdb1/brick node02.→˓mydomain.net:/export/sdb1/brick

Breaking this down into pieces, the first part says to create a gluster volume named gv0 (the name is arbitrary, gv0 waschosen simply because it’s less typing than gluster_volume_0). Next, we tell it to make the volume a replica volume,and to keep a copy of the data on at least 2 bricks at any given time. Since we only have two bricks total, this meanseach server will house a copy of the data. Lastly, we specify which nodes to use, and which bricks on those nodes. Theorder here is important when you have more bricks. . . it is possible (as of the most current release as of this writing,Gluster 3.3) to specify the bricks in a such a way that you would make both copies of the data reside on a single node.This would make for an embarrassing explanation to your boss when your bulletproof, completely redundant, alwayson super cluster comes to a grinding halt when a single point of failure occurs.

Now, we can check to make sure things are working as expected:

gluster volume info

And you should see results similar to the following:

Volume Name: gv0Type: ReplicateVolume ID: 8bc3e96b-a1b6-457d-8f7a-a91d1d4dc019Status: CreatedNumber of Bricks: 1 x 2 = 2Transport-type: tcpBricks:Brick1: node01.yourdomain.net:/export/sdb1/brickBrick2: node02.yourdomain.net:/export/sdb1/brick

This shows us essentially what we just specified during the volume creation. The one this to mention is the “Status”.A status of “Created” means that the volume has been created, but hasn’t yet been started, which would cause anyattempt to mount the volume fail.

Now, we should start the volume.

gluster volume start gv0

Find all documentation here

3.2. Configuration 25


3.3 Installing Gluster

For RPM based distributions, if you will be using InfiniBand, add the glusterfs RDMA package to the installations. ForRPM based systems, yum is used as the install method in order to satisfy external depencies such as compat-readline5

For Debian

Download the packages

wget -nd -nc -r -A.deb http://download.gluster.org/pub/gluster/glusterfs/LATEST/→˓Debian/wheezy/

(Note from reader: The above does not work. Check http://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/ for 3.5 version or use http://packages.debian.org/wheezy/glusterfs-server

Install the Gluster packages (do this on both servers)

dpkg -i glusterfs_3.5.2-4_amd64.deb

For Ubuntu

Ubuntu 10 and 12: install python-software-properties:

sudo apt-get install python-software-properties

Ubuntu 14: install software-properties-common:

sudo apt-get install software-properties-common

Then add the community GlusterFS PPA:

sudo add-apt-repository ppa:gluster/glusterfs-3.5sudo apt-get update

Finally, install the packages:

sudo apt-get install glusterfs-server

Note: Packages exist for Ubuntu 12.04 LTS, 12.10, 13.10, and 14.04 LTS

For Red Hat/CentOS

Download the packages (modify URL for your specific release and architecture).

wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/→˓RHEL/glusterfs-epel.repo


yum install glusterfs-server

For Fedora


yum install glusterfs-server

Once you are finished installing, you can move on to configuration section.


http://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/

http://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/

http://packages.debian.org/wheezy/glusterfs-server


3.4 Overview

This document will get you up to speed with some hands-on experience with Gluster by guiding you through the stepsof setting it up for the first time. If you are looking to get right into things, you are in the right place. If you want justthe bare minimum steps, see the Quick Start Guide.

If you want some in-depth information on each of the steps, you are in the right place. Both the guides will get you toa working Gluster cluster, so it depends on you how much time you want to spend. The Quick Start Guide should haveyou up and running in ten minutes or less. This guide can easily be done in a lunch break, and still gives you time tohave a quick bite to eat. The Getting Started guide can be done easily in a few hours, depending on how much testingyou want to do.

After you deploy Gluster by following these steps, we recommend that you read the Gluster Admin Guide to learnhow to administer Gluster and how to select a volume type that fits your needs. Also, be sure to enlist the help of theGluster community via the IRC channel or Q&A section . We want you to be successful in as short a time as possible.Overview:

Before we begin, let’s talk about what Gluster is, dispel a few myths and misconceptions, and define a few terms. Thiswill help you to avoid some of the common issues that others encounter most frequently.

3.4.1 What is Gluster

Gluster is a distributed scale out filesystem that allows rapid provisioning of additional storage based on your storageconsumption needs. It incorporates automatic failover as a primary feature. All of this is accomplished without acentralized metadata server.

3.4.2 What is Gluster without making me learn an extra glossary of terminology?

• Gluster is an easy way to provision your own storage backend NAS using almost any hardware you choose.

• You can add as much as you want to start with, and if you need more later, adding more takes just a few steps.

• You can configure failover automatically, so that if a server goes down, you don’t lose access to the data. Nomanual steps are required for failover. When you fix the server that failed and bring it back online, you don’thave to do anything to get the data back except wait. In the mean time, the most current copy of your data keepsgetting served from the node that was still running.

• You can build a clustered filesystem in a matter of minutes. . . it is trivially easy for basic setups

• It takes advantage of what we refer to as “commodity hardware”, which means, we run on just about anyhardware you can think of, from that stack of decomm’s and gigabit switches in the corner no one can figure outwhat to do with (how many license servers do you really need, after all?), to that dream array you were speccingout online. Don’t worry, I won’t tell your boss.

• It takes advantage of commodity software too. No need to mess with kernels or fine tune the OS to a tee. Werun on top of most unix filesystems, with XFS and ext4 being the most popular choices. We do have somerecommendations for more heavily utilized arrays, but these are simple to implement and you probably havesome of these configured already anyway.

• Gluster data can be accessed from just about anywhere – You can use traditional NFS, SMB/CIFS for Windowsclients, or our own native GlusterFS (a few additional packages are needed on the client machines for this, butas you will see, they are quite small).

• There are even more advanced features than this, but for now we will focus on the basics.

3.4. Overview 27


• It’s not just a toy. Gluster is enterprise ready, and commercial support is available if you need it. It is used insome of the most taxing environments like media serving, natural resource exploration, medical imaging, andeven as a filesystem for Big Data.

3.4.3 Is Gluster going to work for me and what I need it to do?

Most likely, yes. People use Gluster for all sorts of things. You are encouraged to ask around in our IRC channelor Q&A forums to see if anyone has tried something similar. That being said, there are a few places where Glusteris going to need more consideration than others. - Accessing Gluster from SMB/CIFS is often going to be slow bymost people’s standards. If you only moderate access by users, then it most likely won’t be an issue for you. Onthe other hand, adding enough Gluster servers into the mix, some people have seen better performance with us thanother solutions due to the scale out nature of the technology - Gluster does not support so called “structured data”,meaning live, SQL databases. Of course, using Gluster to backup and restore the database would be fine - Gluster istraditionally better when using file sizes at of least 16KB (with a sweet spot around 128KB or so).

3.4.4 How many billions of dollars is it going to cost to setup a cluster? Don’t Ineed redundant networking, super fast SSD’s, technology from Alpha Cen-tauri delivered by men in black, etc. . . ?

I have never seen anyone spend even close to a billion, unless they got the rust proof coating on the servers. You don’tseem like the type that would get bamboozled like that, so have no fear. For purpose of this tutorial, if your laptop canrun two VM’s with 1GB of memory each, you can get started testing and the only thing you are going to pay for iscoffee (assuming the coffee shop doesn’t make you pay them back for the electricity to power your laptop).

If you want to test on bare metal, since Gluster is built with commodity hardware in mind, and because there is nocentralized meta-data server, a very simple cluster can be deployed with two basic servers (2 CPU’s, 4GB of RAMeach, 1 Gigabit network). This is sufficient to have a nice file share or a place to put some nightly backups. Glusteris deployed successfully on all kinds of disks, from the lowliest 5200 RPM SATA to mightiest 1.21 gigawatt SSD’s.The more performance you need, the more consideration you will want to put into how much hardware to buy, but thegreat thing about Gluster is that you can start small, and add on as your needs grow.

3.4.5 OK, but if I add servers on later, don’t they have to be exactly the same?

In a perfect world, sure. Having the hardware be the same means less troubleshooting when the fires start popping up.But plenty of people deploy Gluster on mix and match hardware, and successfully.

Get started by checking some Common Criteria

3.5 Quick Start Guide

Here are the bare minimum steps you need to get Gluster up and running. If this is your first time setting things up, itis recommended that you choose your desired setup method i.e. AWS, virtual machines or baremetal after step four,adding an entry to fstab. If you want to get right to it (and don’t need more information), the steps below are all youneed to get started.

You will need to have at least two nodes with a 64 bit OS and a working network connection. At least one gig of RAMis the bare minimum recommended for testing, and you will want at least 8GB in any system you plan on doing anyreal work on. A single cpu is fine for testing, as long as it is 64 bit.

Partition, Format and mount the bricks

Assuming you have a brick at /dev/sdb:



fdisk /dev/sdb and create a single partition

Format the partition

mkfs.xfs -i size=512 -n size=8192 /dev/sdb1

Mount the partition as a Gluster “brick”

mkdir -p /export/sdb1 && mount /dev/sdb1 /export/sdb1 && mkdir -p /export/sdb1/brick

Add an entry to /etc/fstab

echo "/dev/sdb1 /export/sdb1 xfs defaults 0 0" >> /etc/fstab

Install Gluster packages on both nodes

rpm -Uvh http://download.gluster.com/pub/gluster/glusterfs/LATEST/Fedora/glusterfs,-→˓server,-fuse,-geo-replication-3.3.1-1.fc16.x86_64.rpm

Note: This example assumes Fedora 16

Run the gluster peer probe command

Note: From one node to the other nodes (do not peer probe the first node itself, repeat this command for all nodes thatshould be in the trusted pool)

gluster peer probe <ip or hostname of another host>

Configure your Gluster volume

gluster volume create testvol replica 2 transport tcp node01:/export/sdb1/→˓brick node02:/export/sdb1/brick

Test using the volume

mkdir /mnt/gluster; mount -t glusterfs node01:/testvol; cp -r /var/log /mnt/gluster

3.6 Setup Baremetal

Note: You only need one of the three setup methods! ### Setup, Method 2 – Setting up on physical servers

To set up Gluster on physical servers, I recommend two servers of very modest specifications (2 CPU’s, 2GB of RAM,1GBE). Since we are dealing with physical hardware here, keep in mind, what we are showing here is for testingpurposes. In the end, remember that forces beyond your control (aka, your bosses’ boss...) can force you to takethat the “just for a quick test” envinronment right into production, despite your kicking and screaming against it. Toprevent this, it can be a good idea to deploy your test environment as much as possible the same way you would to aproduction environment (in case it becomes one, as mentioned above). That being said, here is a reminder of some ofthe best practices we mentioned before:

• Make sure DNS and NTP are setup, correct, and working

• If you have access to a backend storage network, use it! 10GBE or InfiniBand are great if you have access tothem, but even a 1GBE backbone can help you get the most out of you deployment. Make sure that the interfacesyou are going to use are also in DNS since we will be using the hostnames when we deploy Gluster

• When it comes to disks, the more the merrier. Although you could technically fake things out with a single disk,there would be performance issues as soon as you tried to do any real work on the servers

3.6. Setup Baremetal 29


With the explosion of commodity hardware, you don’t need to be a hardware expert these days to deploy a server.Although this is generally a good thing, it also means that paying attention to some important, performance impactingBIOS settings is commonly ignored. A few things I have seen cause issues when people didn’t know to look for them:

• Most manufacturers enable power saving mode by default. This is a great idea for servers that do not have highperformance requirements. For the average storage server, the performance impact of the power savings is not areasonable trade off

• Newer motherboards and processors have lots of nifty features! Enhancements in virtualization, newer ways ofdoing predictive algorithms and NUMA are just a few to mention. To be safe, many manufactures ship hardwarewith settings meant to work with as massive a variety of workloads and configurations as they have customers.One issue you could face is when you set up that blazing fast 10GBE card you were so thrilled about installing?In many cases, it would end up being crippled by a default 1x speed put in place on the PCI-E bus by themotherboard.

Thankfully, most manufactures show all the BIOS settings, including the defaults, right in the manual. It only takes afew minutes to download, and you don’t even have to power off the server unless you need to make changes. More andmore boards include the functionality to make changes in the BIOS on the fly without even powering the box off. Oneword of caution of course, is don’t go too crazy. Fretting over each tiny little detail and setting is usually not worththe time, and the more changes you make, the more you need to document and implement later. Try to find the happybalance between time spent managing the hardware (which ideally should be as close to zero after you setup initially)and the expected gains you get back from it.

Finally, remember that some hardware really is better that others. Without pointing fingers anywhere specifically, itis often true that onboard components are not as robust as add-ons. As a general rule, you can safely delegate theon-board hardware to things like management network for the NIC’s, and for installing the OS onto a SATA drive. Atleast twice a year you should check the manufacturers website for bulletins about your hardware. Critical performanceissues are often resolved with a simple driver or firmware update. As often as not, these updates affect the two mostcritical pieces of hardware on a machine you want to use for networked storage - the RAID controller and the NIC’s.

Once you have setup the servers and installed the OS, you are ready to move on to the install section.

3.7 Deploying in AWS

Deploying in Amazon can be one of the fastest ways to get up and running with Gluster. Of course, most of what wecover here will work with other cloud platforms.

• Deploy at least two instances. For testing, you can use micro instances (I even go as far as using spot instancesin most cases). Debates rage on what size instance to use in production, and there is really no correct answer.As with most things, the real answer is “whatever works for you”, where the trade-offs betweeen cost andperformance are balanced in a continual dance of trying to make your project successful while making surethere is enough money left over in the budget for you to get that sweet new ping pong table in the break room.

• For cloud platforms, your data is wide open right from the start. As such, you shouldn’t allow open access toall ports in your security groups if you plan to put a single piece of even the least valuable information on thetest instances. By least valuable, I mean “Cash value of this coupon is 1/100th of 1 cent” kind of least valuable.Don’t be the next one to end up as a breaking news flash on the latest inconsiderate company to allow their datato fall into the hands of the baddies. See Step 2 for the minimum ports you will need open to use Gluster

• You can use the free “ephemeral” storage for the Gluster bricks during testing, but make sure to use some formof protection against data loss when you move to production. Typically this means EBS backed volumes orusing S3 to periodically back up your data bricks.

Other notes:

• In production, it is recommended to replicate your VM’s across multiple zones. For purpose of this tutorial, itis overkill, but if anyone is interested in this please let us know since we are always looking to write articles on



the most requested features and questions.

• Using EBS volumes and Elastic IP’s is also recommended in production. For testing, you can safely ignorethese as long as you are aware that the data could be lost at any moment, so make sure your test deployment isjust that, testing only.

• Performance can fluctuate wildly in a cloud environment. If performance issues are seen, there are severalpossible strategies, but keep in mind that this is the perfect place to take advantage of the scale-out capability ofGluster. While it is not true in all cases that deploying more instances will necessarily result in a “faster” cluster,in general you will see that adding more nodes means more performance for the cluster overall.

• If a node reboots, you will typically need to do some extra work to get Gluster running again using the defaultEC2 configuration. If a node is shut down, it can mean absolute loss of the node (depending on how you setthings up). This is well beyond the scope of this document, but is discussed in any number of AWS relatedforums and posts. Since I found out the hard way myself (oh, so you read the manual every time?!), I thought itworth at least mentioning.

Once you have both instances up, you can proceed to the install page.

3.8 Setting up in virtual machines

As we just mentioned, to set up Gluster using virtual machines, you will need at least two virtual machines with atleast 1GB of RAM each. You may be able to test with less but most users will find it too slow for their tastes. Theparticular virtualization product you use is a matter of choice. Platforms I have used to test on include Xen, VMwareESX and Workstation, VirtualBox, and KVM. For purpose of this article, all steps assume KVM but the concepts areexpected to be simple to translate to other platforms as well. The article assumes you know the particulars of how tocreate a virtual machine and have installed a 64 bit linux distribution already.

Create or clone two VM’s, with the following setup on each:

• 2 disks using the VirtIO driver, one for the base OS and one that we will use as a Gluster “brick”. You can addmore later to try testing some more advanced configurations, but for now let’s keep it simple.

Note: If you have ample space available, consider allocating all the disk space at once.

• 2 NIC’s using VirtIO driver. The second NIC is not strictly required, but can be used to demonstrate setting upa seperate network for storage and management traffic.

Note: Attach each NIC to a seperate network.

Other notes: Make sure that if you clone the VM, that Gluster has not already been installed. Gluster generates aUUID to “fingerprint” each system, so cloning a previously deployed system will result in errors later on.

Once these are prepared, you are ready to move on to the install section.

3.8. Setting up in virtual machines 31



CHAPTER 4

Administrator Guide

33


34 Chapter 4. Administrator Guide

CHAPTER 5

Upgrade Guide

35


36 Chapter 5. Upgrade Guide

CHAPTER 6

Contributors Guide

6.1 Adding your blog

As a developer/user, you have blogged about gluster and want to share the post to Gluster community.

Ok, you can do that by editing this file. https://github.com/gluster/planet-gluster/blob/master/data/feeds.yml

Please find instructions mentioned in the file and send a pull request.

Once approved, all your gluster related posts will appear in planet.gluster.org

6.2 Bug Lifecycle

This page describes the life of a bug report.

• When a bug is first reported, it is given the NEW status.

• Once a developer has started, or is planning to work on a bug, the status ASSIGNED is set. The “Assigned to”field should mention a specific developer.

• If an initial patch for a bug has been put into the Gerrit code review tool, the status POST should be set manually.The status POST should only be used when all patches for a specific bug have been posted for review.

• After a review of the patch, and passing any automated regression tests, the patch will get merged by one of themaintainers. When the patch has been merged into the git repository, a comment is added to the bug. Only whenall needed patches have been merged, the assigned engineer will need to change the status to MODIFIED.

• Once a package is available with fix for the bug, the status should be moved to ON_QA.

– The Fixed in version field should get the name/release of the package that contains the fix. Packages formultiple distributions will mostly get available within a few days after the make dist tarball was created.

– This tells the bug reporter that a package is available with fix for the bug and that they should test thepackage.

– The release maintainer need to do this change to bug status, scripts are available (ask ndevos).

• The status VERIFIED is set if a QA tester or the reporter confirmed the fix after fix is merged and new buildwith the fix resolves the issue.

• In case the version does not fix the reported bug, the status should be moved back to ASSIGNED with a clearnote on what exactly failed.

• When a report has been solved it is given CLOSED status. This can mean:

37

https://github.com/gluster/planet-gluster/blob/master/data/feeds.yml

https://en.wikipedia.org/wiki/Patch_(computing)

http://review.gluster.org


– CLOSED/NEXTRELEASE when backport of the code change that fixes the reported mainline problemhas been merged.

– CLOSED/CURRENTRELEASE when the code change that fixes the reported bug has been merged inthe corresponding release-branch and a version is released with the fix.

– CLOSED/WONTFIX when the reported problem or suggestion is valid, but any fix of the reportedproblem or implementation of the suggestion would be barred from approval by the project’s Develop-ers/Maintainers (or product managers, if existing).

– CLOSED/WORKSFORME when the problem can not be reproduced, when missing information has notbeen provided, or when an acceptable workaround exists to achieve a similar outcome as requested.

– CLOSED/CANTFIX when the problem is not a bug, or when it is a change that is outside the power ofGlusterFS development. For example, bugs proposing changes to third-party software can not be fixed inthe GlusterFS project itself.

– CLOSED/DUPLICATE when the problem has been reported before, no matter if the previous report hasbeen already resolved or not.

If a bug report was marked as CLOSED or VERIFIED and it turns out that this was incorrect, the bug can be changedto the status ASSIGNED or NEW.

6.3 Bug Reporting Guidelines

6.3.1 Before filing a bug

If you are finding any issues, these preliminary checks as useful:

• Is SELinux enabled? (you can use getenforce to check)

• Are iptables rules blocking any data traffic? (iptables -L can help check)

• Are all the nodes reachable from each other? [ Network problem ]

• Please search Bugzilla to see if the bug has already been reported

– Choose GlusterFS as the “product”, and then type something relevant in the “words” box. If you are seeinga crash or abort, searching for part of the abort message might be effective. If you are feeling adventurousyou can select the “Advanced search” tab; this gives a lot more control but isn’t much better for findingexisting bugs.

– If a bug has been already filed for a particular release and you found the bug in another release,

* please clone the existing bug for the release, you found the issue.

* If the existing bug is against mainline and you found the issue for a release, then the cloned bugdepends on should be set to the BZ for mainline bug.

Anyone can search in Bugzilla, you don’t need an account. Searching requires some effort, but helps avoid duplicates,and you may find that your problem has already been solved.

6.3.2 Reporting A Bug

• You should have a Bugzilla account

• Here is the link to file a bug: Bugzilla

• The template for filing a bug can be found *here*

38 Chapter 6. Contributors Guide

https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS


Note: Please go through all below sections to understand what information we need to put in a bug. So it will help thedeveloper to root cause and fix it

Required Information

You should gather the information below before creating the bug report.

Package Information

• Location from which the packages are used

• Package Info - version of glusterfs package installed

Cluster Information

• Number of nodes in the cluster

• Hostnames and IPs of the gluster Node [if it is not a security issue]

– Hostname / IP will help developers in understanding & correlating with the logs

• Output of gluster peer status

• Node IP, from which the “x” operation is done

– “x” here means any operation that causes the issue

Volume Information

• Number of volumes

• Volume Names

• Volume on which the particular issue is seen [ if applicable ]

• Type of volumes

• Volume options if available

• Output of gluster volume info

• Output of gluster volume status

• Get the statedump of the volume with the problem

$ gluster volume statedump

This dumps statedump per brick process in /var/run/gluster

NOTE: Collect statedumps from one gluster Node in a directory.

Repeat it in all Nodes containing the bricks of the volume. All the so collected directories could bearchived,compressed and attached to bug

6.3. Bug Reporting Guidelines 39


Brick Information

• xfs options when brick partition was done

– This could be obtained with this command :

$ xfs_info /dev/mapper/vg1-brick

• Extended attributes on the bricks

– This could be obtained with this command:

$ getfattr -d -m. -ehex /rhs/brick1/b1

Client Information

• OS Type ( Windows, RHEL )

• OS Version : In case of Linux distro get the following :

$ uname -r

$ cat /etc/issue

• Fuse or NFS Mount point on the client with output of mount commands

• Output of df -Th command

Tool Information

• If any tools are used for testing, provide the info/version about it

• if any IO is simulated using a script, provide the script

Logs Information

• You can check logs for check for issues/warnings/errors.

– Self-heal logs

– Rebalance logs

– Glusterd logs

– Brick logs

– NFS logs (if applicable)

– Samba logs (if applicable)

– Client mount log

• Add the entire logs as attachment, if its very large to paste as a comment



SOS report for CentOS/Fedora

• Get the sosreport from the involved gluster Node and Client [ in case of CentOS /Fedora ]

• Add a meaningful name/IP to the sosreport, by renaming/adding hostname/ip to the sosreport name

6.4 Bug Triage Guidelines

• Triaging of bugs is an important task; when done correctly, it can reduce the time between reporting a bug andthe availability of a fix enormously.

• Triager should focus on new bugs, and try to define the problem easily understandable and as accurate aspossible. The goal of the triagers is to reduce the time that developers need to solve the bug report.

• A triager is like an assistant that helps with the information gathering and possibly the debugging of a new bugreport. Because a triager helps preparing a bug before a developer gets involved, it can be a very nice role fornew community members that are interested in technical aspects of the software.

• Triagers will stumble upon many different kind of issues, ranging from reports about spelling mistakes, orunclear log messages to memory leaks causing crashes or performance issues in environments with severalhundred storage servers.

Nobody expects that triagers can prepare all bug reports. Therefore most developers will be able to assist the triagers,answer questions and suggest approaches to debug and data to gather. Over time, triagers get more experienced andwill rely less on developers.

Bug triage can be summarised as below points:

• Is there enough information in the bug description?

• Is it a duplicate bug?

• Is it assigned to correct component of GlusterFS?

• Are the Bugzilla fields correct?

• Is the bug summary is correct?

• Assigning bugs or Adding people to the “CC” list

• Fix the Severity And Priority.

• Todo, If the bug present in multiple GlusterFS versions.

• Add appropriate Keywords to bug.

The detailed discussion about the above points are below.

6.4.1 Weekly meeting about Bug Triaging

We try to meet every week in #gluster-meeting on Freenode. The meeting date and time for the next meeting isnormally updated in the agenda.

6.4.2 Getting Started: Find reports to triage

There are many different techniques and approaches to find reports to triage. One easy way is to use these pre-definedBugzilla reports (a report is completely structured in the URL and can manually be modified):

• New bugs that do not have the ‘Triaged’ keyword Bugzilla link

6.4. Bug Triage Guidelines 41

https://public.pad.fsfe.org/p/gluster-bug-triage

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&f1=keywords&keywords=Triaged%2CFutureFeature&keywords_type=nowords&list_id=3014117&o1=nowords&product=GlusterFS&query_format=advanced&v1=Triaged


• New features that do not have the ‘Triaged’ keyword (identified by FutureFeature keyword, probably of interestonly to project leaders) Bugzilla link

• New glusterd bugs: Bugzilla link

• New Replication(afr) bugs: Bugzilla link

• New distribute(DHT) bugs: Bugzilla links

• New bugs against version 3.6: Buzilla link

• Bugzilla tracker (can include already Triaged bugs)

• Untriaged NetBSD bugs

• Untriaged FreeBSD bugs

• Untriaged Mac OS bugs

In addition to manually checking Bugzilla for bugs to triage, it is also possible to receive emails when new bugs arefiled or existing bugs get updated.

If at any point you feel like you do not know what to do with a certain report, please first ask irc or mailing lists beforechanging something.

6.4.3 Is there enough information?

To make a report useful, the same rules apply as for bug reporting guidelines.

It’s hard to generalize what makes a good report. For “average” reporters is definitely often helpful to have goodsteps to reproduce, GlusterFS software version , and information about the test/production environment, Linux/GNUdistribution.

If the reporter is a developer, steps to reproduce can sometimes be omitted as context is obvious. However, this cancreate a problem for contributors that need to find their way, hence it is strongly advised to list the steps to reproducean issue.

Other tips:

• There should be only one issue per report. Try not to mix related or similar looking bugs per report.

• It should be possible to call the described problem fixed at some point. “Improve the documentation” or “Itruns slow” could never be called fixed, while “Documentation should cover the topic Embedding” or “The pageat http://en.wikipedia.org/wiki/Example should load in less than five seconds” would have a criterion. A goodsummary of the bug will also help others in finding existing bugs and prevent filing of duplicates.

• If the bug is a graphical problem, you may want to ask for a screenshot to attach to the bug report. Make sure toask that the screenshot should not contain any confidential information.

6.4.4 Is it a duplicate?

Some reports in Bugzilla have already been reported before so you can search for an already existing report. Wedo not recommend to spend too much time on it; if a bug is filed twice, someone else will mark it as a duplicatelater. If the bug is a duplicate, mark it as a duplicate in the resolution box below the comment field by setting theCLOSED DUPLICATE status, and shortly explain your action in a comment for the reporter. When marking a bugas a duplicate, it is required to reference the original bug.

If you think that you have found a duplicate but you are not totally sure, just add a comment like “This bug looks relatedto bug XXXXX” (and replace XXXXX by the bug number) so somebody else can take a look and help judging.


https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&f1=keywords&f2=keywords&list_id=3014699&o1=nowords&o2=allwords&product=GlusterFS&query_format=advanced&v1=Triaged&v2=FutureFeature

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&product=GlusterFS&f1=keywords&o1=nowords&v1=Triaged&component=glusterd

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&component=replicate&f1=keywords&list_id=2816133&o1=nowords&product=GlusterFS&query_format=advanced&v1=Triaged

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&component=distribute&f1=keywords&list_id=2816148&o1=nowords&product=GlusterFS&query_format=advanced&v1=Triaged

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&product=GlusterFS&f1=keywords&f2=version&o1=nowords&o2=regexp&v1=Triaged&v2=\T1\textasciicircum 3.6

https://bugzilla.redhat.com/page.cgi?id=browse.html&product=GlusterFS&product_version=&bug_status=all&tab=recents

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=nowords&op_sys=NetBSD&product=GlusterFS

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=nowords&op_sys=FreeBSD&product=GlusterFS

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=nowords&op_sys=Mac%20OS&product=GlusterFS

http://www.gluster.org/community/index.html

http://en.wikipedia.org/wiki/Example

https://bugzilla.redhat.com/query.cgi?format=advanced


6.4.5 Is it assigned to correct component of GlusterFS?

Make sure the bug is assigned on right component. Below are the list of GlusterFs components in bugzilla.

• access control - Access control translator

• BDB - Berkeley DB backend storage

• booster - LD_PRELOAD’able access client

• build - Compiler, package management and platform specific warnings and errors

• cli -gluster command line

• core - Core features of the filesystem

• distribute - Distribute translator (previously DHT)

• errorgen - Error Gen Translator

• fuse -mount/fuse translator and patched fuse library

• georeplication - Gluster Geo-Replication

• glusterd - Management daemon

• HDFS - Hadoop application support over GlusterFS

• ib-verbs - Infiniband verbs transport

• io-cache - IO buffer caching translator

• io-threads - IO threads performance translator

• libglusterfsclient- API interface to access glusterfs volumes programatically

• locks - POSIX and internal locks

• logging - Centralized logging, log messages, log rotation etc

• nfs- NFS component in GlusterFS

• nufa- Non-Uniform Filesystem Scheduler Translator

• object-storage - Object Storage

• porting - Porting GlusterFS to different operating systems and platforms

• posix - POSIX (API) based backend storage

• protocol -Client and Server protocol translators

• quick-read- Quick Read Translator

• quota - Volume & Directory quota translator

• rdma- RDMA transport

• read-ahead - Read ahead (file) performance translator

• replicate- Replication translator (previously AFR)

• rpc - RPC Layer

• scripts - Build scripts, mount scripts, etc.

• stat-prefetch - Stat prefetch translator

• stripe - Striping (RAID-0) cluster translator

• trace- Trace translator



• transport - Socket (IPv4, IPv6, unix, ib-sdp) and generic transport code

• unclassified - Unclassified - to be reclassified as other components

• unify - Unify translator and schedulers

• write-behind- Write behind performance translator

• libgfapi - APIs for GlusterFS

• tests- GlusterFS Test Framework

• gluster-hadoop - Hadoop support on GlusterFS

• gluster-hadoop-install - Automated Gluster volume configuration for Hadoop Environments

• gluster-smb - gluster smb

• puppet-gluster - A puppet module for GlusterFS

Tips for searching:

• As it is often hard for reporters to find the right place (product and component) where to file a report, also searchfor duplicates outside same product and component of the bug report you are triaging.

• Use common words and try several times with different combinations, as there could be several ways to describethe same problem. If you choose the proper and common words, and you try several times with differentcombinations of those, you ensure to have matching results.

• Drop the ending of a verb (e.g. search for “delet” so you get reports for both “delete” and “deleting”), and alsotry similar words (e.g. search both for “delet” and “remov”).

• Search using the date range delimiter: Most of the bug reports are recent, so you can try to increase the searchspeed using date delimiters by going to “Search by Change History” on the search page. Example: search from“2011-01-01” or “-730d” (to cover the last two years) to “Now”.

6.4.6 Are the fields correct?

Summary

Sometimes the summary does not summarize the bug itself well. You may want to update the bug summary to makethe report distinguishable. A good title may contain:

• A brief explanation of the root cause (if it was found)

• Some of the symptoms people are experiencing

Adding people to the “CC” or changing the “Assigned to” field

Normally, developers and potential assignees of an area are already CC’ed by default, but sometimes reports describegeneral issues or are filed against common bugzilla products. Only if you know developers who work in the areacovered by the bug report, and if you know that these developers accept getting CCed or assigned to certain reports,you can add that person to the CC field or even assign the bug report to her/him.

To get an idea who works in which area, check To know component owners , you can check the “MAINTAINERS”file in root of glusterfs code directory or querying changes in Gerrit (see Simplified dev workflow)

Severity And Priority

Please see below for information on the available values and their meanings.


https://bugzilla.redhat.com/query.cgi?format=advanced

http://review.gluster.org


Severity

This field is a pull-down of the external weighting of the bug report’s importance and can have the following values:

Sever-ity

Definition

urgent catastrophic issues which severely impact the mission-critical operations of an organization. This maymean that the operational servers, development systems or customer applications are down or notfunctioning and no procedural workaround exists.

high high-impact issues in which the customer’s operation is disrupted, but there is some capacity to producemedium partial non-critical functionality loss, or issues which impair some operations but allow the customer to

perform their critical tasks. This may be a minor issue with limited loss or no loss of functionality andlimited impact to the customer’s functionality

low general usage questions, recommendations for product enhancement, or development workun-speci-fied

importance not specified

Priority

This field is a pull-down of the internal weighting of the bug report’s importance and can have the following values:

Priority Definitionurgent extremely importanthigh very importantmedium average importancelow not very importantunspecified importance not specified

Bugs present in multiple Versions

During triaging you might come across a particular bug which is present across multiple version of GlusterFS. Hereare the course of actions:

• We should have separate bugs for each release (We should clone bugs if required)

• Bugs in released versions should be depended on bug for mainline (master branch) if the bug is applicable formainline.

– This will make sure that the fix would get merged in master branch first then the fix can get ported to otherstable releases.

Note: When a bug depends on other bugs, that means the bug cannot be fixed unless other bugs are fixed (dependson), or this bug stops other bugs being fixed (blocks)

Here are some examples:

• A bug is raised for GlusterFS 3.5 and the same issue is present in mainline (master branch) and GlusterFS 3.6

– Clone the original bug for mainline.

– Clone another for 3.6.

– And have the GlusterFS 3.6 bug and GlusterFS 3.5 bug ‘depend on’ the ‘mainline’ bug

• A bug is already present for mainline, and the same issue is seen in GlusterFS 3.5.

– Clone the original bug for GlusterFS 3.5.



– And have the cloned bug (for 3.5) ‘depend on’ the ‘mainline’ bug.

Keywords

Many predefined searches for Bugzilla include keywords. One example are the searches for the triaging. If the bug is‘NEW’ and ‘Triaged’ is no set, you (as a triager) can pick it and use this page to triage it. When the bug is ‘NEW’ and‘Triaged’ is in the list of keyword, the bug is ready to be picked up by a developer.

Triaged Once you are done with triage add the Triaged keyword to the bug, so that others will know the triaged stateof the bug. The predefined search at the top of this page will then not list the Triaged bug anymore. Instead, thebug should have moved to this list.

EasyFix By adding the EasyFix keyword, the bug gets added to the list of bugs that should be simple to fix. Addingthis keyword is encouraged for simple and well defined bugs or feature enhancements.

Patch When a patch for the problem has been attached or included inline, add the Patch keyword so that it is clearthat some preparation for the development has been done already. If course, it would have been nicer if the patchwas sent to Gerrit for review, but not everyone is ready to pass the Gerrit hurdle when they report a bug.

You can also add the Patch keyword when a bug has been fixed in mainline and the patch(es) has been identified. Adda link to the Gerrit change(s) so that backporting to a stable release is made simpler.

Documentation Add the Documentation keyword when a bug has been reported for the documentation. This helpseditors and writers in finding the bugs that they can resolve.

Tracking This keyword is used for bugs which are used to track other bugs for a particular release. For example 3.6tracker bug

FutureFeature This keyword is used for bugs which are used to request for a feature enhancement ( RFE - RequestedFeature Enhancement) for future releases of GlusterFS. If you open a bug by requesting a feature which youwould like to see in next versions of GlusterFS please report with this keyword.

6.4.7 Add yourself to the CC list

By adding yourself to the CC list of bug reports that you change, you will receive followup emails with all commentsand changes by anybody on that individual report. This helps learning what further investigations others make. Youcan change the settings in Bugzilla on which actions you want to receive mail.

6.4.8 Bugs For Group Triage

If you come across a bug/ bugs or If you think any bug should to go thorough the bug triage group, please set NEED-INFO for [email protected] on the bug.

6.4.9 Resolving bug reports

See the Bug report life cycle for the meaning of the bug status and resolutions.

6.4.10 Example of Triaged Bugs

This Bugzilla filter will list NEW, Triaged Bugs


https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&product=GlusterFS

https://bugzilla.redhat.com/showdependencytree.cgi?maxdepth=2&hide_resolved=1&id=glusterfs-3.6.0

https://bugzilla.redhat.com/showdependencytree.cgi?maxdepth=2&hide_resolved=1&id=glusterfs-3.6.0

mailto:[email protected]

https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=anywords&list_id=2739593&product=GlusterFS&query_format=advanced

CHAPTER 7

Changelog

47


48 Chapter 7. Changelog

CHAPTER 8

Presentations

This is a collection of Gluster presentations from all over the world. We have a slideshare account where most of thesepresentations are stored.

GlusterFS Meetup Bangalore @ Bangalore, India (June 4, 2016)

• GlusterFS Containers - Mohamed Ashiq Liyazudeen, Humble Devassy Chirammal

• gdeploy 2.0 - Sachidananda Urs, Nandaja Varma

OpenStack Days Istanbul @ Istanbul, Turkey (May 31, 2016)

• Building Clouds That Scale-Out with GlusterFS - Mustafa Resul CETINEL

NLUUG Voorjaarsconferentie @ Bunnik, The Netherlands (May 26, 2016)

• Replication Techniques in Gluster (.odp) (.pdf) Niels de Vos

Vault @ Raleigh, NC, US (April 20-21, 2016)

• GlusterD 2.0 (presentation) - Atin Mukherjee

Incontro DevOps Italia 2016 @ Bologna, Italy April 1, 2016

• Gluster roadmap, recent improvements and upcoming features - (presentation) (video) - Niels de Vos

LinuxConfAU 2016 @ Geelong, Australia February 03, 2016

• GlusterD thread synchronization using Userspace Read Copy Update (URCU) - Atin Mukherjee

DevConf.CZ 2016 @ Brno, Czech Republic (February 5, 2016)

• Ceph, Gluster, Swift : Similarities and differences - Prashanth Pai, Thiago da Silva

FOSDEM 2016 @ Brussels, Belgium (January 30, 2016)

• Gluster roadmap: Recent improvements and upcoming features - Niels de Vos

49

http://www.slideshare.net/GlusterCommunity

http://www.meetup.com/glusterfs-India/events/229929410/

http://www.slideshare.net/AshiqAshiq/glusterfs-containers

http://www.slideshare.net/sac2279/gdeploy-20

http://openstackdaysistanbul.com/

https://goo.gl/QnHhr6

https://www.nluug.nl/activiteiten/events/vj16/programma.html

https://www.nluug.nl/activiteiten/events/vj16/abstracts/ab07.html

http://people.redhat.com/ndevos/talks/2016-05-NLUUG/20160526-replication-in-gluster.odp

http://people.redhat.com/ndevos/talks/2016-05-NLUUG/20160526-replication-in-gluster.pdf

http://events.linuxfoundation.org/events/vault

https://vault2016.sched.org/event/68kl/glusterd-20-managing-distributed-file-system-using-a-centralized-store-atin-mukherjee-red-hat

http://www.slideshare.net/GlusterCommunity/gluster-d2-62086643

http://www.incontrodevops.it/events/idi2016/

http://www.incontrodevops.it/sessions/gluster-roadmap-recent-improvements-and-upcoming-features/

http://www.slideshare.net/GlusterCommunity/20160401-glusterroadmap

https://vimeo.com/album/3949155/video/167706951

https://linux.conf.au/

http://www.slideshare.net/GlusterCommunity/gluster-d-threadsynchronizationusingurculca2016

http://devconf.cz/

https://speakerdeck.com/prashanthpai/ceph-gluster-swift-similarities-and-differences

http://www.fosdem.org/2016

http://www.slideshare.net/GlusterCommunity/20160130-glusterroadmap


T-DOSE 2015 @ Eindhoven, The Netherlands (November 28, 2015)

• Introduction into Scale-out Storage with Gluster - Niels de Vos

Usenix LISA 2015 @ Washington DC, USA (Nov 8, 2015)

• GlusterFS Tutorial - Architecture - Rajesh Joseph & Poornima Gurusiddaiah

• GlusterFS Tutorial - Hands-on - Rajesh Joseph & Poornima Gurusiddaiah

Open Source Backup Conference @ Cologne, Germany (September 30, 2015)

• Scale-Out backups with Bareos and Gluster - Niels de Vos

2015 Storage Developer Conference

• Achieving Coherent and Aggressive Client Caching in Gluster, a Distributed System - Poornima Gurusiddaiah,Soumya Koduri

• Introduction to Highly Available NFS Server on Scale-Out Storage Systems Based on GlusterFS - SoumyaKoduri, Meghana Madhusudhan

Gluster Summit 2015 @ Barcelona, Spain

• Bug Triage in Gluster - Niels de Vos

• Responsibilities of Gluster Maintainers - Niels de Vos

• Leases and caching - Poornima Gurusiddaiah & Soumya Koduri

• Cache tiering in GlusterFS and future directions - Dan Lambright

• Yet Another Deduplication Library (YADL) - Dan Lambright

Gluster Conference @ NMAMIT, Nitte (April 11, 2015)

• Introduction to Open Source - Niels de Vos, Red Hat

• Software Defined Storage - Dan Lambright, Red Hat

• Introduction to GlusterFS - Kaleb S. KEITHLEY, Red Hat

• Data Deduplication - Joseph Fernandes, Red Hat

• Quality of Service - Karthik US & Sukumar Poojary, 4th SEM, MCA, NMAMIT, Nitte

Ceph & Gluster FS - Software Defined Storage Meetup (January 22, 2015)

• GlusterFS - Current Features & Roadmap - Niels de Vos, Red Hat

Open source storage for bigdata :Fifth Elephant event (June 21, 2014)

• GlusterFS_Hadoop_fifth-elephant.odp - Lalatendu Mohanty, Red Hat

50 Chapter 8. Presentations

http://www.t-dose.org/

http://www.slideshare.net/GlusterCommunity/gluster-introtdose-62086654

http://www.slideshare.net/GlusterCommunity/lisa-2015gluster-fsintroduction

http://www.slideshare.net/GlusterCommunity/lisa-2015gluster-fshandson

http://www.slideshare.net/GlusterCommunity/scale-out-backupswithbareosandgluster-62086631

http://www.snia.org/sites/default/files/SDC15_presentations/file_sys/Poornima_Sourmya_Achiving_Coherent_Aggressive_Client-2.pdf

http://www.slideshare.net/GlusterCommunity/introduction-to-highlyavailablenfsserveronscaleoutstoragesystemsbasedonglusterfssdc2015

http://www.slideshare.net/GlusterCommunity/bug-triage-ingluster

http://www.slideshare.net/GlusterCommunity/responsibilities-of-glustermaintainers

http://www.slideshare.net/GlusterCommunity/leases-andcaching-final

http://www.slideshare.net/GlusterCommunity/tiering-barcelona

http://www.slideshare.net/GlusterCommunity/ydal-barcelona

http://www.slideshare.net/GlusterCommunity/introduction-to-open-source-62086626

http://www.slideshare.net/GlusterCommunity/software-defined-storage-62086656

http://www.slideshare.net/GlusterCommunity/gluster-technical-overview

http://www.slideshare.net/GlusterCommunity/dedupe-nmamit

http://www.slideshare.net/GlusterCommunity/qos-62086668

http://www.meetup.com/RedHatFinland/events/218774694/

http://www.slideshare.net/GlusterCommunity/gluster-fs-currentfeaturesandroadmap-62089261

http://www.slideshare.net/GlusterCommunity/gluster-fs-hadoopfifthelephant


Red Hat Summit 2014, San Francisco, California, USA (April 14-17, 2014)

• Red Hat Storage Server Administration Deep Dive - Dustin Black, Red Hat

– GlusterFS Stack Diagram

Gluster Community Night, Amsterdam, The Netherlands (March 4th, 2014)

• GlusterFS for System Administrators - Niels de Vos, Red Hat

Gluster Community Day, London, United Kingdom (October 29th, 2013)

• Developing_Apps_and_Integrating_with_GlusterFS_-_Libgfapi.odp - Justin Clift, Red Hat

Gluster Community Day / LinuxCon Europ 2013, Edinburgh, United Kingdom (October 22nd, 2013)

• GlusterFS Architecture & Roadmap - Vijay Bellur

• Integrating GlusterFS, QEMU and oVirt - Vijay Bellur

Gluster Community Day, Stockholm, Sweden (September 4th, 2013)

• Gluster related development - Niels de Vos, Red Hat

LOADays, Belgium (April 6th, 2013)

• GlusterFS for Sysadmins - Justin Clift.

CIALUG Des Moines, IA (March 21st, 2013)

• Converged infrastruture with oVirt, KVM, and Gluster - Theron Conrey, Red Hat

Gluster Community Summit, Bangalore (March 7 & 8, 2013)

• SMB-GlusterDevMar2013: (Video recording) - Chris Hertel, Red Hat

• kkeithley-UFONFS-GlusterSummit - Kaleb Keithley, Red Hat

• HDFS + GlusterFS integration - Jay Vyas Video

• Join_the_SuperColony_-_Feb2013.odp - JMW, Red Hat

• GlusterFS API Introduction - Jeff Darcy, Red Hat

Gluster Community Workshop at CERN in Geneva (February 26, 2013)

• Debugging GlusterFS with Wireshark (additional files), Niels de Vos

Gluster Community Workshop at LinuxCon Europe (November 8, 2012)

51

http://people.redhat.com/dblack/summit2014/black_w_1650_Red_Hat_Storage_Server_administration_deep_dive.odp

http://people.redhat.com/dblack/summit2014/gluster_stack.odp

http://people.redhat.com/ndevos/talks/gluster_for_sysadmins-amsterdam-20140304.pdf

http://www.slideshare.net/GlusterCommunity/developing-apps-andintegratingwithglusterfslibgfapi

http://www.slideshare.net/GlusterCommunity/glusterfs-architectur-roadmap-linuxcon-eu-2013

http://www.slideshare.net/GlusterCommunity/integrating-gluster-fsqemuandovirtvijaybellurlinuxconeu2013-62086638

http://people.redhat.com/ndevos/talks/Gluster-Stockholm-20130902.pdf

http://www.slideshare.net/GlusterCommunity/glusterfs-for-sysadminsjustinclift

http://www.rwx3.com/downloads/CI_presentation_26Feb2013.pdf

http://www.slideshare.net/GlusterCommunity/smb-gluster-devmar2013

https://www.youtube.com/watch?v=2RHemtFzIUM

http://www.slideshare.net/GlusterCommunity/kkeithley-ufonfsgluster-summit

https://www.youtube.com/watch?v=Wl3EMX7Sm6o

http://www.slideshare.net/GlusterCommunity/join-the-supercolonyfeb2013

http://www.slideshare.net/GlusterCommunity/gsummit-apis2013

http://www.slideshare.net/GlusterCommunity/debugging-withwiresharknielsdevos

http://people.redhat.com/ndevos/talks/debugging-glusterfs-with-wireshark.d/


• Gluster for Sysadmins - Dustin Black, Red Hat

• On-demand_File_Caching_-_Gustavo_Brand - On-demand File Caching, Gustavo Brand, Scalus Project

• Gluster_Wireshark_Niels_de_Vos - Gluster and Wireshark Integration, Niels de Vos, Red Hat

• Accessing_Gluster_UFO_-_Eco_Willson Unified File and Object with GlusterFS, Eco Willson, Red Hat

• Disperse_Xlator_Ramon_Datalab.pdf - Disperse Translator, Ramon , Datalab

• State_of_the_Gluster_-_LCEU.pdf State of the Gluster Community, John Mark Walker, Red Hat

• QEMU_GlusterFS - QEMU integration with GlusterFS, Bharata Rao, IBM Linux Technology Center

Software Developers’ Conference (SNIA) (September 17, 2012)

• Challenges and Futures - Jeff Darcy, Red Hat

Gluster Workshop at LinuxCon North America (August 28, 2012)

• Translator tutorial - Jeff Darcy, Red Hat

• Translator example - Jeff Darcy, Red Hat

52 Chapter 8. Presentations

http://www.slideshare.net/GlusterCommunity/gluster-for-sysadmins

http://www.slideshare.net/GlusterCommunity/on-demand-filecachinggustavobrand

http://www.slideshare.net/GlusterCommunity/gluster-wireshark-nielsdevos

http://www.slideshare.net/GlusterCommunity/accessing-gluster-ufoecowillson

http://www.slideshare.net/GlusterCommunity/disperse-xlator-ramondatalab

http://www.slideshare.net/GlusterCommunity/state-of-theglusterlceu

http://www.slideshare.net/GlusterCommunity/qemu-gluster-fs

http://www.slideshare.net/GlusterCommunity/sdc-challenges2012

http://www.slideshare.net/GlusterCommunity/lcna-tutorial2012

http://www.slideshare.net/GlusterCommunity/lcna-example2012

Documents

GlusterFS Documentation