11
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 1/11 Lightning fast synch with csync2 and lsyncd Introduction Credits: Axel Kittenberger, Gordan Bobic Support: Ask your questions related to this tutorial into Performance Drive forum If you find this tutorial useful, please post a link to it in your site. We recently had a client who hired us to build from scratch his web site cluster. Among many performance problems, they were struggling with in their vBulletin based site, one issue being the file syncronization between the web nodes. The site does not contain any physical attachments and all the static files were manually uploaded by the owner through all 8 web servers. They tried in the past to synchronize the data through most popular solutions used by web hosts but they encountered performance issues and a SAN based solution was not in their budget. As result, they were forced to store the avatars into MySQL database, creating unnecessary stress at database level while affecting the overall site performance. So I had to come up with a solution that does not eat server resources like an elephant, while providing instant synchronization through an entire cluster. After some research done, I discarded instantly similar solutions like drbd, rsync and unison based on a variety of tests performed. The performance was not at the level I wanted to provide. The only client that performed flawlessly was csync2. I selected csync2 as prime synchronization candidate for the following reasons: 1. ease of configuration setup and the large variety of automated tasks 2. unlimited number of nodes to be synchronized 3. great flexibility how the synchronization is performed 4. level of security used for external nodes 5. data storage of currently synced files in each node, for fast file comparison Home Forums Developer Zone Tutorials and Reviews Floren Axivo Developer Home Forums Search Forums Recent Posts Members Log in or Sign up

Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

Embed Size (px)

DESCRIPTION

csync2

Citation preview

Page 1: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 1/11

Lightning fast synch with csync2 and lsyncd

Introduction

Credits: Axel Kittenberger, Gordan Bobic

Support: Ask your questions related to this tutorial into Performance Drive

forum

If you find this tutorial useful, please post a link to it in your site.

We recently had a client who hired us to build from scratch his web site cluster.

Among many performance problems, they were struggling with in their

vBulletin based site, one issue being the file syncronization between the web

nodes. The site does not contain any physical attachments and all the static files

were manually uploaded by the owner through all 8 web servers. They tried in

the past to synchronize the data through most popular solutions used by web

hosts but they encountered performance issues and a SAN based solution was

not in their budget. As result, they were forced to store the avatars into MySQL

database, creating unnecessary stress at database level while affecting the

overall site performance.

So I had to come up with a solution that does not eat server resources like an

elephant, while providing instant synchronization through an entire cluster.

After some research done, I discarded instantly similar solutions like drbd, rsync

and unison based on a variety of tests performed. The performance was not at

the level I wanted to provide. The only client that performed flawlessly was

csync2.

I selected csync2 as prime synchronization candidate for the following reasons:

1. ease of configuration setup and the large variety of automated tasks

2. unlimited number of nodes to be synchronized

3. great flexibility how the synchronization is performed

4. level of security used for external nodes

5. data storage of currently synced files in each node, for fast file

comparison

Home Forums Developer Zone Tutorials and Reviews

FlorenAxivo Developer

Home Forums

Search Forums Recent Posts

Members

Log in or Sign up

Page 2: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 2/11

Now that the synchronization part was covered, I had to find a solution that

allows csync2 to trigger a sync event, once a new file is present into any cluster

node. Most Linux based operating systems have installed inotify in their kernel.

Basically, inotify is a kernel subsystem that acts to extend filesystems to notice

changes to the filesystem and reports those changes to an application, in our

case csync2.

With that established, I had to find a robust C based application that properly

handled the inotify events and also could be daemonized, so I run it as Linux

service. Again, I quickly discarded all Python or Perl based scripts, for the same

reasons I mentioned earlier: speed and performance.

The main candidates I selected were inotify-tools and lsyncd. They were both

written on C and had options to run as daemon. Based on preliminary tests, I

discovered inotify-tools would not satisfy my selection criteria, as its daemon

features contained bugs. I even tried to daemonize myself inotifywait with a

shell wrapper but the performance was just not there. The server load generated

was too high, so I was forced to discard it and noticed Radu Voicila (inotify-

tools developer) about the issue.

I selected lsyncd as prime inotify candidate for the following reasons:

1. runs as daemon, so it can be started as Linux service

2. does not hamper local filesystem performance like other similar solutions

3. aggregates and combines events in batch, avoiding unnecessary server

load

4. uses lua as configuration scripting tool

Once the solution logic was finished, it was time to transform everything into

reality. First, I created all necessary rpm's allowing me to install csync2 and

lsyncd into any cluster in a matter of seconds. CentOS 5.6 lacked several

important packages, so I had to write everything from scratch except for xinetd

who was available into base repository. All packages are currently available into

Axivo repository. If you are a host provider, please contact us if you want to

help us host our packages on your servers.

The dependencies I wrote/used for csync2.x86_64 1.34 rpm are:

xinetd.x86_64 2:2.3.14+ (CentOS repo)

gnutls.x86_64 1.4.1+ (Axivo repo)

librsync.x86_64 0.9.7+ (Axivo repo)

libtasn1.x86_64 2.9+ (Axivo repo)

sqlite2.x86_64 2.8.17+ (Axivo repo)

Page 3: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 3/11

Floren, May 16, 2011 #1

The dependencies I wrote/used for lsyncd.x86_64 2.04 rpm are:

rsync.x86_64 2.6.8+ (CentOS repo)

lua.x86_64 5.1.4+ (Axivo repo)

Note: lsyncd was designed mainly for rsync, so I had to implement it as

requirement into Axivo rpm.

Next, it was time to actually install all packages and test how well they perform.

For that, I've setup a small 3 nodes cluster on an internal network and installed

all of the above listed rpm's in each node:

I will explain in detail the logic, because it is important to understand how the

actual setup will work. For simplicity, I will call the 3 nodes apollo, chronos and

hermes, as my test nodes I used into cluster.

See below the detailed setup steps I used for csync2 and lsyncd.

Important: Please be aware that I refer to several file locations in the next

steps. These locations are unique to Axivo rpm builds and will probably not

match the actual source build locations. If you install the Axivo rpm's, you will

not encounter any setup difficulties.

Csync2 Setup

Csync2 is very straight forward, but it required some precise steps to perform,

once I installed all Axivo rpm's. The csync2 rpm automates several tasks that

are required to perform manually, once you install the software from source.

For example, it adds the default csync2 TCP port to /etc/services file, creates the

SSL certificates (in case you connect to an external node) and manages the

sqlite database needed to store the sync information for each node.

To avoid frustration, it is important that you study the Csync2 documentation

and understand how its process works, before you proceed with the tutorial.

1) Generate the csync2.key, into apollo node. To do that, I simply issued the

command:

It will require to generate some additional entropy, in order to properly generate

# yum --enablerepo=axivo install csync2 lsyncd

FlorenAxivo Developer

# csync2 -k /etc/csync2/csync2.key

Page 4: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 4/11

the key.

To do that, I simply opened a new terminal window and ran the usual

commands like:

If is not enough, run top and wait until enough random data is created. You

will notice when the process is complete once your shell prompt is returned into

the window where you started the csync2 key generation.

2) Edit the default /etc/csync2/csync2.cfg configuration file and insert into the

following content:

Code:

nossl * *;group web{ host apollo; host chronos; host hermes; key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}

I used this file for the initial syncronisation step on all nodes, explained into final

setup steps.

Personally, I did not wanted the encrypted overhead on an internal network, so

I disabled the SSL with the nossl directive you noticed into above csync2

configuration file.

3) Create a custom configuration file called csync2_apollo.cfg for apollo node.

This part is the key of success for our solution, I will explain the logic in detail at

the end of this article.

Code:

nossl * *;group apollo{ host apollo; host (chronos); key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}

4) On each additional node, copy all files present into apollo:/etc/csync2

directory. To be save, I ran the following commands on chronos and hermes:

# du -h /

Page 5: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 5/11

I did that because I wanted to have the SSL certificates and csync2.key identical

on all nodes. That is the condition needed by csync2, in case you decide to use

an SSL connection while syncing the files.

Once the file transfer completed on each node, rename accordingly the csync2

configuration files to reflect the name of each node. On chronos node, the

configuration file was renamed csync2_chronos.cfg and had the content:

Code:

nossl * *;group chronos{ host chronos; host (hermes); key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}

while on hermes node, the configuration file was renamed csync2_hermes.cfg

and had the content:

Code:

nossl * *;group hermes{ host hermes; host (apollo); key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}

As detailed into Csync2 documentation, nodes will be able to communicate with

eachothers if they have the chained configuration files present in each node. For

example, the csync2_apollo.cfg file should be present in both apollo and

chronos nodes.

You will probably say:

"Floren, what is this non-sense, simply use the global csync2 configuration file

across the entire network!"

"Patience, Obiwan. I will explain in detail the logic at the end of this article."

5) On each node, edit the xinetd csync2 service file:

# rm -f /etc/csync2/*

# scp root@apollo:/etc/csync2/* /etc/csync2/

Page 6: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 6/11

Floren, May 16, 2011 #2

Code:

service csync2{ flags = REUSE socket_type = stream wait = no user = root group = root server = /usr/sbin/csync2 server_args = -i port = 30865 type = UNLISTED #log_on_failure += USERID disable = no #only_from = 192.168.199.3 192.168.199.4}

Basically, all you have to do is change the disable setting from yes, to no. You

could also run the command:

but I wanted you to familiarize with the options inside the service starter, in

case you want to customize some settings in the future.

Lsyncd Setup

On each node, edit the /etc/lsyncd/lsyncd.conf file and insert the following lua

script:

Code:

# chkconfig csync2 on

FlorenAxivo Developer

Page 7: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 7/11

------- User configuration file for lsyncd.---- This example synchronizes one specific directory through multiple nodes,-- by combining csync2 and lsyncd as monitoring tools.-- It avoids any race conditions generated by lsyncd, while detecting and-- processing multiple inotify events in batch, on each node monitored by-- csync2 daemon.---- @author Floren Munteanu-- @link http://www.axivo.com/-----settings = { logident = "lsyncd", logfacility = "user", logfile = "/var/log/lsyncd/lsyncd.log", statusFile = "/var/log/lsyncd/status.log", statusInterval = 1}

initSync = { delay = 1, maxProcesses = 1, action = function(inlet) local config = inlet.getConfig() local elist = inlet.getEvents(function(event) return event.etype ~= "Blanket" end) local directory = string.sub(config.source, 1, -2) local paths = elist.getPaths(function(etype, path) return "\t" .. config.syncid .. ":" .. directory .. path end) log("Normal", "Processing syncing list:\n", table.concat(paths, "\n")) spawn(elist, "/usr/sbin/csync2", "-C", config.syncid, "-x") end, collect = function(agent, exitcode) local config = agent.config if not agent.isList and agent.etype == "Blanket" then if exitcode == 0 then log("Normal", "Startup of '", config.syncid, "' instance finished.") elseif config.exitcodes and config.exitcodes[exitcode] == "again" then log("Normal", "Retrying startup of '", config.syncid, "' instance.") return "again" else log("Error", "Failure on startup of '", config.syncid, "' instance.") terminate(-1) end return end local rc = config.exitcodes and config.exitcodes[exitcode] if rc == "die" then return rc end if agent.isList then if rc == "again" then log("Normal", "Retrying events list on exitcode = ", exitcode) else log("Normal", "Finished events list = ", exitcode) end else if rc == "again" then log("Normal", "Retrying ", agent.etype, " on ", agent.sourcePath, " = ", exitcode) else log("Normal", "Finished ", agent.etype, " on ", agent.sourcePath, " = ", exitcode) end end return rc end, init = function(inlet) local config = inlet.getConfig()

Rename the corresponding source key and value accordingly. You can specify

several sources, using this format:

Code:

local sources = { ["/var/www/html/images"] = "apollo", ["/var/www/html/customavatars"] = "apollo"}

The sources array will allow you to create specific csync2 configurations for

each directory you monitor.

The key allows you to define a specific directory to be synced, while the value is

the csync2 configuration file id:

Obviously, you can specify whatever values you want. You can even define a

specific syncid per directory, if your configuration need to have specific actions

/etc/csync2/csync2_apollo.cfg

Page 8: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 8/11

Floren, May 16, 2011 #3

processed only for your defined source key.

Final Setup Steps

So, the hardcore stuff is done.

All we have to do left is start the daemons on each node and let the sync beast

unleash its power through out the entire network.

1) We need to prepare the csync2 sqlite database on each node. On apollo, run

the csync2 initialization command. It will detect all new files, mark them as

dirty and then sync them through the rest of nodes:

Based on your /var/www/html directory size, it will take csync2 from seconds

to minutes replicating the entire list of files. That includes not only the file

content, but also its permissions. Don't worry, next syncs will be very fast as

csync2 stores each file information into its database and syncs only the files that

changed, resulting in dramatic performance gains over a large cluster.

Once the sync was completed on apollo, I performed the same command on

each other node, in our case chronos and hermes. That resulted on a global

sync, allowing lsynd to start with fresh data, perfectly syncronized.

2) Start the daemons on each node:

You can see if everything is OK, by consulting the /var/log/lysncd/lsyncd.log

file.

Final Test Results

I created a very simple shell script that generates 100 .html files into a /test

directory.

The goal was to see how fast the files are replicated through entire network and

how much resources were consumed:

Code:

FlorenAxivo Developer

# csync2 -xv

# chkconfig xinetd on

# service xinetd start

# chkconfig lsyncd on

# service lsyncd start

Page 9: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 9/11

#!/bin/sh

for ((i=0; i<100; i++)); do touch /var/www/html/test/${i}.htmldone

I started top on all 3 nodes and opened another terminal window where I

manually created the /test directory, then executed the shell script. Once the

script executed, the new directory and the set of 100 html files were replicated in

less than 1 second, generating only a 0.08 server load on each node:

Still, that was not enough for my taste. I really wanted to push the system to its

limits and see how the nodes react under heavy sync processing. So I created a

more advanced script that will generate a batch of 1000 html files, repeated in a

loop 10 times at random time intervals between 1.01 and 1.60 seconds:

Code:

#!/bin/sh

for ((i=0; i<10; i++)); do number=$((1000000 + ($(od -An -N2 -i /dev/random)) % (10000 + 1000))) for ((j=0; j<1000; j++)); do touch /var/www/test/${i}-${j}.html done usleep ${number}done

I selected the random time interval higher than 1 second because I wanted to

avoid the lsyncd batch processing implemented with the new lua script.

The only difference with the new script from a performance point of view was

the server load increase. In previous test, I experienced a 0.08 load. With the

new script, the load went up to nearly 0.50 on all nodes, rapidly dropping once

the shell script finished its execution. Yet, the sync speed was unaffected, files

were instantly synced over the entire network.

Dropping the entire /var/www/html/test directory was a little more harsh on

the load. csync2 had to remove 10,000 files from its database, so that pushed to

load to nearly 1, 0.97 precisely. Still, the files were instantly deleted from all

Sun May 15 10:57:16 2011 Normal: Recursive startup sync: apollo:/var/www/html/

Sun May 15 10:57:19 2011 Normal: Startup of 'apollo' instance finished.

Sun May 15 10:59:21 2011 Normal: Processing syncing list:

apollo:/var/www/html/test/

While syncing file /var/www/html/test:

ERROR from peer chronos: File is also marked dirty here!

Auto-resolving conflict: Won 'master/slave' test.

Sun May 15 10:59:22 2011 Normal: Finished events list = 0

Sun May 15 10:59:38 2011 Normal: Processing syncing list:

apollo:/var/www/html/test/0.html Click to expand.. .

Page 10: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 10/11

nodes.

As you can see for yourself, the final results are simply amazing.

However, things were not going so well, when I started developing this project...

see below.

The Dark Side

Originally, I started with two basic csync2 and lsyncd configurations, as

explained into manual. This particular setup created insane racing conditions

among nodes, generating a huge load and repeated delays into global sync. It

was not looking good at all. So I started analyzing how both csync2 and lysncd

work under the cover, by testing each product in different circumstances.

There was a lot of back and forward discussion with Axel Kittenberger (lsyncd

developer) and Gordan Bobic, on the mailing lists (where my weekend was

spent, not funny I know). Basically, the racing conditions were caused by:

1. cyclic csync2 executions, based on "false" lsyncd alarms

2. lsyncd events processed on a file-by-file bassis

While testing, I noticed initial race conditions among nodes, generated by

csync2. Particularly, it was when the update was completed from hermes to

apollo. At that time, I was using the global csync2 configuration (group web).

So Gordan came with the idea to use instead a "chained" configuration setup,

forcing csync2 to execute only on specific nodes and on a specific order. I

created the proper configuration files and started again the tests. The results

were not promising at all, the races were still there, yet at a lower intensity.

Then, I noticed that by default, lsyncd operates on a batched file-by-file basis

when the stock configuration files are used. In other words, when lsyncd finds

that file /foo/bar has changed, it will initiate a csync2 sync of a specific file,

rather than everything. That was very bad. As solution, I had to write a custom

lua based configuration script and force lsyncd to process the event calls in

batch, instead of default per-file event. I am so glad that Axel decided to use Lua

as scripting base for configurations, you can do extraordinary things with it.

As a bonus for my wasted weekend, I found out today that my script will be

featured into next lsyncd package, as proper example how to bind csync2 and

lsyncd for an advanced cluster sync solution. Thank you Axel!

The result is an efficient sync process constantly monitored by Linux services

while generating no racing conditions and updating an entire cluster in an

instant. Personally, I'm very satisfied with the results and I'm sure my client

will enjoy the ease and improved manageability of files.

Page 11: Lightning fast synch with csync2 and lsyncd _ Axivo Community.pdf

11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community

https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 11/11

Contact Us Help

Terms and RulesAxivo Inc. © 2013

Floren, May 16, 2011 #4

hws and thaer like this.

(You must log in or sign up to reply here.)

Tweet 7 20

Share This Page

Home Forums Developer Zone Tutorials and Reviews

Recommend 8 people recommend this. Be the first of your friends.