Upload
sagarsrivastava
View
3.162
Download
35
Embed Size (px)
DESCRIPTION
csync2
Citation preview
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 1/11
Lightning fast synch with csync2 and lsyncd
Introduction
Credits: Axel Kittenberger, Gordan Bobic
Support: Ask your questions related to this tutorial into Performance Drive
forum
If you find this tutorial useful, please post a link to it in your site.
We recently had a client who hired us to build from scratch his web site cluster.
Among many performance problems, they were struggling with in their
vBulletin based site, one issue being the file syncronization between the web
nodes. The site does not contain any physical attachments and all the static files
were manually uploaded by the owner through all 8 web servers. They tried in
the past to synchronize the data through most popular solutions used by web
hosts but they encountered performance issues and a SAN based solution was
not in their budget. As result, they were forced to store the avatars into MySQL
database, creating unnecessary stress at database level while affecting the
overall site performance.
So I had to come up with a solution that does not eat server resources like an
elephant, while providing instant synchronization through an entire cluster.
After some research done, I discarded instantly similar solutions like drbd, rsync
and unison based on a variety of tests performed. The performance was not at
the level I wanted to provide. The only client that performed flawlessly was
csync2.
I selected csync2 as prime synchronization candidate for the following reasons:
1. ease of configuration setup and the large variety of automated tasks
2. unlimited number of nodes to be synchronized
3. great flexibility how the synchronization is performed
4. level of security used for external nodes
5. data storage of currently synced files in each node, for fast file
comparison
Home Forums Developer Zone Tutorials and Reviews
FlorenAxivo Developer
Home Forums
Search Forums Recent Posts
Members
Log in or Sign up
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 2/11
Now that the synchronization part was covered, I had to find a solution that
allows csync2 to trigger a sync event, once a new file is present into any cluster
node. Most Linux based operating systems have installed inotify in their kernel.
Basically, inotify is a kernel subsystem that acts to extend filesystems to notice
changes to the filesystem and reports those changes to an application, in our
case csync2.
With that established, I had to find a robust C based application that properly
handled the inotify events and also could be daemonized, so I run it as Linux
service. Again, I quickly discarded all Python or Perl based scripts, for the same
reasons I mentioned earlier: speed and performance.
The main candidates I selected were inotify-tools and lsyncd. They were both
written on C and had options to run as daemon. Based on preliminary tests, I
discovered inotify-tools would not satisfy my selection criteria, as its daemon
features contained bugs. I even tried to daemonize myself inotifywait with a
shell wrapper but the performance was just not there. The server load generated
was too high, so I was forced to discard it and noticed Radu Voicila (inotify-
tools developer) about the issue.
I selected lsyncd as prime inotify candidate for the following reasons:
1. runs as daemon, so it can be started as Linux service
2. does not hamper local filesystem performance like other similar solutions
3. aggregates and combines events in batch, avoiding unnecessary server
load
4. uses lua as configuration scripting tool
Once the solution logic was finished, it was time to transform everything into
reality. First, I created all necessary rpm's allowing me to install csync2 and
lsyncd into any cluster in a matter of seconds. CentOS 5.6 lacked several
important packages, so I had to write everything from scratch except for xinetd
who was available into base repository. All packages are currently available into
Axivo repository. If you are a host provider, please contact us if you want to
help us host our packages on your servers.
The dependencies I wrote/used for csync2.x86_64 1.34 rpm are:
xinetd.x86_64 2:2.3.14+ (CentOS repo)
gnutls.x86_64 1.4.1+ (Axivo repo)
librsync.x86_64 0.9.7+ (Axivo repo)
libtasn1.x86_64 2.9+ (Axivo repo)
sqlite2.x86_64 2.8.17+ (Axivo repo)
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 3/11
Floren, May 16, 2011 #1
The dependencies I wrote/used for lsyncd.x86_64 2.04 rpm are:
rsync.x86_64 2.6.8+ (CentOS repo)
lua.x86_64 5.1.4+ (Axivo repo)
Note: lsyncd was designed mainly for rsync, so I had to implement it as
requirement into Axivo rpm.
Next, it was time to actually install all packages and test how well they perform.
For that, I've setup a small 3 nodes cluster on an internal network and installed
all of the above listed rpm's in each node:
I will explain in detail the logic, because it is important to understand how the
actual setup will work. For simplicity, I will call the 3 nodes apollo, chronos and
hermes, as my test nodes I used into cluster.
See below the detailed setup steps I used for csync2 and lsyncd.
Important: Please be aware that I refer to several file locations in the next
steps. These locations are unique to Axivo rpm builds and will probably not
match the actual source build locations. If you install the Axivo rpm's, you will
not encounter any setup difficulties.
Csync2 Setup
Csync2 is very straight forward, but it required some precise steps to perform,
once I installed all Axivo rpm's. The csync2 rpm automates several tasks that
are required to perform manually, once you install the software from source.
For example, it adds the default csync2 TCP port to /etc/services file, creates the
SSL certificates (in case you connect to an external node) and manages the
sqlite database needed to store the sync information for each node.
To avoid frustration, it is important that you study the Csync2 documentation
and understand how its process works, before you proceed with the tutorial.
1) Generate the csync2.key, into apollo node. To do that, I simply issued the
command:
It will require to generate some additional entropy, in order to properly generate
# yum --enablerepo=axivo install csync2 lsyncd
FlorenAxivo Developer
# csync2 -k /etc/csync2/csync2.key
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 4/11
the key.
To do that, I simply opened a new terminal window and ran the usual
commands like:
If is not enough, run top and wait until enough random data is created. You
will notice when the process is complete once your shell prompt is returned into
the window where you started the csync2 key generation.
2) Edit the default /etc/csync2/csync2.cfg configuration file and insert into the
following content:
Code:
nossl * *;group web{ host apollo; host chronos; host hermes; key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}
I used this file for the initial syncronisation step on all nodes, explained into final
setup steps.
Personally, I did not wanted the encrypted overhead on an internal network, so
I disabled the SSL with the nossl directive you noticed into above csync2
configuration file.
3) Create a custom configuration file called csync2_apollo.cfg for apollo node.
This part is the key of success for our solution, I will explain the logic in detail at
the end of this article.
Code:
nossl * *;group apollo{ host apollo; host (chronos); key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}
4) On each additional node, copy all files present into apollo:/etc/csync2
directory. To be save, I ran the following commands on chronos and hermes:
# du -h /
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 5/11
I did that because I wanted to have the SSL certificates and csync2.key identical
on all nodes. That is the condition needed by csync2, in case you decide to use
an SSL connection while syncing the files.
Once the file transfer completed on each node, rename accordingly the csync2
configuration files to reflect the name of each node. On chronos node, the
configuration file was renamed csync2_chronos.cfg and had the content:
Code:
nossl * *;group chronos{ host chronos; host (hermes); key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}
while on hermes node, the configuration file was renamed csync2_hermes.cfg
and had the content:
Code:
nossl * *;group hermes{ host hermes; host (apollo); key /etc/csync2/csync2.key; include /var/www/html; exclude *~ .*; auto none;}
As detailed into Csync2 documentation, nodes will be able to communicate with
eachothers if they have the chained configuration files present in each node. For
example, the csync2_apollo.cfg file should be present in both apollo and
chronos nodes.
You will probably say:
"Floren, what is this non-sense, simply use the global csync2 configuration file
across the entire network!"
"Patience, Obiwan. I will explain in detail the logic at the end of this article."
5) On each node, edit the xinetd csync2 service file:
# rm -f /etc/csync2/*
# scp root@apollo:/etc/csync2/* /etc/csync2/
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 6/11
Floren, May 16, 2011 #2
Code:
service csync2{ flags = REUSE socket_type = stream wait = no user = root group = root server = /usr/sbin/csync2 server_args = -i port = 30865 type = UNLISTED #log_on_failure += USERID disable = no #only_from = 192.168.199.3 192.168.199.4}
Basically, all you have to do is change the disable setting from yes, to no. You
could also run the command:
but I wanted you to familiarize with the options inside the service starter, in
case you want to customize some settings in the future.
Lsyncd Setup
On each node, edit the /etc/lsyncd/lsyncd.conf file and insert the following lua
script:
Code:
# chkconfig csync2 on
FlorenAxivo Developer
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 7/11
------- User configuration file for lsyncd.---- This example synchronizes one specific directory through multiple nodes,-- by combining csync2 and lsyncd as monitoring tools.-- It avoids any race conditions generated by lsyncd, while detecting and-- processing multiple inotify events in batch, on each node monitored by-- csync2 daemon.---- @author Floren Munteanu-- @link http://www.axivo.com/-----settings = { logident = "lsyncd", logfacility = "user", logfile = "/var/log/lsyncd/lsyncd.log", statusFile = "/var/log/lsyncd/status.log", statusInterval = 1}
initSync = { delay = 1, maxProcesses = 1, action = function(inlet) local config = inlet.getConfig() local elist = inlet.getEvents(function(event) return event.etype ~= "Blanket" end) local directory = string.sub(config.source, 1, -2) local paths = elist.getPaths(function(etype, path) return "\t" .. config.syncid .. ":" .. directory .. path end) log("Normal", "Processing syncing list:\n", table.concat(paths, "\n")) spawn(elist, "/usr/sbin/csync2", "-C", config.syncid, "-x") end, collect = function(agent, exitcode) local config = agent.config if not agent.isList and agent.etype == "Blanket" then if exitcode == 0 then log("Normal", "Startup of '", config.syncid, "' instance finished.") elseif config.exitcodes and config.exitcodes[exitcode] == "again" then log("Normal", "Retrying startup of '", config.syncid, "' instance.") return "again" else log("Error", "Failure on startup of '", config.syncid, "' instance.") terminate(-1) end return end local rc = config.exitcodes and config.exitcodes[exitcode] if rc == "die" then return rc end if agent.isList then if rc == "again" then log("Normal", "Retrying events list on exitcode = ", exitcode) else log("Normal", "Finished events list = ", exitcode) end else if rc == "again" then log("Normal", "Retrying ", agent.etype, " on ", agent.sourcePath, " = ", exitcode) else log("Normal", "Finished ", agent.etype, " on ", agent.sourcePath, " = ", exitcode) end end return rc end, init = function(inlet) local config = inlet.getConfig()
Rename the corresponding source key and value accordingly. You can specify
several sources, using this format:
Code:
local sources = { ["/var/www/html/images"] = "apollo", ["/var/www/html/customavatars"] = "apollo"}
The sources array will allow you to create specific csync2 configurations for
each directory you monitor.
The key allows you to define a specific directory to be synced, while the value is
the csync2 configuration file id:
Obviously, you can specify whatever values you want. You can even define a
specific syncid per directory, if your configuration need to have specific actions
/etc/csync2/csync2_apollo.cfg
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 8/11
Floren, May 16, 2011 #3
processed only for your defined source key.
Final Setup Steps
So, the hardcore stuff is done.
All we have to do left is start the daemons on each node and let the sync beast
unleash its power through out the entire network.
1) We need to prepare the csync2 sqlite database on each node. On apollo, run
the csync2 initialization command. It will detect all new files, mark them as
dirty and then sync them through the rest of nodes:
Based on your /var/www/html directory size, it will take csync2 from seconds
to minutes replicating the entire list of files. That includes not only the file
content, but also its permissions. Don't worry, next syncs will be very fast as
csync2 stores each file information into its database and syncs only the files that
changed, resulting in dramatic performance gains over a large cluster.
Once the sync was completed on apollo, I performed the same command on
each other node, in our case chronos and hermes. That resulted on a global
sync, allowing lsynd to start with fresh data, perfectly syncronized.
2) Start the daemons on each node:
You can see if everything is OK, by consulting the /var/log/lysncd/lsyncd.log
file.
Final Test Results
I created a very simple shell script that generates 100 .html files into a /test
directory.
The goal was to see how fast the files are replicated through entire network and
how much resources were consumed:
Code:
FlorenAxivo Developer
# csync2 -xv
# chkconfig xinetd on
# service xinetd start
# chkconfig lsyncd on
# service lsyncd start
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 9/11
#!/bin/sh
for ((i=0; i<100; i++)); do touch /var/www/html/test/${i}.htmldone
I started top on all 3 nodes and opened another terminal window where I
manually created the /test directory, then executed the shell script. Once the
script executed, the new directory and the set of 100 html files were replicated in
less than 1 second, generating only a 0.08 server load on each node:
Still, that was not enough for my taste. I really wanted to push the system to its
limits and see how the nodes react under heavy sync processing. So I created a
more advanced script that will generate a batch of 1000 html files, repeated in a
loop 10 times at random time intervals between 1.01 and 1.60 seconds:
Code:
#!/bin/sh
for ((i=0; i<10; i++)); do number=$((1000000 + ($(od -An -N2 -i /dev/random)) % (10000 + 1000))) for ((j=0; j<1000; j++)); do touch /var/www/test/${i}-${j}.html done usleep ${number}done
I selected the random time interval higher than 1 second because I wanted to
avoid the lsyncd batch processing implemented with the new lua script.
The only difference with the new script from a performance point of view was
the server load increase. In previous test, I experienced a 0.08 load. With the
new script, the load went up to nearly 0.50 on all nodes, rapidly dropping once
the shell script finished its execution. Yet, the sync speed was unaffected, files
were instantly synced over the entire network.
Dropping the entire /var/www/html/test directory was a little more harsh on
the load. csync2 had to remove 10,000 files from its database, so that pushed to
load to nearly 1, 0.97 precisely. Still, the files were instantly deleted from all
Sun May 15 10:57:16 2011 Normal: Recursive startup sync: apollo:/var/www/html/
Sun May 15 10:57:19 2011 Normal: Startup of 'apollo' instance finished.
Sun May 15 10:59:21 2011 Normal: Processing syncing list:
apollo:/var/www/html/test/
While syncing file /var/www/html/test:
ERROR from peer chronos: File is also marked dirty here!
Auto-resolving conflict: Won 'master/slave' test.
Sun May 15 10:59:22 2011 Normal: Finished events list = 0
Sun May 15 10:59:38 2011 Normal: Processing syncing list:
apollo:/var/www/html/test/0.html Click to expand.. .
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 10/11
nodes.
As you can see for yourself, the final results are simply amazing.
However, things were not going so well, when I started developing this project...
see below.
The Dark Side
Originally, I started with two basic csync2 and lsyncd configurations, as
explained into manual. This particular setup created insane racing conditions
among nodes, generating a huge load and repeated delays into global sync. It
was not looking good at all. So I started analyzing how both csync2 and lysncd
work under the cover, by testing each product in different circumstances.
There was a lot of back and forward discussion with Axel Kittenberger (lsyncd
developer) and Gordan Bobic, on the mailing lists (where my weekend was
spent, not funny I know). Basically, the racing conditions were caused by:
1. cyclic csync2 executions, based on "false" lsyncd alarms
2. lsyncd events processed on a file-by-file bassis
While testing, I noticed initial race conditions among nodes, generated by
csync2. Particularly, it was when the update was completed from hermes to
apollo. At that time, I was using the global csync2 configuration (group web).
So Gordan came with the idea to use instead a "chained" configuration setup,
forcing csync2 to execute only on specific nodes and on a specific order. I
created the proper configuration files and started again the tests. The results
were not promising at all, the races were still there, yet at a lower intensity.
Then, I noticed that by default, lsyncd operates on a batched file-by-file basis
when the stock configuration files are used. In other words, when lsyncd finds
that file /foo/bar has changed, it will initiate a csync2 sync of a specific file,
rather than everything. That was very bad. As solution, I had to write a custom
lua based configuration script and force lsyncd to process the event calls in
batch, instead of default per-file event. I am so glad that Axel decided to use Lua
as scripting base for configurations, you can do extraordinary things with it.
As a bonus for my wasted weekend, I found out today that my script will be
featured into next lsyncd package, as proper example how to bind csync2 and
lsyncd for an advanced cluster sync solution. Thank you Axel!
The result is an efficient sync process constantly monitored by Linux services
while generating no racing conditions and updating an entire cluster in an
instant. Personally, I'm very satisfied with the results and I'm sure my client
will enjoy the ease and improved manageability of files.
11/7/13 Lightning fast synch with csync2 and lsyncd | Axivo Community
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 11/11
Contact Us Help
Terms and RulesAxivo Inc. © 2013
Floren, May 16, 2011 #4
hws and thaer like this.
(You must log in or sign up to reply here.)
Tweet 7 20
Share This Page
Home Forums Developer Zone Tutorials and Reviews
Recommend 8 people recommend this. Be the first of your friends.