FOSDEM 2012: Practical implementation of promise theory in CFEngine

Embed Size (px)

Citation preview

Practical implementation of promise theory in CFEngine

Mikhail GusarovCFEngine AS

Welcome and thank you for coming.

My name is Mikhail Gusarov and I work for CFEngine AS, which is a company backing free software project CFEngine.

I expect it to be most interesting to system administrators and software engineers who tried CFEngine and were puzzled by the semantics of operations, and also who read the works on promise theory by Mark Burgess and are trying to figure out how do the theoretical foundation are mapped to real-world CFEngine implementation.

Contents

Promise theory

"reports"

"processes"

"packages"

"variables"

Summary

We will start with the quick recap of promise theory, and then proceed to dissect several promise types in CFEngine, trying to find how the fundamental principles are reflected in particular design.

Promise theory

Agent A promisesto agent B to fulfill C

Let's start with a quick question: please raise your hand if you have used CFEngine before. And who has read the Mark Burgess' papers on promise theory?

The promise theory states that in distributed (or multi-agent) systems you may not adequately describe reality by obliging actors to do something, and instead promise is the building block. Promise in computing is similar to the one in real life: it might be broken (voluntarily or forcedly), it might be fulfilled from time to time, so at the best we may only statistically say that the promise is kept.

Software Implementation

ConvergencyPromise essence is the desired state

Embracing errorsPromises may be kept occasionally

AutonomyUnable to rely on another agents

What does it mean from the software point of view?

We need some way do describe and test promises, and the most useful way to do it is to formulate promises in the form of desired state, with operators to test whether desired state is achieved, and to perform actions on the system to move closer to the desired state.

We need to embrace errors, as it's quite rare situtaion when the promise we are trying to fulfill may be achived by the agent itself, so we need to be able to handle unfulfilled promises by any other agents we communicate with, as well as any internal inconsistencies.

We need to be able to work autonomously, as we may not rely to another agents. At least we should be able to fail gracefully and wait until other agents start to fulfill their promises.

Real-world example: e-mail

Convergency

Delivery to address

Embracing errors

4xx, 5xx errors

Multiple MXes

Mail queues

Autonomy

Independent mail servers

Local routing decisions

Here's an example of a reliable system which uses all of those principles.E-mail promises are convergent, in the sense that single promise this system fulfills is to deliver mail to the given address, and every MTA knows how to verify whether this promise is fulfilled or not, and how to advance in achievement of this promise.E-mail embraces errors, and there is huge lot of error handling in all components of-email.E-mail is autonomous in the sense, that any MTA is able to work offline, and does not fail completely if there is no DNS service. Once external services is available, it fetches the necessary data, makes routing decision and contines with delivery.

Reports

body common control { bundlesequence => {"mybundle"};}bundle agent mybundle {reports: linux:: "Hello world!";}

Let's start with a simple policy.

Reports

body common control { bundlesequence => {"mybundle"};}bundle agent mybundle {reports: linux:: "Hello world!";}

First section declares what _bundles_ are to be taken into account during CFEngine run, and is not particularily interesting. I will skip this part of the policy for the rest of the talk.

Reports

bundle agent mybundle {reports: linux:: "Hello world!";}

Let's have a look at second one. This is a bundle, that is, a building block for CFEngine policy, best approximated as a module in other languages.

Reports

bundle agent mybundle {reports: linux:: "Hello world!";}

This module contains a single statement of type "report".

Reports

bundle agent mybundle {reports: linux:: "Hello world!";}

executed only in context "linux", which is the case on Linux machines.

Reports

bundle agent mybundle {reports: linux:: "Hello world!";}

This statement asks to report a string "Hello world!".

Reports (translated)

def mybundle() { if "linux" in global_contexts { report("Hello world!"); }}mybundle()

Let's use some imaginary imperative programming language to re-state the policy above. From the look of it, it looks similar to ... But does it behave this way?

Reports (run)

$ cf-agentR: Hello world!$

Let's run it. Fine nothing unexpected.

Reports (run)

$ cf-agentR: Hello world!$ cf-agent$

Let's run it again. That's interesting.

Reports (run)

$ cf-agentR: Hello world!$ cf-agent$ cf-agent$

Let's run it again. Hmm...

Reports

bundle agent mybundle {reports: linux:: "Hello world!"; "That's surprising";}

Let's add another report statement.

Reports (run)

$ cf-agentR: Hello world!$ cf-agent$ cf-agent$ cf-agentR: That's surprising$

And run it again.

Reports (run)

$ cf-agentR: Hello world!$ cf-agent$ cf-agent$ cf-agentR: That's surprising$ cf-agentR: Hello world!$

And one more time. So, the reports statement does not behave like a conventional print statement from imperative language.

def mybundle() { promises.add( report("Hello world!", activate = : "linux" in global_contexts))}mybundle()run_all_relevant_promises()

More correct restatement in imaginary language we have used before would be like the following.

Ensure that specified message is reported to sysadmin as soon as possible, but no more than once per 5 minutes to the sysadmin using the specified reporting mechanisms

If we would specify the reports promise from CFEngine language in English, it would sound similar to the following.

Ensure that specified message is reported to sysadmin as soon as possible, but no more than once per 5 minutes to the sysadmin using the specified reporting mechanisms

Convergence

This type of promises is convergent: we know the end state, and how to get closer to it.

Ensure that specified message is reported to sysadmin as soon as possible, but no more than once per 5 minutes to the sysadmin using the specified reporting mechanisms

Embracing errors

This type of promises embraces errors, as we know that reporting mechanisms are not always available.

Ensure that specified message is reported to sysadmin as soon as possible, but no more than once per 5 minutes to the sysadmin using the specified reporting mechanisms

Autonomy

And this this type of promise is autonomous in the sense that agent does not need any additional information to decide where to direct the report.

Processes

bundle agent myprocesses {processes: "nginx" match_range => irange("1","20"); restart_class => "restart_nginx";commands: restart_nginx:: "/etc/init.d/nginx start";}

Let's have a look at another promise.

Processes

bundle agent myprocesses {processes: "nginx" match_range => irange("1","20"); restart_class => "restart_nginx";commands: restart_nginx:: "/etc/init.d/nginx start";}

This declaration promises that agent will keep number of nginx processes in specified range.

Processes

bundle agent myprocesses {processes: "nginx" match_range => irange("1","inf"); restart_class => "restart_nginx";commands: restart_nginx:: "/etc/init.d/nginx start";}

And if it is not, extra processes will be killed, or the need to spawn new processes will be communicated to another promise.

def myprocesses() { nginx_cnt = count_proc("nginx") if nginx_cnt == 0 { promises.add( command( "/etc/init.d/nginx start")) } elif nginx_cnt > 20 { Promises.add(kill("nginx", nginx_cnt 20)) }}myprocesses()run_all_relevant_promises()

This is a simplistic rewrite of the previous policy in our imaginary imperative language, but it is not accurate in one significant aspect: this code only checks for amount of nginx processes once, while agent is able to re-check it again, if other parts of policy do a corrections, and spawn or kill monitored processes.

def myprocesses() { promises.add( processes("nginx", count: if count == 0 { define_context("restart_nginx")}) promises.add( commands( "/etc/init.d/nginx start", activate_if = : "restart_nginx" in global_contexts))}myprocesses()run_all_relevant_promises()

So the following rewrite would be more adequate, but as you easily see, it is just a restatement of policy in different syntax.

Ensure that amount of specified processes is kept in specified range, killing extra processes and communicating the need to restart if necessary

Let's try to formulate what does this promise type say.

Ensure that amount of specified processes is kept in specified range, killing extra processes and communicating the need to restart if necessary

Convergence

This promise type is convergent, as we know the target state, and how to reach it.

Ensure that amount of specified processes is kept in specified range, killing extra processes and communicating the need to restart if necessary

Embracing errors

The mere existance of this promise type is admittance of the fact processes may run out of the control.

Ensure that amount of specified processes is kept in specified range, killing extra processes and communicating the need to restart if necessary

Autonomy

And the promise type is obviously autonomous, as it does not need any external data to proceed.

Packages

bundle agent mypackages {packages: "libapache2-mod-wsgi" package_method => apt, package_policy => "addupdate", package_version => "3.3-4", package_select => ">=";}

Let's consider the next promise:

Packages

bundle agent mypackages {packages: "libapache2-mod-wsgi" package_method => apt, package_policy => "addupdate", package_version => "3.3-4", package_select => ">=";}

Agent promises to ensure that package libapache2-mod-wsgi from APT repository is installed and has version at least 3.3-4.

Packages (translated)

(Implementation is too large to fit in the marginslide)

I won't even try to translate this kind of promise, as it will be either very verbose, or just restate the policy language using DSL of some sort.

Ensure that specified package is installed or not installed according to the constraints specified. Do this by instructing local package manager to add, upgrade or remove package

The following description of package promise type is simplified (as it does not include large number of less-used features, like lack of local package manager, or rarely used operations), but nonetheless useful.

Ensure that specified package is installed or not installed according to the constraints specified. Do this by instructing local package manager to add, upgrade or remove package

Convergent

Packages promise is obviously convergent, as we know the end state and how to reach it.

Ensure that specified package is installed or not installed according to the constraints specified. Do this by instructing local package manager to add, upgrade or remove package

Embracing errors

It implicitly embraces errors, as implementation will patiently wait until local package manager is able to do what is requested (e.g. until network is available to fetch new packages)

Ensure that specified package is installed or not installed according to the constraints specified. Do this by instructing local package manager to add, upgrade or remove package

Autonomous

And the promise is autonomous in the sense that no extra information is needed to verify the state or to try to reach the promised state.

Variables

bundle agent myvars { vars: "a" string => "My $(string)";}

Variables are special kind of promises. They belong to the agent itself, and not to the environment.

Maintain in-agent binding of left-side name to the value of right-side expression

And hence their semantics is pretty similar to the one you might find in functional languages.

Convergent, autonomous

Maintain in-agent binding of left-side name to the value of right-side expression

Variables are trivially convergent and autonomous by definition.

Embracing errors

bundle agent myvars {vars: "myvar" string => execresult( "/usr/bin/testparm /etc/smb/smb.conf", "noshell");}

But still, there is error-handling component. CFEngine variables will try to evaluate itself until it suceeds. For example, in this variable definition, execresult function may fail if testparm binary is not available, and variable will be evaluated as frequently as deemed necessary to try to get it value, e.g. If testparm binary is installed by another part of policy.

Summary

CFEngine building blocks are promises

Higher-level

Convergent

Error-embracing

Autonomous

Not the things from conventional languages

So, we have got through three constructs in CFEngine language, and all of them shown the same unifying principles behind: convergency, embracing errors, and autonomy.This approach distinguishes CFEngine from configuration management tools, such as glorified parallel shell wrappers, where you need to think of convergence yourself, and the language does not help you to produce self-healing systems.But also this approach makes CFEngine language so different from conventional languages.

IRC: #cfengine @ FreeNodeWeb: http://cfengine.com/techML: https://cfengine.org/mailman/listinfo

Questions?

DSL vs. eDSL

Pro eDSL:

Familiar host language

Contra eDSL:

Radically different building blocks

Different control structure and evaluation

DSL

As you just saw, the building blocks of CFEngine language are quite differentfrom the functional or imperative languages. CFEngine agent does not executesequence of commands, and does not evaluate a function changing world as aside-effect, but keeps a number of promises.

There are roughly two ways of describing domain knowledge in machine-readableway:

* eDSL for general-purpose language * Separate DSL

Some promises may require a great deal of context to be bundled in order to beconvergent, and do not docempose to more fundamental entities which still havethe property of convergency. This makes describing policies as a eDSL prettycumbersome - resulting eDSL would still have to have all the CFEngine languageprimitives, including classes and variables, so it would just add another layerof syntactic clutter on top of the same semantics.

Considering this issue, CFEngine chose the approach of having a separate DSL.