Download pdf - Weaver: Language and runtime for software defined environments

Weaver: Language andruntime for software definedenvironments

M. H. KalantarF. Rosenberg

J. DoranT. Eilam

M. D. ElderF. Oliveira

E. C. SnibleT. Roth

Continuous delivery of software and related infrastructureenvironments is a challenging proposition. Typical enterpriseenvironments, comprising distributed software and its supportinginfrastructure, exhibit non-obvious, often implicit dependencies andrequirements. Further increasing this challenge is that knowledgeabout configuration is fragmented and informally recorded. Giventhis situation, we propose Weaver, a domain-specific languagedesigned to formally specify blueprints, desired state descriptions ofenvironments. An associated runtime executes blueprints to createor modify environments through a set of target-specific platformproviders that supply cloud-specific implementations. New andexisting automation to implement and maintain the desired statecan be associated with a blueprint specified in Weaver. Furthermore,Weaver supports the definition of conditions to validate ablueprint at design time and deployment time, as well as tocontinuously validate a deployed environment. We demonstratethe use of Weaver to deploy IBM Connections, an enterprisesocial software platform.

IntroductionTypical enterprise systems comprise a multitude ofdistributed software components that exhibit non-obvious,often implicit dependencies and infrastructure requirements.The deployment and operation of such complex systems,including their applications and infrastructure, are thereforechallenging tasks. Even more challenging is to supportcontinuous delivery [1]Vthat is, to continuously deploythe environment in a test environment that is reasonablysimilar to the actual production environment as partof development and testing efforts and to promote it toproduction when appropriate. Aggravating these challengesis the fact that typically, the knowledge of the configurationof running environments is fragmented and informallyrecorded.We identify two types of configuration dependencies in

these environments: (1) dependencies between differentsoftware components and (2) dependencies between thosesoftware components and the underlying infrastructure.In many cases, the configuration of the infrastructure dependson the requirements of the software. For example, network

firewall configuration depends on the softwarecommunication requirements. The optimization of theplacement of compute resources across data centers dependson bandwidth requirements and constraints on the availabilityof the running system. In addition, modifications to thehardware infrastructure may require software updates.For example, the addition of compute resources toincrease capacity may require the reconfiguration of aload balancer.We observe that enterprises often lack a holistic

knowledge of the entire environment configuration. Thesystem configuration knowledge is often distributed acrossorganizational boundaries, where different teams knowthe details of a subset of the environment components. Whenrecorded, configuration knowledge is kept in multiple,informal documents.Thus, it is no surprise that environment

development expectations and assumptions do not matchoperational reality. The barrier between developmentand operations teams impedes iterative environmentdevelopment, increasing the risk of instability inthe transition to production and in the update of theproduction environment.

�Copyright 2014 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done withoutalteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied by any means or distributed

royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.

Digital Object Identifier: 10.1147/JRD.2014.2304865

M. H. KALANTAR ET AL. 10 : 1IBM J. RES. & DEV. VOL. 58 NO. 2/3 PAPER 10 MARCH/MAY 2014

0018-8646/14 B 2014 IBM

Given this situation, we propose Weaver, a language thatallows one to specify a blueprintVa formal description ofthe desired state of an environment. By environment,we mean a distributed application and its supportinginfrastructure. Weaver provides the language constructs thatdescribe all desired aspects of an environment: compute,storage and network resources, services, and software.Automation artifacts are associated with the blueprint (orWeaver program) to implement and maintain the desiredstate. Blueprints specified as Weaver programs can bemanaged via source control; that is, they can be versionedand shared. The execution of a Weaver program validatesthe specified environment, and deploys or updates it.The key goal associated with Weaver is to improve

agility and reduce the risk associated with continuouslydelivering software. Agility is the ability to respond quicklyto changing requirements by introducing new applicationfunctions or making structural changes in the infrastructuretopology, such as adding a firewall for better isolationand security. With the Weaver approach, even suchstructural changes as inserting a virtual firewall are treatedprogrammatically (Bas code[) by editing the Weaverblueprint and re-executing it. In addition, Weaver isdesigned to support effective collaboration between domainexperts, modularity, and re-use of code. Weaver languageconstructs make it easier to map application componentsdifferently on different infrastructures. These designprinciples will be further explored in the section BWeaverlanguage design.[The design of Weaver is motivated and influenced by

the DevOps [2] discipline. DevOps is a methodology toenhance the collaboration between development andoperations teams by applying development techniques, suchas iterative development, automation, automated testing,and versioning, to both application code and deploymentautomation code. Weaver does not replace the need forlow-level automation building blocks to install and configureindividual software components. Existing scriptinglanguages, including special-purpose configuration languagessuch as Chef [3] and Puppet [4], can be used to defineautomation on single nodes. The main objective associatedwith Weaver is to provide a programmable view of theentire environment, including software components that spansystems, and the infrastructure elements that are neededto support them.To validate the concepts of Weaver, we experimented with

using the approach to automate end-to-end a large andcomplex social software systemVIBM Connections [5]. Weuse this system to exemplify the challenges and describe howWeaver successfully addresses them. The next sectiondescribes IBM Connections in more detail. The approachand the Weaver language are then described. Finally,we return to a discussion of the results of our experimentwith IBM Connections and conclude.

Motivating example: IBM ConnectionsIBM Connections consists of a set of social softwareapplications including, for example, community, wiki,personal profile, forum, and file sharing applications. Thisset of hosted applications, offered as a service to all IBMemployees, is extensively used. The instance supportingIBM has had the number of visitors increase by more than110% in less than a year, the number of user profilescurrently exceeds 650,000, more than 600,000 wikis andcommunities have been created, and file storage is growingat 8% per month.Typical deployments of IBM Connections are large and

complex. Figure 1 shows a simplified deployment, yetsufficient to exemplify the complexity. The topology consistsof two IBM HTTP (Hypertext Transfer Protocol) servers(IHS), 16 IBM WebSphere* Application Servers (WAS)grouped in four clusters that can be deployed in varyingsizes, and 1 IBM WebSphere Deployment Manager(DMGR). All of the WAS nodes are connected to an externaldatabase (IBM DB2*) and an external network file system(NFS). The set of social applications are distributedamong the four clustersVeach cluster in the topology hostsa number of them.In addition to the inherit complexity of the topology,

multiple nonfunctional requirements must be addressed. Asthis is a critical business application, it must be highlyavailable and massively scalable. In addition, there are strictnetwork isolation and data privacy requirements. Figure 1also illustrates the implications of the non-functionalrequirements on the topology. Firewalls must be presentbetween the web and application tiers. The clusters must beconfigurable with varying sizes, and the IHS servers must beconfigured for high availability.Like many other large distributed applications, developing,

testing, deploying, and maintaining IBM Connectionspresent several challenges. First, system knowledge isfragmented among different teams of experts (e.g., WAS andDB2 configuration experts), all of whom must participate toimplement changes and updates. Second, dependenciesbetween different software components and between softwareand the supporting infrastructure is not well documentedor verifiable. Figure 2 illustrates some of the temporal anddata dependencies between the different steps required toinstall the IBM Connections stacks. Note that some of thedata dependencies cut horizontally across systems where theoutcome of a step is required in order to properly complete adifferent step in a different stack. These data dependencieshence imply additional temporal dependencies and needfor coordination across the various systems. Finally,non-functional requirements such as availability and securitypose additional requirements on the intersection betweensoftware and systems, for example, the presence of firewallsand their proper port configuration and the spreading ofsoftware across physical machines and racks.

10 : 2 M. H. KALANTAR ET AL. IBM J. RES. & DEV. VOL. 58 NO. 2/3 PAPER 10 MARCH/MAY 2014

As a result of these challenges, the IBM Connections teamhas not yet been successful in fully automating thedeployment and update of the entire production environment.Consequently, they suffer from infrequent release cycles (twoevery year) and require a long manual planning phase tomake any structural changes such as improved isolationbetween tiers.To validate the concept of configuration as Bcode,[ we

fully automated the deployment of IBM Connections for thetopology shown in Figure 1. This effort required severalpeople-months of work including consultations with variousdomain experts and authoring and testing the requiredautomation building blocks (in Chef [3]) and the Weaverblueprints that reference them. We were able to achieve areliable and repeatable deployment of IBM Connections andto demonstrate many of the goals for which Weaver isdesigned: modularity and reuse, agility, and reduced risk.The lessons we learned are described in the sectionBDiscussion.[

ApproachIn principle, a custom script could be developed to provisioninfrastructure resources in a cloud and to install softwareon those resources. Such an approach works well only forsmall systems that are deployed only once; it is not a scalable

solution. Such scripts are fragile and complex to implementon large systems. Minor modifications, such as the additionof a new cloud resource, may require changes in severalplaces. Furthermore, allowing the concurrent execution ofdifferent installation and configuration operations on differentnodes may be difficult to achieve in a large-scale systemdue to multiple data dependencies. Critically, such anapproach does not enable collaboration and code reuseamong different stakeholders.We propose Weaver as a language that allows developers

to express blueprints as code. A blueprint specifies thedesired state of an environment in terms of its resources,services, and software. A blueprint also referencesautomation scripts needed to implement and maintain thedesired state. An environment, comprising applicationand its infrastructure, is managed as a unit. Environmentsmay be deployed in support of development and testactivities, and for running a production system.Using Weaver, a blueprint is expressed by a set of

Weaver files that comprise a Weaver program. The corelanguage concepts express all relevant resources for anenvironment such as servers, storage, software components,automation scripts, etc. Weaver is an internal domain-specificlanguage (DSL) [6] built in Ruby. An internal DSL buildsupon the host language and is therefore tightly coupled to

Figure 1

Simplified physical topology for IBM Connections. Replicated IHSs support four WAS clusters, WAS_C1 through WAS_C4. Each cluster containsfour nodes: Ci_M0 through Ci_M3. The clusters are managed by a DMGR and are supported by a database and shared file server. Lines represent validcommunication paths, whereas the red boxes represent firewalls.


its syntax and semantics. The use of Ruby as the hostlanguage allows the use of a well-known type system,expression syntax, and semantics that have been adopted bypopular infrastructure-as-code frameworks (e.g., Chef [3]).The Weaver language is described in detail in the nextsection.The Weaver runtime creates and modifies environments

by executing a blueprint expressed as a Weaver program.The runtime coordinates the creation or modification ofthe resources described by the blueprint via a set ofplatform-specific providers which implement the interactionswith target clouds. With reference to Figure 3, theruntime first creates an in-memory model of the desiredenvironment from the blueprint (Parsing and Transformationcomponent). This model is used to validate the blueprint(Validation component) prior to deployment. The model isthen analyzed to identify relationships between propertyvalues and dependencies. The dependencies are used toderive coordination requirements used during softwareconfiguration. Finally, the in-memory model is traversed tocreate or modify resources using a set External Services andPlatform Providers which provide target cloud specificresource implementations. Weaver currently implementsproviders for the SDI (software defined infrastructure)Controller [7], OpenStack** [8], and Amazon ComputeCloud [9]. As virtual machine (VM) instances start they areexecute a startup script to configure the software. The

software configuration is coordinated between instancesusing a Coordinator currently implemented on ApacheZooKeeper** [10]. The coordination ensures that propertyvalues propagated between instances are available whenand where they are needed. After an environment hasbeen deployed, the Persistence component serializesthe in-memory model as the actual system state in theDatastore, currently implemented using ApacheCouchDB [11].

Weaver language design

Desired state representationA blueprint expressed in Weaver is a description ofthe desired or implicit state of an environment, not aset of instructions to create an environment. Use of desiredstate is a common approach in systems management[3, 4, 12–14]. A supporting runtime creates or modifies theenvironment defined by the blueprint. In order to expressthe desired state, Weaver provides basic resource typesfor compute, storage, and network resources. Further,Weaver provides ways to express software components andexternal services. All resources in Weaver have an identifierthat must be unique within its scope. This identifier canbe used to reference the resource. Resources may containproperties and child resources. Hereafter, we briefly describesome of the core Weaver resources.

Figure 2

Steps in an automated deployment of IBM Connections. The arrows represent data dependencies between the steps. The labels on the arrows describethe data dependency. (EAR: enterprise archive; IP: Internet protocol address; SSL: secure sockets layer.)


Compute resourcesA homogeneous group of compute resources is representedin Weaver by a node. A multiplicity property (defaultvalue 1) identifies the desired number of compute resourcesin the group. In Weaver, the definition is a template thatwill be reused multiplicity times. In each instance, the(uneditable) attribute multiplicity index will have aunique integer value assigned at runtime. It can be used todefine unique property values for the compute resourcesin the group. An example server used for that shown inFigure 1 is

nodeðBclust1[Þ fmultiplicity 4

property :cluster name ¼ 9 BClusterA[

name late binding fBConnsCluASrv#fmultiplicity indexg[g

key pair name Fweaver-sample_

security group ids ½Ftest_�g

In this example, a cluster with id clust1 with fourmembers is defined. It contains several predefinedproperties (multiplicity, name, key pair name, andsecurity group ids) as well as a user-defined propertyðcluster nameÞ. The name property is defined in terms of auniquely assigned attribute multiplicity index. Moreinformation about defining properties in terms of others isprovided in the subsection BProperty value assignments.[Properties may be used at deployment time to create aVMVname, key pair name, and security group ids inthis caseVmay be used as inputs to other properties(cluster name in this example) or may be descriptive.While not shown in the example above, all nodes contain

ip address and hostname properties. When these

Figure 3

System context for Weaver runtime. (SDN: software defined network; SDS: software defined storage; SDC: software defined compute.)


properties have values, the node refers to a specific (existing)compute resource. When they are undefined, as in theexample above, the node models a desired resource thatwill be provisioned as part of blueprint execution. Additionalnode properties specific to a particular target cloud are madeavailable by the platform provider referenced by the node(see the following subsections for more about providers).

Storage resourcesBlock storage volumes are defined in Weaver by thestorage resource type. Storage instances have size andformat properties. Platform providers may define additionalproperties. Storage resources are associated with nodes via astorage link that identifies where and how the storageshould be mounted. For example, a 21 GB volume of typeext3 is defined as

storage ð:db2 storageÞ fname Fdatabase storage_

description FDB2 database storage_

size 21 # 21 GB

fstype Fext3_

delete with pattern true

format true

volume type config GOLD

g

In addition, a storage_link may also be defined thatspecifies that the volume should be attached to a nodedB2 server as =dev=vdb and mounted as =storage=lcuser:

storage link ð:db2 storage linkÞ fmount point=storage=lcuser

dev name F=dev=vdb_

storage user Flcuser_

storage group Flcuser_

storage attach timeout 240

source dB2 server

target db2 storage

g

Network resourcesNetwork resources are expressed indirectly by means of anetwork link resource between nodes and/or services.Properties on the network link determine the requirementson the physical network including firewall requirements.For example, below are two examples of network links. Thefirst one shows a requirement for a bandwidth of 500 Mbps,whereas the second one shows a requirement for an openfirewall port.

network link ðBihs to was cluster1[Þ fsource ihs server

target Bclust1[

bandwidth 500

bandwidth unit Fmbps_

g

network link ðBwas cluster1 to db2[Þ fsource Bclust1[

target dB2 server

firewall fallow tcp :port ¼ 9 50001

gg

Software resourcesWeaver represents software components that are present orshould be provisioned in a node using the component

resource. Each component groups properties specific to asoftware product as a unit. The logic to install and/orconfigure one or more software components is describedusing an automation. An automation may be associatedeither with a component or directly with a node. Theydescribe an interface, not actual logic; that is, they describethe input and output parameters of the logic. The valuesof input parameters may be defined in terms of otherblueprint properties.

Property value assignmentsA key feature of the Weaver language is its ability to defineproperty values in terms of others. This avoids the need torepeat values in multiple places thereby reducing errors.The relationship between property values, which may be acomputation, is explicitly described in Weaver using alate binding expression. The Weaver runtime evaluatesthese expressions, Ruby code that may reference Weaverobjects using a dotted notation, as late as possible. Forexample, to configure the IHS server for IBM Connections(using the ihs role automation), the hostname of theDMGR is required. This can be expressed as

ihs role:dmgr hostname

late binding f was dmgr:hostname g

Evaluation may be delayed until after deployment hascommenced. For example, if an expression depends onproperties set at deployment time, such as the assignment of ahostname or an IP address, the Weaver runtime ensures thatthe expression is evaluated only after the property valueis available.In addition to an expression that depends on a single

property, a late binding expression may refer to multiple


properties. In particular, Weaver supports an allðÞ methodthat indicates that an array of values, one from each nodein a node group is desired. For example, to configurethe DMGR node (using the dmgr role), the node namesof all of the IHS servers are required:

dmgr role:dmgr ihs nodenames

late binding allðihs server:ihs role:fihs node nameÞ g

Further, a late binding expression may include anyRuby code. For example, to assign the name of a server, onecan compose the name from three separate properties usingRuby string concatenation as follows:

name late binding ‘‘Conns#fshort cluster namegf

Srv#fmultiplicity indexgconfig½:instance tag�g[g

The Weaver runtime inspects late binding expressionsto identify dependencies between properties. It then deploysor updates an environment in order that ensures that thedependencies are satisfied.

ModularityModularity enables the specification of smaller buildingblocks as separate programs that can be composed into acomplete blueprint. This enables collaboration, sharing, andreuse. In particular, it supports collaborative blueprintdevelopment and maintenance between development andoperations teams. To support modularity, Weaver has severalmechanisms, described below, that allow resources to bedefined in separate files.

It is common in Weaver to define an application topologyin one file, the underlying infrastructure in a second file, andthe mapping between the two in a final file. For example,the left half of Figure 4 shows a portion of the IBMConnections application definition. Here, threeapplicationsVweb, wikis, and blogsVare shown hosted onthree nodes: frontend_node, wikis_node, and blogs_node,respectively. On the right of Figure 4 is a portion of theinfrastructure model from Figure 1. Finally, the arrows showa mapping between the nodes in the application model tothe nodes in the infrastructure model. Weaver supportssuch modular design by allowing files to be imported(cf. keyword import) by which means the definitions inthe imported file are available to be used. In our IBMConnections environment, the application and infrastructuremodels are imported as follows:

import Fconnections-app:weaver_

import Fconnections-pattern:weaver_

Once imported, any resource definitions are available tobe used, possibly multiple times, via the use keyword.As they are used, resources and attributes can be renamed forconvenience. Finally, resources can be mapped to eachother using the realize keyword.

realizes connections:front end ¼ 9

pattern:ihsserver

realizes connections:blogsapp ¼ 9

pattern:clust1

realizes connections:wikisapp ¼ 9

pattern:clust1

Figure 4

Example realization mapping an application view of nodes (on the right) to infrastructure nodes (on the left). Mapping may be many-to-one.


Realization mappings are valid when the propertiesbetween the source and target are compatible [15] and maybe many to one (as in the Figure 4), indicating that severalapplication servers may be realized by a single physicalserver.In addition to importing files, Weaver also supports

the notion of including (keyword include) files whereby thecontents of the included file are incorporated directly byvalue. For example, each automation can be defined ina separate Bautomation[ file. This can then be included intothe node on which it should be executed. For example, theautomation : was role is added to the node clust1 byincluding the file containing its definition:

node ðBclust1[Þ finclude F::=automation=connections

was role:weaver_

g

In this case, the automation connections was role

defined in the included file can be referenced directly (there isno need to use or realize it).

ExtensibilityExtensibility is a key feature of a flexible language and itsruntime, allowing a blueprint to reference multiple targetplatform providers and automation approaches. As newplatforms become available, they can be integratedimmediately. Resources targeted for different platforms maysupport platform-specific extensions. For example, a networklink could express a bandwidth reservation on a targetcloud that supports it. Furthermore, the runtime should havewell-defined plugin interfaces that can be used by new targetplatforms and automation providers, so they can extendthe functionality of the runtime. Weaver supports two formsof extensibility: an ability to add support for additional cloudplatforms (platform providers) and an ability to supportexternal services.

Platform providersNodes in Weaver reference a provider. Referencing aprovider within a Weaver resource makes the providerresponsible for managing the creation, querying, and deletionof the resource and any contained Weaver resources.The provider also brings to the resource the provider’sview of the resource’s metadata; that is, provider-specificproperties and validations are added to the resource. Forexample, a node that references the OpenStack provider isassociated with an OpenStack image, a flavor (VM size), asecurity key pair, a list of security groups, and user data. Onthe other hand, a node referencing the SDI Controller [8](used for IBM Connections), has additional properties forCPU count and memory requirements. Such properties neednot be single value; for example, the SDI Controller provider

provides a platform provider-specific type definitions for aredundancy constraint. This constraint can be used in ablueprint to specify how VMs should be distributed amongphysical machines and racks. The SDI Controller takesthese requirements into account when creating VMs. Forexample, to specify that VMs should be on different physicalmachines and span at least two racks,

sdi redundancy constraint ð: llmnÞ fspread across at leastðf : rack ¼ 9 2gÞ# at least two racks

all different :compute node

# each VM on a different compute node

g

The Weaver runtime delegates creation, querying anddeletion of resources to the provider. To implement these, theprovider declaration typically requires an endpoint andcredentials. For example, for the SDI controller:

provider ð:sdicontrollerÞ fsdi endpoint Fhttp:==Gendpoint server 9 : 5006=_

sdi username Ftest_

sdi password Fpassw0rd_

g

This approach allows the blueprint developer to write codewith less attention to the target platform interfaces, confidentthat the provider-specific extensions and metadata areavailable if needed.

External servicesExternal services can be described using the serviceresource type. As part of a service declaration, create anddelete methods can be defined in Ruby that allow interactionwith the service when the environment is created ordestroyed. As with other resource types, they may be definedin separate files and included in a final blueprint definition.

Validations and rulesA blueprint should be able to describe that certain conditionsand rules hold at different points of an environment’slifecycle such as prior to deployment, immediately afterdeployment or on an ongoing basis after deployment. Suchconditions and rules allow a blueprint to be validated forcorrectness both before and after being applied. For example,a useful check to make prior to deploying a blueprint iswhether the specified image is available in the target cloudenvironment. After deployment, one might check whether thedesired firewall rules have been correctly applied. On anon-going basis, one might want to verify that a certain set ofprocesses are running on each deployed VM. Weaversupports such checks using validators (keyword validator)and rules (keyword rule).


Validators can be attached to Weaver types and properties.Weaver supplies basic validators for including checks fordata type, numeric range, and set membership. Cloudproviders can define additional validators using Ruby code.Blueprint authors can also define validations. For example,below is a validator to check, prior to deployment, if a TCPconnection can be established to a node on a certain portwithin a given time:

validator : networkconnectable ¼ 9 :attribute,:phase ¼ 9 :predeployment do jport, timeoutj

ip addr ¼ self:value

begin

Timeout::timeoutðtimeoutÞ f TCPSocket:newðip addr; portÞ:recvfromð0Þ g

rescue Exception ¼ 9 ex

reportðLogger :: ERROR, BCannot reach

#fip addrg:#fportg -- #fex:messageg[; self Þend

end

Rules are a second way to express validations. Theyevaluate a where clause to determine their scope and executea body which implements the validation. Both the where

clause and body are expressed in Ruby.

DiscussionUsing Weaver, we developed a blueprint for IBMConnections (our motivating case study) matching thephysical topology shown in Figure 1. Examples of theWeaver code implementing the blueprint were presentedabove. Before we started designing the blueprint, a silentinstaller (bash script) was already available for IBMConnections. However, since Weaver currently supports onlyChef cookbooks [3] as the automation mechanism, we wrotea Chef cookbook to wrap the silent installer. Furthermore,since no automation code existed to form a WAS cell, toconfigure security, or to execute several IBM Connectionspost-install steps, we also wrote Chef cookbooks to automatethese steps. We associated all of the Chef cookbooks withthe blueprint defined in Weaver. The steps involved aredepicted in Figure 2.We demonstrated that the Weaver runtime was able to

repeatedly and reliably deploy IBM Connections to anOpenStack cloud using the SDI Controller platform provider.This provider supports VM placement policies, allowingthe Weaver blueprint to express anti-collocation constraintsfor the compute resources from different clusters, whichis required for IBM Connections to achieve high availability.Our successful case study involving a complex system

clearly demonstrates the value of Weaver. In the process, welearned several lessons through our experience that furthersubstantiate the benefits accrued by employing Weaver torealize software defined environments.

It became evident that supporting a blend of imperativeand declarative code is critical for writing concise, yetexpressive, Weaver blueprints. Describing systems with alarge number of nodes where each node has a propertythat needs to be assigned a unique value can be easilyachieved by using native Ruby constructs to loop througheach node, assigning the property values as needed via, forinstance, string interpolation.Another important feature of Weaver is the ability to

automatically detect dependencies between nodes byanalyzing the assignment of values to properties ofautomations associated with nodes. In blueprints of largesystems, it would be tedious and error-prone manuallyspecifying such dependencies. This is even more criticalwhen a pair of nodes depend on each other at different stagesof their automation (Chef recipes) execution. For example,the arrows in Figure 2 represent such data dependencies.Not only does Weaver detect such data dependencies, but itsruntime also propagates parameter values from one node’sautomation to another’s (via a coordinator) at execution time.In other words, synchronization between nodes is implicitlyexpressed by the blueprint and enforced by the runtime.Supplementing its data dependency detection, Weaver also

allows a blueprint to explicitly define control dependenciesbetween nodes, between automations, or between nodesand automations. We encountered a few scenarios, reportedby our users, where this feature was needed. Note thatWeaver allows the parallel execution of automation sectionsthat are not affected by the explicit dependencies.Weaver’s support for modularity is crucial for large

systems. Writing a blueprint for a large system can beaccomplished by having different people write differentportions of the blueprint and different automations accordingto their expertise. This allows a team to make rapid progressdeveloping the initial blueprint and maintaining it. Asdescribed above, one pattern we encourage Weaver usersto adopt is to split the blueprint into (at least) three distinctdocuments: application, infrastructure, and environment.The application definition can be reused across differentinfrastructures, whereas the application-infrastructuremappings are expressed in a specific environment document.In some cases, even the infrastructure can be reused acrossdifferent environments, in which case only the nodemappings or the used automation (and correspondingparameters) would differ.Another area where Weaver turned out to be extremely

valuable in the IBM Connections case was debugging.The existing automation code had never been tested whennodes join a WebSphere cell in parallel. This is not surprisingas such a test is tedious if done manually. Weaver helpedus quickly detect serious automation bugs in this regard.In addition, Weaver allowed us to easily apply the debuggingtechnique of starting two instances of the same application intwo different environments (each with differing OpenStack


networking code) so that we could observe the differencesin behavior. Finally, Weaver validation is a powerfulmechanism to detect problems early, before a lengthydeployment is attempted. For example, there is no point instarting a deployment if a needed NFS server has been shutdown, especially when the failure would manifest itself2 hours into the deployment. A simple validator that checks ifrequired external components are operational can save time.

Related workMany efforts have been made to improve the mechanismsused to reliably configure and deploy IT systems. We canseparate these approaches into two classes: (1) graphicalrepresentation of the IT systems through a formal model and(2) infrastructure-as-code (IaC).Rational Software Architect’s Deployment Model

(RSA-DM) [14] represents an approach of the first class,wherein a graphical representation is created by and renderedfrom Extensible Markup Language (XML) substitutiongroups to represent the IT system under development.RSA-DM defines a detailed type system for representingconfiguration elements. The type system can be validatedagainst a set of constraints and be used with a goal-basedreasoning engine to generate the expected set of automatedsteps required to configure the running system.While the model-driven, graphical representation approach

was appealing to some people, system administrators havebeen mostly gravitating around IaC frameworks. Cfengine[12] was a pioneer in this arena, proposing a new languageto facilitate the automation of common system administrationtasks and an integrated environment for their specificationand execution. Cfengine paved the way for modern IaCframeworks such as Chef [3] and Puppet [4], whosephilosophy is to constantly make all managed serversconverge to a desired state that is expressed in terms of theirDSL and stored in a central repository. Typically, adeployment and configuration automation in Cfengine, Chef,and Puppet is expressed in terms of idempotent units ofwork. Similar in flavor to these IaC frameworks is LCFG(local configuration system) [16], which also proposes alanguage and runtime for automatic configuration andinstallation of servers.Weaver deals with system deployment and configuration

through a combination of model-driven and IaC approaches,defining what we refer to as environment-as-code. AWeaver environment binds, via the mechanism of realization(also used by RSA-DM), a description of an application tothat of the infrastructure on which the application willrun. Weaver further associates with the environment allautomation responsible for installing and configuring theapplication onto the target infrastructure. Unlike IaCframeworks, the Weaver runtime orchestrates thedeployment and configuration of distributed applications,cross-configuring software components and middleware by

propagating parameter values across nodes and observing thetemporal order of concurrent configuration operations.Amazon CloudFormation (CFN) [17] and OpenStack

Heat [18] are both approaches for orchestrating thedeployment of cloud resources in the cloud, with Heat beinga re-implementation of the Amazon’s CFN concepts forOpenStack. Both approaches take a declarative inputtemplate, either in JSON (JavaScript Object Notation) orYAML (markup language) (for Heat only), and provision aBstack[ given the resources specified in the input. The inputformat is monolithic and does not allow the specificationof software components (only hooks for arbitrary scripts).Modularity of the input can only be realized by leveragingnested stacks. A key feature of Weaver over CFN is that itprovides Blanguage[ capabilities based on its DSL, whichare relevant for modularity, agility to develop, validateand debug blueprints effectively. Weaver modular input canthen be compiled to CFN format for provisioning.SmartFrog [19], Engage [20], and AESON (Activation

Engine on Service Overlay Network) [21] are systemssupporting distributed application deployment and lifecyclemanagement; like Weaver, all three take into accountmulti-node dependencies. Also similar to Weaver, SmartFrogand Engage can validate descriptions of the managedenvironment. Differently, however, Weaver completelydecouples deployment/configuration automation from theenvironment definition. This characteristic gives Weavermore agility when describing different types of environment,for instance, testing, integration, and production.Given its inherent complexity, automation of deployment

and configuration is still an active area of research. Someefforts explore configuration optimization and validation inthe context of declarative system administration [13], mixdeclarative and workflow-based automation [22], underscorethe importance of cross-node constraints in new DSLs [23],and propose systems for deployment and configuration ofmulti-node services and applications [14, 24, 25]. Weaverfocuses on tying together all automation logic as part of anenvironment description, whereas CDE [26] avoids theautomation problem by capturing an executable packagecontaining the code, data, and environment needed by anapplication. Such a package is captured by monitoring theinstallation and configuration of the target application.

ConclusionIn this paper, we proposed and presented Weaver, a languagefor codifying environment blueprints. Through a case studyinvolving a complex enterprise social computing platform,IBM Connections, we demonstrated that the Weaverlanguage and its runtime can facilitate the deployment ofcomplex systems. Weaver’s environment-as-code approachtakes IaC a step further, by binding together all relevantbuilding blocks that constitute an environmentVapplication,infrastructure, and automation. Unifying the management of


all these elements in such a systematic way is a prerequisitefor a successful DevOps experience as it encourages tightercollaboration between development and operations teams.

*Trademark, service mark, or registered trademark of InternationalBusiness Machines Corporation in the United States, other countries, orboth.

**Trademark, service mark, or registered trademark of OpenStackFoundation or Apache Software Foundation in the United States, othercountries, or both.

References1. J. Humble and D. Farley, Continuous Delivery: Reliable Software

Releases through Build, Test, and Deployment Automation.Reading, MA, USA: Addison-Wesley, 2010.

2. J. Humble and J. Molesky, BWhy enterprises must adopt devops toenable continuous delivery,[ Cutter IT J., vol. 24, no. 8, p. 6,Aug. 2011.

3. Opscode, Chef. [Online]. Available: http://www.opscode.com/chef/

4. Puppet Labs, Puppet. [Online]. Available: https://puppetlabs.com/5. IBM Corporation, IBM Connections. [Online]. Available: http://

www 03.ibm.com/software/products/us/en/conn6. M. Fowler, Domain Specific Languages. Reading, MA, USA:

Addison-Wesley, 2011.7. W. C. Arnold, D. J. Arroyo, W. Segmuller, M. Spreitzer,

M. Steinder, and A. N. Tantawi, BWorkload orchestration andoptimization for software defined environments,[ IBM J. Res.Dev., vol. 58, no. 2/3, Paper 11, 2014 (this issue).

8. OpenStack. [Online]. Available: http://www.openstack.org/9. Amazon Elastic Compute Cloud. [Online]. Available: http://aws.

amazon.com/ec210. Apache, Zookeeper. [Online]. Available: http://zookeeper.

apache.org11. Apache CouchDB. [Online]. Available: http://couchdb.apache.org12. M. Burgess, BCfengine: A site configuration engine,[ USENIX

Comput. Syst., vol. 8, no. 3, pp. 309–337, 1995.13. J. Hewson, P. Anderson, and A. Gordon, BA declarative approach

to automated configuration,[ in Proc. 26th USENIX LISA,San Diego, CA, USA, 2012, pp. 51–66.

14. T. Eilam, M. Elder, A. Konstantinou, and E. Snible, BPattern-basedcomposite application deployment,[ in Proc. IFIP/IEEE Int.Symp. Integr., Netw. Manag. (IM), Dublin, Ireland, 2011,pp. 217–224.

15. W. Arnold, T. Eilam, M. Kalantar, A. Konstantinou, andA. Totok, BAutomatic realization of SOA deploymentpatterns in distributed environments,[ in Proc. ICSOC, 2008,pp. 162–179, LNCS 5364.

16. P. Anderson and A. Scobie, BLCFG: The next generation,[ inProc. UKUUG LISA Winter Conf., 2002, pp. 1–9.

17. Amazon CloudFormation. [Online]. Available: http://aws.amazon.com/cloudformation/

18. OpenStack Heat. [Online]. Available: https://wiki.openstack.org/wiki/Heat

19. P. Goldsack, J. Guijarro, S. Loughran, A. Coles, A. Farrell,A. Lain, P. Murray, and P. Toft, BThe SmartFrog configurationmanagement framework,[ SIGOPS Oper. Syst. Rev., vol. 43, no. 1,pp. 16–25, Jan. 2009.

20. J. Fischer, R. Majumdar, and S. Esmaeilsabzali, BEngage: Adeployment management system,[ in Proc. PLDI, Beijing, China,2012, pp. 263–274.

21. D. Jayasinghe, F. Oliveira, F. Rosenberg, and T. Eilam, BAESON:A model-driven and fault tolerant composite deploymentruntime for IaaS clouds,[ in Proc. IEEE Int. SCC, Santa Clara, CA,USA, 2013, pp. 575–582.

22. H. Herry, P. Anderson, and G. Wickler, BAutomated planning forconfiguration changes,[ in Proc. 25th USENIX LISA, Boston, MA,USA, 2011, pp. 1–12.

23. T. Delaet and W. Joosen, BPoDIM: A language for high-levelconfiguration management,[ in Proc. 21st USENIX LISA, Dallas,TX, USA, 2007, pp. 261–273.

24. X. Etchevers, T. Coupaye, F. Boyer, and N. de Palma,BSelf-configuration of distributed applications in the cloud,[ inProc IEEE CLOUD, Washington, DC, USA, 2011, pp. 668–675.

25. J. Kirschnick, J. Calero, P. Goldsack, A. Farrell, J. Guijarro,S. Loughran, N. Edwards, and L. Wilcock, BTowards anarchitecture for deploying elastic services in the cloud,[ Softw.:Practice Experience, vol. 42, no. 4, pp. 395–408, Apr. 2012.

26. P. Guo, BCDE: Run any Linux application on-demand withoutinstallation,[ in Proc. 25th USENIX LISA, Boston, MA, USA,2011, p. 2.

Received August 15, 2013; accepted for publicationSeptember 18, 2013

Michael H. Kalantar IBM Research Division, Thomas J. WatsonResearch Center, Yorktown Heights, NY 10598 USA ([email protected]). Dr. Kalantar graduated from Cornell University, hastaught at Shandong and Shiyou Universities, and now works at IBMResearch. His research interests are system management and distributedsystems.

Florian Rosenberg IBM Research Division, Thomas J. WatsonResearch Center, Yorktown Heights, NY 10598 USA ([email protected]). Dr. Rosenberg is a Research Staff Member and amanager of the Model-Driven Management Technologies Department.He received his Ph.D. degree in computer science from theVienna University of Technology in 2009. His research interestsinclude software engineering, systems management and distributedsystems.

Jim Doran IBM Research Division, Thomas J. Watson ResearchCenter, Yorktown Heights, NY 10598 USA ([email protected]@us.ibm.com). Mr. Doran is a Distinguished Engineer at IBM Research.

Tamar Eilam IBM Research Division, Thomas J. WatsonResearch Center, Yorktown Heights, NY 10598 USA ([email protected]). Dr. Eilam is a Research Staff Member at IBMResearch. She is the senior manager of the Virtualized and CloudInfrastructure Management Group.

Michael D. Elder IBM Software Group, Durham, NC 27703 USA([email protected]). Mr. Elder is an IBM Senior Technical StaffMember and an IBM Software Group Master Inventor focused on IBMproducts in the DevOps space. He holds a B.S. degree in computerscience from Furman University and an M.S. degree in computerscience from the University of North Carolina–Chapel Hill. Hispassions include solving customer problems to enable them to providebetter experiences for their users.

Fabio Oliveira IBM Research Division, Thomas J. WatsonResearch Center, Yorktown Heights, NY 10598 USA ([email protected]). Dr. Oliveira is a Research Staff Member of theModel-Driven Management Technologies Department at IBMResearch. He earned a Ph.D. degree in computer science from RutgersUniversity in 2010. His research interests include systems management,distributed systems, and operating systems.


Edward C. Snible IBM Research Division, Thomas J.Watson Research Center, Yorktown Heights, NY 10598 USA([email protected]). Mr. Snible is a software engineer and member ofthe Model-Driven Management Technologies Department. His researchinterests include visualization of software deployments and thedetection of errors in distributed systems.

Tova Roth IBM Research Division, Thomas J. Watson ResearchCenter, Yorktown Heights, NY 10598 USA ([email protected]).Ms. Roth is a member of the Model-Driven Management TechnologiesDepartment at IBM Research. Her research interests include softwareengineering and model based engineering.