CASE STUDY: BETFAIR'S APP CLOUD POWERED BY CLOUDSOFT'S AMP

"APP CLOUD HAD TO OFFER AN INCREDIBLY EASY-TO-USE SOFTWARE DEVELOPMENT LIFECYCLE, PROVIDE FOOL-PROOF SECURITY, A HIGHLY RELIABLE RUN-TIME ENVIRONMENT WITH CONSISTENT HIGH-PERFORMANCE, AND ALL AT ULTRA LOW-COST."

CAREL VOSLOO
BETFAIR APP CLOUD DIRECTOR

Organization: Betfair
Business: Online sports betting
Headquarters: London
Geographies: Global - UK and International

Betfair has used Cloudsoft's Application Management Platform (AMP) to provide the deployment, orchestration, and management capabilities required by their App Cloud service (cloud.betfair.com). App Cloud enables exciting new revenue opportunities for Betfair's customers, and AMP allows Betfair to provide this service with faster delivery, greater reliability, and lower costs.

THE ADVENT OF A NEXT GENERATION APPLICATION PLATFORM

Betfair's Betting Exchange has fundamentally changed sports betting by providing a venue where customers can come together to bet at mutually agreed prices. This eliminates the need for a traditional bookmaker and creates significantly better odds for customers. The Exchange also facilitates efficient trading strategies that can lock in profits or cut losses throughout sporting events. As a result of these innovations, Betfair has acquired the world's largest betting community and processes more than seven million transactions every day, more than all European stock exchanges combined. Underpinning the Exchange is a unique combination of technologies that enable rapid innovation of highly secure applications which conform to all jurisdictional requirements. For example Betfair's API (application programming interface) allows customers and software developers to securely interact with the Exchange, which has proven so successful that over 60,000 Betfair customers use non-Betfair interfaces to bet on the Exchange. Building on this success, in 2012 Betfair launched the Betfair App Cloud.

Betfair has used Cloudsoft’s Application Management Platform (AMP) to provide the deployment, orchestration, and management capabilities required by their App Cloud service (cloud.betfair.com).

App Cloud enables exciting new revenue opportunities for Betfair’s customers, and AMP allows Betfair to provide this service with faster delivery, greater reliability, and lower costs.

the Advent of A next generAtion AppliCAtion plAtform

Betfair’s Betting Exchange has fundamentally changed sports betting by providing a venue where customers can come together to bet at mutually agreed prices. This eliminates the need for a traditional bookmaker and creates significantly better odds for customers.

The Exchange also facilitates efficient trading strategies that can lock in profits or cut losses throughout sporting events.

As a result of these innovations, Betfair has acquired the world’s largest betting community and processes more than seven million transactions every day, more than all European stock exchanges combined.

Underpinning the Exchange is a unique combination of technologies that enable rapid innovation of highly secure applications which conform to all jurisdictional requirements. For example Betfair’s API (application programming interface) allows customers and software developers to securely interact with the Exchange, which has proven so successful that over 60,000 Betfair customers use non-Betfair interfaces to bet on the Exchange.

Building on this success, in 2012 Betfair launched the Betfair App Cloud.

poliCy-driven Control-plAne:









betfAir’s App Cloud

App Cloud is a new service enabling Betfair’s customers to create their own unique sports-betting applications and host them on Betfair’s infrastructure. App Cloud is compelling for several reasons and particularly because it takes care of an application’s non-functional requirements, thereby making applications dramatically faster and less expensive to develop, deploy and manage.

Betfair’s Carel vosloo, who created the App Cloud vision, set a clear design brief: “App Cloud had to offer an incredibly easy-to-use software development lifecycle, provide fool-proof security, a highly reliable run-time environment with consistent high-performance, and all at ultra low-cost”. The only realistic way to achieve all these objectives simultaneously and at scale was through dynamic automation. As Carel explains: ” we needed App Cloud to be almost entirely self-managing and self-optimizing, which is why we chose Cloudsoft’s Application Management Platform”.

Additionally App Cloud had to be portable across Betfair’s internal infrastructure and public cloud providers. This requirement was also met by Cloudsoft’s AMP.

Cloudsoft’s AppliCAtion mAnAgement plAtform

AMP is open-source software that adds a policy-driven control-plane and cloud-portability to applications and application platforms.

In practice, AMP’s policy-driven control plane enables App Cloud and the applications it hosts to become self-managing and self-optimizing across machines, locations and clouds, adapting automatically to unpredictable changes while removing complexity for operators and users.

(For further details, please see the separate AMP whitepaper).

App Cloud teChnology stACk

security, efficiency, and reliability were the key drivers for the design decisions behind App Cloud, with AMP taking responsibility for the initial deployment and ongoing changes needed to meet these goals.

within App Cloud each customer is managed as a tenant, and each tenant’s applications run on a pool of dedicated security-hardened

app-servers. Each app server runs in its own LXC container in order to ensure application isolation with minimum overhead. And multiple LXC containers are hosted in each Linux virtual machine in order to drive up application density and hence resource utilization.

separate environments are maintained for development, staging and live, with AMP automating deployment to the appropriate environment. Users issue requests when a deliverable is ready to move between environments, and App Cloud automates the necessary checks and approvals.

A cluster of load-balancers runs in each environment and routes requests, round-robin-style, to the tenant’s app-servers based on inspection of the request UrL.

host-based port forwarding isolates all containers from each other, and firewalls surround each environment as well as the entire App Cloud platform (with vPn access required for anything other than publicly available services).

Tenants interact with App Cloud via the Git version control system and a bespoke project management system. Applications are built using a farm of Jenkins CI servers, and a successful build is pushed to an app-server in the development environment.

If this is the first app for a tenant, a new LXC container is started for that tenant, the app-server is launched, and the application is uploaded and deployed. Otherwise the app is simply deployed to the tenant’s existing app-servers. In parallel the load-balancer configuration is automatically updated with an additional routing rule for requests matching this tenant’s UrL prefix.





LXC Container

ContainerApp Server




LXC Container

App Server



Build Farm



LXC Container

App Server


Each software component is monitored for health and performance using JMXMP for comprehensive instrumentation at the app-server tier. These metrics are used to drive the autonomic management policies for auto-scaling and failure recovery, which are described in greater detail in the “Custom Policies” section below.

betfAir’s seleCtion proCess

Betfair had already created an alpha version of App Cloud when they discovered Cloudsoft. Initial discussions indicated that Cloudsoft’s AMP had the potential to dramatically increase automation and reduce costs. Betfair commissioned a short Proof-of-Concept to retrofit AMP to App Cloud.

The PoC proved that AMP could dynamically manage the App Cloud platform, the applications running on that platform, and the App Cloud software development lifecycle.

Building on the PoC, Betfair funded a Pilot implementation (70 man-days) which added more sophisticated scaling policies, added multi-tenancy management, added automated fault-fixing, and prepared the complete solution for production deployment.

betfAir’s experienCe with Amp: Custom poliCies

Betfair had to ensure that App Cloud achieved high levels of reliability and performance, with lean and efficient use of resources. AMP’s library of powerful ready-to-use policies, and the ability to customize these for the exact requirements of the App Cloud, allowed Betfair to very quickly achieve these objectives with a production quality solution.


Betfair’s highest priority for AMP was to enable the rapid and automatic scaling of multi-tenant applications. AMP’s auto-scaler policy was the starting point for this, with its built-in algorithms for analyzing workloads, calculating thresholds, and avoiding “thrashing”.

The auto-scaler policy was tuned to monitor only those metrics that Betfair specified as accurately representing workloads - Betfair liked the way AMP could simply re-use existing metrics and did not require agents to be installed. scaling also involved the co-ordinated addition/removal of resources together with re-configuring load-balancers to accommodate changing application footprints. Betfair liked the way AMP executed the required changes via existing control mechanisms and therefore did not require any changes either to App Cloud or to the hosted applications.

AutomAtic FAilure recovery

Betfair used an additional policy to detect and correct various problems that App Cloud may encounter.

In this use-case the policy analyzes data that indicate the health of App Cloud components in order to recognize when and where components have failed - at which point the policy determines and executes the appropriate corrective action (e.g. restart a failed container and wire it back into the configuration).

Initially this has been implemented as an open-loop system whereby operator intervention is solicited, but as behaviors are better understood

Betfair will progressively move towards a fully automated, closed-loop system.

Betfair also plans to progressively enhance policies to recognize where failures are likely or are imminent so that pre-emptive corrections can be taken

AutomAted sdlC mAnAgement

AMP integrates with App Cloud’s workflow engine to automatically migrate customer applications from development and test to staging, and ultimately to production - taking care of provisioning and configuration requirements along the way.


lower coStS

• AMP helps automate the App Cloud software development lifecycle, simplifying and accelerating code deployment into test environments and subsequently migrating applications through staging and into production.

• AMP automates even complex operational tasks, thereby greatly reducing the manpower required to manage both App Cloud and the applications it hosts.

• By applying policies to workload management, AMP automatically implements just-in-time provisioning of resources for App Cloud applications (rather than just-in-case provisioning), thereby increasing average resource utilization and driving down costs.

• AMP optimizes the multi-tenancy capabilities architected into App Cloud, further driving up resource utilization and driving down costs.


• AMP helps App Cloud to decouple applications from many non-functional requirements, which makes applications significantly easier and faster to develop.

• AMP abstracts many aspects of an application’s behavior into policies so that, for example, the service level, Qos and cost characteristics of App Cloud applications can be easily adapted as needed - even while they are running.

• AMP makes it much easier to change infrastructure providers so that App Cloud is positioned to exploit lower cost opportunities, and to exploit new technology and new geography opportunities.

improved cuStomer experience

• reduced “friction” in the software development lifecycle for App Cloud developers.

• App Cloud end-users experience reliable, consistent performance (irrespective of changing operating conditions such as fluctuating workloads).

• If a failure occurs, App Cloud recovers more quickly and can take steps to mitigate (and potentially eliminate) its impact.

• Maintaining a lower development and operational cost-base enables lower costs for customers.

reduced riSkS

• AMP is open-source, thereby enabling Betfair to maintain control of their App Cloud management framework.

• AMP is open-standards, making it easier for Betfair to find staff and suppliers with the necessary skills, and helping facilitate integration with other products and technologies used by Betfair, both now and in the future.

• Cloud-portability prevents being locked-in to a cloud vendor.

• The ability to span multiple infrastructures - including on-premise resources and public clouds - makes it easier to avoid shortages in computing resource, and facilitates additional options for high availability and disaster recovery.

• AMP can enforce compliance and governance requirements via policies, thereby reducing operational, reputational, and legal risks.

• AMP automatically sets up many of the security measures required by App Cloud (e.g. AMP acts as certificate authority, assigns certificates as needed, and ensures encryption across connections).

future plAns

Although a lot has already been achieved, there is a lot more that Betfair plans to do with AMP including:

• Betfair will progressively enhance their existing policies to enable increasing numbers of error conditions to be automatically handled by AMP

• The productivity improvements provided by App Cloud have persuaded Betfair’s internal development teams to also use it.

• Cloudsoft has recently released the AMP service Catalog which greatly simplifies the way developers can select and compose reusable services, and which can therefore simplify the way developers consume App Cloud services.

• Currently App Cloud uses the rackspace Cloud powered by Openstack, but there are plans to also deploy it in-house on a traditional vMware environment. AMP will enable this to be achieved without changing App Cloud, or the applications running on it.

• Betfair plans to deploy App Cloud in multiple geographies. AMP already has capabilities for optimizing applications across wide-area networks, and Cloudsoft’s roadmap includes making AMP a federated control plane, which will allow for more sophisticated and scalable optimization strategies.

• A legacy from App Cloud’s original implementation is the use of BPEL (Business Process Execution Language) to control aspects of the software development lifecycle. As BPEL is less aligned to the way the App Cloud platform engineers work, Betfair is in the process of replacing BPEL with AMP policies, as these better support standard coding practices.






CArel vosloo

betfAir App Cloud direCtor