9
MongoDB on Windows Azure A 10gen White Paper

MongoDB on Windows Azure

Embed Size (px)

DESCRIPTION

MongoDB on Windows Azure brings the power of the leading NoSQL database to Microsoft’s flexible, open, and scalable cloud.

Citation preview

Page 1: MongoDB on Windows Azure

MongoDB on Windows Azure

A 10gen White Paper

Page 2: MongoDB on Windows Azure

2

MongoDB on Windows Azure brings the power of the leading NoSQL database to Microsoft’s flexible, open, and scalable cloud.

MongoDB is an open source, document-oriented database designed with scalability and developer agility in mind. Windows Azure is the cloud services operating sys-tem that provides the development, service hosting, and service management envi-ronment for the Azure Services Platform. Together, MongoDB and Windows Azure

MongoDB on Windows Azure

provide customers the tools to build limit-lessly scalable applications in the cloud.

This paper begins with an overview of MongoDB. Next, we describe the two primary deployment options available on Microsoft’s cloud platform, Windows Azure Virtual Machines and Windows Azure Cloud Services. Finally, to help those evaluating deploying MongoDB on Windows Azure, we outline the pros and cons of the two deployment options available.

Page 3: MongoDB on Windows Azure

About MongoDBMongoDB is an open source, document-oriented database. MongoDB bridges the gap between key-value stores – which are fast and scalable – and relational databas-es – which have rich functionality. Instead of storing data in tables and rows as one would with a relational database, MongoDB stores a binary form of JSON (BSON or ‘binary JSON’ documents). An example of a document is shown in Figure 1.

The document serves as the fundamental unit within MongoDB (like a row in an RDBMS); one can add fields (like a column in an RDBMS), as well as nested fields and embedded documents. Rather than impos-ing a flat, rigid schema across an entire table, the schema is implicit in what fields are used in the documents. Thus, MongoDB allows developers to have variable schemas

3

{ "_id": ObjectId("504e4dd43796b3da50183991"), "text": "Study Implicates Immune System in Parkinson’s Disease Pathogenesis http://bit.ly/duhe4P", "source": "<a href=\"http://twitterfeed.com\" rel=\"nofollow\">twitterfeed</a>, "coordinates": null, "truncated": false, "entities": { "urls": [{ "indices": [ 67, 87], "url": "http://bit.ly/duhe4P", "expanded_url": null }], "hashtags": [] }, "retweeted": false, "place": null, "user": { "friends_count": 780, "created_at": "Fri Jan 08 17:40:11 +0000 2010", "description": "Latest medical news, articles, and features from Medscape Pathology.", "time_zone": "Eastern Time (US & Canada)", "url": "http://www.medscape.com/pathology", "screen_name": "MedscapePath", "utc_offset": -18000 }, "favorited": false, "in_reply_to_user_id": null, "id": NumberLong("22819397000")}

Figure 1: Sample JSON Document

across documents and to adapt schemas as their applications evolve.

Unlike relational databases, MongoDB does not use SQL syntax. Rather, MongoDB has a query language based on JSON. It also has drivers for most modern languages, such as C#, Java, Ruby, Python, and many others. MongoDB’s flexible data model and support for modern programming languages simplify development and administration significantly.

Page 4: MongoDB on Windows Azure

MongoDB ArchitectureMongoDB’s core capabilities deliver reli-ability, high availability, high performance, and scalability.

Replication through replica sets provides for high availability and data safety. A replica set is comprised of one primary node and some number of secondary nodes (de-termined by the user). Figure 2 shows an example replica set, with one primary and two secondaries (a common deployment model). By default, the primary node takes all reads and writes from the application; the secondaries replicate asynchronously in the background. If the primary node goes down for any reason, one of the secondaries is automatically promoted to primary status and begins to take all reads and writes. Replica sets help protect applications from hardware and data center-related down-time. Moreover, they make it easy for DBAs to conduct operational tasks, including software upgrades and hardware changes.

Figure 2: Replica Sets with MongoDB

4

Secondary

Secondary

Application

Read Write

Asynchronous Replication

Automatic Leader Election

Primary

Figure 3: Sharding with MongoDB

Shard A0...30

Shard B31...60

Shard C61...90

Shard Nn...n+30

...Horizontally Scalable

Sharding enables users to scale horizon-tally as their data volumes grow and/or as demands on their data stores grow. A shard is a subset of the database, kind of like a partition of the data. In Figure 3, Shard A contains documents 1-30; Shard B contains documents 31-60; and so on. One can choose any key on which to shard the col-lection (e.g., user name), and MongoDB will automatically shard the data store based on this key. One can scale a database infi-nitely using sharding by adding new nodes to a cluster. When a new node is added, MongoDB recognizes it and redistributes the data across the cluster. Because shard-ing distributes both the actual data and therefore the load (i.e., traffic), it enables horizontal scalability as well as high per-formance.

Page 5: MongoDB on Windows Azure

An overview of the MongoDB architecture is shown in Figure 4. In a multi-shard environment, the application communi-cates with mongos, an intermediary router that directs reads and writes to the appropriate shard. Each shard is a replica set, providing scalability, availability, and performance to developers.

MongoDB was built for the cloud. Cloud services like Windows Azure are therefore a natural fit for MongoDB. By coupling MongoDB’s easy-to-scale architecture and Azure’s elastic cloud capacity, users can quickly and easily build, scale, and manage their applications.

About Windows Azure Services for MongoDB Windows Azure is Microsoft’s suite of cloud services, providing developers on-demand compute and storage to create, host and manage scalable and available web applications through Microsoft data centers. When deploying MongoDB to Windows Azure, users can choose from two deployment options:

» Windows Azure Virtual Machines. Windows Azure Virtual Machines (VMs) is Microsoft’s Infrastructure-as-a-Service (IaaS) offering. Similar to Amazon Web Services

EC2, Azure VMs give users access to elastic, on-demand virtual servers. Users can install Windows or Linux on a VM and configure it based on their own preferences or their apps’ specific needs. Users manage the VMs themselves, including scaling, installing security patches, and ongoing performance monitoring and management. Azure VMs give users a relatively significant degree of control over their environments, but by the same token require users to take on the VM management. Note: This service is currently in preview (beta).

» Windows Azure Cloud Services. Windows Azure Cloud Services (Worker Roles and Web Roles) is Microsoft’s Platform–as–a–Service (PaaS) offering. Similar to Heroku, Worker Roles provide users with prebuilt, preconfigured instances of compute power. In contrast with Azure VMs, users do not have to configure or manage Azure Worker Roles. Windows Azure handles the deployment details – from provisioning and load balancing to health monitoring for continuous availability. This can be helpful to some users who prefer not to manage their applications at the infrastructure level, though it restricts the level of control users have over their environments.

5

Application

Replica Set A0...30

Replica Set B31...60

Replica Set C61...90

Replica Set Nn...n+30

...Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

mongos

Figure 4: MongoDB Architecture

Page 6: MongoDB on Windows Azure

Understanding the Deployment Options Given that MongoDB can be deployed on either Windows Azure Virtual Machines (IaaS) or Windows Azure Cloud Services (PaaS), it is important for users to consider the different capa-bilities and implementation details of each service to deter-mine which deployment model makes the most sense for their applications.

Azure Virtual MachinesBASIC SETUPAfter being granted access to the preview functionality for Azure Virtual Machines, users can launch an instance and install and configure MongoDB on it manually. Alternatively, users can use the recently released Windows Azure installer for MongoDB to set up a MongoDB replica set quickly and easily on Windows Azure VMs.

The installer is built on top of Windows PowerShell and the Windows Azure command line tool. It contains a number of deployment scripts. The tool is designed to help users get single or multi-node MongoDB configurations up and running quickly. There are only two steps to installing and configur-ing a MongoDB replica set on Azure VMs. Note: the installer is designed to run on a user’s local machine (i.e., not directly on an Azure VM), and then to deploy output to Windows Azure VMs. To start, download the publish settings file. Next, run the installer from the command prompt.

With Azure Virtual Machines, users can create their own VMs or they can create a VM instance from one of several pre-installed operating system configurations. Both Windows and Linux are supported on Azure Virtual Machines. To deploy MongoDB on Linux, visit the MongoDB wiki (wiki.mongodb.org) for step-by-step instructions.

PROS AND CONS OF AZURE VIRTUAL MACHINESThe pros and cons of deploying MongoDB on Azure Virtual Ma-chines are generally consistent with the considerations around using IaaS more broadly. Overall, Azure Virtual Machines allow users to fine-tune their deployments but by the same token require increased operational effort.

The advantages of using MongoDB on Azure Virtual Machines are as follows:

» Increased Control. Users have more control over their infrastructural configuration relative to Azure Cloud Services. For instance, they can install and configure services on the OS, define policies, etc. This consideration may be important for enterprises that have regimented policies and processes for IT security and compliance.

» OS Choice. Users can use Windows or Linux.

Azure Virtual Machines may not always be the right fit for the following reasons:

Increased Operational Effort. The increased control that Azure Virtual Machines provide comes with increase effort, as well. Users must define and implement their own security measures, apply patches, and locate instances for fault tolerance. This consideration may be important for developers that lack expe-rience managing their own infrastructure or for companies that don’t have the operational bandwidth to devote to managing this component of the stack.

Beta. The Azure Virtual Machines service is still in Preview (beta).

Azure Cloud ServicesBASIC SETUPUsers can also deploy MongoDB on Azure Cloud Services. To do so, download the MongoDB Azure Worker Role package, which is a preconfigured Worker Role with MongoDB. When deployed, each replica set member runs as a separate Worker Role instance; MongoDB data files are stored in Azure Cloud Drives. For detailed instructions, visit the MongoDB wiki (wiki.mongodb.org).

PROS AND CONS OF AZURE CLOUD SERVICESThe pros and cons of running MongoDB on Azure Cloud Services are generally consistent with those of using PaaS in general, though there are some Azure-specific considerations. Overall, Windows Azure Cloud Services decreases the opera-tional burden on users but affords them less control from an infrastructure configuration standpoint. The advantages of using Azure Cloud Services are as follows:

» Lower Operational Effort. Microsoft manages OS updates and security, decreasing the operational burden on the users.

» Built-in Fault Tolerance. When deploying multiple MongoDB worker role instances, Windows Azure automatically deploys the instances across multiple fault and update domains to guarantee better uptime.

» Secure by Default. Microsoft takes measures to ensure that worker and web roles are secure. Endpoints on instances can be enabled for instance-to-instance communication without making them public. Thus, one can configure MongoDB to be secure by enabling it only for other roles in the same deployment.

6

Page 7: MongoDB on Windows Azure

PROS CONS

Initial Administrative Effort

- Windows only

- Fixed OS configuration

IaaS – Windows Azure Virtual Machines

- Increased operational effort

- Increased control

- OS choice

- Lower operational effort

- Built-in fault tolerance

- Secure by default

Table 1: Pros and Cons Summary - Windows Azure Virtual Machines and Windows Azure Cloud Services

By the same token, there are some aspects of Azure Cloud Services that may be consid-ered drawbacks:

» Windows Only. Worker Roles can only be deployed with Windows; Linux is not an option.

» Fixed OS Configuration. Users cannot configure the OS, and must therefore develop applications that run on the pre-defined machine configurations available.

Table 1 summarizes the pros and cons of using Windows Azure Worker Roles and Windows Azure Virtual Machines.

SummaryMongoDB was built for ease of use, scal-ability, availability, and performance, and it’s quickly becoming an attractive alterna-tive to relational databases. Windows Azure provides a flexible cloud platform for host-ing MongoDB, with two deployment models to choose from. Developers and enterprises looking at deploying MongoDB on Windows Azure should consider the pros and cons discussed here when evaluating which option is most appropriate for them. We hope that this paper helps customers better understand these solutions, how they work, and how to assess them.

To learn more about MongoDB and how to deploy it in the cloud, or to speak to a sales representative, please email [email protected].

7

Page 8: MongoDB on Windows Azure

New York 578 Broadway, New York, NY 10012 • London 5-25 Scrutton St., London EC2A 4HJ

[email protected] • US (866) 237-8815 • INTL +1 (650) 440-4474

Page 9: MongoDB on Windows Azure

Published by 10gen, Inc. October 2012