40
MongoDB and AWS Integrating with AWS Services Partner Technical Solutions, MongoDB Inc. Sandeep Parikh #mongodb

MongoDB and AWS: Integrations

  • Upload
    mongodb

  • View
    4.807

  • Download
    0

Embed Size (px)

DESCRIPTION

MongoDB is one of the fastest growing NoSQL workloads on AWS due to its simplicity and scalability, and recent product additions by the AWS team have only improved those traits. In this session, we’ll talk about various AWS offerings and how they fit together with MongoDB -- including CloudFormation, Elastic MapReduce, Route53, Elastic Beanstalk, Elastic Load Balancing, and more -- and how they can be leveraged to enhance your MongoDB experience.

Citation preview

Page 1: MongoDB and AWS: Integrations

MongoDB and AWSIntegrating with AWS Services

Partner Technical Solutions, MongoDB Inc.

Sandeep Parikh

#mongodb

Page 2: MongoDB and AWS: Integrations

Recap: Deployment and Availability

• MongoDB basics

• Deployment configurations

• Instance types

• Best practices

• Slides and recording:– http://www.mongodb.com/presentations/mongodb

-and-amazon-web-services-deploying-high-availability

Page 3: MongoDB and AWS: Integrations

Recap: Storage Configurations

• Storage options

• Simple recommendations

• Backup and restore

• Advanced configurations

• Slides and recording:– http://www.mongodb.com/presentations/mongodb

-and-amazon-web-services-storage-options-mongodb-deployments

Page 4: MongoDB and AWS: Integrations

Agenda

• Available Services

• Integrations

• Infrastucture

• Future Directions

• Questions

Page 5: MongoDB and AWS: Integrations

Available Services

Page 6: MongoDB and AWS: Integrations

AWS Services

Compute Storage Persistent IPs DNS

Hadoop Data Warehouse

Stream processing

App deploymen

t

Orchestration

Provisioning

App services Caching

Page 7: MongoDB and AWS: Integrations

AWS Services

Compute Storage Persistent IPs DNS

Hadoop Data Warehouse

Stream processing

App deploymen

t

Orchestration

Provisioning Security Caching

Page 8: MongoDB and AWS: Integrations

Integrations

Page 9: MongoDB and AWS: Integrations

CloudFormation

• Simplify provisioning and deployment

• JSON-based templates

• Manage like source code

• Specify all manner of AWS components

• Boostrap for other tools like Chef or Puppet

Page 10: MongoDB and AWS: Integrations

"Parameters" : {

"KeyPairName" : {

"Description" : "EC2 KeyPair to enable SSH access",

"Type" : "String"

},

"SecurityGroupName" : {

"Description" : "EC2 Security Group",

"Type" : "String”

},

"InstanceType" : {

"Type" : "String",

"Default" : ”m3.large",

"AllowedValues" : [”m3.large”,”m3.xlarge”,”m3.2xlarge”],

"Description" : "EC2 instance type"

}

},

CloudFormation Sample

Page 11: MongoDB and AWS: Integrations

"Properties" : {

"InstanceType" : { "Ref" : "InstanceType" },

"ImageId" : { … },

"SecurityGroups" : [{ "Ref”: “SecurityGroupName" }],

"KeyName" : { "Ref" : "KeyPairName" },

"EbsOptimized" : "true",

"BlockDeviceMappings" : [{

"DeviceName" : "/dev/xvdf",

"Ebs" : { "VolumeSize" : "200”, "Iops" : "1000",

"VolumeType" : "io1”, "DeleteOnTermination" : “false”

}}]

CloudFormation Sample

Page 12: MongoDB and AWS: Integrations

CloudFormation Templates

• https://github.com/crcsmnky/aws-cfn-mongodb

• Templates to launch single-node MongoDB deployment

• Each one implements our best practices– EBS-optimized, PIOPS, ulimit, readahead

• Used to generate AWS Marketplace instances

Page 13: MongoDB and AWS: Integrations

CloudFormation Templates

Clone the repo

Upload the CF template

Instance provisioning

starts

Instance clones repo

Instance runs setup

script

Instance provisioned

and deployed

Page 14: MongoDB and AWS: Integrations

CloudFormation Tools

• https://github.com/cloudtools/troposphere

• Python package to generate CF templates

• Next versions of our templates will leverage this

• Coming soon: Replica Sets

• Coming later: Sharded Cluster

Page 15: MongoDB and AWS: Integrations
Page 16: MongoDB and AWS: Integrations

Elastic Map Reduce

• Quickly deploy and run Hadoop in AWS

• Tuned distributions to run on top of EC2

• Provision deployments with any number of nodes

• Supports Spot and Reserved pricing for savings

Page 17: MongoDB and AWS: Integrations

EMR and MongoDB

• https://github.com/mongodb/mongo-hadoop

• MongoDB-Hadoop connector– Bi-directional access to/from MongoDB

• Supports MapReduce, Hive, Pig, Streaming

• Read/write from – MongoDB deployments or – BSON backup files

Page 18: MongoDB and AWS: Integrations

EMR with MongoDB

MongoDB

BSON

S3

EMR

EMR

EMR

EMR

EMR

EMR

EMR

EMR

EMR

EMR

Page 19: MongoDB and AWS: Integrations

EMR Workflow

Bootstrap script• MongoDB-Hadoop• MongoDB Java

driver

Copy resources• Bootstrap script• MapReduce job

Launch EMR• Instance type• Instance count• Arguments

MapReduce Output• MongoDB• BSON in S3

EMR Logs• Written to S3

Page 20: MongoDB and AWS: Integrations

$ elastic-mapreduce --create --jobflow ENRON000

--instance-type m1.xlarge --num-instances 5

--bootstrap-action s3://$S3_BUCKET/bootstrap.sh

--log-uri s3://$S3_BUCKET/enron_logs

--jar s3://$S3_BUCKET/enron-example.jar

--arg -D --arg mongo.job.input.format =

com.mongodb.hadoop.BSONFileInputFormat

--arg -D --arg mapred.input.dir =

s3n://mongo-test-data/messages.bson

--arg -D --arg mapred.output.dir =

s3n://$S3_BUCKET/BSON_OUT

--arg -D --arg mongo.job.output.format =

com.mongodb.hadoop.BSONFileOutputFormat

EMR Launch

Page 21: MongoDB and AWS: Integrations

Elastic Beanstalk

• Deploy and manage applications

• Handles provisioning, scaling, load balancing

• Built on EC2, S3, SNS, Auto Scaling

Page 22: MongoDB and AWS: Integrations

Elastic Beanstalk Architecture

App Serve

r

App Serve

r

App Serve

rSecurity Group

Elastic Load Balancer

Auto Scaling Group

Page 23: MongoDB and AWS: Integrations

Elastic Beanstalk with MongoDB

App Server

App Server

App Server

Security Group

Elastic Load Balancer

Auto Scaling Group

mongos

mongos

mongos

MongoDB

Page 24: MongoDB and AWS: Integrations

Elastic Beanstalk with MongoDB

• Customize and configure software that your app needs (e.g. mongos)

• Install packages

• Create files

• Execute commands (before or after app is setup)

• Control system services

• http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html

Page 25: MongoDB and AWS: Integrations

Infrastructure

Page 26: MongoDB and AWS: Integrations

Elastic IPs

• EC2 instances use dynamic IP addresses

• EIPs are static addresses that can be assigned to individual EC2 instances

• Unfortunately you have a limited number

Page 27: MongoDB and AWS: Integrations

Route53

• Highly available and scalable DNS service in AWS

• Hostnames can be assigned to EC2 instances, ELB instances, or S3 buckets

• DNS load balancing with weighted-round-robin

• Supports hostnames for non-AWS infrastructure

Page 28: MongoDB and AWS: Integrations

Route53 and MongoDB

• Short answer: use hostnames for all components

• With replica sets, hostnames can ease machine replacement

• With sharded clusters, hostnames can simplify config server maintenance

Page 29: MongoDB and AWS: Integrations

VPC

• Virtual Private Cloud lets you provision a logically isolated network inside AWS

• You manage all aspects of networking including– IP address ranges– Subnets– Routing tables and gateways

• Can be used as an extension to an offsite data center with Hardware VPN

Page 30: MongoDB and AWS: Integrations

VPC Public and Private

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html

• Private subnets hidden to outside world

• Internet Gateway and EIPs can be used to access

• Web tier in public subnet

• Data tier in private subnet

Page 31: MongoDB and AWS: Integrations

ElastiCache

• Distributed in-memory cache

• Backed by Memecached or Redis

• Can be a drop-in replacement for existing cache deployments

• Supports auto-discovery and read-replicas

Page 32: MongoDB and AWS: Integrations

Future Directions

Page 33: MongoDB and AWS: Integrations

RedShift

• Fully-managed petabyte-scale data warehouse service

• MongoDB not natively supported as a data source

• … So how do you get your data in?

Page 34: MongoDB and AWS: Integrations

Data Pipeline

• Process and move data between different AWS compute and storage services

• Date Pipeline handles resources, failures, and dependencies

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html

Page 35: MongoDB and AWS: Integrations

Data Pipeline with MongoDB

AWS Data Pipeline

MongoDB

S3

EMRorRedshift

Page 36: MongoDB and AWS: Integrations

OpsWorks

• Complete DevOps stack

• Model and manage apps, load balancers, databases

• Uses Chef recipes

• Load or time-based scaling

• Deploying MongoDB with OpsWorks:– http://blogs.aws.amazon.com/application-manage

ment/post/Tx1RB65XDMNVLUA/Deploying-MongoDB-with-OpsWorks

Page 37: MongoDB and AWS: Integrations

CloudWatch

• Monitoring for AWS resources

• Supports custom metrics

http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatch.html

Page 38: MongoDB and AWS: Integrations

aws cloudwatch put-metric-data

--metric-name ResidentMemory

--namespace MongoDB

--timestamp 2014-02-14T20:30:00Z

--value 32

--unit Gigabytes

CloudWatch Custom Metrics

Page 39: MongoDB and AWS: Integrations

Questions?

Page 40: MongoDB and AWS: Integrations

MongoDB WorldNew York City, June 23-25

#MongoDBWorld

See what’s next in MongoDB including • MongoDB 2.6• Sharding• Replication• Aggregation

http://world.mongodb.comSave 25% with discount code 25SandeepParikh