34
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. DAT303 - A Closer Look at Amazon RDS for Microsoft SQL Server Deep Dive into Performance, Security, and Data Migration Best Practices Sergei Sokolenko - Sr Product Manager, AWS Allan Parsons - VP Operations, Viddy November 13, 2013

Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Embed Size (px)

DESCRIPTION

Come learn about architecting high-performance applications and production workloads using Amazon RDS for SQL Server. Understand how to migrate your data to an Amazon RDS instance, apply security best practices, and optimize your database instance and applications for high availability.

Citation preview

Page 1: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

DAT303 - A Closer Look at Amazon RDS for Microsoft SQL Server Deep Dive into Performance, Security, and Data Migration Best Practices

Sergei Sokolenko - Sr Product Manager, AWS

Allan Parsons - VP Operations, Viddy

November 13, 2013

Page 2: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

• Best Practices – Security

– Performance

– Data Migration

– Data Durability

• Viddy’s Case

Next Hour …

Page 3: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Security Best Practices

Page 4: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Control Access Internet

IAM

VPC

Page 5: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Encrypt Your Data

• “In transit” with SSL – Import public Amazon RDS certificate into Windows

https://rds.amazonaws.com/doc/rds-ssl-ca-cert.pem

– Add "encrypt=true" to your connection string

• “At rest” with Transparent Data Encryption – Encrypts data before writing to storage

– Decrypts when reading

Page 6: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013
Page 7: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Performance Best Practices

Page 8: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

High Performance Relational Databases

Amazon RDS Configuration

Increase Throughput

Reduce Latency

Push-Button Scaling

DB Shards

Provisioned IOPS

Push-Button Scaling Provisioned IOPS Database Shards

Page 9: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Push Button Scaling & Sharding

• Scale nodes vertically up or down – M1.small (1 virtual core, 1.7GB)

– M2.4XLarge (8 virtual cores, 64GB)

• Scale out nodes horizontally – Shard based on data or workload

characteristics

Page 10: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Production = Provisioned IOPS Consistently fast performance

• 1 TB max instance size

• 10,000 Provisioned IOPS

• I/O-Optimized instances

• Check I/O blockers – Database contention

– Locking

Page 11: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Data Migration Best Practices

Page 12: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Replication + Switchover

Linked Servers

SSIS

Bulk Migration

Import/Export Wizard

BCP Bulk Load

Migrating Data to Amazon RDS

Page 13: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

One-time Bulk Migration

On Premise AWS

Page 14: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Migration Code Snippets -- Run SSMS’s “Generate and Publish Scripts” Wizard

-- .BAT script for export BCP commands

SELECT 'bcp ' + db_name() + '..' + name + ' out “C:\Data\' + name + '.txt" -E -n -S localhost –U usr –P pwd' FROM sysobjects WHERE type = 'U'

bcp dbname..table out “C:\Data\table.txt” –E -n -S localhost -U usr -P pwd

-- .BAT script for import BCP commands

SELECT 'bcp ' + db_name() + '..' + name + ' in “C:\Data\' + name + '.txt" -E -n –S RDSEndpoint –U usr –P pwd‘ from sysobjects where type = 'U‘

bcp dbname..table in “C:\Data\table.txt” –E -n -S endpoint,port -U usr -P

pwd

More Info: Data Import Guide for SQL Server

Tables Only

Script USE DATABASE = False

Script Check Constraints = False

Script Foreign Keys = False

Script Primary Keys = False

Script Unique Keys = False

Page 15: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Ongoing Replication with Switchover

SourceINST

On Premise TargetINST

AWS

Linked Server

Page 16: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

On Target Instance (Amazon RDS) USE master;

CREATE LOGIN [repl_login] WITH PASSWORD=N'password01', DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF;

USE UserDB1;

CREATE USER [repl_user] FOR LOGIN [repl_login];

EXEC sp_addrolemember 'db_datareader', [repl_user];

EXEC sp_addrolemember 'db_datawriter', [repl_user];

-- Assume Source DB has a table “Customers”

CREATE TABLE StageCustomers ( CustomerID int, UpdatedDate datetime );

Page 17: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

On Source Instance (On-Premise) USE master;

EXEC sp_addlinkedserver N'[TargetINST.amazonaws.com,port]', N'SQL Server';

CREATE LOGIN [repl_login] WITH PASSWORD=N'password02', DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF;

EXEC sp_addlinkedsrvlogin

@rmtsrvname = N'[TargetINST.amazonaws.com,port]', N'SQL Server',

@useself = 'FALSE', @locallogin = N'repl_login',

@rmtuser = N'repl_login', @rmtpassword = N'password01';

USE UserDB1;

INSERT INTO [TargetINST.amazonaws.com,port].UserDB1.dbo.StageCustomers (CustomerID, UpdatedDate)

SELECT CustomerID,UpdatedDate FROM Customers WHERE UpdatedDate >= DATEADD(DD,-2,GETDATE());

Page 18: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Data Durability Best Practices

Page 19: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Backups and Disaster Recovery • Automated Backups

Nightly system snapshots + transaction backup

Enables point-in-time restore to any point in retention period

Max retention period = 35 days

• DB Snapshots

User-driven snapshots of database

Kept until explicitly deleted

Page 20: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Region 1

AZ 1

Region 2

AZ 1

Cross Region Snapshot Copy

Page 21: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Viddy’s Case

Allan Parsons, Viddy

Scaling viddy.com on Amazon RDS for SQL Server

Page 22: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Vision

To entertain and connect

people around the world by

empowering mobile users to

easily capture, beautify and

share amazing videos to

those who matter most.

Page 23: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Viddy By The Numbers

• Reach :: 41+ Million Registered Users

• Connections :: 250+ Million Users Connections

• Media :: 6.0+ Million Unique Videos

• CDN Assets (encoded videos + images)

• Videos :: 30+ Million Video Files

• Images :: 2+ Billion Image Files

• Human Power

• Executives & Support Staff :: 4

• Software Engineers :: 6

• DevOps Engineers :: 1

• Database Administrators :: 0

Page 24: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013
Page 25: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

What Powers Viddy

• Web / Front-End :: Windows / IIS (C# / .NET / MVC)

• Cache :: Linux / memcached (via Couchbase)

• Persistent Cache :: Linux / Redis (2x Master-Slave Environments)

• Source Control :: Team Foundation Server

• Continuous Integration & Build Automation :: Jenkins, Powershell, msbuild

• AWS & EC2 Tools

• VPCs :: 1 VPC/Environment (Production, QA, Dev)

• RDS :: 11 SQL Server Instances Housing 144 Databases (Production)

• SNS / SQS :: Used for Eventual Consistency

• Route53 & ELBs :: DNS and Load Balancing

• CloudWatch :: Monitoring & Trending

• CloudSearch :: Media, Tag, and User Searching

• S3 & CloudFront :: Asset Storage and Delivery

We’re a Technology Agnostic Stack & Team

Page 26: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Early Technical Challenges Wrong Cloud Ideology

• Inherited a PaaS Cloud Infrastructure

Difficulty in Caching Data

• Twitter-based Service Model

Underestimated Power of Facebook

• Open Graph drove 1MM+ User Registrations / 24H Period

Very Very Busy SQL Instance

• 1 Instance, 6 Databases

• Disabled Key Constraints to Improve Performance

• Too busy to get transactionally consistent backups

Inflexible Platform

• Adding machines would make inefficiencies worse

• On PaaS, more money != more scalability

Page 27: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Moving to AWS

VPC

• Guaranteed affinity between Web, Cache, SQL

• Low Latency

• Better security

SQL

• Tremendous cleanup effort

• 144 RDS shells & filled via ETL

• Engineered Eventual Consistency to Move Deltas

Build Automation

• Build Scripts dual-deployed to PaaS and IaaS

• Developers could build & test multiple times per hour on 2 providers

DNS

• Moved all zones to Route53 & Lowered TTLs

• Updated DNS entries Christmas Eve 2012 (low traffic)

Goal: PaaS to IaaS with Zero Downtime

Page 28: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

RDS Eventual Consistency

[1] :: API Servers Push Messages to Amazon SNS Topic

[2] :: Amazon SNS Distributes Message to SQS Queue

[3] :: Windows Service Monitors Queues

[4] :: Windows Service Pushes Message to Shard

Advantages :: Can lose Windows Service, keep messages

:: Can lose DB Shard, keep messages

:: Easy to Scale!

+ more queues

+ more messages

= More Windows Services / EC2 Machines

Shards Based On UserID (GUID)

Page 29: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Provisioning On RDS

SQL Edition

• SQL Server 2012 Standard (BizSpark)

Storage Allocation

• We took the max (1TB)

• Changing Storage = downtime

IOPS

• Busiest Instance (ViddyDB) has 7,000 provisioned IOPS

• Shards have no provisioned IOPS

• Occasional hotspots when celebrities post content

• Changing IOPS = downtime

Instance Size

• Busiest Instance (ViddyDB) has largest size (m2.4xlarge)

• Shards running (m2.2xlarge)

• Changing Instance Size = downtime

VPC Placement

• VPC guarantees node affinity (ours sit in private segment)

• Change VPC Placement = downtime

Goal: As Hands Off As Possible (we don’t have a DBA)

Page 30: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Designing for High Availability

Amazon RDS In VPCs

• At the time we provisioned (Nov-2012), no data replication across AZs

• Single point of failure is Availability Zone

• Running our own replication meant no RDS (and need a DBA)

• RDS didn’t force SQL Server’s AlwaysOn Technology

Sharded Model

• User exists in 1/64 Consumer Shards & 1/64 Producer Shards

• Database goes down: 1/64 users affected (1.5%)

• Instance goes down: 1/8 users affected (12.5%)

Eventual Consistency

• Amazon SNS/SQS Guarantees Eventual Consistency

• Visibility Timeout gives us time to get DB or Instance back online

• Sharded Amazon SQS = won’t affect other shards during downtime

Snapshots

• Set it and forget it

• Reliably works

• Allows us to regularly refresh non-prod DBs via scripts.

Goal: Easily & Quickly

Recover from Outage

Page 31: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Security Considerations The Basics

• Application config files use separate restricted accounts (not SA)

• DBs sit in private VPC segment

• Port restrictions done at Security Group Level

• Viddy HQ is whitelisted

• Developers can connect remotely over OpenVPN

• Support staff gets read-only DB access if they know SQL

The Facebook Security Model

• Every developer has access to everything (we’re a team of 7)

• Less friction, empowers developers

• With great privilege comes great responsibility

Page 32: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Questions?

Page 33: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Try Amazon RDS for SQL Server!

• Start using Transparent Data Encryption (TDE) – See Amazon RDS for SQL Server documentation

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/

• Try Cross Region Snapshot Copy

Page 34: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

DAT303