Transcript
Page 1: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

DAT303 - A Closer Look at Amazon RDS for Microsoft SQL Server Deep Dive into Performance, Security, and Data Migration Best Practices

Sergei Sokolenko - Sr Product Manager, AWS

Allan Parsons - VP Operations, Viddy

November 13, 2013

Page 2: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

• Best Practices – Security

– Performance

– Data Migration

– Data Durability

• Viddy’s Case

Next Hour …

Page 3: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Security Best Practices

Page 4: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Control Access Internet

IAM

VPC

Page 5: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Encrypt Your Data

• “In transit” with SSL – Import public Amazon RDS certificate into Windows

https://rds.amazonaws.com/doc/rds-ssl-ca-cert.pem

– Add "encrypt=true" to your connection string

• “At rest” with Transparent Data Encryption – Encrypts data before writing to storage

– Decrypts when reading

Page 6: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013
Page 7: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Performance Best Practices

Page 8: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

High Performance Relational Databases

Amazon RDS Configuration

Increase Throughput

Reduce Latency

Push-Button Scaling

DB Shards

Provisioned IOPS

Push-Button Scaling Provisioned IOPS Database Shards

Page 9: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Push Button Scaling & Sharding

• Scale nodes vertically up or down – M1.small (1 virtual core, 1.7GB)

– M2.4XLarge (8 virtual cores, 64GB)

• Scale out nodes horizontally – Shard based on data or workload

characteristics

Page 10: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Production = Provisioned IOPS Consistently fast performance

• 1 TB max instance size

• 10,000 Provisioned IOPS

• I/O-Optimized instances

• Check I/O blockers – Database contention

– Locking

Page 11: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Data Migration Best Practices

Page 12: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Replication + Switchover

Linked Servers

SSIS

Bulk Migration

Import/Export Wizard

BCP Bulk Load

Migrating Data to Amazon RDS

Page 13: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

One-time Bulk Migration

On Premise AWS

Page 14: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Migration Code Snippets -- Run SSMS’s “Generate and Publish Scripts” Wizard

-- .BAT script for export BCP commands

SELECT 'bcp ' + db_name() + '..' + name + ' out “C:\Data\' + name + '.txt" -E -n -S localhost –U usr –P pwd' FROM sysobjects WHERE type = 'U'

bcp dbname..table out “C:\Data\table.txt” –E -n -S localhost -U usr -P pwd

-- .BAT script for import BCP commands

SELECT 'bcp ' + db_name() + '..' + name + ' in “C:\Data\' + name + '.txt" -E -n –S RDSEndpoint –U usr –P pwd‘ from sysobjects where type = 'U‘

bcp dbname..table in “C:\Data\table.txt” –E -n -S endpoint,port -U usr -P

pwd

More Info: Data Import Guide for SQL Server

Tables Only

Script USE DATABASE = False

Script Check Constraints = False

Script Foreign Keys = False

Script Primary Keys = False

Script Unique Keys = False

Page 15: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Ongoing Replication with Switchover

SourceINST

On Premise TargetINST

AWS

Linked Server

Page 16: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

On Target Instance (Amazon RDS) USE master;

CREATE LOGIN [repl_login] WITH PASSWORD=N'password01', DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF;

USE UserDB1;

CREATE USER [repl_user] FOR LOGIN [repl_login];

EXEC sp_addrolemember 'db_datareader', [repl_user];

EXEC sp_addrolemember 'db_datawriter', [repl_user];

-- Assume Source DB has a table “Customers”

CREATE TABLE StageCustomers ( CustomerID int, UpdatedDate datetime );

Page 17: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

On Source Instance (On-Premise) USE master;

EXEC sp_addlinkedserver N'[TargetINST.amazonaws.com,port]', N'SQL Server';

CREATE LOGIN [repl_login] WITH PASSWORD=N'password02', DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF;

EXEC sp_addlinkedsrvlogin

@rmtsrvname = N'[TargetINST.amazonaws.com,port]', N'SQL Server',

@useself = 'FALSE', @locallogin = N'repl_login',

@rmtuser = N'repl_login', @rmtpassword = N'password01';

USE UserDB1;

INSERT INTO [TargetINST.amazonaws.com,port].UserDB1.dbo.StageCustomers (CustomerID, UpdatedDate)

SELECT CustomerID,UpdatedDate FROM Customers WHERE UpdatedDate >= DATEADD(DD,-2,GETDATE());

Page 18: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Data Durability Best Practices

Page 19: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Backups and Disaster Recovery • Automated Backups

Nightly system snapshots + transaction backup

Enables point-in-time restore to any point in retention period

Max retention period = 35 days

• DB Snapshots

User-driven snapshots of database

Kept until explicitly deleted

Page 20: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Region 1

AZ 1

Region 2

AZ 1

Cross Region Snapshot Copy

Page 21: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Viddy’s Case

Allan Parsons, Viddy

Scaling viddy.com on Amazon RDS for SQL Server

Page 22: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Vision

To entertain and connect

people around the world by

empowering mobile users to

easily capture, beautify and

share amazing videos to

those who matter most.

Page 23: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Viddy By The Numbers

• Reach :: 41+ Million Registered Users

• Connections :: 250+ Million Users Connections

• Media :: 6.0+ Million Unique Videos

• CDN Assets (encoded videos + images)

• Videos :: 30+ Million Video Files

• Images :: 2+ Billion Image Files

• Human Power

• Executives & Support Staff :: 4

• Software Engineers :: 6

• DevOps Engineers :: 1

• Database Administrators :: 0

Page 24: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013
Page 25: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

What Powers Viddy

• Web / Front-End :: Windows / IIS (C# / .NET / MVC)

• Cache :: Linux / memcached (via Couchbase)

• Persistent Cache :: Linux / Redis (2x Master-Slave Environments)

• Source Control :: Team Foundation Server

• Continuous Integration & Build Automation :: Jenkins, Powershell, msbuild

• AWS & EC2 Tools

• VPCs :: 1 VPC/Environment (Production, QA, Dev)

• RDS :: 11 SQL Server Instances Housing 144 Databases (Production)

• SNS / SQS :: Used for Eventual Consistency

• Route53 & ELBs :: DNS and Load Balancing

• CloudWatch :: Monitoring & Trending

• CloudSearch :: Media, Tag, and User Searching

• S3 & CloudFront :: Asset Storage and Delivery

We’re a Technology Agnostic Stack & Team

Page 26: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Early Technical Challenges Wrong Cloud Ideology

• Inherited a PaaS Cloud Infrastructure

Difficulty in Caching Data

• Twitter-based Service Model

Underestimated Power of Facebook

• Open Graph drove 1MM+ User Registrations / 24H Period

Very Very Busy SQL Instance

• 1 Instance, 6 Databases

• Disabled Key Constraints to Improve Performance

• Too busy to get transactionally consistent backups

Inflexible Platform

• Adding machines would make inefficiencies worse

• On PaaS, more money != more scalability

Page 27: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Moving to AWS

VPC

• Guaranteed affinity between Web, Cache, SQL

• Low Latency

• Better security

SQL

• Tremendous cleanup effort

• 144 RDS shells & filled via ETL

• Engineered Eventual Consistency to Move Deltas

Build Automation

• Build Scripts dual-deployed to PaaS and IaaS

• Developers could build & test multiple times per hour on 2 providers

DNS

• Moved all zones to Route53 & Lowered TTLs

• Updated DNS entries Christmas Eve 2012 (low traffic)

Goal: PaaS to IaaS with Zero Downtime

Page 28: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

RDS Eventual Consistency

[1] :: API Servers Push Messages to Amazon SNS Topic

[2] :: Amazon SNS Distributes Message to SQS Queue

[3] :: Windows Service Monitors Queues

[4] :: Windows Service Pushes Message to Shard

Advantages :: Can lose Windows Service, keep messages

:: Can lose DB Shard, keep messages

:: Easy to Scale!

+ more queues

+ more messages

= More Windows Services / EC2 Machines

Shards Based On UserID (GUID)

Page 29: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Provisioning On RDS

SQL Edition

• SQL Server 2012 Standard (BizSpark)

Storage Allocation

• We took the max (1TB)

• Changing Storage = downtime

IOPS

• Busiest Instance (ViddyDB) has 7,000 provisioned IOPS

• Shards have no provisioned IOPS

• Occasional hotspots when celebrities post content

• Changing IOPS = downtime

Instance Size

• Busiest Instance (ViddyDB) has largest size (m2.4xlarge)

• Shards running (m2.2xlarge)

• Changing Instance Size = downtime

VPC Placement

• VPC guarantees node affinity (ours sit in private segment)

• Change VPC Placement = downtime

Goal: As Hands Off As Possible (we don’t have a DBA)

Page 30: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Designing for High Availability

Amazon RDS In VPCs

• At the time we provisioned (Nov-2012), no data replication across AZs

• Single point of failure is Availability Zone

• Running our own replication meant no RDS (and need a DBA)

• RDS didn’t force SQL Server’s AlwaysOn Technology

Sharded Model

• User exists in 1/64 Consumer Shards & 1/64 Producer Shards

• Database goes down: 1/64 users affected (1.5%)

• Instance goes down: 1/8 users affected (12.5%)

Eventual Consistency

• Amazon SNS/SQS Guarantees Eventual Consistency

• Visibility Timeout gives us time to get DB or Instance back online

• Sharded Amazon SQS = won’t affect other shards during downtime

Snapshots

• Set it and forget it

• Reliably works

• Allows us to regularly refresh non-prod DBs via scripts.

Goal: Easily & Quickly

Recover from Outage

Page 31: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Security Considerations The Basics

• Application config files use separate restricted accounts (not SA)

• DBs sit in private VPC segment

• Port restrictions done at Security Group Level

• Viddy HQ is whitelisted

• Developers can connect remotely over OpenVPN

• Support staff gets read-only DB access if they know SQL

The Facebook Security Model

• Every developer has access to everything (we’re a team of 7)

• Less friction, empowers developers

• With great privilege comes great responsibility

Page 32: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Questions?

Page 33: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Try Amazon RDS for SQL Server!

• Start using Transparent Data Encryption (TDE) – See Amazon RDS for SQL Server documentation

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/

• Try Cross Region Snapshot Copy

Page 34: Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) | AWS re:Invent 2013

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

DAT303


Recommended