40
Building Scalable SQL Applications Using NoSQL Paradigms Michael Rys Program Manager, SQL Server Engine Team ([email protected] , @SQLServerMike)

Building Scalable SQL Applications Using NoSQL Paradigms

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Building Scalable SQL Applications Using NoSQL Paradigms

Building Scalable SQL Applications Using NoSQL Paradigms

Michael Rys Program Manager, SQL Server Engine Team([email protected], @SQLServerMike)

Page 2: Building Scalable SQL Applications Using NoSQL Paradigms

Agenda

Scale-Out Application Pattern for SQL Server and SQL Azure: MySpace.com MSN Casual Games

Applications do all the work Databases provide platform support:

Reliable messaging infra structure Sharding in the data platform

Demo SQL Azure Federations Roadmap

Page 3: Building Scalable SQL Applications Using NoSQL Paradigms

MySpace: the Business Problem 223M users

4.4 million users concurrently

900 Terabytes of data Horizontally partitioned by

user Id 450 SQL Servers Required (eventual) data

consistency across databases E.g. show your updated

state in your friends’ profile pages

Page 4: Building Scalable SQL Applications Using NoSQL Paradigms

How to reflect change inmy friends’ DBs?- reliable- scalable

My DB gets updated

MySpace’s Data Consistency Problem

1-1000

1001-2000

2001-3000

3001-4000

4001-5000

5001-6000 Web Tier

Data Tier

I change my status

userId=1024

Page 5: Building Scalable SQL Applications Using NoSQL Paradigms

MySpace’s Solution

Propagate data changes from one DB to other DBs using reliable, async Message Service (Service Broker) Managing routes from each DB to every

other DB would be too complex Global Transactions would hinder scale

and availability And also used for

Clean-up state (e.g. on account close) Deploy business logic (stored procedures)

Page 6: Building Scalable SQL Applications Using NoSQL Paradigms

MySpace’s Service Dispatcher Coordination point between all SQL

Servers Centralizes route management Avoids routes explosion

Load-balanced across 30 SQL Servers Messages are sent randomly to these

Enables multicast/broadcast functionality Supports destination lists and wildcards

e.g. [DB1,DB3, DB4], DB% 18,000 ~2k msgs/sec per dispatcher

SQL Server

Page 7: Building Scalable SQL Applications Using NoSQL Paradigms

MySpace Architecture

1-1000

3001-4000

2001-3000

1001-2000

4001-5000

5001-6000

I change my status

userId=1024

Web Tier

Data Tier

ServiceDispatch

er

Service Broker

Service Broker

TX2

My DBgets updatedTX

1Service Broker

TX3

TX4

TX5

Page 8: Building Scalable SQL Applications Using NoSQL Paradigms

Many other customers using similar patterns

Online electronic stores (cannot give names )

Travel reservation systems (e.g. Choice International)

Page 9: Building Scalable SQL Applications Using NoSQL Paradigms
Page 10: Building Scalable SQL Applications Using NoSQL Paradigms

MSN Casual Games Architecture: Goals

Provide elastically scalable, high-available, agile online platform for integrated, social gaming experience Developed v1 to v3 with 7 devs in 3 months!

Support multiple gaming platforms: Windows Live Messenger Games MSN Games Bing Games

Integrate into social environments Favorite games, Friends’ highscores,

Compete against Friends, etc Social Networks: MSN, Facebook, etc

Page 11: Building Scalable SQL Applications Using NoSQL Paradigms

STS

Data Backend Services

MSN Casual Games Architecture: Overview

Ops

Bing Games

WLM Games

MSN Games

GameBar Host

Auth

l

l

MSN Games

Web Portal

WLM Games

Web Portal

Bing Games

Web Portal

Management Services

STS

Data Backend Services

Data Backend Services

Front Door Router

Services

Azure Data Centers

Social Networ

ks

Social Network

s

Page 12: Building Scalable SQL Applications Using NoSQL Paradigms

Why SQL Azure?

Faster than Table Storage Very low learning curve

Prototyped and written in less than 4 weeks by 1 dev

Testability: Easy to prepopulate

with millions of records Compatible with SQL

Server

Page 13: Building Scalable SQL Applications Using NoSQL Paradigms

How does it scale?• ~2 Million users at launch• ~86 Million services

requests/day • 135 Windows Azure Data

Services Hosting VMs • ca. 18K connections in

Connection Pools, this could grow with traffic

• Ca. 1200 SQL Azure requests/second spread across all partitions during peak load

• ~ 90% reads vs 10% writes (this varies per storage type)

• ~ 200 bytes of storage per user

• ~ 20% of database storage is currently used, but expect this to grow

Page 14: Building Scalable SQL Applications Using NoSQL Paradigms

Partitioning Strategies

• Built to scale: Functional and Data Partitioning

• 398 databases (10 Gig each)

• 100 social user information + 298 leaderboard databases• Social user

information partitioned by UserId

• Leaderboard partitioned by GameId

Page 15: Building Scalable SQL Applications Using NoSQL Paradigms

Data Backend Services

Front Door Router

Services

250 instances

STSSTS

DBUser …

Partitioned over 100 SQL Azure DBs

Social Service

Gamer Services

Game Ingestio

n

Social Services

Gamer Services

Game Ingestio

n

Game Catalog

Find Friends’ Profiles

DBLeaderboard

Partitioned over 298 SQL Azure DBs

Find Friends’ ProfilesGet my ProfilePublish feed, read feed

Last PlayedFavoritesGame PreferencesSocial Leaderboards

Disable/Enable Games from accessing services

Game binariesGame metadata

Get Friends highscores

DBUser …

Partitioned over 100 SQL Azure DBs

Write user specific game infos

250 instances

Page 16: Building Scalable SQL Applications Using NoSQL Paradigms

Common Data Services Features

• Fanout: Parallel calls to multiple database partitions

• Quorum: Able to tolerate a percentage of request failures during Fanout

• Retry: Retry on database requests error

Page 17: Building Scalable SQL Applications Using NoSQL Paradigms

Operational Games DashboardStatistics per game

Page 18: Building Scalable SQL Applications Using NoSQL Paradigms

Lessons Learned from both scenarios

Require high availability Be able to scale out:

Functional and Data Partitioning Architecture Provide scale-out processing:

Function shipping Fanout and Map/Reduce processing

Be able to deal with failures: Quorum Retries Eventual Consistency (similar to Read-consistent Snapshot Isolation)

Be able to quickly grow and change: Elastic scale Flexible, open schema Multi-version schema support

Move better support for these patterns into the Data Platform!

Page 19: Building Scalable SQL Applications Using NoSQL Paradigms

What is NoSQL? It is different things to different people But for all about App Availability, Scalability, Agility! Processing Paradigms

High Availability (Fast Failover, DR/GeoDR) Scale-out (Sharding, Map-Reduce, BigData) Performance (Caching) Eventual Consistency

Data Model Paradigms: Data first: Flexible Open Schema Wide Tables: For “Relational” Data

HyperTable, Windows Azure Tables, Hbase, SQL Server Sparse Columns Graph stores: For “Relationship” Graphs, “Semantics” Key/document stores: For “Object” and “hierarchical data”

JSON: MongoDB, CouchBase, Riak etc. XML: Marklogic etc.

Cost: Cheap to build, Cheap to operate

NoSQL is all about operational and developer agility!

Page 20: Building Scalable SQL Applications Using NoSQL Paradigms

Operational Agility

You want: Availability of service (scalability) Global consistency Network Partition Tolerance

You can only get 2 of 3 (CAP Theorem) In Brave New World:

Online businesses need availability It is distributed, because it is big thus Network Partitioning is unavoidable Hence global consistency must be relaxed

Page 21: Building Scalable SQL Applications Using NoSQL Paradigms

Operational Agility

Performance and Elastic Scale on Demand Automate management lifecycle (or fail) Simple deployment lifecycle No DB or OS Admin telling me what to do

Page 22: Building Scalable SQL Applications Using NoSQL Paradigms

Developer Agility

Code First and revise quickly Application-model first (before database) Flexible open data models You don’t know exactly what you are

looking for Lower Pain of adoption and maintenance No DB or OS Admin telling me what to do

Page 23: Building Scalable SQL Applications Using NoSQL Paradigms

Introducing SQL Azure Federations Provides Data Partitioning/Sharding at

the Data Platform Enables applications to build elastic

scale-out applications Provides non-blocking SPLIT/DROP for

shards (MERGE to come later) Auto-connect to right shard based on

sharding keyvalue Provides SPLIT resilient query mode

Page 24: Building Scalable SQL Applications Using NoSQL Paradigms

SQL Azure Federation Concepts

24

Federation “Orders_Fed”

ShardedApplication

Azure DB with Federation Root

Federation Directories, Federation Users, Federation Distributions, …

Federation- Represents the data being sharded

Federation Root- Database that logically houses federations,

contains federation meta data Federation Key

- Value that determines the routing of a piece of data (defines a Federation Distribution)

Federation Member (aka Shard)- A physical container for a set of federated

tables for a specific key range and reference tables

Federated Table- Table that contains only atomic units for the

member’s key range Reference Table

- Non-sharded table Atomic Unit

- All rows with the same federation key value: always together!

Member: PK [min, 100)

Member: PK [100, 488)

Member: PK [488, max)

(Federation Key: CustomerID)

AUPK=

5

AUPK=25

AUPK=35

AUPK=105

AUPK=235

AUPK=365

AUPK=555

AUPK=254

5

AUPK=356

5

Con

nectio

n

Gate

way

Page 25: Building Scalable SQL Applications Using NoSQL Paradigms

Demo: Map-Reduce scale-out over SQL Azure Federations

• Sharded GamesInfo table using SQL Azure

Federations

• Use a C# library that does implement a

Map/Reduce processor on top SQL Azure

Federations

• Mapper and Reducer are specified using SQL

Page 26: Building Scalable SQL Applications Using NoSQL Paradigms

SQL Azure: A Not Only SQL Data Platform SQL Azure is adding data platform support

for NoSQL paradigms in the data platform: High-Availability (each DB has two replicas) Sharding support with federations:

Data platform provides online SPLIT/DROP Filtered connection to provide split resilient

programming model Flexible Data Models:

XML support Sparse columns/Column sets

More to come in the future…

Page 27: Building Scalable SQL Applications Using NoSQL Paradigms

Related Resources

Windows Gaming Experience Case Study: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=4000008310

Related Whitepapers: CACM: Scalable SQL: http://

cacm.acm.org/magazines/2011/6/108663-scalable-sql NoSQL and the Windows Azure Platform: http://

download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE-6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf

SQL Federation blog: http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-federations.aspx

Contact me: [email protected] @SQLServerMike http://sqlblog.com/blogs/michael_rys/default.aspx

Page 28: Building Scalable SQL Applications Using NoSQL Paradigms

Appendix

SQL Azure Federations Details

Page 29: Building Scalable SQL Applications Using NoSQL Paradigms

29

CREATE FEDERATIONExisting Database

Gat

eway

CREATE FEDERATION sales (customer_id bigint RANGE)

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

Page 30: Building Scalable SQL Applications Using NoSQL Paradigms

30

Federation with a Single Shard

Gat

eway

Existing Database

salesDatabase root contains:• Federation root = DB level object

containing federation scheme• Federation users• Federation metadata incl. federation

map

Federation Member

Range: Min...Max

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

CREATE FEDERATION sales (customer_id bigint RANGE)

Page 31: Building Scalable SQL Applications Using NoSQL Paradigms

31

Introducing Two Connection Modes

• Filtered Connection– Guarantees that any queries or DML will produce the

same results independent of changes to the physical layout of the federation members

– Scoped to an “Atomic Unit”• Unfiltered Connection– Scoped to a Federation Member– Management Connection

Page 32: Building Scalable SQL Applications Using NoSQL Paradigms

32

Create Schema on Member: Management Connection

© 2011 Microsoft Corporation. Microsoft Materials - Confidential. All rights reserved. CITA #

MSFT101120_A

Gat

eway

Existing Database

sales

Federation Member

Range: Min...Max

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION sales (customer_id=0) WITH FILTERING=OFF, RESET;

Customer Order Product

federated

federatedNon-

federated

CREATE TABLE …

Page 33: Building Scalable SQL Applications Using NoSQL Paradigms

33

DDL

CREATE TABLE customer ( c_id bigint PRIMARY KEY, … ) FEDERATED ON (customer_id=c_id);

CREATE TABLE order ( item_num int, customer_id bigint, date_sold datetime2, …, CONSTRAINT PK_Order PRIMARY KEY (item_num, customer_id, date_sold), CONSTRAINT FK_Cust FOREIGN KEY customer_id REFERENCES customer (customer_id) ) FEDERATED ON (customer_id=customer_id);

CREATE TABLE product ( product_name varchar(100) NOT NULL, unit_price money, item_num int PRIMARY KEY, … );

Page 34: Building Scalable SQL Applications Using NoSQL Paradigms

34

More Detail

• Supported data types for federation key : bigint, int, GUID, and varbinary (900)– Only range partitioning

• Federation key must be part of unique index• Foreign key constraints only allowed between federated tables and from

federated table to reference table• Not all Azure programmability features supported

– Sequence, timestamp• Additional surface area restrictions

– Indexed views, drop database (members)• Schemas are allowed to diverge over time

– Furthermore, in v1, schema updates to existing members must be done in each member (where the change is needed)

• USE FEDERATION “rewires a connection”– Connection is reestablished– All existing settings and context of the connection is lost (sp_reset_connection)– Must be in a batch by itself

Page 35: Building Scalable SQL Applications Using NoSQL Paradigms

35

Connect to Atomic Unit: Filtered

Gat

eway

Existing Database

sales

Federation Member

Range: Min...Max

customer order product

33

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION sales (customer_id=3) WITH FILTERING=ON, RESET;

When using into a specific key value, SELECT will only return records from federated tables that match that value. It will still return all records from non-federated tables.Inserts and UPDATES operating outside of the value will fail.

SELECT * from customer

SELECT * from order

SELECT * from product

Page 36: Building Scalable SQL Applications Using NoSQL Paradigms

36

More on Connection Filtering

• Most operations behave differently in filtered vs unfiltered connections

• Connection filtering is a property of the session– Filter injected dynamically at runtime– Cannot inspect source code to determine how it behaves

• E.g., running stored proc written for filtered mode on unfiltered connection could lead to unintended results

• There are several operations that will not work in filtered connection in v1– DDL, DML on reference tables, …

• Fan-out, bulk operations not efficient in filtered mode– For now, filter=off is our best offer

Page 37: Building Scalable SQL Applications Using NoSQL Paradigms

37

Support Matrix

Connection Type Filtered Unfiltered Named (unfiltered)Operation

Dynamic SELECT P P PDML* (federated tables) P P PDML* (reference tables) X P P

DDL X P PViews (not indexed) P P P

UDF - activate P P PStored Proc - activate P P P

Trigger (all modes) - activate P P PCREATE/UPDATE Stats X P P

Bulk Opsopenrowset bulk, bcp, bulk

insert X P P

* not including SELECT & modules

^ autostats will work on all connections

System stored procs, intrinsics will be unaffected (run unfiltered)

Page 38: Building Scalable SQL Applications Using NoSQL Paradigms

38

Splitting a Member

Gat

eway

Existing Database

sales

Federation Member

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDERATION ROOT WITH RESET

ALTER FEDERATION salesSPLIT AT (customer_id=50)

Using to the federation ROOT will pop you out of a member back into the database that hosts the federation

Range: Min...Max

customer order product

3

58

3

58

58

40

Page 39: Building Scalable SQL Applications Using NoSQL Paradigms

39

Two New Members

Gat

eway

Existing Database

sales

Federation Member

Range: Min...50

customer order product

33

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION ROOT WITH RESET

ALTER FEDERATION salesSPLIT AT (customer_id=50)

Federation Member

Range: 51...Max

customer order product

5858

5840

Page 40: Building Scalable SQL Applications Using NoSQL Paradigms

40

Two New MembersExisting Database

sales

Range: Min...50

customer order product

40

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION sales (customer_id=40) WITH FILTERING=ON, RESET;

Range: 51...Max

customer order product

5858

58

Gat

eway

40

SELECT *from customer

SELECT * from order