View
47
Download
2
Category
Tags:
Preview:
DESCRIPTION
04-DocumentDB-SQL
Citation preview
Dezvoltareaaplicațiilor de tip CloudSTORE DATA IN AZURE
s.l. dr. ing. Daniel Iercan
DocumentDB
What is DocumentDB?
schema free
+
non-trivial queries
+
transactional processing
What is DocumentDB?•NoSQL document database
•JavaScript and JSON
•Rich query (SQL-like) and transactions overschema-free JSON data• JSON documents are indexed automatically
• can be queried
What is DocumentDB?•Reliable and configurable performance• SSD storage• Tune and trade-off consistency• strong• bounded-stateless• session• eventual
•Data automatically replicated
•RESTful API
What is DocumentDB?•Fully managed by Azure
•Elastically scale throughput and storage (database units)
•Open by design (JavaScript, JSON, RESTfulHTTP)
Democonfigure DocumentDB in Azure portal
DocumentDB resources
Database account and administrative quota•100 databases
•500.000 users
•2.000.000 permissions
Elastic collections•One database can contain any number of
collections (limited by number of capacity units bought)
•Transaction domain for documents
•Scope for document storage and query execution
What is a Capacity Unit (CU)•a way to buy resources (CPU, RAM, IO, storage)
•1CU = 3 elastic collections, 10GB of SSD, 2000 request units
What is a request unit (RU)• request unit = CPU + memory + IO
•measured as rate per second
What is a request unit (RU)document size 1KB, 10 properties, session consistency, all documents indexed
DATABASE OPERATIONSNUMBER OF OPERATIONS PER SECOND PER CU
Reading a single document 2000
Inserting/Replacing/Deleting a single document
500
Query a collection with a simple predicate and returning a single document
1000
Stored Procedure with 50 document inserts
20
Developing Against Azure DocumentDB•RESTful API
•.NET (LINQ provider)
•Node.js
•JavaScript
•Python
•Support for CRUD operation and SQL syntax
DEMOOverview of .Net Client
DocumentDBresource model
QueryingSQL like syntax (sub-set of ANSI SQL)
User Defined Functions (JavaScript) can be used in queries
Transactions and JavaScript executions•SPs
•Triggers
•UDF
• JavaScript replaces T-SQL
• JavaScript logic executes in ACID transactions (snapshot isolation)
•Entire transaction is aborted in case of JavaScirptexception
Demo- create SP
- create trigger
- create UDF
http://azure.microsoft.com/en-gb/documentation/articles/documentdb-resources/
Documents• JSON objects
• Free schema
• Stored in collections
• Can be inserted, replaced, deleted, read, enumerated and query
Attachments and Media• binary blobs/media
• can be stored in DocumentDb or externally:
• special JSON document that captures the metadata of the media stored in a remote media store
• attachments stored in DocumentDB have _media property to point to the resource URI
• attachments in DocumentDB are GC automatically
• for media stored externally developer has to manage it
Users• Logical names for grouping permissions
• Implement multi-tenancy (one user for each actual application user)
• Shard data:• each user maps to database
• each user maps to a collection
• all documents for a user stored in a collection
• documents from different user stored in various collections
Permissions• administrative resources vs. application resources
• master key vs. resource key• master key – access to everything
• resource key – granular access to specific resources
Optimistic concurrencyETag and If-Match header attributes
DocumentDBtuning performance
Configuring Indexing Policy of a Collection• Choose whether the collection automatically indexes all of
the documents or not
• Choose whether to include or exclude specific paths or patterns in your documents from the index
• Choose between synchronous (consistent) and asynchronous (lazy) index updates
Configure consistencytrade-offs between consistency, availability and latency
•Strong
•Bounded-stateless – total ordering of writes and maximum staleness
•Session – read your own writes
•Eventual
Referenceshttp://azure.microsoft.com/en-us/services/documentdb/
http://azure.microsoft.com/en-us/documentation/services/documentdb/
http://azure.microsoft.com/en-us/documentation/articles/documentdb-introduction/
http://azure.microsoft.com/en-gb/documentation/articles/documentdb-resources/
http://azure.microsoft.com/en-gb/documentation/articles/documentdb-interactions-with-resources/
SQL in the Cloud
Need for scalabilityAs data grow performance degrade
SQL Partitioning (horizontal scalling)• Master/Slave: one (master) SQL server for write operations (CRUD),
and one ore more SQL server for read operations• master can be a bottleneck
• replication is near-real-time
• master single point of failure
• Cluster Computing: multiple server that act as nodes and uses a centralized shared disk facility• all nodes can be used for read, only one for write, if the node that does the
write fails another one takes its place (shared disk can be a bottle neck, write does not scale)
• advanced clustering uses real-time memory replication so that all nodes can do writes (network traffic between node can be a bottle neck + shared disk)
SQL Partitioning (horizontal scalling)• Table Partitioning• data in single large tables can be split across multiple disks to improve I/O,
partition can be done both horizontally (by rows) as well as vertically (by columns), issues with join operations
• Federated Tables• tables can be access across multiple servers (complex to administrate, good
for reporting but not for general read/write transactions), federation key is very important
SQL Partitioning (horizontal scalling)• Sharding (Shared-Nothing)• Independent servers (CPU, memory and disk)
• Smaller databases are: easier to manage and maintain, faster, and reduce costs
• Challenges: reliability, distributed queries, avoidance of cross-shard joins, auto-increment key management, support for multiple shard schema (session-based, transaction-based, statement-based) , determine optimum method for sharding data (by primary key, by modulus of a key, maintain a master shard index table)
Ways to use SQL in the Cloud
• SQL as a service
• dedicated SQL VM
SQL Azure• Old – federation
• New - sharding
Multi-tenancy• a single instance of the software runs on a server, serving
multiple tenants
• has to ensure data separation
SQL Azure Database is
Get started quickly
Ready to get started?
Provision Your ServerServer defined
Service head that contains databases
Connect via automatically generated FQDN (xxx.database.windows.net)
Initially contains only a master database
Provision servers interactively
Log on to Windows Azure Management Portal
Create a SQL Azure server
Specify admin login credentials
Add firewall rules and enable service access
Automate server provisioning
Use Windows Azure Platform PowerShell cmdlets (or use REST API directly)
wappowershell.codeplex.com
Build Your DatabaseUse familiar technologies
Supports Transact-SQL
Supports popular languages
.NET Framework (C#, Visual Basic, F#) via ADO.NET
C / C++ via ODBC
Java via Microsoft JDBC provider
PHP via Microsoft PHP provider
Supports popular frameworks
OData (REST data access)
Entity Framework
WCF Data Services
NHibernate
Supports popular tools
SQL Server Management Studio (2008 R2 and later)
SQL Server command-line utilities (SQLCMD, BCP)
CA Erwin® Data Modeler
Embarcadero Technologies DBArtisan®
Differences in comparison to SQL Server Focus on logical vs. physical administration
Database and log files automatically placed
Three high-availability replicas maintained for every database
Databases are fully contained
Tables require a clustered index
Maximum database size is 50 Gb
Unsupported SQL Server featuresBACKUP / RESTORE
USE command, linked servers, distributed transactions, distributed views, distributed queries, four-part names
Service Broker
Common Language Runtime (CLR)
SQL Agent
Database
Thin client database development
Rich client database development
Database
Data-tier Application Framework (DAC Fx)
How to get the latest DAC Fx
Database
Interactive approach for dacpac v1 and v2
Interactive approach for bacpac v2
Upgrading a dacpac or bacpac
Secure Your DatabaseServer identity and access control
SQL authentication supported
Integrated authentication not supported
Connect to master to administer logins and create / drop databases
The admin login (configured during service provisioning) is like sa
The admin login has full rights on the server (and all databases) and should only be used for administration
Manage logins with CREATE / ALTER / DROP LOGIN commands
Membership in the loginmanager server role grants CREATE / ALTER / DROP LOGIN priveleges
Membership in the dbmanager server role grants CREATE / DROP DATABASE privileges
Database identity and access controlLogins must have an associated user account to connect to a database
The admin login is automatically associated with a special user known as dbo (database owner)
The dbo has full rights in the database and should only be used for administration
Manage users with CREATE / ALTER / DROP USER commands
Add users to system or user-defined database roles to grant privileges via sp_add_rolemember
Organize database objects into schema containers based upon common access control requirements
Grant privileges to schema containers instead of individual objects for better productivity
Connect Your ApplicationConnecting to SQL Azure
TDS (Tabular Data Stream) protocol over TCP/IP supported
SSL required
Use firewall rules to connect from outside Microsoft data center
ASP.NET example:
Special considerations
Legacy tools and providers may require special format for login: [login]@[server]
Idle connections terminated after 30 minutes
Long running transactions terminated after 24 hours
DoS guard terminates suspect connections with no error message
Failover events terminate connections
Throttling may cause errors
Use connection pooling and implement retry logic to handle transient failures
Latency introduced for updates due to HA replicas
No cross-database dependencies, result sets from different databases must be combined in application tier
<connectionStrings><addname="AdventureWorks"connectionString=
"Data Source=[server].database.windows.net;Integrated Security=False;Initial Catalog=ProductsDb;User Id=[login];Password=[password];Encrypt=true;"
providerName="System.Data.SqlClient"/></connectionStrings>
DemoSQL db from laboratory enrol.
No-Sql vs. Sql
Performancecertain types of queries can be slow
for business application SQL most likely are better
for fetching few bits of information but high traffic and concurrency NO-SQL is better
Business Intelligencebest works with SQL
NO-SQL (wide-columns) works good with “BIG DATA”
Referenceshttp://codefutures.com/database-sharding/
http://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/
Recommended