Upload
sundararajan-subramanian
View
4.391
Download
4
Tags:
Embed Size (px)
DESCRIPTION
This is a deep dive sesison on Windows Azure Table Storage given by Sundararajan S - http://sundars.net in B.Net DevCon
Citation preview
WINDOWS AZURE TABLE STORAGE – DEEP DIVE
SUNDARARAJAN SUBRAMANIAN
ASSOCIATE TECHNICAL ARCHITECT
AZURE - STORAGE
Tables – Provide structured storage. A Table is a set of entities, which contain a set of properties
Queues – Provide reliable storage and delivery of messages for an application
Blobs – Provide a simple interface for storing named files along with metadata for the file
Drives – Provides durable NTFS volumes for Windows Azure applications to use
2
WINDOWS AZURE TABLESProvides Structured Storage
• Massively Scalable Tables• Billions of entities (rows) and TBs of data• Can use thousands of servers as traffic grows
• Highly Available & Durable• Data is replicated several times
Familiar and Easy to use API
• ADO.NET Data Services – .NET 3.5 SP1• .NET classes and LINQ• REST – with any platform or language
3
TABLE STORAGE CONCEPTS
EntitiesTablesAccounts
sundars
BlogPosts
Users
Blogtitle=…Name = …
Blogtitle=…Name = …
Name=…Id= …
Name=…Id= …
4
TABLE DATA MODEL
Table
• A storage account can create many tables• Table name is scoped by account• Set of entities (i.e. rows)
Entity
• Set of properties (columns)• Required properties
• PartitionKey, RowKey and Timestamp
5
REQUIRED ENTITY PROPERTIES
PartitionKey & RowKey
• Uniquely identifies an entity• Defines the sort order• Use them to scale your application
Timestamp
• Read only• Optimistic Concurrency
6
PARTITIONKEY AND PARTITIONS
PartitionKey
• Used to group entities in the table into partitions
A table partition
• All entities with same partition key value• Unit of scale• Control entity locality• Row key provides uniqueness within a partition
7
PARTITIONING TABLES
PartitionKey RowKey Title
2010 1000 Blogtitle1
2010 1001 Blogtitle2
2010 1002 Blogtitle3
2009 1003 Blogtitle4
2009 1004 Blogtitle5
Partition 1
Partition 2
PARTITIONING – WHY?
• Scalability
• Each individual partitions are distributed across multiple storage nodes• System monitors the Partition usage and automatically balances
partitions across multiple storage nodes.• A partition i.e. all entities with same partition key, will be served by a
single node
• Entity Group Transactions
• Allows the application to atomically perform multiple Create/Update/Delete operations across multiple entities in a single batch request to the storage system
• Entity Locality
CHOSING PARTITION KEY
• Entity Group transactions
• Efficient queries
• Scalability
TABLE OPERATIONS
Table• Create• Query• Delete
Entities• Insert• Update
• Merge – Partial Update
SaveChanges()• Replace – Update entire entity
SaveChanges(SaveChangesOptions.ReplaceOnUpdate)
• Delete• Query• Entity Group Transaction
DEMO
BLOG ENGINE – WINDOWS AZURE TABLE STORAGE
DATASERVICE CONTEXT – BEST PRACTICES
• Do not share the dataservicecontext object across threads
• Maintain shorter lifetimes
• Use separate Dataservice Context object for each operation
• If dataservice context object is shared across multiple operations the error cause in one of the operation will be retried during the subsequent SaveChanges.
• Entity Class name and the Table name should be same for high performance
CONCURRENT UPDATES
• With each result set, Etags are sent
• When an update happens to the retrieved entity, the client sends the Etag back to the server.
• Server checks for the Etag of the persisted entity before update.
• If there is a mismatch, server throws an exception
QUERY SPEED
• FAST
• Single partitionkey and rowkey with equality
• MEDIUM
• Single partition but a small range for RowKey• Entire partition or table that is small
• SLOW
• Large single scan• Large table scan• “OR” predicates on keys => no query optimization => results in scan
• Expect Continuation Tokens
MAKE QUERIES FASTER
Large Scans
• Split the range and parallelize queries• Create and maintain own views that help queries
“Or” Predicates
• Execute individual query in parallel instead of using “OR”
User Interactive
• Cache the result to reduce scan frequency
CONTINUATION TOKENS
• Maximum of 1000 rows in a response
• At the end of partition range boundary
• Maximum of 5 seconds to execute the query
• Expect Continuation token always
• If the Query times out, Server returns a continuation token so that the client can make another query
• When the Scan crosses partition boundary, continuation tokens are returned
PAGINATION
• Use Iqueryable<>.Take(N) to fetch the top results
• Use continuation Tokens
http://<serviceUri>/Blogs?<originalQuery>&NextPartitonKey=<someValue>&NextRowKey=<someOtherValue>
TIPS AND TRICKS
WINDOWS AZURE TABLE STORAGE
RETRIEVE LATEST ITEMS
Have the row key as
DateTime.MaxValue.Ticks - DateTime.UtcNow.Ticks
PREFIX BASED RETRIEVAL
• Use CompareTo and ‘>’ and ‘<‘ function effectively
• blog.PartitionKey.CompareTo(“Mic”)>=0
Q&A
THANK YOU
HTTP://TINYURL.COM/CODESHELVE
HTTP://SUNDARS.NET
HTTP://TWITTER.COM/SUNDARARAJANS