View
945
Download
1
Category
Tags:
Preview:
DESCRIPTION
Citation preview
Microsoft and OpennessNoSQL on Azure
Heriyadi JanwarPlatform Lead
We have changed as a company and have become more OPEN
Microsoft + Linux
• Linux runs as a first-class guest on Windows Server Hyper-V
• Develop apps for both Linux and Windows (CoApp)
• Supported on Windows Azure Virtual Machines (CentOS, openSUSE, SUSE, Ubuntu)
“Microsoft is playing quite nicely with Linux and other open source tools. “
-Robert McMillan, Wired Enterprise
Microsoft + Apache Hadoop
• Embracing the Big Data revolution with enterprise-ready Hadoop as part of Windows Azure HDInsight & Microsoft HDInsight for Windows Server
• Utilize Microsoft BI tools to unleash data insights from all your data, including those in Hadoop
“Microsoft's ongoing relationship in supporting the open source Hadoop technology continues apace as interoperability is being opened up for Windows Server and Windows Azure.“
-Kurt Mackie, Redmondmag.com
“Given the promising foundation of Windows Azure, we saw an opportunity to provide a cloud deployment and monitoring experience for our customers' existing Java-based enterprise applications...“
-Adi Paz, Executive VP for Marketing & Business Development at GigaSpaces
Microsoft + Java
• Great Java experience on Windows Server and Windows Azure
• Windows Azure plug-in for Eclipse with Java
“Between 2003 and 2012 we've seen the general opinion about Microsoft, Windows and PHP turn 180 degrees.“-René de Haas, SoHosted CEO
Microsoft + PHP
• Impressive performance on Windows Server and Windows Azure
• Open source community development of PHP on Windows right alongside Linux
“HTML5 represents the chance for browsers to work together and find common ground.”
-Chris Blizzard in the article “Only Microsoft Gets Web Standards”
Microsoft + Firefox
• Well supported across Microsoft cloud services (Bing, Office 365, SkyDrive, Skype)
• Windows Media Player Firefox Plug-in
“We successfully moved our [Drupal] site to Windows Azure and the biggest traffic day for us went off with flying colors.“
-Erin Griffin, CIO of Screen Actors Guild
Microsoft + Drupal
• Improved interoperability with Drupal to better manage web content
“Microsoft announced it has completed its addition of Node.js support to Azure, meaning that any developer can launch a server-based JavaScript app from Microsoft's cloud in minutes.“
-Scott M. Fulton, ReadWriteWeb
Microsoft + Node.js
• Support for a new class of real-time applications
• Improved Windows and Linux experience
• Support for Cloud9 IDE
“A few years back, a patch submission from coders at Microsoft would have been amazing to the point of unthinkable, but the battles are mostly over and times have changed.“
-Chris Hertel, writing on the Samba Team blog
Microsoft + SAMBA
• Submitted a patch to the Samba code to improve interoperability, contributed under GPL2+
• Working with the Samba team to support SMB protocol
downloads
Microsoft + Open Source Momentum
>1M
MicrosoftWebMatrix
9 of the top 10 most downloaded OSS projects run on Windows
9/10
SUSE-Microsoft Alliance customers
>900
In just two years, CodePlex membership has tripled 900,00
0+300,0002010 2012
Sourceforge (Nov’12)
Recent Announcements
“MongoLab and Windows Azure represent two leading MongoDBcloud participants, and their co-operation greatly expands the options available to developers to run, operate, and enjoy MongoDB.”
-Ed Albanese, Vice President of Business Development at 10gen, the companybehind MongoDB
• MongoLab offering MongoDB-as-a-Service through the Windows Azure Store
• MongoDB Installer for Windows Azure
CDN caching
identity & security
business analytics commercemediaintegration HPC
compute data management networking
SQL database
noSQL databasewebsites
cloud services blob connect
virtual network
traffic managerVMs
Windows Azure
Flexible & Open
• Choose from multiple runtimes and languages for your applications: .NET, Node.js, Java and PHP
• Run Linux images on Windows Azure Virtual Machines• Support multiple frameworks and popular open source
applications with Windows Azure Web Sites • Utilize Hadoop services preview for Big Data needs
Support for multiple languages and frameworks (ASP.NET, PHP, Node.js)
Pick from popular OSS apps
Choose your database (SQL Azure, MySQL, MongoDB)
Select your tools (Visual Studio, Git, FTP, WebMatrix)
Build on any platform (Windows, Mac, Linux)
Windows Azure Web Sites
Microsoft WebMatrix
Build and deploy web sites quickly and easily with gallery of popular open source web applications
Installs PHP and MySQL for necessary applications
Utilizes NuGet to access community-driven ASP.NET resources
OnlineBusinessApplicatio
n
Attract Individual Consumers:- Provide
interesting service
- Provide mobility- Provide social
Monetize Individual:- Upsell service
- VIP- Speed- Extra
Capabilities
Monetize the Social:- Improve individual
experience- Re-sell Aggregate
Data (e.g., Advertisers)
The Web 2.0 Business Architecture
Social NetworkING: the Business Problem
• 100s of million of users• 10s of million of users
concurrently• Terabytes to petabytes of
data• Structured and unstructured
• Required (eventual) data consistency across users• E.g. show your updated state in
your friends’ profile pages
Solution• Shard/Partition user data across hundreds to
thousands of SQL Databases• Propagate data changes from one DB to other
DBs using reliable, async Message Service• Managing routes from each DB to every other
DB would be too complex• Global Transactions would hinder scale and
availability• Provide a caching layer for performance• And also used for
• Clean-up state (e.g. on account close)• Deploy business logic (stored procedures)
Many LARGE SCALE customers using similar patterns
• Patterns• Sharding and reliable messaging• Sharding and fan/out query layer• Caching layer
• Customer Examples• Social Networking: Facebook, MySpace, etc• Online electronic stores (cannot give names )• Travel reservation systems (e.g. Choice International)• MSN Casual Gaming• etc.
Lessons Learned from THESE scenarios• Require high availability• Be able to scale out:• Functional and Data Partitioning Architecture• Provide scale-out processing:• Function shipping• Fanout and Map/Reduce processing
• Be able to deal with failures:• Quorum• Retries• Eventual Consistency (similar to Read-consistent Snapshot Isolation)
• Be able to quickly grow and change:• Elastic scale• Flexible, open schema• Multi-version schema support
Move better support for these patterns into the Data Platform!
What is NoSQL about?• NoSQL = operational and developer agility at low CapEx and OpEx!
• Low Cost• Free Open Source Stores• Scale CapEx cost below customer growth rate• Web friendly developer model and tool chain
• Processing Paradigms• High Availability (scalable Replication, Fast Failover, DR/GeoDR, tunable latency)• Scale-out (Sharding, Map-Reduce, Elasticity)• Performance (tuned for specific workloads, Caching, co-located compute with partitioned state)• Tunable/Eventual Consistency
• Data Model Paradigms• Data first: Flexible Schema• Low-impedance mismatch between programming and data model:
• Key-Documents and Objects (BLOBS, JSON, XML, POJO)• Key-Wide Sparse Column Sets• Graphs (e.g., RDF)
• Range from devices, over OLTP Web 2.0 applications to BigData Analytics
Data ModelsData Model Example Stores (apologies to the ones I did not list)
Simple Key-Value Pairs Memcache, Redis, Dynamo, Voldermort, LevelDB, Azure Caching
Wide Sparse Column Sets HyperTable, Big Table, Cassandra, HBASE, Hyperbase, Amazon DynamoDB, Windows Azure Tables, SQL Server/Azure Sparse columns
BLOBs Amazon S3, Oracle Berkeley NoSQL, Windows Azure Blob Store, SQL Server RBS/FileTable
JSON Documents MongoDB, CouchBase, Riak, RavenDB
Graph Neo4J, GraphDB, HypergraphDB, Stig, Intellidimension
Objects and XML Documents
Versant, Oracle Berkeley NoSQL, MarkLogic, existDB, EMC HiveDB, SQL Server/Azure, Oracle, IBM DB2
Extended Relational Oracle, EMC SQLFire, IBM DB2, MySQL, Postgres, SQL Server/Azure
Operational Agility
• You want:• Availability of service (scalability)• Global consistency• Network Partition Tolerance
• You can only get 2 of 3 (CAP Theorem)• In Brave New World:• Online businesses need availability• It is distributed, because it is big• thus Network Partitioning is unavoidable• Hence global consistency must be relaxed
Operational Agility
• Performance and Elastic Scale on Demand• Automate management lifecycle (or fail)• Simple deployment lifecycle• No DB or OS Admin telling me what to do
Developer Agility
• Code First and revise quickly• Application-model first (before database)• Flexible open data models• You don’t know exactly what you are looking for• Lower Pain of adoption and maintenance • No DB or OS Admin telling me what to do
What Can SQL learn From NoSQL?• Low CapEx, Low OpEx
• Built-in tunable High-Availability
• Data scale-out (Sharding)
• Processing scale-out (Map-Reduce, Fan-Out, tunable consistency)
• Flexible Data Models• JSON (& XML) support
• Sparse columns/Column sets
• Integrate with BigData Analytics (e.g., Hadoop)
Many Relational Database Systems are incorporating these learning!
Example: SQL Azure Federations
• Provides Data Partitioning/Sharding at the Data Platform
• Enables applications to build elastic scale-out applications
• Provides non-blocking SPLIT/DROP for shards (MERGE to come later)
• Auto-connect to right shard based on sharding keyvalue
• Provides SPLIT resilient query mode
What Can NOSQL learn From SQL?• Flexible data is good, but:
• Provide optional schema in data platform to help with constraints and optimizations
• Procedural Scale-Out processing is good, but:• Develop a declarative language suited for and across the data models (e.g., coSQL)
• Standardize suitable abstractions and languages
• Eventual Consistency is good, but:• Provide users the choice
• Simple Queries are good, but:• Provide me with secondary indexes
• it will be more efficient to join between two collections of JSON documents in the query engine than in the Application layer
Many NoSQL Database Systems are starting to incorporate these learnings!
Online
BusinessApplication
Attract Individual Consumers:- Provide
interesting service
- Provide mobility- Provide social
Monetize Individual:- Upsell service
- VIP- Speed- Extra
Capabilities
Monetize the Social:- Improve individual
experience- Re-sell Aggregate
Data (e.g., Advertisers)
The Web 2.0 Business Architecture
Primary Shard
Readable Replica
Readable Replica
Primary Shard
Readable Replica
Readable Replica
Primary Shard
Readable Replica
Readable Replica
OLTP Workloads
Highly AvailableHigh ScaleHigh Flexibility
mostly touching 1 to low number of shards
Dynamic OLAP Workloads
3Vs (Volume, Velocity, Variety) Exploratory
Scale-out queries, often using eventual consistent scale-out frameworks like Hadoop
Scale-Out Data PLATFORM Architecture
Traditional OLAP Workloadsknown schemaData warehouse, “Star joins”
Copy
Query
SQL or NoSQL Store
Big Data requires an end-to-end approach
32
Call To Action• Familiarize yourself with the NoSQL genes in the
Microsoft Online Platform• Free 3-Month Trial for Windows and SQL Azure: http://
www.windowsazure.com
33
Presentation Speaker Date and TimeDo We Have the Tools We Need to Navigate the New World of Data?
Dave Campbell 2/29 9:00am PST
Onsite Interview *Tim O’Reilly, Dave
Campbell2/29 10:15am PST
Unleash Insights on All Data With Microsoft Big Data
Alexander Stojanovic 2/29 11:30am PST
Office Hours (Q&A session) Dave Campbell 2/29 1:30pm PST
Hadoop + Javascript: What We Learned Asad Khan 2/29 2:20pm PST
Democratizing BI at Microsoft: 40,000 Users and Counting
Kirkland Barrett 3/1 10:40am PST
Data Marketplaces For Your Extended Enterprise
Piyush Lumba 3/1 2:20pm PST
Related Resources
• NoSQL and the Windows Azure Platform• Whitepaper:
http://download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE-6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf
• SQL Federation blog: http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-federations.aspx
Open to Feedback:
Heriyadi.janwar@microsoft.comweb: microsoft.com/openness twitter: @OpenForBizAPAC
Recommended