®
1
New Data Stack Workshop: Building a Scalable Cloud Datacenter
Ping Li, Accel [email protected]
July 14, 2010Stanford University
2
®
Accel Partners Confidential
Delivering Cloud Computing
• Cloud data centers will share infrastructure layers common to mainframes but redelivered for cloud capabilities
• “New Data Stack” will form foundation for cloud computing
• Elasticity
• Multi-app/user
• User-provisioned
• Portability
“Cloud Frame” MainframeMonitoring—Security
(RACF)
Resource Scheduler(z/VM & OS 370)
Monitoring—Performance(Mainview)
Provisioning & ConfigurationManagement
Virtualization(z/VM)
Performance Acceleration & dedicated processors (OS 370)
Clustering, failover, and mirroring(OS 370 & purpose built hw & microcode)
Backup and DR Tivoli Storage Manager, Parallel Sysplex
Private/Public
3
®
Accel Partners Confidential
Data Explosion
Legacy Stack
New Data Stack
• 2,500 exabytes of new information in 2012 with Internet/web as primary driver• “Digital universe” grew by 62% last year to 800K petabytes and will grow to 1.2 zettabytes this yearSource: An IDC White Paper - sponsored by EMC. As the Economy Contracts, the Digital Universe Expands. May 2009.
.
Cloud Application Data
Business Transaction Data
4
®
Accel Partners Confidential
“New Data” Trends
Data is growing faster than processing power – leading to coping strategies like throwing away data or frequent archiving to tape
61% CAGR
42% CAGRData
Transistors
Application responsiveness/scale trumps immediate consistency
Absolute consistency is the primary requirement – ACID transactions
Unstructured, complex data blobs (images, voice, logs, video) – doesn’t fit nicely into rows/columns
Highly structured, relatively small data records
Extremely large data sets (petabytes)Smaller data sets (bytes)
2,000 users = Tiny2,000 users = Huge
Circa 2010 – Cloud DataCirca 1975 – Transaction Data
Source: Gartner. .
5
®
Accel Partners Confidential
New Data Stack Technologies
CloudLegacy
Distributed computing layer (virtual machines,Map Reduce, networked commodity servers)
High speed networking is pervasive
Non-relational/”no sql” data stores
Distributed file systems
Flash/SSD (high performance and abundant)
Open platforms
Internet/cloud scale
Distributed computing layer (virtual machines,Map Reduce, networked commodity servers)
High speed networking is pervasive
Non-relational/”no sql” data stores
Distributed file systems
Flash/SSD (high performance and abundant)
Open platforms
Internet/cloud scale
Centralized/monolithic computing layer
Computer networking limited
Relational databases
FC SAN/NAS
Disks/Tape (memory scarce/expensive)
Proprietary/closed vendors
Enterprise-scale
Centralized/monolithic computing layer
Computer networking limited
Relational databases
FC SAN/NAS
Disks/Tape (memory scarce/expensive)
Proprietary/closed vendors
Enterprise-scale
6
®
Accel Partners Confidential
Agenda
1:15 pm NorthscaleSharon Barr, Vice President EngineeringJames Phillips, Founder, Chief Product OfficerDustin Sailings, Chief ArchitectBob Wiederhold, President, CEO
2:15 pm ClouderaAmr Awadallah, CTO/Co-FounderJeff Hammerbacher, Chief Scientist/Co-Founder
3:15 pm FacebookBobby Johnson, Director, Software EngineeringMark Rabkin, Software Engineer
4:15 pm Fusion-ioRobert Wipfel, Fellow
5:30 pm Cocktails!
Elastic Data Management Softwarefor web applications and cloud computing environments
The opportunity.
“ Relational database technology has served us well for 40 years, and will likely continue to do so for the foreseeable future to support transactions requiring ACID guarantees. But a large, and increasingly dominant, class of software systems and data do not need those guarantees. Much of the data manipulated by Web applications have less strict transactional requirements but, for lack of a practical alternative, many IT teams continue to use relational technology, needlessly tolerating its cost and scalability limitations. For these applications and data, distributed key-value cache and database technologies such as NorthScale provide a promising alternative. ”
Carl OlofsonResearch Vice PresidentDatabase Management Software ResearchIDC
Modern interactive software architecture
3
To support more users …
… simply add more commodity web servers
(or virtual machines) behind a load balancer …
… but you must get a bigger, more complex
database server.
Application scales linearly, data hits a wall
Application Scales OutJust add more commodity web servers
Database Scales UpGet a bigger, more complex server
4
What’s driving the curves?
5
1.Transaction overhead.
Same hardware, over an order of magnitude difference in supportable user base.
2.Expensive hardware.
More costly to start with, and the cost differential widens with growth.
3.Complex administration.
RDBMS technology is extremely complex and expensive to administer.
750 OPS 15,000 OPS
$7,500 $2,500750 OPS 750 OPS
RDBMS NorthScale
RDBMS NorthScaleRDBMS
Schema committee
Add new table(s)Re-normalize
Shard if needed
Tune performanceUpdate views
Insert and select.
NorthScale
Set and get.$125,000 $12,50015,000 OPS 15,000 OPS
3x
10x
Create indices
Billions in data management savings available
RDBMS ideal for intended purpose, will continue to be appropriate for debit-credit data – costly overkill for most new data
6
Relational databasetechnology ideal
Alternative database technology needed
Relational database technology was $18.8 billion market in 2007 (IDC)
Big leap from relational database to alternatives
7
Where do I start? What data should I move first? Which alternative database technology will “win”? This looks really complicated.
NorthScale solution.
“ I can’t tell you how many email requests I’ve received from our developers asking for something that is as simple and fast as memcached, but that promises data durability. Cassandra is just far too complex and heavyweight and we won’t be doing any more deployments. NorthScale is definitely on to something here. ”
Director of EngineeringLeading Social Network
Before: Where you are today
9
Relational database technology powers 99.999% of web applications.
Step 1: Cache relational data in memcached
10
Memcached is simple, fast and infinitely scalable. It is easy to adopt, and delivers immediate cost, performance and scalability benefits.
NorthScale Memcached Servers
Relational Database
Step 2: Gradually migrate data to membase
11
NorthScale Memcached Servers
Relational Database
NorthScale Membase Servers
After: Elastic compute and data layersData layer now scales with linear cost and constant performance.
Application Scales OutJust add more commodity web servers
12
Database Scales OutJust add more commodity data servers
Scaling out flattens the cost and performance curves.
An evolutionary path toward elastic data
13
NorthScale Membase Server
Membase is an elastic key-value database
15
Membase data servers
In the data center
Web application server
Application user
On the administrator console
Five minutes or less to a working cluster• Downloads for Linux and Windows• Start with a single node• One button press joins nodes to a clusterEasy to develop against• Just SET and GET – no schema required• Drop it in. 10,000+ existing applications
already “speak membase” (via memcached)• Practically every language and application
framework is supported, out of the boxEasy to manage• One-click failover and cluster rebalancing• Graphical and programmatic interfaces• Configurable alerting
Membase is Simple, Fast, Elastic
16
Membase is Simple, Fast, Elastic
17
Predictable• “Never keep an application waiting”• Quasi-deterministic latency and throughput
Low latency• Auto-migration of hot data to lowest latency
storage technology (RAM, SSD, Disk)• Selectable write behavior – asynchronous,
synchronous (on replication, persistence)• Back-channel rebalancing [FUTURE]
High throughput• Multi-threaded• Low lock contention• Asynchronous wherever possible• Automatic write de-duplication
Membase is Simple, Fast, Elastic
18
Scale out• Spread I/O and data across commodity
servers (or VMs) • Consistent performance with linear cost• Dynamic rebalancing of a live clusterAll nodes are created equal• No special case nodes• Clone to growExtensible• Filtered TAP interface provides hook points
for external systems (e.g. full-text search, backup, warehouse)
• Data bucket – engine API for specialized container types
• Membase NodeCode [FUTURE]
vBucket mapping
19
Key1Key2
All possiblemembase keys
Key3Key4Key5Key6Key7Key8Key9Key10
Keym
vBucket1
vBucket2
vBucket3
vBuckets
vBucketn
Server1 / Server2, Server3
Server1 / Server2, Server3
Server2 / Server3, Server4
Key vBucket(hash function)
vBucket Servers(table lookup)
Serverp / Serverq, Serverr
Host Server/Replica Servers
vBucket‐Server Map ‐ Example
vBuckets
vBucket5 ServerC / ServerA, ServerB
vBucket1 ServerA / ServerB, ServerC
Host Server/Replica Servers
vBucket3 ServerB / ServerA, ServerC
vBucket6 ServerC / ServerA, ServerB
vBucket2 ServerA / ServerB, ServerC
vBucket4 ServerB / ServerA, ServerC
Deployment options
20
applicationlogic
OTC memcached
client
data operations
applicationlogic
OTC memcached
client
data operations
cluster operations
11211
serverlist
OTC Memcached Server
11211
Membase Server
serverlist
proxy vbucketmap
applicationlogic
OTC memcached
client
Membase Server
localhost
proxyvbucket
map
applicationlogic
NEWmemcached
client
Membase Server
vbucketmap
Embedded proxy Standalone proxy “vBucket-aware” client
Deployment Option 1 Deployment Option 2 Deployment Option 3
11210
data operations
cluster operations
11211
proxy vbucketmap
11210
data operations
cluster operations
11211
proxy vbucketmap
11210
Membase “write” data flow – application view
21
User action results in the need to change the VALUE of KEY
Application updates key’s VALUE, performs SET operation
Membase (memcached) client hashes KEY, identifies KEY’s master serverSET request sent over
network to master server
Membase replicates KEY-VALUE pair, caches it in memory and stores it to disk
1
2
34
5
Listener‐Sender
DiskDisk Disk
RAM*
mem
base storage engine
SSDSSD SSD
Listener‐Sender
DiskDisk Disk
RAM*
mem
base storage engine
SSDSSD SSD
Membase data flow – under the hood
22
SET request arrives at KEY’s master server
Listener-Sender
Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY
2 2
1 SET acknowledgement returned to application5
DiskDiskDiskDisk DiskDisk
RAM*
mem
base
sto
rage
eng
ine
SSDSSDSSDSSD SSDSSD
3
4
moxi
11211 11210
memcachedprotocol listener/sender
membase storage engine
engine interface
memcapable 1.0 memcapable 2.0
21100 – 2119943698080
httpR
ES
T m
anag
emen
t AP
I/Web
UI
Hea
rtbea
t
Pro
cess
mon
itor
Glo
bal s
ingl
eton
sup
ervi
sor
Con
figur
atio
n m
anag
er
on each node
Erlang/OTP
Reb
alan
ce o
rche
stra
tor
Nod
e he
alth
mon
itor
one per cluster
vBuc
ket s
tate
and
repl
icat
ion
man
ager
HTTP distributed erlangerlang port mapper
Data Manager Cluster Manager
Membase Architecture
moxi
11211 11210
memcachedprotocol listener/sender
membase storage engine
engine interface
memcapable 1.0 memcapable 2.0
21100 – 2119943698080
httpR
ES
T m
anag
emen
t AP
I/Web
UI
Hea
rtbea
t
Pro
cess
mon
itor
Glo
bal s
ingl
eton
sup
ervi
sor
Con
figur
atio
n m
anag
er
on each node
Erlang/OTP
Reb
alan
ce o
rche
stra
tor
Nod
e he
alth
mon
itor
one per cluster
vBuc
ket s
tate
and
repl
icat
ion
man
ager
HTTP distributed erlangerlang port mapper
Membase Architecture
Data buckets are secure membase “slices”
25
Membase data servers
In the data center
Web application server
Application user
On the administrator console
Bucket 1
Bucket 2
Aggregate Cluster Memory and Disk Capacity
Leading cloud service (PAAS) providerOver 65,000 hosted applicationsNorthScale Memcached Server serving over 1,200 Heroku customers (as of June 10, 2010)
NorthScale in production
26
Social game leader – FarmVille, Mafia Wars, Café WorldOver 230 million monthly usersNorthScale Membase Serveris the 500,000 ops-per-second database behind FarmVille and Café World
Wednesday, July 14, 2010
Evolving a New Analytical PlatformWhat Works and What’s Missing
Jeff HammerbacherChief Scientist, ClouderaJuly 14, 2010
Wednesday, July 14, 2010
My BackgroundThanks for Asking
▪ [email protected]▪ Studied Mathematics at Harvard▪ Worked as a Quant on Wall Street▪ Conceived, built, and led Data team at Facebook▪ Nearly 30 amazing engineers and data scientists▪ Several open source projects and research papers
▪ Founder of Cloudera▪ Chief Scientist▪ Also, check out the book “Beautiful Data”
Wednesday, July 14, 2010
Presentation Outline▪ 1. Defining the Platform▪ BI: Science for Profit▪ Need tools for whole research cycle▪ SQL Server 2008 R2: defining the platform
▪ 2. State of the Platform Ecosystem▪ 3. Foundations for a New Implementation▪ Hadoop▪ Boiling the Frog
▪ 4. Future Developments▪ Questions and Discussion
Wednesday, July 14, 2010
1. Defining the Platform
Wednesday, July 14, 2010
BI is looking more like science (for profit)
Wednesday, July 14, 2010
Jim Gray: Science entering Fourth Paradigm“We have to do better at producing tools to
support the whole research cycle”
Wednesday, July 14, 2010
RDBMS only a small part of this tool set
Wednesday, July 14, 2010
Example: SQL Server 2008 R2
Wednesday, July 14, 2010
RDBMS: SQL Server
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting Services
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting ServicesAnalysis: SQL Server Analysis Services
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting ServicesAnalysis: SQL Server Analysis Services
Search: Full-Text Search
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting ServicesAnalysis: SQL Server Analysis Services
Search: Full-Text Search
CEP: StreamInsight
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting ServicesAnalysis: SQL Server Analysis Services
Search: Full-Text Search
CEP: StreamInsight
OLAP: PowerPivot
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting ServicesAnalysis: SQL Server Analysis Services
Search: Full-Text Search
CEP: StreamInsight
OLAP: PowerPivot
MDM: Master Data Services
Wednesday, July 14, 2010
RDBMS: SQL ServerETL: SQL Server Integration Services
Reporting: SQL Server Reporting ServicesAnalysis: SQL Server Analysis Services
Search: Full-Text Search
CEP: StreamInsight
OLAP: PowerPivot
MDM: Master Data ServicesCollaboration: SharePoint
Wednesday, July 14, 2010
What do we call this unified suite?
Wednesday, July 14, 2010
For today: Analytical Data Platform
Wednesday, July 14, 2010
LAMP Stack for Analytical Data ManagementFor today: Analytical Data Platform
Wednesday, July 14, 2010
2. The State of the Platform Ecosystem
Wednesday, July 14, 2010
Who makes up the platform ecosystem?
Wednesday, July 14, 2010
Platform Providers
Wednesday, July 14, 2010
Platform ProvidersInfrastructure Providers
Wednesday, July 14, 2010
Platform ProvidersInfrastructure Providers
Application Developers
Wednesday, July 14, 2010
Platform ProvidersInfrastructure Providers
Application Developers
Content Providers
Wednesday, July 14, 2010
Platform ProvidersInfrastructure Providers
Application DevelopersEnd Users
Content Providers
Wednesday, July 14, 2010
What is new about the ecosystem today?
Wednesday, July 14, 2010
Content Providers1. > 95% of enterprise data is unstructured
2. Data volumes growing rapidly
Wednesday, July 14, 2010
Infrastructure Providers1. Cloud
2. Warehouse-Scale Computers
Wednesday, July 14, 2010
Platform Providers1. Open source
2. Driven by consumer web properties
Wednesday, July 14, 2010
Application Developers1. Data Scientists
2. Diversity of languages
Wednesday, July 14, 2010
End Users1. Browser is the client
2. Tell a story about the business
Wednesday, July 14, 2010
3. Foundations for a New Implementation
Wednesday, July 14, 2010
New foundations: HDFS and MapReduce
Wednesday, July 14, 2010
2005: Doug/Mike start project inside Nutch
Wednesday, July 14, 2010
2006: Doug joins Yahoo!
Wednesday, July 14, 2010
2007: Make Hadoop scale
Wednesday, July 14, 2010
2007: Make Hadoop scaleYahoo! makes Pig open source
Wednesday, July 14, 2010
2007: Make Hadoop scaleJim Gray’s “Fourth Paradigm” lecture
Yahoo! makes Pig open source
Wednesday, July 14, 2010
2007: Make Hadoop scaleJim Gray’s “Fourth Paradigm” lecture
Yahoo! makes Pig open source
Randy Bryant’s “DISC” lecture
Wednesday, July 14, 2010
2007: Make Hadoop scaleJim Gray’s “Fourth Paradigm” lecture
Yahoo! makes Pig open source
Randy Bryant’s “DISC” lecture
Powerset makes HBase open source
Wednesday, July 14, 2010
2008: Make Hadoop fast
Wednesday, July 14, 2010
2008: Make Hadoop fastYahoo! wins Daytona terabyte sort benchmark
Wednesday, July 14, 2010
2008: Make Hadoop fastFirst Hadoop Summit
Yahoo! wins Daytona terabyte sort benchmark
Wednesday, July 14, 2010
2008: Make Hadoop fastFirst Hadoop Summit
Yahoo! wins Daytona terabyte sort benchmarkYahoo! builds production webmap with Hadoop
Wednesday, July 14, 2010
2008: Make Hadoop fastFirst Hadoop Summit
Yahoo! wins Daytona terabyte sort benchmarkYahoo! builds production webmap with Hadoop
Facebook makes Hive open source
Wednesday, July 14, 2010
2008: Make Hadoop fastFirst Hadoop Summit
Yahoo! wins Daytona terabyte sort benchmarkYahoo! builds production webmap with Hadoop
Facebook makes Hive open source“MapReduce: A Major Step Backwards”
Wednesday, July 14, 2010
2009: Insert Hadoop into the enterprise
Wednesday, July 14, 2010
2009: Insert Hadoop into the enterpriseCloudera releases CDH
Wednesday, July 14, 2010
2009: Insert Hadoop into the enterpriseCloudera releases CDH
First Hadoop World NYC
Wednesday, July 14, 2010
2009: Insert Hadoop into the enterpriseCloudera releases CDH
First Hadoop World NYCYahoo! sorts a petabyte with Hadoop
Wednesday, July 14, 2010
2009: Insert Hadoop into the enterpriseCloudera releases CDH
First Hadoop World NYCYahoo! sorts a petabyte with Hadoop
Cloudera adds training, support, services
Wednesday, July 14, 2010
2009: Insert Hadoop into the enterpriseCloudera releases CDH
First Hadoop World NYCYahoo! sorts a petabyte with Hadoop
Cloudera adds training, support, services
“The Unreasonable Effectiveness of Data”
Wednesday, July 14, 2010
2010: Integrate Hadoop into the enterprise
Wednesday, July 14, 2010
2010: Integrate Hadoop into the enterpriseIBM announces InfoSphere BigInsights
Wednesday, July 14, 2010
2010: Integrate Hadoop into the enterpriseIBM announces InfoSphere BigInsights
Yahoo! completes enterprise-class security
Wednesday, July 14, 2010
2010: Integrate Hadoop into the enterpriseIBM announces InfoSphere BigInsights
Yahoo! completes enterprise-class security
Datameer and Karmasphere funded
Wednesday, July 14, 2010
2010: Integrate Hadoop into the enterpriseIBM announces InfoSphere BigInsights
Yahoo! completes enterprise-class security
Datameer and Karmasphere funded
Quest, Talend, Netezza, and more integrate
Wednesday, July 14, 2010
2010: Integrate Hadoop into the enterpriseIBM announces InfoSphere BigInsights
Yahoo! completes enterprise-class security
Datameer and Karmasphere funded
Quest, Talend, Netezza, and more integrateHive adds JDBC and ODBC
Wednesday, July 14, 2010
Hadoop will be an Analytical Data Platform
Wednesday, July 14, 2010
4. Future Developments
Wednesday, July 14, 2010
Capture: Log collection and CEP
Wednesday, July 14, 2010
Curate: Workflow and Scheduling
Wednesday, July 14, 2010
Curate: Secondary and Full-Text Indexing
Wednesday, July 14, 2010
Curate: Learn Structure from Data
Wednesday, July 14, 2010
Analyze: Mesos-enabled frameworks
Wednesday, July 14, 2010
Analyze: Link working set and historical data
Wednesday, July 14, 2010
All behind a single user interface
Wednesday, July 14, 2010
HUEMaking Many Computers Feel Like One
Wednesday, July 14, 2010
!"#$%&'()* !"#$%"&'$"()*+(%*,-.((/0*12%#"()*30*"#*$42*
2)$2%/%"#2*(/2)*#('%52*/6-$+(%7*+(%*5(7/628*.-$-
! !"#$%&'#$()! '**)+,-.,"$"#/)0)12"+#3,"/)3"#$&,.$&'#$)43#5),"$)
"#$%&'()%&($*+&),%"#-"(-)./01,! 63-.*313$()! 7*,2($&')-'"'%$/)
&$823&$()+,-.,"$"#)9$&/3,"/)
0)($.$"($"+3$/
! :.$")/,2&+$)! ;<<=)>.'+5$)
*3+$"/$(
! ?$*3'@*$)! .'#+5$()43#5)13A$/)
1&,-)12#2&$)&$*$'/$/)#,)
3-.&,9$)/#'@3*3#B
! 62..,&#$()! 7*,2($&')$-.*,B/)CD<=),1)#5$).&,E$+#)1,2"($&/)'"()
'#)*$'/#),"$)+,--3##$&)1,&)CF<=),1)#5$/$),.$")/,2&+$)
+,-.,"$"#/G
Wednesday, July 14, 2010
(c) 2010 Cloudera, Inc. or its licensors. "Cloudera" is a registered trademark of Cloudera, Inc.. All rights reserved. 1.0
Wednesday, July 14, 2010
ioMemory for Scale-out
Robert Wipfel, Fellow [email protected]
14th July, 2010, Accel Partners Panel Discussion
Factors impacting Scale-out
Balance • CPU • Disk • Network
Contention • Sharing • Locking
Throughput • IOPS • Bandwidth
Latency • Distributed • Dependencies
Graceful Recovery • No SPOFs • Fast Replay
Energy • Servers • RAM • Disks
Management and Monitoring
Need Disk
What’s *really* Needed…
Want • Really fast
Don’t Want • Volatile • Expensive • Limited capacity
Want • Non-volatile • Cheap • Large capacity
Don’t Want • Really slow
Want • Non-volatile • Really fast • Large capacity • Reasonable price • Low energy
DRAM
Solution: ioMemory
A disruption called ioMemory
• High speed like DRAM
• Persistence and capacity of disks
PCIe based NAND Flash Storage
• Very high IOPS
• Micro-second latency
• Very high data throughput
DRA
M
L1
SAN
, N
AS,
RA
IDed
DA
S
L2
L3
6 orders of magnitude
SSD
s
5 orders of magnitude
3 orders of magnitude
50µs (10E-‐6)
ioM
emor
y
Millisecond (10E-3) Nanosecond (10E-9) ACCESS DELAY IN TIME
Why is it called ioMemory?
Fusion-io ioDrive Maximum Write
24 GB, Flash, PCIe x4
Fusion-io ioDrive Improved Write
40 GB, Flash, PCIe x4
Fusion-io ioDrive Maximum Capacity
80 GB, Flash, PCIe x4
SSD SATA Vendor A 3.0Gbps 2.5 RAID 0
128 GB, Flash SATA/300
SSD SATA Vendor B 3.0Gbps 2.5 RAID 0
64 GB, Flash SATA/300
SSD SATA Vendor C 32 GB, Flash SATA/300
H2benchw 3.6: Interface Bandwidth MB/s
Raw Storage Performance
7/14/10
Application Performance
Fusion-io ioDrive Maximum Write
24 GB, Flash, PCIe x4
Fusion-io ioDrive Improved Write
40 GB, Flash, PCIe x4
Fusion-io ioDrive Maximum Capacity
80 GB, Flash, PCIe x4
SSD SATA Vendor A 3.0Gbps 2.5 RAID 0
128 GB, Flash SATA/300
SSD SATA Vendor B 3.0Gbps 2.5 RAID 0
64 GB, Flash SATA/300
SSD SATA Vendor C 32 GB, Flash SATA/300
IOMeter Database Benchmark I/O: Average Throughput MB/s
2x Faster Storage I/O
50x Faster Application I/O
ioMemory Performance
PCI bus protection
Checksums Poison bit
Strong ECC Wear leveling
Bad block re-mapping
Data labeling Parity-
protected pipelines
Flashback Chip protection
Power cut protection
ioMemory Reliability
MTBF = 2 Million Hours +
SSD
SSD
5
RAID Controller Application CPU
6 5
1
ioMemory
ioMemory
Application CPU
1
2
4
3
SSD
4b
3b
2
3a
4a
8 9
ioMemory is not a Solid State Disk
KI
LO
WA
TT
S
97 kWh/yr
3,013 kWh/yr
133,493 kWh/yr
15,000 RPM FC HDD
ioDrive Fusion-io
SSD ZeusIOPS
ioMemory is Green
Case Study
One of the world’s fastest growing Webmonsters
• Over 900% more database queries per second
• Dramatically improved server replication for most current data
• Over 800% improvement to disaster recovery back-up time
• Cut server footprint, power costs, and IT overhead by 75%
• Full and immediate ROI on repurposed servers with
• Continued ROI on operational cost saving
Case Study
Case Study
• 5x improvement to
• Database replication performance
• Data intensive query response
• Analysis routines
• Eliminating 210 failure points from system
• Implemented full system redundancy
• Dramatically lowered power and cooling expenses
Internet security company that protects over 1 billion inboxes
Case Study
Disruption
By deploying ioMemory… Cloudmark eliminated the need for this…
Department of Defense takes NASTRAN from 3-days to 6-hours
Demos Dynamics NAV can get a 4x performance improvement
Other Customer Examples
HMO achieves a 200 HDD to 1 ioDrive reduction for their Data Warehouse
Does a 30 to 1 box reduction for their reliable messaging system
Shows a 35x performance increase of unstructured search at OracleWorld
Stock exchange doubles the performance of their trading systems
ioMemory Products
160 GB • 116,046 (4k read packet size) • 93,199 (75/25 r/w mix 4k packet size)
320 GB • 71,256 (4k read packet size) • 67,659 (75/25 r/w mix 4k packet size)
640 GB • 122,601 (4k read packet size) • 121,008 (75/25 r/w mix 4k packet size)
320 GB • 185,022 (4k read packet size) • 129,699 (75/25 r/w mix 4k packet size)
80 GB • 119,790 (4k read packet size) • 89,549 (75/25 r/w mix 4k packet size)
19 Confiden8al Informa8on: Fusion-‐io
OEM Partners
20
Questions?
T H A N K Y O U