Upload
in-memory-computing-summit
View
112
Download
6
Embed Size (px)
Citation preview
NVMe, Storage Class Memory, and Operational Databases:Real World Results
—Brian Bulkowski, CTO and FounderMay 24, 2016
See all the presentations from the In-Memory Computing Summit at http://imcsummit.org
Who needs a fast operational database?
3© 2016 Aerospike Inc. All rights reserved. [ ]
Identified where scale is mission critical – the AdTech Industry
1 Expanding into other use casesin BFSI, Telco, Retail, IoT
4
Built database that can scale for RTB use case for these customers and more….
2
The Scalable Operational Database Problem
3
Deployed @ Digital Enterprises that need consumer scale databases
4© 2016 Aerospike Inc. All rights reserved. [ ]
Architecture Overview
LEGACY DATABASE(Mainframe)
XDR
DATA WAREHOUSE/DATA LAKE
LEGACY RDBMSHDFS BASED
BUSINESSTRANSACTIONS
Web views
( Payments ) ( Mobile Queries )
( Recommendation ) ( And More )
High Performance NoSQL
“REAL-TIME BIG DATA”“DECISIONING”
YYYOUR APPLICATION FRAMEWORK
YY
Decisioning Engine
5© 2016 Aerospike Inc. All rights reserved. [ ]
Architecture Overview
LEGACY DATABASE(Mainframe)
XDR
Decisioning Engine
DATA WAREHOUSE/DATA LAKE
LEGACY RDBMSHDFS BASED
BUSINESSTRANSACTIONS
Web views
( Payments ) ( Mobile Queries )
( Recommendation ) ( And More )
High Performance NoSQL
“REAL-TIME BIG DATA”“DECISIONING”
500Business Trans per sec
5000 Calculations per sec
X =
2.5 M Database Transactions per sec
6© 2016 Aerospike Inc. All rights reserved. [ ]
Business Challenge• Meet payment SLA of 750 ms• Differentiate between fraudulent and legitimate
orders in real-time• Stop loss of business due to latency• Support thousands of DB reads/writes per credit
card transaction• Bring new algorithms and data
online within weeks
Prevent Only Fraudulent Transactions
7© 2016 Aerospike Inc. All rights reserved. [ ]
Bespoke Algorithms• Blend Internet behavior• Track every IP address• Recent fraudulent cart items• Machine learning generated filters• Expand to new platforms
Not business process modeling
Easier to code and validatethan SQL queries
Credit Card Processing System
Fraud Detection & Protection App
RulesRule 1Rule 2Rule 3
Historical Data
Rule 1-PassedRule 2-PassedRule 3-Failed
Account BehaviorStatic Data
AccountStatistics
Prevent Only Fraudulent Transactions
8© 2016 Aerospike Inc. All rights reserved. [ ]
Intra-day System of Record
Challenge• DB2 (RDBMS) stores positions for 10 Million customers• Must update stock prices, show balances on 300 positions,
process 250M transactions, 2M updates/day• Single view of position across
Risk, Mobile, Fraud• Enable hard-to-code Risk and Fraud
algorithms
9© 2016 Aerospike Inc. All rights reserved. [ ]
Intra-day System of Record
High performance NoSQL• Immediate consistency, no data loss• Predictable low latency at high throughput • Flash provides durability, fast restarts• Cross data center (XDR) for replication
Highly parallel account-to-trades data model
Easy-to-validate Java-based algorithms
IBM DB2(Mainframe)
Real-Time App Record App
Finance App
Real-TimeData Feed
Start of the DayData Loading
End of DayReconciliation
AccountPositions
10© 2016 Aerospike Inc. All rights reserved. [ ]
■Internet cache replacement■Telecom integrated real-time billing and routing■Retail predictive analytics■Machine learning
■Online learning updates models rapidly
■Social messaging■Gaming and gambling
Similar use cases
11© 2016 Aerospike Inc. All rights reserved. [ ]
Aerospike Architecture
■ Every node in a cluster is identical, handles both transactions and long running tasks
■ Direct attach storage for 1M+ TPS performance
■ Highly available clustering, integratedtransaction processing
OHIO Data Center
12© 2016 Aerospike Inc. All rights reserved. [ ]
■Tests simultaneous read and write performance■Small block ( 1K ) random read latency
■During continual large-block write load
■50-50 read-write load
■Defragmentation ( large blocks )
■24 hours ( variance often found at 7 to 10 hours )
■Operational databases need to accept reads and writes■Reads are not localized or predictable
Aerospike ACT certification tool for Flash
Status of Wide SATA, PCIe, NVMe,
14© 2016 Aerospike Inc. All rights reserved. [ ]
■Early SATA ( 2009, 2010 )■Intel X25M
■Samsung SS805
■Devices provide 95% < 1msfor about 2,000 IOPS
■$3 / GB
■FusionIO - 2010■PCIe, but custom driver
■CPU, bus load
■$8 / GB
Historical Perspective – Early Days
15© 2016 Aerospike Inc. All rights reserved. [ ]
■Micron proves fast PCIe possible■P320 ( SLC ) with low bus overhead, excellent driver
■Over 200,000 IOPS with 99.7% < 1ms
■( SFF-8639 hot-swap 2.5” pci-e drives )
■“Wide SATA” generally used■12 to 20 2.5” SATA drives per 2U chassis
■Intel S3700, S3500 ; Samsung 843 favored
■8 drives per controller ( many issues )
■150,000 IOPs per chassis achievable
■Violin, FusionIO troubled, DSSD sold to EMC■NVMe available but not practical
Historical Perspective – 2013, 2014
16© 2016 Aerospike Inc. All rights reserved. [ ]
■Linux, Windows drivers achieve performance■U.2 and M.2 form factors available
■Intel P3700, P3600, P3500 available – ■250k IOPs per card
■Samsung PM1735 available – ■120k IOPs per card
■Micron 9100■HGST, Toshiba – 30k to 50k per card
■SAS / SATA lingers■Samsung SM1635, PM1633; Intel S3700; Micron S600 still shipping
NVMe Arrives – 2015 to present
17© 2016 Aerospike Inc. All rights reserved. [ ]
■High, predictable performance■High transfer speed reduces jitter
■2.8uS NVMe vs 5.0uS SATA
■Better controllers
■Mature and tuned Linux driver
■U.2 front panel hot swap available ( and 24-wide )
■NVMe arrays available■Apeiron ADS1000 “direct scale-out flash”
■EMC D5, Mangstore NX63020
■All new Aerospike deployments on NVMe
NVMe’s crushing superiority
18© 2016 Aerospike Inc. All rights reserved. [ ]
1 Million TPS on 1 Server
Options for storage on a database before Aerospike:
RAM, which was fast, but allowed very limited storage Disk, which allowed for a lot of storage, but was limited in
speed
Intel achieved 1M TPS using 4 Intel P3700 SDs with 1.6 TB capacity on a single Aerospike server. The cost per GB is a fraction of the cost of RAM, while still having very high performance.
19© 2016 Aerospike Inc. All rights reserved. [ ]
■Every public cloud provider has Flash■AWS / EC2 has sophisticated offerings
■Google Compute is high performance
■Private clouds manage Flash■Docker offers storage metadata
■Pivotal manages Flash and traditional storage
Flash in the Public Cloud
Storage Class Memory( and trends )
21© 2016 Aerospike Inc. All rights reserved. [ ]
■Diverging “Drive Writes Per Day”■Low-write devices
■1~2 DWPD
■Sandisk Inifiniflash, Micron
■Increase density, lower cost
■Hadoop / Data lake “all flash” use
■High-write devices■10~15 DWPD
■P3700, Hitachi, Samsung
■Optane to the rescue
Trends in NAND
22© 2016 Aerospike Inc. All rights reserved. [ ]
■Optane this year■3D Xpoint in 2.5” NVMe package
■“7x faster” – limited by NVMe !
■Very high write durability
■Replaces SLC for some use cases
■Unknown pricing
■NVDIMM■Removes NVMe limit
■Intel cagy on delivery – “uncommitted”
■Competes with DRAM
■Really changes the world
Intel’s 3D Xpoint roadmap ( public info )
23© 2016 Aerospike Inc. All rights reserved. [ ]
■It’s not exactly like DRAM■When a system restarts, need to clear
■Slower
■Different kind of data structures required
■It’s not exactly like storage■It’s on the memory bus
■New instructions for persistance
■Think of it more like DRAM■Lower power consumption
■Much higher density ( 1T++ )
How to architect for DDR 3D XPoint
Thank YouQuestions?