Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Accelerating innovation through next generation storage technologies REBECCA WEEKLYSR. PRINCIPAL ENGINEER & SR. DIRECTOR
CLOUD PLATFORMS GROUP
Source: April 2019 Raconteur
Data Gravity: Challenge or Opportunity?
Data Center | CloudCoreAccess | EdgeDevices | Things
Storage for the Data Challenge
NVMe over Fabrics
Storage Technology Trends
Computational Storage
NVMe Storage
In Market
Data Storage Paradigms
• Hierarchical tree structures• Applications write to file
system over network (NAS)• File system tracks state
(metadata) of objects (files)• Active Documents
• Enterprise databases • Mission critical core business
applications
• Flat key / value structure• Metadata provides context• Application keeps track of object
location (“URL”)• Unstructured Big Data and
Archives
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
Block Storage Object StorageFile Storage
1 0
0
1
1
0
1 1
0
0
0
1
1 0
1
1
0
0
0 0
0
0
1
1
0 1
1
0
1
0
0 0
0
0
1
1
1 1
0
1
1
0
0 1
1
1
0
0
What Drives Storage Choice?
Unstructured Data Workloads
Capacity requirement in PB range
Data not coupled to application
Applications don’t require strong consistency
Concurrent/Distributed access to content
Granular security and multi-tenancy
Source: http://www.gartner.com/technology/reprints.do?id=1-1R78PJ9&ct=140226&st=sb00
Object Storage Workloads
Core Functions
REST/HTML API access
High tolerance for latency
Support high concurrency
Eventual consistency
Specific Use Cases
VM Templates, ISO Images, etc.
Disk Volume Snapshots
Backup/Archive
Image/Video Repository
IO Workloads
Static Content, low change rate
Sequential R/W
Lower IOPS w/High throughput
Modern Use Cases for Object Storage
Source: https://minio.io
Decoupling the data created from the application enables an entirely new paradigm for data management.
Source: https://minio.io
DisaggregateData Lake
Streaming DataEventsLogs
Sensor DataSocial MediaTransactions Intel® AVX-512
Erasure Coding (HDFS, MinIO, RGW)
Applications
Data Processing
Machine Learning
S3 Select
S3A
SQL
SQL
S3 Select
S3 Select
Intel® AVX-512Erasure Coding
Intel® AVX-512MKL-DNN
S3A
S3A
Distributed Asynchronous Object Storage
10
Benefits▪ For persistent memory
▪ End-to-end OS bypass
▪ Low-latency I/O
▪ True zero-copy I/O
▪ Non-blocking
▪ Scalable communications & I/Os
3rd Party Applications
Rich Data Models
Storage Platform
Network
DAOS Storage EngineOpen Source Apache 2.0 License
POSIX I/O
Workflow
HDF5ApacheArrow
SQL …
Data Plane Control Plane
Libfabric
OPA RoCE GNI InfinibandSockets
TCP/IP TLS
iWARP
Learn about the architecture and features of Distributed Asynchronous Object Storage (DAOS). This open source object store is based on the Persistent Memory Development Kit (PMDK) for massively distributed non-volatile memory applications.
Persistent Storage for Cloud Apps
FaaS or Microservice architectures are not stateless –>
Data is created or read in at many points in the architecture.
Image Source: https://microservices.io/patterns/microservices.html
Industry-Standard Server-Based Scale-out Storage Solution
Optimizing for Performance and Value
Scale-out
HTTP Proxy Gateways
Proxy Network
Replication/Cluster Network
Intel® Xeon SP® Processor
Intel® Ethernet
Intel® SSD
Storage Node1
Object Storage Server
Intel® Xeon SP® Processor
Intel® Ethernet
Intel® SSD
Storage Node2
Object Storage Server
Intel® Xeon SP® Processor
Intel®Ethernet
Intel® SSD
Storage Node2
Object Storage Server
Scalable Cluster Framework
Proxy Gateway Servers
Clients Clients Clients
Client Network
Authentication
Monitoring and Management
Intel Technology at all levels of the stack
Intel® Xeon® SP processor-
based servers
Intel® SSD Data Center Family
10/25/40/50/100 Gigabit Intel® Ethernet
Network Adapters
Storage Performance Development Kit (SPDK)
Persistent Memory Development Kit (PMDK)
Intel® Optane™ DCPMM
Performance and Efficiency - Intel® Processors
Intel® AVX-512 is designed to improve both latency and throughput
Provides over 2X-3X performance boost over previous generation processors for storage functions:
▪ High speed, high bandwidth, vector pipeline integer operations
▪ XOR operations from ISA-L
▪ Hashing
▪ Erasure Codes
0%
50%
100%
150%
200%
250%
XOR Gen (16+1) PQ Gen (16+2) Reed Solomon EC (10+4)
Generational Cycle/Byte Comparison(higher is better)
0%
50%
100%
150%
200%
250%
300%
350%
Multihash
SHA-1
Multihash
SHA-1
Murmur
Multihash
SHA-256
Multibuffer
SHA-1
Multibuffer
SHA-256
Multibuffer
SHA-512
Multibuffer
MD5
Intel® Xeon® Processor E5-2650v3
Intel® Xeon® Processor E5-2650v4
Intel® Xeon® Platinum 8180 Processor
Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure on slide 27 for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
Intel® Xeon® Processor E5-2600v3, E5-2650v3, 10C, 2.3 GHz, M1, Aztec City CRB, 4x8 GB DDR4 2133 MT/s ECC RDIMM. Intel® Xeon® Processor E5-2600v4, E5-2650v4, 12C, 2.2 GHz, M0, Aztec City CRB, 4x8 GB DDR4 2400 MT/s ECC RDIMM Intel® Xeon® Processor Scalable Family, Platinum 8180 Processor, 28C, 2.5 GHz, H0, Neon City CRB, 6x16 GB DDR4 2666 MT/s ECC RDIMM BIOS Configuration: P-States: Disabled, Turbo: Disabled, Speed Step: Disabled, ,C-States: Disabled, Power Performance Tuning: Disabled, ENERGY_PERF_BIAS_CFG: PERF, Isochronous: Disabled, Memory Power Savings: Disabled ISA-L 2.19
INTEL® OPTANE™ DC PERSISTENT MEMORY• DATA PERSISTENCE ENABLING MEMORY-CENTRIC APPLICATIONS (APP-DIRECT)
• INCREASED MEMORY SIZE (MEMORY MODE) FOR PERFORMANCE
• IMPROVED TCO
INTEL.COM/OPTANE
Performance and Scale: Intel Storage
STORAGE
MEMORY
PERSISTENT MEMORY
DRAMHOT TIER
HDD / TAPECOLD TIER
Intel® 3D Nand SSDs
INTEL® OPTANE™ SSD DC D4800X DUAL-PORT • PERFORMANCE + RESILIENCY FOR CRITICAL ENTERPRISE IT APPS
• DUAL PORT CONNECTIONS ENABLE 24X7 DATA AVAILABILITY WITH REDUNDANT, HOT SWAPPABLE DATA PATHS
INTEL® SSD D-5 P4326 E1.L• COST-OPTIMIZED, ENABLES GREATER WARM STORAGE
• E1.L FORM FACTOR SCALABLE TO ~1PB (IN 1U)
Compute Express Link (CXL) Enabled Computing
CXL enables a more fluid and flexible memory modelSingle, common, memory address space across processors and devices
CPU CPU GPU FPGA AI NIC NIC
CPU-attached Memory(OS Managed)
Accelerator-Attached Memory(Runtime managed cache)
… … …
WritebackMemory
Memory Load/Store
Memory Load/Store
PCIe DMA
PCIe DMA
• Create shared memory pools
• Enhance movement of operands and results between
accelerators and target devices
• Enable efficient resource sharing
• Significant latency reduction to enable disaggregated memory
WritebackMemory
CXL consortium - Currently 83 companies and growing Learn more at www.ComputeExpressLink.org
Performance and Efficiency: Intel Tools
5XFIO for NVME1
5XCEPH Workload, OCF with Optane Cache1
OCF
8XCassandra
With Native Persistence1
PMDKSPDK Intel® VTune™
2.2XNetflix GBE1
Storage Performance Development Kit Open Cache Acceleration Software Framework Persistent Memory Development Kit Amplifier
Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure on slide 29-35 for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
Rack
Top-of-RackSwitch (TOR)
End-of-RowSwitch (EOR)
Spine Switch
Router
Inter-DC Links (DCI)
Optical Links(Interconnect)
Intel® Ethernet
Intel® Silicon Photonics
Programmable Switches
Accelerator PoolsServers & Storage
Criticality of Connectivity for Storage
PERFORMANCE
scalability
efficiency
18
All Ethernet NVMe-oF Protocols in a Single Adapter with Intel® Ethernet 800 Series
NVMe-oF* (Non-Volatile Memory Express over Fabrics)
*Other names and brands may be claimed as the property of others.
RDMA (Remote Direct Memory Access)
iWARP* RoCE* v2 (RDMA over Converged Ethernet
ver. 2)
Infiniband* Fibre Channel
NVMe*/TCP(Transmission Control Protocol)
Future Fabrics= Supported in Intel® Ethernet 800 Series (“Columbiaville”)
Ethernet-based
Intel® Omni-Path Architecture(Intel® OPA)
Application Device Queues (ADQ)
Performance and Scale: Intel Networking
With ADQ Application traffic to a dedicated set of queues
Without ADQApplication traffic intermixed with other traffic types
Applying ADQ to NVMe/TCP
Adding the Intel® Ethernet 800 Series with ADQ to NVMe/TCP narrows the performance gaps with RDMA NVMe-oF solutions
linux kernel updates NVMe/TCP FOR Released for comments
adq
ADQ
An open technology designed to improve application predictability, latency and throughput
adq
(Lower is better)
ADQImprovement
Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure on slide 28 for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
Example: Folded Clos Network Topology – Facebook
Tiered Structure Results in more E-W BW congestion. Streaming Inference tends to be done in the same rack as the web-tier nodes to avoid congestion.Often done with batch sizes of 1-2 because folded topology prevents taking advantage of scale.
Source: http://firstclassfunc.com/facebook-fabric-networking
Example: Fully Routed CloS – Google, Azure, IaaS trend
Very high E-W BW and uniform latencies across any two nodes within a much larger zone. Facilitates distributed functions for inference. Eg – Delegate through RPC call to TPU node.
Scale can allow streaming requests to be batched on Inference node for increased efficiency
Provider diagram Provider diagram
Standards bring calm to the chaos (NVMEOF, NVME OVER RDMA, NVME OVER TCP, ETC)
SNIA Board of Directors: Jim Pappas, Vice Chairman and Executive Committee
Technical Council: Alan Bumgarner
Solid State Storage Initiative (SSSI): Jenni Dietz, Co-Chair
Solid State Drive Special Interest Group (SSD SIG): Jonmichael Hands, Co-Chair
PM/NVDIMM Special Interest Group: Jim Pappas & Jenni Dietz
Networking Storage Forum (NSF): Christine McMonigal
NVM Programming TWG: Alan Bumgarner, Co-Chair
Computational Storage TWG: Nick Adams, Co-Chair
SFF TA TWG: Anthony Constantine
Swordfish: Barry Kittner
Data Center | CloudCoreAccess | EdgeDevices | Things
Architecting Storage for the Data Challenge
Visit other Intel SessionsDate Time Speaker Name Room Session Title
9/23 8:30 amNick Adams J Metz (Cisco)
CypressNVM Express Specifications: Mastering Today’s Architecture and Preparing for Tomorrow’s
9/23 9:30 amJim Harris Paul Luse
Cypress Squeezing Compression into SPDK
9/23 10:35 am Piotr Wysocki Stevens Creek Scalable Storage Management with NVMe and NVMe-oF
9/23 2:30 pm Benjamin Walker Cypress 10 Million I/Ops From a Single Thread
9/24 1:00 pmChangpeng Liu Xiaodong Liu
Lafayette/San Tomas Introduction of SPDK vhost FUSE Target to Accelerate File Access in VM and Containers
9/24 2:00 pmAlan BumgarnerTom Talpey (Microsoft)
Stevens Creek Nonvolatile Memory Programming TWG - Remote Persistent Memory
9/24 3:05 pm Haodong Tang Stevens Creek Spark-PMoF: Accelerating big data analytics with Persistent Memory over Fabric
9/24 4:05 pmLisa Li Tushar Gohad
Lafayette/San Tomas A Crash-consistent Client-side Cache for Ceph
9/24 7:00 pm Fred Zhang Winchester BOF - Considerations in NVMe-oF Storage Transport Protocols
9/25 9:00 am Peter Onufryk Santa Clara Ballroom NVMe State of the Union
9/25 2:00 pm Usha Upadhyayula Stevens Creek Volatile Use of Persistent Memory
9/25 4:05 pm Nick Adams Cypress What Happens when Compute Meets Storage? – Computational Storage TWG
9/26 8:30 am Michael Strassmaier Stevens Creek Intel® Optane™ DC Persistent Memory Performance Review
9/26 8:30 amAndrzej Jakowski Adrian Pearson
Lafayette/San Tomas Data-At-Rest Protection at Data Center Scale with NVMe* and Opal*
9/26 9:30 amDave MinturnAnil Vasudevan
Winchester Selecting an NVMe over Fabrics Ethernet Transport, RDMA or TCP
9/26 9:30 am Andy Rudoff Stevens Creek Persistent Memory Programming Made Easy with pmemkv
9/26 11:35 am Ziye Yang Winchester SPDK based user space NVMe over TCP Transport Solution
9/26 3:35 pmVishal VermaJohn Kariuki
Stevens Creek Improved Storage Performance Using the New Linux Kernel I/O Interface
Join Birds-of-a-Feather NVMe-oF session Today 7:00 PM in Winchester Room and attend “Selecting NVMe-oF Ethernet Transport RDMA or TCP” presentation Thurs 9:30 AM in Winchester Room to learn more
NOTICES & DISCLAIMERSIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration.
Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks .
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks .
Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.
© 2019 Intel Corporation.
Intel, the Intel logo, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as property of others.
26
NVME/TCP WITH ADQ ACCELERATION TESTING CONFIGURATION
SUT (Host) Client (Initiator)
Test by Intel Intel
Test date 09/17/19 09/17/19
Platform Dell R740XD Dell R740XD
# Nodes 1 1
# Sockets 2 2
CPU Intel® Xeon® Platnium 8168 (33M cache 2.70GHz) Intel® Xeon® Platnium 8168 (33M cache 2.70GHz)
Cores/socket, Threads/socket 48 cores/socket 2 threads/socket48 cores/socket 2 threads/socket
Microcode 0x200005a 0x200005a
HT Enabled Enabled
Turbo Enabled Enabled
BIOS version Dell 2.1.8 Dell 2.1.8
System DDR Mem Config: slots / cap / run-speed 4 slots / 32GB / 2666 MT/s 8 slots / 16GB / 2666 MT/s
System DCPMM Config: slots / cap / run-speed N/A N/A
Total Memory/Node (DDR+DCPMM) 128GB DDR4-2666 RDIMM 128GB DDR4-2666 RDIMM
Storage - boot 128GB SATA3 SSD 128GB SATA3 SSD
Storage - application drives 6x Intel® Optane SSD DC P4800X Series (375GB, 2.5in PCIe 3.1) N/A
NIC Intel E810-C Intel E810-C
Platform Chipset Intel Corporation C620 Series Chipset Family Intel Corporation C620 Series Chipset Family
Other HW (Accelerator) N/A N/A
OS Red Hat Enterprise Linux 7.6 Red Hat Enterprise Linux 7.6
Kernel 5.2.1 5.2.1
IBRS (0=disable, 1=enable) 1 1
eIBRS (0=disable, 1=enable) 0 0
Retpoline (0=disable, 1=enable) 1 1
IBPB (0=disable, 1=enable) 1 1
PTI (0=disable, 1=enable) 1 1
Mitigation variants (1,2,3,3a,4, L1TF) 1,2,3,L1TF 1,2,3,L1TF
Workload & version Fio-3-7 Fio-3-7
Compiler
NIC DriverRDMA driver: ice-0.12.0_rc3 (irdma-0.12.113), firmware-version: 0x800018f7
TCP driver: ice-0.12.0_rc3, firmware-version: 0x800018f7TCP(ADQ) driver: ice-0.11.2_rc3_adq_isv, firmware-version: 0x80001563
RDMA driver: ice-0.12.0_rc3 (irdma-0.12.113), firmware-version: 0x800018f7TCP driver: ice-0.12.0_rc3, firmware-version: 0x800018f7
TCP(ADQ) driver: ice-0.11.2_rc3_adq_isv, firmware-version: 0x80001563
27
HARDWARE CONFIGURATION FOR
SYSTEM LEVEL PERFORMANCEComponent Single DIMM Config
Test by Intel
Test date 02/20/2019
Platform NeonCity
Chipset LBG B1
CPU CLX B0 28 Core (QDF QQYZ)
DDR Speed 2666 MT/s
AEP QS Tranche3, 256GB, 18W
Memory Config32GB DDR4 (per socket)128GB AEP (per socket)
AEP FW 5336
BIOS 573.D10
BKC version WW08 BKC
Linux OS 4.20.4-200.fc29
Spectre/Meltdown
Patched (1,2,3, 3a)
Performance Turning QoS Disabled, IODC=5(AD)
SSDs
Intel-tested: Measured using FIO 3.1. Common Configuration - Intel 2U Server System, OS CentOS 7.5, kernel 4.17.6-1.el7.x86_64, CPU 2 x Intel® Xeon® 6154 Gold @ 3.0GHz (18 cores), RAM 256GB DDR4 @ 2666MHz. Configuration – Intel® Optane™ SSD DC P4800X 375GB and Intel® SSD DC P4610 3.2TB. Intel Microcode: 0x2000043; System BIOS: 00.01.0013; ME Firmware: 04.00.04.294; BMC Firmware: 1.43.91f76955; FRUSDR: 1.43.
Intel® Optane™ DC Persistent Memory
The benchmark results may need to be revised as additional testing is conducted. Performance results are based on testing as of November 15, 2018 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.
SPDK SYSTEM CONFIGURATION
Performance results are based on testing by Intel as of 2/26/2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product or component can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz + P4610: Tested by Intel on 4/12/2019, S2600WFT Platform with 12 x 16GB 2666MHz DDR4 (total 192GB), Storage: Intel® SSD DC S3700 800GB, Storage drives: 20x Intel® SSD DC P4610 (2TB), SPDK: (16x P4610s), URING: (4x P4610s), AIO: (2x P4610s), Bios: SE5C620.86B.0D.01.0250.112320180145, ucode: 0x4000010 (HT=ON, Turbo=ON), OS: Fedora 29, Kernel: 5.0.0-rc6+, Benchmark: bdevperf, QD= 32 (for SPDK), QD= 64 (for URING), QD=128 (for AIO), runtime = 300s, SPDK commit: b62dca930, SPDK compiled with LTO, PGO gcc compiler options, for URING (tuning: echo 0 > /sys/block/$dev/queue/iostats, echo 0 > /sys/block/$dev/queue/rq_affinity, echo 2 > /sys/block/$dev/queue/nomerge, echo 0 > /sys/block/$dev/queue/io_poll_delay)Results: 4K 100% Random Reads (100%) SPDK = 8.15M IOPSResults: 4K 100% Random Reads (100%) URING = 1.56M IOPSResults: 4K 100% Random Reads (100%) AIO = 0.614M IOPS
29
OCF FOOTNOTES/SYSTEM CONFIGURATIONS
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
System Configuration for slides titled “CAS + Intel® Optane™ SSD Accelerating MySQL” (pages 27-28) and for performance claim “MySQL up to 5.1X as fast w/CAS + Intel® Optane™ SSD” (pages 6, 8, 15) and for performance claim “MySQL* accelerated 5.11X” (pages 10, 26)System configuration –Red Hat Enterprise Linux 7.3, Kernal 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016, Purley Silver Wolf Pass S2600WFQ, BIOS Version: SE5C620.86B.0X.01.0107.122220170349, BIOS Release Date: 12/22/2017, Skylake H0 (2 Processors)(24 cores each processor, hyper-threading is enabled in BIOS so thread count per processor is 48) Intel® Xeon® Platinum 8160T CPU @ 2.10GHz, Intel(R) Rapid Storage Technology enterprise PreOS Version : 5.3.0.1052, 256GB Physical RAM installed but set to 128GB in the grub2 configuration, Intel 82574L Gigabit Ethernet Adapter, VMD enabled in BIOS and VROC HW key (Premium) installed and activated., Package C-State set to C6(non retention state) and Processor C6 set to enabled in BIOS, P-States set to default in BIOS and SpeedStep and Turbo are enabled, BMC version: 1.43.33e8d6b4 ME version: 4.00.04.309 SDR Package version: 1.43, fio version: fio-3.5-86-gcefd2, (VROC) mdadm - v4.0 - 2017-09-22 Intel build: RSTe_5.3_WW38.5, kmod-md-rste-5.3-514_4.el7_3.x86_64
System Configuration for slides “Accelerating Ceph* using HDD Backing Store” (page 33 - 34), performance claims “Ceph* Reads up to 4.9 X Faster with CAS + Intel® Optane™ SSD” and “Ceph* Writes up to 4.8 X Faster with CAS + Intel® Optane™ SSD” (pages 6, 8, 15) and “Ceph* reads 4.9X faster, Ceph writes 4.8X faster” (pages 12, 31)Baseline 4-Node Cluster: HDD OSD Drives with Journals on Intel S4600 SSD’s: 3x OSD 1x Mon/RGW Nodes: Server Intel S2600GZ (Grizzly Pass), CPUs 2x Intel® Xeon® Ivy Bridge E5-2660v2 @ 2.20GHz, 64GB Mem, SATA Boot SSD 1 x 800GB Intel® SSD DC S3700, OSD HDD 7 x 4TB WD* WDC_WD4003FZEX (excl. Mon/RGW), SATA Journal SSD 1 x 2TB Intel® SSD DC S4600, Network 2 x Intel® X540-AT2 10Gbe NICs; Ceph journal size: 10GB x 7. Value 4-Node Cluster: HDD OSD Drives with Journals on Optane, with/without CAS: Same as Baseline except NVMe Journal and cache 2 x 375GB Intel P4800x Optane; Ceph Journal size: 10GB x 7, Cache Size: 320GB x 2. Software: Ceph Luminous v12.2.3, RHEL 7.4 Updated, COSBench 0.4.2.c4, Intel CAS 3.5.1 (Value)
30
CONFIGURATION
SUMMARY
Parameter NVMe DCPMM
Test by Intel/Java Performance Team Intel/Java Performance Team
Test date 22/02/2019 22/02/2019
Platform S2600WFD S2600WFD
# Nodes 1 1
# Sockets 2 2
CPU 8280L 8280L
Cores/socket, Threads/socket 28/56 28/56
ucode 0x4000013 0x4000013
HT On On
Turbo On On
BIOS version SE5C620.86B.0D.01.0286.011120190816 SE5C620.86B.0D.01.0286.011120190816
DCPMM BKC version NA WW52 -2018
DCPMM FW version NA 5318
System DDR Mem Config: slots / cap / run-speed 12 slots / 16GB / 2666 12 slots / 16GB / 2666
System DCPMM Config: slots / cap / run-speed - 12 slots / 512GB
Total Memory/Node (DDR, DCPMM) 192GB, 0 192GB, 6TB
Storage - boot 1x Intel 800GB SSD OS Drive 1x Intel 800GB SSD OS Drive
Storage - application drives 4x P4610 1.6TB NVMe 12x512GB DCPMM
NIC 1x Intel X722 1x Intel X722
Software
OS Red Hat Enterprise Linux Server 7.6 Red Hat Enterprise Linux Server 7.6
Kernel 4.19.0 (64bit) 4.19.0 (64bit)
Mitigation log attached Yes Yes
DCPMM mode NA App Direct, Persistent Memory
Run Method 5 minute warm up post boot, then start
performance recording
5 minute warm up post boot, then start
performance recording
Iterations and result choice 3 iterations, median 3 iterations, median
Dataset size Two 1.5 Billion Partitions (Insanity schema) Two 1.5 Billion Partitions (Insanity schema)
Workload & version Read Only, Mix 80% Read/20% Updates,
Updates only
Read Only, Mix 80% Read/20% Updates,
Updates only
Compiler ANT 1.9.4 compiler for Cassandra ANT 1.9.4 compiler for Cassandra
Libraries NA PMDK 1.5, LLPL (latest as of 2/20/1019)
Other SW (Frameworks, Topologies…) NA NA
PMDK - Hardware Configuration
Diagram
PMDK - HARDWARE CONFIGURATION DIAGRAM
32
Client runningcassandra-stress
10Gb network
Intel Confidential-CNDA Required
10Gb network
Client runningcassandra-stress
10Gb network
10Gbit switch
S0
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
S1
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
DR
AM
Op
tan
e
PM
Intel Confidential
CLX Server
NVMe P4610
NVMe P4610
NVMe P4610
NVMe P4610
PMDK - SOFTWARE CONFIGURATION DIAGRAM
33Intel Confidential-CNDA Required
cassandra-stress 1
cassandra-stress 2
Client 1
cassandra-stress 3
cassandra-stress 4
Client 2
Server 1
Cassandra App 2
Persistent MemoryNamespace or 2 NVME
Cassandra App 1 Database 1
Persistent MemoryNamespace or 2 NVME
Database 2
Intel Confidential
Socket 1
Socket 2
ISA-L FOOTNOTES/SYSTEM CONFIGURATIONS
CLX:
Intel(R) Xeon(R) Platinum 8280L, 28C, 2.7 GHz, H0, Neon City CRB, 12x16 GB DDR4 2933 MT/s ECC RDIMM, Micron MTA18ASF2G72PDZ-2G9E1TG, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.5 64-bit OS, kernel 3.10.0-957.1.3.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.25
CLX:
Intel(R) Xeon(R) Gold 6230, 20C, 2.1 GHz, H0, Neon City CRB, 12x16 GB DDR4 2933 MT/s ECC RDIMM, Micron MTA18ASF2G72PDZ-2G9E1TG, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.5 64-bit OS, kernel 3.10.0-957.1.3.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.25
SKX:
Intel(R) Xeon(R) Gold 6126, 12C, 2.6 GHz, H0, Neon City CRB, 12x16 GB DDR4 2666 MT/s ECC RDIMM, Micron MTA36ASF2G72PZ-2G6B1QI, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.4 64-bit OS, kernel 3.10.0-693.21.1.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.23 vs ISA-L 2.25
BDX:
Intel(R) Xeon(R) E5-2650v4, 12C, 2.2 GHz, B0, Aztec City CRB, 8x8 GB DDR4 2400 MT/s ECC RDIMM, Samsung M393A1G43DB0, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.4 64-bit OS, kernel 3.10.0-693.21.1.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.23
34
See page 21 in “Notices and Disclaimers Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance.