35
Accelerating innovation through next generation storage technologies REBECCA WEEKLY SR. PRINCIPAL ENGINEER & SR. DIRECTOR CLOUD PLATFORMS GROUP

Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Accelerating innovation through next generation storage technologies REBECCA WEEKLYSR. PRINCIPAL ENGINEER & SR. DIRECTOR

CLOUD PLATFORMS GROUP

Page 2: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Source: April 2019 Raconteur

Page 3: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Data Gravity: Challenge or Opportunity?

Page 4: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Data Center | CloudCoreAccess | EdgeDevices | Things

Storage for the Data Challenge

Page 5: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

NVMe over Fabrics

Storage Technology Trends

Computational Storage

NVMe Storage

In Market

Page 6: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Data Storage Paradigms

• Hierarchical tree structures• Applications write to file

system over network (NAS)• File system tracks state

(metadata) of objects (files)• Active Documents

• Enterprise databases • Mission critical core business

applications

• Flat key / value structure• Metadata provides context• Application keeps track of object

location (“URL”)• Unstructured Big Data and

Archives

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

Block Storage Object StorageFile Storage

1 0

0

1

1

0

1 1

0

0

0

1

1 0

1

1

0

0

0 0

0

0

1

1

0 1

1

0

1

0

0 0

0

0

1

1

1 1

0

1

1

0

0 1

1

1

0

0

Page 7: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

What Drives Storage Choice?

Unstructured Data Workloads

Capacity requirement in PB range

Data not coupled to application

Applications don’t require strong consistency

Concurrent/Distributed access to content

Granular security and multi-tenancy

Source: http://www.gartner.com/technology/reprints.do?id=1-1R78PJ9&ct=140226&st=sb00

Page 8: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Object Storage Workloads

Core Functions

REST/HTML API access

High tolerance for latency

Support high concurrency

Eventual consistency

Specific Use Cases

VM Templates, ISO Images, etc.

Disk Volume Snapshots

Backup/Archive

Image/Video Repository

IO Workloads

Static Content, low change rate

Sequential R/W

Lower IOPS w/High throughput

Page 9: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Modern Use Cases for Object Storage

Source: https://minio.io

Decoupling the data created from the application enables an entirely new paradigm for data management.

Source: https://minio.io

DisaggregateData Lake

Streaming DataEventsLogs

Sensor DataSocial MediaTransactions Intel® AVX-512

Erasure Coding (HDFS, MinIO, RGW)

Applications

Data Processing

Machine Learning

S3 Select

S3A

SQL

SQL

S3 Select

S3 Select

Intel® AVX-512Erasure Coding

Intel® AVX-512MKL-DNN

S3A

S3A

Page 10: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Distributed Asynchronous Object Storage

10

Benefits▪ For persistent memory

▪ End-to-end OS bypass

▪ Low-latency I/O

▪ True zero-copy I/O

▪ Non-blocking

▪ Scalable communications & I/Os

3rd Party Applications

Rich Data Models

Storage Platform

Network

DAOS Storage EngineOpen Source Apache 2.0 License

POSIX I/O

Workflow

HDF5ApacheArrow

SQL …

Data Plane Control Plane

Libfabric

OPA RoCE GNI InfinibandSockets

TCP/IP TLS

iWARP

Learn about the architecture and features of Distributed Asynchronous Object Storage (DAOS). This open source object store is based on the Persistent Memory Development Kit (PMDK) for massively distributed non-volatile memory applications.

Page 11: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Persistent Storage for Cloud Apps

FaaS or Microservice architectures are not stateless –>

Data is created or read in at many points in the architecture.

Image Source: https://microservices.io/patterns/microservices.html

Page 12: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Industry-Standard Server-Based Scale-out Storage Solution

Optimizing for Performance and Value

Scale-out

HTTP Proxy Gateways

Proxy Network

Replication/Cluster Network

Intel® Xeon SP® Processor

Intel® Ethernet

Intel® SSD

Storage Node1

Object Storage Server

Intel® Xeon SP® Processor

Intel® Ethernet

Intel® SSD

Storage Node2

Object Storage Server

Intel® Xeon SP® Processor

Intel®Ethernet

Intel® SSD

Storage Node2

Object Storage Server

Scalable Cluster Framework

Proxy Gateway Servers

Clients Clients Clients

Client Network

Authentication

Monitoring and Management

Intel Technology at all levels of the stack

Intel® Xeon® SP processor-

based servers

Intel® SSD Data Center Family

10/25/40/50/100 Gigabit Intel® Ethernet

Network Adapters

Storage Performance Development Kit (SPDK)

Persistent Memory Development Kit (PMDK)

Intel® Optane™ DCPMM

Page 13: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Performance and Efficiency - Intel® Processors

Intel® AVX-512 is designed to improve both latency and throughput

Provides over 2X-3X performance boost over previous generation processors for storage functions:

▪ High speed, high bandwidth, vector pipeline integer operations

▪ XOR operations from ISA-L

▪ Hashing

▪ Erasure Codes

0%

50%

100%

150%

200%

250%

XOR Gen (16+1) PQ Gen (16+2) Reed Solomon EC (10+4)

Generational Cycle/Byte Comparison(higher is better)

0%

50%

100%

150%

200%

250%

300%

350%

Multihash

SHA-1

Multihash

SHA-1

Murmur

Multihash

SHA-256

Multibuffer

SHA-1

Multibuffer

SHA-256

Multibuffer

SHA-512

Multibuffer

MD5

Intel® Xeon® Processor E5-2650v3

Intel® Xeon® Processor E5-2650v4

Intel® Xeon® Platinum 8180 Processor

Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure on slide 27 for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Intel® Xeon® Processor E5-2600v3, E5-2650v3, 10C, 2.3 GHz, M1, Aztec City CRB, 4x8 GB DDR4 2133 MT/s ECC RDIMM. Intel® Xeon® Processor E5-2600v4, E5-2650v4, 12C, 2.2 GHz, M0, Aztec City CRB, 4x8 GB DDR4 2400 MT/s ECC RDIMM Intel® Xeon® Processor Scalable Family, Platinum 8180 Processor, 28C, 2.5 GHz, H0, Neon City CRB, 6x16 GB DDR4 2666 MT/s ECC RDIMM BIOS Configuration: P-States: Disabled, Turbo: Disabled, Speed Step: Disabled, ,C-States: Disabled, Power Performance Tuning: Disabled, ENERGY_PERF_BIAS_CFG: PERF, Isochronous: Disabled, Memory Power Savings: Disabled ISA-L 2.19

Page 14: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

INTEL® OPTANE™ DC PERSISTENT MEMORY• DATA PERSISTENCE ENABLING MEMORY-CENTRIC APPLICATIONS (APP-DIRECT)

• INCREASED MEMORY SIZE (MEMORY MODE) FOR PERFORMANCE

• IMPROVED TCO

INTEL.COM/OPTANE

Performance and Scale: Intel Storage

STORAGE

MEMORY

PERSISTENT MEMORY

DRAMHOT TIER

HDD / TAPECOLD TIER

Intel® 3D Nand SSDs

INTEL® OPTANE™ SSD DC D4800X DUAL-PORT • PERFORMANCE + RESILIENCY FOR CRITICAL ENTERPRISE IT APPS

• DUAL PORT CONNECTIONS ENABLE 24X7 DATA AVAILABILITY WITH REDUNDANT, HOT SWAPPABLE DATA PATHS

INTEL® SSD D-5 P4326 E1.L• COST-OPTIMIZED, ENABLES GREATER WARM STORAGE

• E1.L FORM FACTOR SCALABLE TO ~1PB (IN 1U)

Page 15: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Compute Express Link (CXL) Enabled Computing

CXL enables a more fluid and flexible memory modelSingle, common, memory address space across processors and devices

CPU CPU GPU FPGA AI NIC NIC

CPU-attached Memory(OS Managed)

Accelerator-Attached Memory(Runtime managed cache)

… … …

WritebackMemory

Memory Load/Store

Memory Load/Store

PCIe DMA

PCIe DMA

• Create shared memory pools

• Enhance movement of operands and results between

accelerators and target devices

• Enable efficient resource sharing

• Significant latency reduction to enable disaggregated memory

WritebackMemory

CXL consortium - Currently 83 companies and growing Learn more at www.ComputeExpressLink.org

Page 16: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Performance and Efficiency: Intel Tools

5XFIO for NVME1

5XCEPH Workload, OCF with Optane Cache1

OCF

8XCassandra

With Native Persistence1

PMDKSPDK Intel® VTune™

2.2XNetflix GBE1

Storage Performance Development Kit Open Cache Acceleration Software Framework Persistent Memory Development Kit Amplifier

Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure on slide 29-35 for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Page 17: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Rack

Top-of-RackSwitch (TOR)

End-of-RowSwitch (EOR)

Spine Switch

Router

Inter-DC Links (DCI)

Optical Links(Interconnect)

Intel® Ethernet

Intel® Silicon Photonics

Programmable Switches

Accelerator PoolsServers & Storage

Criticality of Connectivity for Storage

PERFORMANCE

scalability

efficiency

Page 18: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

18

All Ethernet NVMe-oF Protocols in a Single Adapter with Intel® Ethernet 800 Series

NVMe-oF* (Non-Volatile Memory Express over Fabrics)

*Other names and brands may be claimed as the property of others.

RDMA (Remote Direct Memory Access)

iWARP* RoCE* v2 (RDMA over Converged Ethernet

ver. 2)

Infiniband* Fibre Channel

NVMe*/TCP(Transmission Control Protocol)

Future Fabrics= Supported in Intel® Ethernet 800 Series (“Columbiaville”)

Ethernet-based

Intel® Omni-Path Architecture(Intel® OPA)

Page 19: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Application Device Queues (ADQ)

Performance and Scale: Intel Networking

With ADQ Application traffic to a dedicated set of queues

Without ADQApplication traffic intermixed with other traffic types

Applying ADQ to NVMe/TCP

Adding the Intel® Ethernet 800 Series with ADQ to NVMe/TCP narrows the performance gaps with RDMA NVMe-oF solutions

linux kernel updates NVMe/TCP FOR Released for comments

adq

ADQ

An open technology designed to improve application predictability, latency and throughput

adq

(Lower is better)

ADQImprovement

Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure on slide 28 for details. No product can be absolutely secure. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Page 20: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Example: Folded Clos Network Topology – Facebook

Tiered Structure Results in more E-W BW congestion. Streaming Inference tends to be done in the same rack as the web-tier nodes to avoid congestion.Often done with batch sizes of 1-2 because folded topology prevents taking advantage of scale.

Source: http://firstclassfunc.com/facebook-fabric-networking

Page 21: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Example: Fully Routed CloS – Google, Azure, IaaS trend

Very high E-W BW and uniform latencies across any two nodes within a much larger zone. Facilitates distributed functions for inference. Eg – Delegate through RPC call to TPU node.

Scale can allow streaming requests to be batched on Inference node for increased efficiency

Provider diagram Provider diagram

Page 22: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Standards bring calm to the chaos (NVMEOF, NVME OVER RDMA, NVME OVER TCP, ETC)

SNIA Board of Directors: Jim Pappas, Vice Chairman and Executive Committee

Technical Council: Alan Bumgarner

Solid State Storage Initiative (SSSI): Jenni Dietz, Co-Chair

Solid State Drive Special Interest Group (SSD SIG): Jonmichael Hands, Co-Chair

PM/NVDIMM Special Interest Group: Jim Pappas & Jenni Dietz

Networking Storage Forum (NSF): Christine McMonigal

NVM Programming TWG: Alan Bumgarner, Co-Chair

Computational Storage TWG: Nick Adams, Co-Chair

SFF TA TWG: Anthony Constantine

Swordfish: Barry Kittner

Page 23: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Data Center | CloudCoreAccess | EdgeDevices | Things

Architecting Storage for the Data Challenge

Page 24: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

Visit other Intel SessionsDate Time Speaker Name Room Session Title

9/23 8:30 amNick Adams J Metz (Cisco)

CypressNVM Express Specifications: Mastering Today’s Architecture and Preparing for Tomorrow’s

9/23 9:30 amJim Harris Paul Luse

Cypress Squeezing Compression into SPDK

9/23 10:35 am Piotr Wysocki Stevens Creek Scalable Storage Management with NVMe and NVMe-oF

9/23 2:30 pm Benjamin Walker Cypress 10 Million I/Ops From a Single Thread

9/24 1:00 pmChangpeng Liu Xiaodong Liu

Lafayette/San Tomas Introduction of SPDK vhost FUSE Target to Accelerate File Access in VM and Containers

9/24 2:00 pmAlan BumgarnerTom Talpey (Microsoft)

Stevens Creek Nonvolatile Memory Programming TWG - Remote Persistent Memory

9/24 3:05 pm Haodong Tang Stevens Creek Spark-PMoF: Accelerating big data analytics with Persistent Memory over Fabric

9/24 4:05 pmLisa Li Tushar Gohad

Lafayette/San Tomas A Crash-consistent Client-side Cache for Ceph

9/24 7:00 pm Fred Zhang Winchester BOF - Considerations in NVMe-oF Storage Transport Protocols

9/25 9:00 am Peter Onufryk Santa Clara Ballroom NVMe State of the Union

9/25 2:00 pm Usha Upadhyayula Stevens Creek Volatile Use of Persistent Memory

9/25 4:05 pm Nick Adams Cypress What Happens when Compute Meets Storage? – Computational Storage TWG

9/26 8:30 am Michael Strassmaier Stevens Creek Intel® Optane™ DC Persistent Memory Performance Review

9/26 8:30 amAndrzej Jakowski Adrian Pearson

Lafayette/San Tomas Data-At-Rest Protection at Data Center Scale with NVMe* and Opal*

9/26 9:30 amDave MinturnAnil Vasudevan

Winchester Selecting an NVMe over Fabrics Ethernet Transport, RDMA or TCP

9/26 9:30 am Andy Rudoff Stevens Creek Persistent Memory Programming Made Easy with pmemkv

9/26 11:35 am Ziye Yang Winchester SPDK based user space NVMe over TCP Transport Solution

9/26 3:35 pmVishal VermaJohn Kariuki

Stevens Creek Improved Storage Performance Using the New Linux Kernel I/O Interface

Join Birds-of-a-Feather NVMe-oF session Today 7:00 PM in Winchester Room and attend “Selecting NVMe-oF Ethernet Transport RDMA or TCP” presentation Thurs 9:30 AM in Winchester Room to learn more

Page 25: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source
Page 26: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

NOTICES & DISCLAIMERSIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration.

Performance results are based on testing as of September 2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks .

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks .

Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

© 2019 Intel Corporation.

Intel, the Intel logo, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as property of others.

26

Page 27: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

NVME/TCP WITH ADQ ACCELERATION TESTING CONFIGURATION

SUT (Host) Client (Initiator)

Test by Intel Intel

Test date 09/17/19 09/17/19

Platform Dell R740XD Dell R740XD

# Nodes 1 1

# Sockets 2 2

CPU Intel® Xeon® Platnium 8168 (33M cache 2.70GHz) Intel® Xeon® Platnium 8168 (33M cache 2.70GHz)

Cores/socket, Threads/socket 48 cores/socket 2 threads/socket48 cores/socket 2 threads/socket

Microcode 0x200005a 0x200005a

HT Enabled Enabled

Turbo Enabled Enabled

BIOS version Dell 2.1.8 Dell 2.1.8

System DDR Mem Config: slots / cap / run-speed 4 slots / 32GB / 2666 MT/s 8 slots / 16GB / 2666 MT/s

System DCPMM Config: slots / cap / run-speed N/A N/A

Total Memory/Node (DDR+DCPMM) 128GB DDR4-2666 RDIMM 128GB DDR4-2666 RDIMM

Storage - boot 128GB SATA3 SSD 128GB SATA3 SSD

Storage - application drives 6x Intel® Optane SSD DC P4800X Series (375GB, 2.5in PCIe 3.1) N/A

NIC Intel E810-C Intel E810-C

Platform Chipset Intel Corporation C620 Series Chipset Family Intel Corporation C620 Series Chipset Family

Other HW (Accelerator) N/A N/A

OS Red Hat Enterprise Linux 7.6 Red Hat Enterprise Linux 7.6

Kernel 5.2.1 5.2.1

IBRS (0=disable, 1=enable) 1 1

eIBRS (0=disable, 1=enable) 0 0

Retpoline (0=disable, 1=enable) 1 1

IBPB (0=disable, 1=enable) 1 1

PTI (0=disable, 1=enable) 1 1

Mitigation variants (1,2,3,3a,4, L1TF) 1,2,3,L1TF 1,2,3,L1TF

Workload & version Fio-3-7 Fio-3-7

Compiler

NIC DriverRDMA driver: ice-0.12.0_rc3 (irdma-0.12.113), firmware-version: 0x800018f7

TCP driver: ice-0.12.0_rc3, firmware-version: 0x800018f7TCP(ADQ) driver: ice-0.11.2_rc3_adq_isv, firmware-version: 0x80001563

RDMA driver: ice-0.12.0_rc3 (irdma-0.12.113), firmware-version: 0x800018f7TCP driver: ice-0.12.0_rc3, firmware-version: 0x800018f7

TCP(ADQ) driver: ice-0.11.2_rc3_adq_isv, firmware-version: 0x80001563

27

Page 28: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

HARDWARE CONFIGURATION FOR

SYSTEM LEVEL PERFORMANCEComponent Single DIMM Config

Test by Intel

Test date 02/20/2019

Platform NeonCity

Chipset LBG B1

CPU CLX B0 28 Core (QDF QQYZ)

DDR Speed 2666 MT/s

AEP QS Tranche3, 256GB, 18W

Memory Config32GB DDR4 (per socket)128GB AEP (per socket)

AEP FW 5336

BIOS 573.D10

BKC version WW08 BKC

Linux OS 4.20.4-200.fc29

Spectre/Meltdown

Patched (1,2,3, 3a)

Performance Turning QoS Disabled, IODC=5(AD)

SSDs

Intel-tested: Measured using FIO 3.1. Common Configuration - Intel 2U Server System, OS CentOS 7.5, kernel 4.17.6-1.el7.x86_64, CPU 2 x Intel® Xeon® 6154 Gold @ 3.0GHz (18 cores), RAM 256GB DDR4 @ 2666MHz. Configuration – Intel® Optane™ SSD DC P4800X 375GB and Intel® SSD DC P4610 3.2TB. Intel Microcode: 0x2000043; System BIOS: 00.01.0013; ME Firmware: 04.00.04.294; BMC Firmware: 1.43.91f76955; FRUSDR: 1.43.

Intel® Optane™ DC Persistent Memory

The benchmark results may need to be revised as additional testing is conducted. Performance results are based on testing as of November 15, 2018 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.

Page 29: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

SPDK SYSTEM CONFIGURATION

Performance results are based on testing by Intel as of 2/26/2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product or component can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz + P4610: Tested by Intel on 4/12/2019, S2600WFT Platform with 12 x 16GB 2666MHz DDR4 (total 192GB), Storage: Intel® SSD DC S3700 800GB, Storage drives: 20x Intel® SSD DC P4610 (2TB), SPDK: (16x P4610s), URING: (4x P4610s), AIO: (2x P4610s), Bios: SE5C620.86B.0D.01.0250.112320180145, ucode: 0x4000010 (HT=ON, Turbo=ON), OS: Fedora 29, Kernel: 5.0.0-rc6+, Benchmark: bdevperf, QD= 32 (for SPDK), QD= 64 (for URING), QD=128 (for AIO), runtime = 300s, SPDK commit: b62dca930, SPDK compiled with LTO, PGO gcc compiler options, for URING (tuning: echo 0 > /sys/block/$dev/queue/iostats, echo 0 > /sys/block/$dev/queue/rq_affinity, echo 2 > /sys/block/$dev/queue/nomerge, echo 0 > /sys/block/$dev/queue/io_poll_delay)Results: 4K 100% Random Reads (100%) SPDK = 8.15M IOPSResults: 4K 100% Random Reads (100%) URING = 1.56M IOPSResults: 4K 100% Random Reads (100%) AIO = 0.614M IOPS

29

Page 30: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

OCF FOOTNOTES/SYSTEM CONFIGURATIONS

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

System Configuration for slides titled “CAS + Intel® Optane™ SSD Accelerating MySQL” (pages 27-28) and for performance claim “MySQL up to 5.1X as fast w/CAS + Intel® Optane™ SSD” (pages 6, 8, 15) and for performance claim “MySQL* accelerated 5.11X” (pages 10, 26)System configuration –Red Hat Enterprise Linux 7.3, Kernal 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016, Purley Silver Wolf Pass S2600WFQ, BIOS Version: SE5C620.86B.0X.01.0107.122220170349, BIOS Release Date: 12/22/2017, Skylake H0 (2 Processors)(24 cores each processor, hyper-threading is enabled in BIOS so thread count per processor is 48) Intel® Xeon® Platinum 8160T CPU @ 2.10GHz, Intel(R) Rapid Storage Technology enterprise PreOS Version : 5.3.0.1052, 256GB Physical RAM installed but set to 128GB in the grub2 configuration, Intel 82574L Gigabit Ethernet Adapter, VMD enabled in BIOS and VROC HW key (Premium) installed and activated., Package C-State set to C6(non retention state) and Processor C6 set to enabled in BIOS, P-States set to default in BIOS and SpeedStep and Turbo are enabled, BMC version: 1.43.33e8d6b4 ME version: 4.00.04.309 SDR Package version: 1.43, fio version: fio-3.5-86-gcefd2, (VROC) mdadm - v4.0 - 2017-09-22 Intel build: RSTe_5.3_WW38.5, kmod-md-rste-5.3-514_4.el7_3.x86_64

System Configuration for slides “Accelerating Ceph* using HDD Backing Store” (page 33 - 34), performance claims “Ceph* Reads up to 4.9 X Faster with CAS + Intel® Optane™ SSD” and “Ceph* Writes up to 4.8 X Faster with CAS + Intel® Optane™ SSD” (pages 6, 8, 15) and “Ceph* reads 4.9X faster, Ceph writes 4.8X faster” (pages 12, 31)Baseline 4-Node Cluster: HDD OSD Drives with Journals on Intel S4600 SSD’s: 3x OSD 1x Mon/RGW Nodes: Server Intel S2600GZ (Grizzly Pass), CPUs 2x Intel® Xeon® Ivy Bridge E5-2660v2 @ 2.20GHz, 64GB Mem, SATA Boot SSD 1 x 800GB Intel® SSD DC S3700, OSD HDD 7 x 4TB WD* WDC_WD4003FZEX (excl. Mon/RGW), SATA Journal SSD 1 x 2TB Intel® SSD DC S4600, Network 2 x Intel® X540-AT2 10Gbe NICs; Ceph journal size: 10GB x 7. Value 4-Node Cluster: HDD OSD Drives with Journals on Optane, with/without CAS: Same as Baseline except NVMe Journal and cache 2 x 375GB Intel P4800x Optane; Ceph Journal size: 10GB x 7, Cache Size: 320GB x 2. Software: Ceph Luminous v12.2.3, RHEL 7.4 Updated, COSBench 0.4.2.c4, Intel CAS 3.5.1 (Value)

30

Page 31: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

CONFIGURATION

SUMMARY

Parameter NVMe DCPMM

Test by Intel/Java Performance Team Intel/Java Performance Team

Test date 22/02/2019 22/02/2019

Platform S2600WFD S2600WFD

# Nodes 1 1

# Sockets 2 2

CPU 8280L 8280L

Cores/socket, Threads/socket 28/56 28/56

ucode 0x4000013 0x4000013

HT On On

Turbo On On

BIOS version SE5C620.86B.0D.01.0286.011120190816 SE5C620.86B.0D.01.0286.011120190816

DCPMM BKC version NA WW52 -2018

DCPMM FW version NA 5318

System DDR Mem Config: slots / cap / run-speed 12 slots / 16GB / 2666 12 slots / 16GB / 2666

System DCPMM Config: slots / cap / run-speed - 12 slots / 512GB

Total Memory/Node (DDR, DCPMM) 192GB, 0 192GB, 6TB

Storage - boot 1x Intel 800GB SSD OS Drive 1x Intel 800GB SSD OS Drive

Storage - application drives 4x P4610 1.6TB NVMe 12x512GB DCPMM

NIC 1x Intel X722 1x Intel X722

Software

OS Red Hat Enterprise Linux Server 7.6 Red Hat Enterprise Linux Server 7.6

Kernel 4.19.0 (64bit) 4.19.0 (64bit)

Mitigation log attached Yes Yes

DCPMM mode NA App Direct, Persistent Memory

Run Method 5 minute warm up post boot, then start

performance recording

5 minute warm up post boot, then start

performance recording

Iterations and result choice 3 iterations, median 3 iterations, median

Dataset size Two 1.5 Billion Partitions (Insanity schema) Two 1.5 Billion Partitions (Insanity schema)

Workload & version Read Only, Mix 80% Read/20% Updates,

Updates only

Read Only, Mix 80% Read/20% Updates,

Updates only

Compiler ANT 1.9.4 compiler for Cassandra ANT 1.9.4 compiler for Cassandra

Libraries NA PMDK 1.5, LLPL (latest as of 2/20/1019)

Other SW (Frameworks, Topologies…) NA NA

PMDK - Hardware Configuration

Diagram

Page 32: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

PMDK - HARDWARE CONFIGURATION DIAGRAM

32

Client runningcassandra-stress

10Gb network

Intel Confidential-CNDA Required

10Gb network

Client runningcassandra-stress

10Gb network

10Gbit switch

S0

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

S1

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

DR

AM

Op

tan

e

PM

Intel Confidential

CLX Server

NVMe P4610

NVMe P4610

NVMe P4610

NVMe P4610

Page 33: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

PMDK - SOFTWARE CONFIGURATION DIAGRAM

33Intel Confidential-CNDA Required

cassandra-stress 1

cassandra-stress 2

Client 1

cassandra-stress 3

cassandra-stress 4

Client 2

Server 1

Cassandra App 2

Persistent MemoryNamespace or 2 NVME

Cassandra App 1 Database 1

Persistent MemoryNamespace or 2 NVME

Database 2

Intel Confidential

Socket 1

Socket 2

Page 34: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

ISA-L FOOTNOTES/SYSTEM CONFIGURATIONS

CLX:

Intel(R) Xeon(R) Platinum 8280L, 28C, 2.7 GHz, H0, Neon City CRB, 12x16 GB DDR4 2933 MT/s ECC RDIMM, Micron MTA18ASF2G72PDZ-2G9E1TG, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.5 64-bit OS, kernel 3.10.0-957.1.3.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.25

CLX:

Intel(R) Xeon(R) Gold 6230, 20C, 2.1 GHz, H0, Neon City CRB, 12x16 GB DDR4 2933 MT/s ECC RDIMM, Micron MTA18ASF2G72PDZ-2G9E1TG, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.5 64-bit OS, kernel 3.10.0-957.1.3.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.25

SKX:

Intel(R) Xeon(R) Gold 6126, 12C, 2.6 GHz, H0, Neon City CRB, 12x16 GB DDR4 2666 MT/s ECC RDIMM, Micron MTA36ASF2G72PZ-2G6B1QI, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.4 64-bit OS, kernel 3.10.0-693.21.1.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.23 vs ISA-L 2.25

BDX:

Intel(R) Xeon(R) E5-2650v4, 12C, 2.2 GHz, B0, Aztec City CRB, 8x8 GB DDR4 2400 MT/s ECC RDIMM, Samsung M393A1G43DB0, NUMA Memory Configuration, Red Hat Enterprise Linux Server 7.4 64-bit OS, kernel 3.10.0-693.21.1.el7.x86_64, BIOS ENERGY_PERF_BIAS_CFG: PERF, Disabled: P-States, Turbo, Speed Step, C-States, Power Performance Tuning, Isochronous, Memory Power Savings, ISA-L 2.23

34

Page 35: Accelerating innovation through next generation storage ... · • File system tracks state (metadata) of objects (files) • Active Documents • Enterprise databases ... Open Source

See page 21 in “Notices and Disclaimers Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance.