21
PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform Technologies,

PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

1

PM in the Wild: VMware Experience and Future ExpectationsRichard A. Brunner Principal Engineer & CTO Server Platform Technologies,

Page 2: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

2

Agenda and Introduction

• Experiences Thus Far• Introduction• Are We Ready? With What?• vSphere Support• Customer Reaction

• Future Expectations• Data Protection• Server FW & HW Support• Persistence Domain & Power Fail• Future Topologies• Future Technologies

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

“It's been a long road, getting from there to here.It's been a long time, but my time is finally near. “

Page 3: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

3

What is PM (Persistent Memory)?

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Page 4: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

4

PM Inside a VM: Different Access Models

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

NVMe SSD: vSCSI to NVMe SSD; Legacy & New OS• fsync flushes page cache to virtual SCSI disk (vSCSI)• vSCSI write executes GOS SCSI Stack• Hypervisor intercepts and writes physical SCSI stack

vPMemDisk: vSCSI to PMEM; Legacy & New OS• fsync flushes page cache to virtual SCSI disk (vSCSI)• vSCSI write executes GOS SCSI Stack• Hypervisor intercepts and writes to physical PMEM

vNVDIMM (block access): PMEM mapped into New OS• fsync flushes page cache to PMEM

vNVDIMM-DAX (Direct Access)• Requires New GOS and New Guest Apps• File read/write directly to PMEM pages in GOS• PMEM pages directly mapped to Guest App• No need for fsync

Page 5: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

5© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

It finally Launched in April 2019 !!!

Intel “Cascade Lake”

Page 6: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

6

Are We Ready? - Basics

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Page 7: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

7

Are We Ready? - Advanced

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

But we need more Applications!

Page 8: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

8© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

2017

VMware vSphere Certified Servers with PM

Page 9: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

9

VMware vSphere 6.7 Working with PM

Legacy OS & Application Usage

• Native: • PMEM as block storage device

with special driver.

• Virtualized: PMEM as block dev• Use special Guest driver; or• Use vPMemDisk with no new

driver • Map Guest Storage to PMEM

outside of VM by admin.

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Native OS vSphere 6.7

Page 10: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

10

VMware vSphere 6.7 Working with PM

New OS & Application Usage• Native & Virtualized

• Can use a direct load/store model with little OS overhead

• All the benefits of vSphere can be made available now or in the future:

• Multiple workloads using PMEM• Live VM Migration across servers• Check-pointing• Boost for Legacy VMs/Workloads• And More …

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

https://kb.vmware.com/s/article/54444https://kb.vmware.com/s/article/54445https://kb.vmware.com/s/article/67645

Page 11: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

11

vSphere Support For Persistent Memory

vCenter & DRS

PMem

DS

NVDIMMs NVDIMMsNVDIMM NVDIMMsNVDIMMs NVDIMM

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Page 12: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

12

vSphere Support For Persistent Memory (2)

vCenter & DRS

PMem

DS

NVDIMMs NVDIMMsNVDIMM NVDIMMsNVDIMMs NVDIMM

Enter maintenance mode (vacate powered off VMs also)

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Page 13: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

13

Yes, We Are There!

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Customer Reaction?

Future Expectations …

Page 14: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

14

Data Protection – Near Term

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Host Running VMs

Schedule Maintenance

DRS/HA restart of VM

VM crashesMemory Poison

Maintenance NeededPerf Degradation

ADL @ Power LossADL @ Off-liningWPL @ Off-lining

WPL @ Power LossWPL @ Starting Now

ACPI 6.3: Get Current Health (NCH)WPL: Write Persistency LossADL: All Data LossNIH: Can inject errors for testing!

Auto Shutdown Affected VMs

Admin Operation

Maintenance Mode

Start Migration

Reboot

Shutdown All VMs

Page 15: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

15

Data Protection – Longer Term

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Host Running VMs

Auto Migration Now

Auto Migration Soon

Schedule Maintenance

VM Auto Suspend or Shutdown

VM Auto Suspend or Shutdown

DRS/HA restart of VM

VM crashesMemory Poison

Maintenance NeededPerf Degradation

ADL @ Power LossADL @ Off-liningWPL @ Off-lining

WPL @ Power LossWPL @ Starting Now

PrayADL @ Starting Now

Checkpointing & Replication of PM in the background

Admin Operation

Maintenance Mode

Start Migration

Reboot

Shutdown All VMs

Page 16: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

16

Server HW & FW support for PM• 2017-2019 was a firestorm of different, non-conforming,

Firmware interfaces!• UEFI/ACPI/FW specs struggling against HW schedules.

• With last minute changes to Error Detection & Remediation• Too many iterations with too many OEMs.• Non-std provisioning and encryption

• This shouldn’t happen in 2020.• UEFI/ACPI/FW interfaces are more mature and stable.• Intel “Barlow Pass” does not break these☺• OEMs have now broadly deployed solutions.• VMware has better defense:

• Full “PMEM Certification” kit for OEMs.

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

• Correct and tested UEFI/ACPI interfaces are critical to OS & Hypervisors.• Well-tested error and power-fail handling are critical.• VMware will only certify a PM solution in combination with the platform that supports it.

https://funnyjunk.com/channel/funny/Calling+all+tech+support/

Page 17: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

17

DRAM Controller DRAM NAND

NVDIMM-N

x86 Persistence Domain & Power Fail• RTFM: SW needs to flush to persistence

domain, accept no substitutes! • CLFLUSH* & CLWB + SFENCE

• Key: Trigger Memory Controller to Flush WPQ to PM internal buffers or DRAM.

• Auto-triggered on Power Fail by Asynchronous DRAM Refresh (ADR).

• All Servers with PM support ADR.• Triggered by SW by WPQ Flush Cmd.

• Extended ADR: ADR + CPU Cache Flush

• BIOS flushes cache on each CPU.• Requires Large Backup Energy or UPS.• Exotic for Traditional Server.

• NVDIMM-N needs Backup Energy to flush to NAND.

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

SuperCap / Battery

Write PendingQueue (WPQ)

Cache

Core

IMC

Processor Pkg

Media Controller Media Media

Intel Optane DCPMMADR or

WPQ Flush

CLFLUSH*CLWB

Page 18: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

18

Media Controller

PM

PM

PM

Hierarchical PM Topology & Pooling• Current SW support for “local” PM

Memory Affinity.• But more levels are coming.• Requires coherent access to be

effective.

• Future Coherent I/O interface allows expanding PM store.• Could be CXL or Gen-Z.• Latency vs Remote Capacity &

Remote Failover.• Replication can be off-lined.

• Many challenges:• Switch HW is non-trivial• Error Propagation & Recovery• Sophisticated SW Memory Tiering

& Hot-page Migration.• Fine-grain encryption.

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

RDMA

Server 1

Server 2

PM

PM

PM

PM

(CXL or Gen-Z)

Page 19: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

19

Future Capacity & Latency• PM is not on the same density curve as DRAM,

• But, PM device capacity could be 2x to 4x of current (2 TiB?) in the next 5-years.

• Could we see a 2-socket server with 32 TiB of PM in the next 5-years?• (2 TiB/channel) x (8 channel/socket) x (2 socket) = 32 TiB of PM

• Future Technologies:• Carbon Nanotube (CNT), MRAM, or NAND+ DRAM (NVDIMM-P)• Next Generation Intel Optane, Next Gen ReRAM• Some could have near-DRAM latencies

• Regardless, *1* std OS/SW interface provided by HW, FW, SNIA, etc …• But sadly, advanced error recovery & provisioning likely device-specific for OS/SW

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Page 20: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

20

Summary

© 2020 SNIA Persistent Memory Summit. All Rights Reserved.

Experiences Thus Far: • We are ready enough, but we will learn much in 2020.

• We need more Applications!• Customer Reaction is slow so far.

Future Expectations: • Much collaborative work remains for

• Data Protection• Server FW & HW Support• Persistence Domain & Power Fail

• HW Innovation (with help from SW) continues:• Future Topologies• Future Technologies

Page 21: PM in the Wild: VMware Experience and Future Expectations€¦ · PM in the Wild: VMware Experience and Future Expectations Richard A. Brunner Principal Engineer & CTO Server Platform

2121

Thank youPlease visit www.snia.org/pmsummit for presentations

21