19
COLO: COarse-grain LOck- stepping Virtual Machine for Non-stop Service Li Zhijian <[email protected]> Fujitsu Limited.

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Embed Size (px)

Citation preview

Page 1: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service

Li Zhijian <[email protected]>

Fujitsu Limited.

Page 2: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Agenda

Background COarse-grain LOck-stepping Summary

Page 3: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Non-Stop Service with VM Replication

Typical Non-stop Service Requires

Expensive hardware for redundancy

Extensive software customization

VM Replication: Cheap Application-agnostic Solution

Page 4: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Existing VM Replication Approaches

4

Replication Per Instruction: Lock-steppingExecute in parallel for deterministic instructions

Lock and step for un-deterministic instructions

Replication Per Epoch: Continuous CheckpointSecondary VM is synchronized with Primary

VM per epoch

Output is buffered within an epoch

Page 5: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Problems

5

Lock-steppingExcessive replication overhead

memory access in an MP-guest is un-deterministic

Continuous CheckpointExtra network latency

Excessive VM checkpoint overhead

Page 6: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Agenda

6

Background

COarse-grain LOck-stepping

Summary

Page 7: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Why COarse-grain LOck-stepping (COLO)

7

VM Replication is an overly strong conditionWhy we care about the VM state ?

The client care about response only

Can the control failover without ”precise VM state

replication”?

Coarse-grain lock-stepping VMsSecondary VM is a replica, as if it can generate

same response with primary so farBe able to failover without service stop

Non-stop service focus on server response, not internal machine state!Non-stop service focus on server response, not internal machine state!Non-stop service focus on server response, not internal machine state!Non-stop service focus on server response, not internal machine state!

Page 8: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Architecture of COLO

8

Pnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VMPnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VM

COarse-grain LOck-stepping Virtual Machine for Non-stop ServiceCOarse-grain LOck-stepping Virtual Machine for Non-stop ServiceCOarse-grain LOck-stepping Virtual Machine for Non-stop ServiceCOarse-grain LOck-stepping Virtual Machine for Non-stop Service

Internal Network

Primary Node

Primary VM

QEMU

Heartbeat

COLO Disk Manager

KVM

Kernel

Storage External Network

VM Checkpoint

Proxy module

Disk IO

Net IO

Failover

Secondary Node

Secondary VM

QEMU

Heartbeat

COLO Disk Manager

KVM

Kernel

StorageExternal Network

Failover

VM Checkpoint

Proxy module

Page 9: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Network topology of COLO

9

[eth0] : client and vm communication

[eth1] : migration/checkpoint, storage replication and proxy

Pnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VMPnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VM

Page 10: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Network Process

10

Guest-RX Pnode

Receive a packet from client

Copy the packet and send to Snode

Send the packet to PVM

Snode

Receive the packet from Pnode

Adjust packet’s ack_seq number

Send the packet to SVM

Guest-TX Snode

Receive the packet from SVM

Adjust packet’s seq number

Send the SVM packet to Pnode

Pnode

Receive the packet from PVM

Receive the packet from Snode

Compare PVM/SVM packet

Same: release the packet to clientSame: release the packet to clientDifferent: trigger checkpoint and release packet to clientDifferent: trigger checkpoint and release packet to client

Base on Qemu’s netfilter and SLIRPBase on Qemu’s netfilter and SLIRP

Page 11: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Storage Process

11

Write

Pnode

Send the write request to Snode

Write the write request to storage

Snode

Receive PVM write request

Read original data to SVM cache & write PVM write request to storage(Copy On Write)

Write SVM write request to SVM cache

Read

Snode

Read from SVM cache, or storage (SVM cache miss)

Pnode

Read form storage

Checkpoint Drop SVM cache

Failover Write SVM cache to storage

Base on qemu’s quorum,nbd,backup-driver,backingfile

Page 12: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Memory Sync Process

12

Internal Network

Transfer Dirty Pages

Primary Node

Primary VM Qemu

VM Checkpoint

KVM

Kernel

Get Dirty Pages

Secondary Node

Secondary VMQemuVM Checkpoint

KVM

Kernel

PVM Memory Cache Update SVM state

• PNode– Track PVM dirty pages, send them to Snode periodically

• Snode– Receive the PVM dirty pages, save them to PVM Memory

Cache– On checkpoint, update SVM memory with PVM Memory Cache

Page 13: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Checkpoint Process

13

Need modify migration process in Qemu to support Need modify migration process in Qemu to support checkpoint checkpoint

Page 14: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Why Better

14

Comparing with Continuous VM checkpointNo buffering-introduced latency

Less checkpoint frequencyOn demand vs. periodic

Comparing with lock-steppingEliminate excessive overhead of un-

deterministic instruction execution due to MP-guest memory access

Page 15: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Agenda

15

Background

COarse-grain LOck-stepping

Summary

Page 16: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Summary

16

Performance

Page 17: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Summary

17

COLO status colo frame: patchset v9 had been post(by

[email protected])

colo-block: most of patch is reviewed (by [email protected])

colo-proxy: (by [email protected])

netfilter related is reviewed

packet compare is developing

Page 18: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

Summary

18

Next steps:Redesign based on feedbacks

Develop and send out for review

Optimize performance

Page 19: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited

19