Disk Performance Comparison Xen v.s. KVM

  • Upload
    nknytk

  • View
    7.704

  • Download
    0

Embed Size (px)

Citation preview

  • 1. Autor: 2013/09/20 Disk I/O Performance Comparison Xen v.s. KVM

2. Autor: 2013/09/20 Agenda 1. About This Research 2. Benchmark Configuration 3. Test Result with HDD 4. Test Result with SSD 5. Consideration 3. Autor: 2013/09/20 1. About This Research 4. Autor: 2013/09/20 Goal Get enough information on I/O performance characteristic of Xen and KVM so that we can rightly judge which shoud be adopted for a use case. 5. Autor: 2013/09/20 Premise This document is writen on a premise that you know what Xen and KVM is. The explanation about them is skipped mainly because author's laziness. 6. Autor: 2013/09/20 2. Benchmark Configuration 7. Autor: 2013/09/20 Benchmarked Virtulization Softwares Two famous open source virtulization softwares are tested. KVM Xen 4.2.2 XenServer 6.2 Could not test because the installer did not find any disk to install... 8. Autor: 2013/09/20 VM Host Server VM host server specification: CPU model Core i 5 2500K 3.3GHz 2Core4Thread CPU settings Hyper-Threading enabled Turbo Core disabled Power Saving disabled Memory 16GB DDR3 1,333MHz dual channel Disk 1 80GB HDD 3.5inch 7,200rpm Hitachi HDS721680PLA380 Disk 2 128GB SSD CFD CSSD-S6T128MHG5Q (Toshiba HG5Q) OS CentOS 6.4 filesystem Ext4 9. Autor: 2013/09/20 VM Guest Server VM guest server common specification: VCPU 2 Memory 512MB Disk Size 10GB Thin Provisioning No OS CentOS 6.4 Disk Driver Para-Virtulized Driver filesystem Ext4 10. Autor: 2013/09/20 Benchmark Tool Flexible I/O benchmark tool "fio" is used. http://freecode.com/projects/fio Setting item exmaples: read/write, sequential/random, direct false/true (use file cache/do not use)... Abalable data examples: bandwidth, IOPS, latency distribution, CPU load, IO %util ... 11. Autor: 2013/09/20 Benchmark Configurations Tested Configuration: I/O size(byte) Load Type Direct Test Limit 11 random read false 1GB or 180sec random write false 1GB or 180sec 512 random read true 1GB or 180sec random write true 1GB or 180sec 4k random read true 3GB or 180sec random write true 3GB or 180sec 32k random read true 3GB or 180sec random write true 3GB or 180sec 512k random read true 3GB or 180sec random write true 3GB or 180sec 1m sequential read true 3GB or 180sec sequential write true 3GB or 180sec 12. Autor: 2013/09/20 KVM disk cache Two benchmarked disk cache configurations: writethrough/default Defalut setting by virt-manager on Ubuntu 12.10 none Defalut setting by virt-manager on CentOS 6.4 http://infoliser.com/a-guide-to-kvm-guest-disk-cache/ 13. Autor: 2013/09/20 KVM disk cache Other disk cache configurations: Not benchmarked in this test. writeback diretsync http://infoliser.com/a-guide-to-kvm-guest-disk-cache/ 14. Autor: 2013/09/20 Criteria IOPS: Higher is better Bandwidth: Higher is better (I/O size * IOPS) Latency: Lower is better, lower variance is better CPU usage: Lower is better Emuration cost is in KVM: host CPU guest CPU in Xen: Domain0 CPU? I/O %util 15. Autor: 2013/09/20 Benchmark Scripts KVM/Xen installation and benchmarks are done by scripts published here https://github.com/nknytk/disk-performance-xen- kvm 16. Autor: 2013/09/20 3. Test Result with HDD 17. Autor: 2013/09/20 Tested Machines Host KVM guest1: disk cache = writethrough KVM guest2: disk cache = none Xen guest In all machines, disk scheduler is "cfq." 18. Autor: 2013/09/20 Summary KVM with cache="writethrough" performs well only in small size reading. Xen generally performs best. Even better than host. Xen's unique latency distribution seem to be resulting from some optimization. Performance of KVM with cache=none is slightly worse than host, but the delay is within 10%. KVM with cache=none has very similar latency distribution to host. 19. Autor: 2013/09/20 Detalied Results Only typical data is represented in this presentation. You can get full data from the repository bellow. https://github.com/nknytk/disk-performance-xen- kvm/tree/master/result_hdd_examples 20. Autor: 2013/09/20 IOPS iosize = 11byte xen kvm2 kvm1 host 0 100 200 300 400 500 600 700 800 900 1000 206 139 680116 143 11byte direct=false randomread IOPS xen kvm2 kvm1 host 0 100 200 300 400 500 600 700 800 900 1000 335 126 372087 127 11byte direct=false randomwrite IOPS 21. Autor: 2013/09/20 Latency Distribution iosize = 11byte0.002 0.004 0.010 0.020 0.050 0.100 0.250 0.500 0.750 1.000 2.000 4.000 10.000 20.000 50.000 100.000 250.000 1000.000 0 20 40 60 80 100 11byte direct=false random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.002 0.004 0.010 0.020 0.050 0.100 0.250 0.500 0.750 1.000 2.000 4.000 10.000 20.000 50.000 100.000 250.000 500.000 750.000 1000.000 0 20 40 60 80 100 11byte direct=false random write latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 22. Autor: 2013/09/20 IOPS iosize = 4kbyte xen kvm2 kvm1 host 0 20 40 60 80 100 120 140 160 180 133 119 156 126 4kbyte direct=true randomread IOPS xen kvm2 kvm1 host 0 200 400 600 800 1000 1200 1122 164 97 185 4kbyte direct=true randomwrite IOPS 23. Autor: 2013/09/20 Latency Distribution iosize = 4kbyte0.1 0.25 0.5 0.75 1 2 4 10 20 50 100 250 500 0 10 20 30 40 50 60 70 80 90 100 4kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.05 0.1 0.25 0.5 0.75 1 2 4 10 20 50 100 250 500 0 10 20 30 40 50 60 70 80 90 100 4kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 24. Autor: 2013/09/20 IOPS iosize = 512kbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 50 61 41 64 512kbyte direct=true randomread IOPS xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 85 61 46 60 512kbyte direct=true randomwrite IOPS 25. Autor: 2013/09/20 Latency Distribution iosize = 512kbyte0.25 0.5 0.75 4 10 20 50 100 250 500 1000 2000 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.5 0.75 1 2 4 10 20 50 100 250 500 1000 1000 500 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 26. Autor: 2013/09/20 IOPS iosize = 1mbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 69 62 57 68 1mbyte direct=true sequential read IOPS xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 74 71 41 70 1mbyte direct=true sequential write IOPS 27. Autor: 2013/09/20 Latency Distribution iosize = 1mbyte0.25 0.5 1 2 10 20 50 100 250 500 750 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.75 1 2 4 10 20 50 100 250 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 28. Autor: 2013/09/20 Read Bandwidth Comparison 11(direct=false) 512 4k 32k 512k 1m(sequential) 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Read Bandwidth Comparison (host=1) host kvm1 kvm2 xen I/O size 29. Autor: 2013/09/20 Write Bandwidth Comparison 11(direct=false) 512 4k 32k 512k 1m(sequential) 0 1 2 3 4 5 6 Write Bandwidth Comparison (host=1) host kvm1 kvm2 xen I/O size 30. Autor: 2013/09/20 4. Test Result with SSD 31. Autor: 2013/09/20 Tested Machines Host KVM guest1 - disk cache = "writethrough" - Another process consumes so much memory on host that only 200mb is left for page cache KVM guest2 - disk cache = "none" Xen guest In all machines, disk scheduler is "noop." 32. Autor: 2013/09/20 Summary All VMs are unignorably delayed in small size I/O. Both KVM and Xen cost nearly 100% of CPU for emuration at the worst case. KVM's performance with cache=writethrough is lower than KVM with cache=none by up to 90%. KVM with cache=none has similar latency distribution to host. 33. Autor: 2013/09/20 Detalied Results Only typical data is represented in this presentation. You can get full data at the repository bellow. https://github.com/nknytk/disk-performance-xen- kvm/tree/master/result_ssd_examples 34. Autor: 2013/09/20 IOPS iosize = 11byte xen kvm2 kvm1 host 0 100000 200000 300000 400000 500000 600000 700000 800000 5263 4514 231262 737753 11byte direct=false randomread IOPS xen kvm2 kvm1 host 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 3006 3146 927 447745 11byte direct=false randomwrite IOPS 35. Autor: 2013/09/20 CPU Usage iosize = 11byte xen kvm2 kvm1 host 0 20 40 60 80 100 120 140 11byte direct=false random read CPU usage (100%=1core) test_server host xen kvm2 kvm1 host 0 20 40 60 80 100 120 11byte direct=false random write CPU usage (100%=1core) test_server host 36. Autor: 2013/09/20 IO %util iosize = 11byte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 11byte direct=false random read IO %util test_server host xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 11byte direct=false random write IO %util test_server host 37. Autor: 2013/09/20 Latency Distribution iosize = 11byte0.002 0.004 0.010 0.020 0.050 0.100 0.250 0.500 0.750 1.000 2.000 4.000 10.000 20.000 0 10 20 30 40 50 60 70 80 90 100 11byte direct=false random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.002 0.004 0.010 0.020 0.050 0.100 0.250 0.500 0.750 1.000 2.000 4.000 10.000 20.000 50.000 100.000 250.000 500.000 0 10 20 30 40 50 60 70 80 90 100 11byte direct=false random write latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 38. Autor: 2013/09/20 IOPS iosize = 4kbyte xen kvm2 kvm1 host 0 2000 4000 6000 8000 10000 12000 3139 3029 9894 4260 4kbyte direct=true randomread IOPS xen kvm2 kvm1 host 0 2000 4000 6000 8000 10000 12000 14000 11833 6246 581 10198 4kbyte direct=true randomwrite IOPS 39. Autor: 2013/09/20 CPU Usage iosize = 4kbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 4kbyte direct=true random read CPU usage (100%=1core) test_server host xen kvm2 kvm1 host 0 20 40 60 80 100 120 140 160 4kbyte direct=true random write CPU usage (100%=1core) test_server host 40. Autor: 2013/09/20 IO %util iosize = 4kbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 4kbyte direct=true random read IO %util test_server host xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 4kbyte direct=true random write IO %util test_server host 41. Autor: 2013/09/20 Latency Distribution iosize = 4kbyte0.002 0.004 0.010 0.020 0.050 0.100 0.250 0.500 0.750 1.000 2.000 4.000 10.000 20.000 50.000 250.000 0 10 20 30 40 50 60 70 80 90 100 4kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.05 0.1 0.25 0.5 0.75 1 2 4 10 20 50 100 250 0 10 20 30 40 50 60 70 80 90 100 4kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 42. Autor: 2013/09/20 IOPS iosize = 512kbyte xen kvm2 kvm1 host 0 100 200 300 400 500 600 700 800 345 660 338 748 512kbyte direct=true randomread IOPS xen kvm2 kvm1 host 0 100 200 300 400 500 600 700 800 900 748 678 454 800 512kbyte direct=true randomwrite IOPS 43. Autor: 2013/09/20 CPU Usage iosize = 512kbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 512kbyte direct=true random read CPU usage (100%=1core) test_server host xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random write CPU usage (100%=1core) test_server host 44. Autor: 2013/09/20 IO %util iosize = 512kbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random read IO %util test_server host xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random write IO %util test_server host 45. Autor: 2013/09/20 Latency Distribution iosize = 512kbyte2 4 10 20 50 100 250 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.5 0.75 1 2 4 10 20 50 100 250 0 10 20 30 40 50 60 70 80 90 100 512kbyte direct=true random read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 46. Autor: 2013/09/20 IOPS iosize = 1mbyte xen kvm2 kvm1 host 0 50 100 150 200 250 300 350 400 450 500 315 434 254 440 1mbyte direct=true sequential read IOPS xen kvm2 kvm1 host 0 50 100 150 200 250 300 350 400 450 369 364 259 412 1mbyte direct=true sequential write IOPS 47. Autor: 2013/09/20 CPU Usage iosize = 1mbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential read CPU usage (100%=1core) test_server host xen kvm2 kvm1 host 0 20 40 60 80 100 120 1mbyte direct=true sequential write CPU usage (100%=1core) test_server host 48. Autor: 2013/09/20 IO %util iosize = 1mbyte xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential read IO %util test_server host xen kvm2 kvm1 host 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential write IO %util test_server host 49. Autor: 2013/09/20 Latency Distribution iosize = 1mbyte4 10 20 100 250 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 0.75 1 2 4 10 20 50 100 250 0 10 20 30 40 50 60 70 80 90 100 1mbyte direct=true sequential read latency distribution host kvm1 kvm2 xen latency(msec) distribution(%) 50. Autor: 2013/09/20 Read Bandwidth Comparison 11(direct=false) 512 4k 32k 512k 1m(sequential) 0 0.5 1 1.5 2 2.5 3 Read Bandwidth Comparison (host=1) host kvm1 kvm2 xen I/O size 51. Autor: 2013/09/20 Write Bandwidth Comparison 11(direct=false) 512 4k 32k 512k 1m(sequential) 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Read Bandwidth Comparison (host=1) host kvm1 kvm2 xen I/O size 52. Autor: 2013/09/20 5. Consideration 53. Autor: 2013/09/20 Use Cases If your VM host server's disks are slow, Xen will offer better I/O performance than KVM. KVM with cache="none" is relatively appropriate for performance simulation of physical server because of the similarity of latency distribution. Both Xen and KVM are not suitable for high random I/O load on super fast device, e.g. OLTP DB server with ioDrive. 54. Autor: 2013/09/20 Questions In this test, I/O load is single-threaded. What if prarel I/O load by many guests? Is Xen's data safe? It seems to me that more optimization longer time in mermory larger data loss in server fault