80
Document Revision 1.0 December 2014 Intel ® Open Network Platform Server (Release 1.2) Benchmark Performance Test Report

Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Document Revision 1.0December 2014

Intel® Open Network Platform Server (Release 1.2)Benchmark Performance Test Report

Page 2: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

2

Revision History

Revision Date Comments

1.0 December 15, 2014 Initial document for release of Intel® Open Network Platform Server 1.2

Page 3: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

3

Intel® Open Network Platform ServerBenchmark Performance Test Report

Contents

1.0 Introduction ................................................................................................................ 52.0 Ingredient Specifications ............................................................................................ 7

2.1 Hardware Versions ....................................................................................................................72.2 Software Versions .....................................................................................................................7

3.0 Test Cases ................................................................................................................... 94.0 Test Descriptions .......................................................................................................11

4.1 Performance Metrics ................................................................................................................114.2 Test Methodology ...................................................................................................................12

4.2.1 Layer 2 and Layer 3 Throughput Tests..................................................................................124.2.2 Latency Tests....................................................................................................................12

5.0 Platform Configuration ...............................................................................................135.1 Linux Operating System Configuration .......................................................................................135.2 BIOS Configuration .................................................................................................................135.3 Core Usage for Intel DPDK Accelerated vSwitch ..........................................................................145.4 Core Usage for OVS ................................................................................................................15

6.0 Test Results ...............................................................................................................176.1 Host Performance with DPDK ....................................................................................................17

6.1.1 Test Case 0 - Host L3 Forwarding ........................................................................................186.1.2 Test Case 1 - Host L2 Forwarding ........................................................................................19

6.2 Virtual Switching Performance ..................................................................................................206.2.1 Test Case 2 - OVS with DPDK-netdev L3 Forwarding (Throughput) ...........................................216.2.2 Test Case 3 - OVS with DPDK-netdev L3 Forwarding (Latency) ................................................226.2.3 Test Case 4 - OVDK vSwitch L3 Forwarding (Throughput) .......................................................23

6.3 VM Throughput Performance without a vSwitch ...........................................................................246.3.1 Test Case 5 - VM L2/L3 Forwarding with PCI Pass-through ......................................................256.3.2 Test Case 6 - DPDK L3 Forwarding: SR-IOV Pass-through .......................................................26

6.4 VM Throughput Performance with vSwitch ..................................................................................276.4.1 Test Case 7 - Throughput Performance with One VM (OVS with DPDK-netdev, L3 Forwarding) .....276.4.2 Test Case 8 - Throughput Performance with Two VMs in Series (OVS with DPDK-netdev,

L3 Forwarding)..................................................................................................................286.4.3 Test Case 9 - Throughput Performance with Two VMs in Parallel (OVS with DPDK-netdev,

L3 Forwarding)..................................................................................................................306.5 Encrypt / Decrypt Performance .................................................................................................32

6.5.1 Test Case 10 - VM-VM IPSec Tunnel Performance Using Quick Assist Technology........................32

Appendix A DPDK L2 and L3 Forwarding Applications....................................................35A.1 Building DPDK and L3/L2 Forwarding Applications ........................................................................35A.2 Running the DPDK L3 Forwarding Application ..............................................................................35A.3 Running the DPDK L2 Forwarding Application ..............................................................................36

Appendix B Building DPDK, Open vSwitch, and Intel DPDK Accelerated vSwitch ...........39B.1 Getting DPDK..........................................................................................................................39

B.1.1 Getting DPDK Git Source Repository.....................................................................................39B.1.2 Getting DPDK Source tar from DPDK.ORG.............................................................................39

B.2 Building DPDK for OVS .............................................................................................................40B.2.1 Getting OVS Git Source Repository ......................................................................................40B.2.2 Getting OVS DPDK-netdev User Space vHost Patch ................................................................41B.2.3 Applying OVS DPDK-netdev User Space vHost Patch ..............................................................41B.2.4 Building OVS with DPDK-netdev ..........................................................................................41

B.3 Building DPDK for Intel DPDK Accelerated vSwitch .......................................................................42

Page 4: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

4

B.3.1 Building Intel DPDK Accelerated vSwitch...............................................................................42B.4 Host Kernel Boot Configuration..................................................................................................43

B.4.1 Kernel Boot Hugepage Configuration ....................................................................................43B.4.2 Kernel Boot CPU Isolation Configuration ...............................................................................43B.4.3 IOMMU Configuration .........................................................................................................44B.4.4 Generating GRUB Boot File .................................................................................................44B.4.5 Verifying Kernel Boot Configuration......................................................................................44B.4.6 Configuring System Variables (Host System Configuration) .....................................................44

B.5 Setting Up Tests ......................................................................................................................45B.5.1 OVS Throughput Tests (PHY-PHY without a VM).....................................................................45B.5.2 Intel DPDK Accelerated vSwitch Throughput Tests (PHY-PHY without a VM)...............................48B.5.3 Intel DPDK Accelerated vSwitch with User Space vHost...........................................................50B.5.4 OVS with User Space vHost ................................................................................................53B.5.5 SR-IOV VM Test ................................................................................................................61B.5.6 Affinitization and Performance Tuning...................................................................................68

B.6 PCI Passthrough with Intel® QuickAssist .....................................................................................69B.6.1 VM Installation ..................................................................................................................70B.6.2 Installing strongSwan IPSec Software ..................................................................................71B.6.3 Configuring strongSwan IPsec Software................................................................................72

Appendix C Glossary......................................................................................................75

Appendix D Definitions ..................................................................................................77D.1 Packet Throughput...................................................................................................................77D.2 RFC 2544 ...............................................................................................................................78

Appendix E References ..................................................................................................79

Page 5: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

5

Intel® Open Network Platform ServerBenchmark Performance Test Report

1.0 Introduction

The primary audiences for this document are architects and engineers implementing the Intel® Open Network Platform Server. Software ingredients of this architecture include:

• Fedora 20*

• Data Plane Development Kit (DPDK)

• Open vSwitch (OVS)*

• OpenStack*

• OpenDaylight*

An understanding of system performance is required to develop solutions that meet the demanding requirements of the telecom industry and transform telecom networks. This document provides a guide for performance characterization using the Intel® Open Network Platform Server (Intel ONP Server). Ingredient versions, integration procedures, configuration parameters, and test methodologies all influence performance.

This document provides a suite of Intel ONP Server performance tests, and includes learning to help with optimizing system configurations, test cases, configuration details, and performance data. The test methodologies and data provide a baseline for performance characterization of a Network Function Virtualization (NFV) compute node using Commercial Off-the-Shelf (COTS) with Intel® Xeon® E5-2697 V3 processor. This data does not represent optimal performance but should be easy to reproduce to help architects evaluate their specific performance requirements. The tests described are not comprehensive and do not cover any conformance aspects. Providing a baseline configuration of well tested procedures can help to achieve optimal system performance when developing an NFV/SDN solution.

Benchmarking is not a trivial task, as it takes equipment, time, and resources. Engineers doing NFV performance characterization need strong domain knowledge (telco, virtualization, etc.), good understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols, and hands-on experience working with relevant open-source software (Linux virtualization, vSwitch, software acceleration technologies like DPDK, system optimization/tuning, etc.).

Page 6: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

6

NOTE: This page intentionally left blank.

Page 7: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

7

Intel® Open Network Platform ServerBenchmark Performance Test Report

2.0 Ingredient Specifications

2.1 Hardware Versions

2.2 Software Versions

Table 2-1 Hardware Ingredients

Description Notes

Platform Intel® Server Board S2600WTT Intel® Xeon® DP-based Server (2 CPU sockets). 120 GB SSD 2.5in SATA 6 GB/s Intel Wolfsville SSDSC2BB120G4.

Processors Intel® Xeon® E5-2697 V3 processor (Formerly code-named Haswell) 14 Core, 2.60GHz, 145W 35M cache, 9.6 GT/s QPI, DDR4-1600/1866/2133. Tested with DDR4-1067

Cores 14 physical cores per CPU (i.e. per socket) 28 Hyper-threaded cores per CPU (i.e. per socket) for 56 total cores per platform (i.e. 2 sockets).

Memory 8 GB DDR4 RDIMM Crucial Server capacity = 64 GB RAM (8 x 8 GB). Tested with 32 GB memory.

NICs (Niantic) 2x Intel® 82599 10 Gigabit Ethernet Controller

NICs are on socket zero (3 PCIe slots available on socket 0).

BIOS BIOS Revision:GRNDSDP1.86B.0038.R01.1409040644Release Date: 09/04/2014

Intel® Virtualization Technology for Directed I/O (Intel® VT-d) enabled. Hyper-threading disabled.

Quick Assist Technology

Intel® QuickAssist Adapter 8950-SCCP (formerly code-named Walnut Hill) with Intel® Communications Chipset 8950 (formerly code-named Coleto Creek)

PCIe acceleration card with Intel® Communications Chipset 8950. Capabilities include RSA, SSL/IPsec, Wireless Crypto, Security Crypto, Compression.

Table 2-2 Software Versions

Software Component Function Version/Configuration

Fedora 20 x86_64 Host OS 3.15.6-200.fc20.x86_64

Fedora 20 x86_64 VM 3.15.6-200.fc20.x86_64

Qemu‐kvm Virtualization technology 1.6.2-9.fc20.x86_64

Data Plane Development Kit (DPDK)

Network Stack bypass DPDK 1.7.1git commit 99213f3827bad956d74e2259d06844012ba287a4

Intel® DPDK Accelerated vSwitch (OVDK)1

vSwitch v1.1.0.61git commit bd8e8ee7565ca7e843a43204ee24a7e1e2bf9c6f

Open vSwitch (OVS) vSwitch Open vSwitch V 2.3.90git commit e9bbe84b6b51eb9671451504b79c7b79b7250c3b

Page 8: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

8

DPDK-netdev2 vSwitch patch This is currently an experimental feature of OVS that needs to be applied to OVS version specified above.http://patchwork.openvswitch.org/patch/6280/

Intel® QuickAssist Technology Driver (QAT)

Crypto Accelerator Intel® Communications Chipset 89xx Series Software for Linux* - Intel® QuickAssist Technology Driver (L.2.2.0-30).Latest drivers posted at:

https://01.org/packet-processing/intel%C2%AE-quickassist-technology-drivers-and-patches

strongSwan IPSec stack strongSwan v.4.5.3http://download.strongswan.org/strongswan-4.5.3.tar.gz

PktGen Software Network Package Generator

v.2.7.7

1. Intel has stopped further investment in the OVDK (available on Github). Intel OVDK demonstrated that the Data Plane Development Kit (DPDK) accelerates performance. Intel has also contributed to the Open vSwitch project which has adopted the DPDK and therefore it is not desirable to have two different code bases that use the DPDK for accelerated vSwitch performance. As a result, Intel has decided to increase investment in the Open vSwitch community project with a focus on using the DPDK and advancing hardware acceleration.

2. Keeping up with line rate at 10 Gb/s, in a flow of 64-byte packets, requires processing more than 14.8 million packets per second, but current standard version of OVS kernel pushes more like 1.1 million to 1.3 million, OVS’ latency in processing small packets becomes intolerable as well. The new, mainstream-OVS code goes under the name OVS with DPDK-netdev and has been on Github since March. It’s available as an “experimental feature” in OVS version 2.3. Intel hopes to include the code officially in OVS 2.4, which OVS committers are hoping to release early next year.

Table 2-2 Software Versions (Continued)

Software Component Function Version/Configuration

Page 9: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

9

Intel® Open Network Platform ServerBenchmark Performance Test Report

3.0 Test Cases

Network throughput, latency, and jitter performance are measured for L2/L3 forwarding and uses standard test methodology RFC 2544.

L3 forwarding on host and port-port switching performance using vSwitch also serves as verification test after installing compute node ingredients. The table below is a summary of performance test-cases (this is subset of many possible test configurations).

Table 3-1 Test Case Summary

Test Case Description Configuration Workload/Metrics

Host-Only Performance

0 L3 Forwarding on host DPDK • L3 Fwd (DPDK test Application)• No pkt modification• Throughput1

• Latency (avg, max, min)

1 L2 Forwarding on host DPDK • L2 Fwd (DPDK Application)• Pkt MAC modification• Throughput1

• Latency (avg, max, min)

Virtual Switching Performance

2 L3 Forwarding by vSwitch – throughput

OVS with DPDK-netdev2 • Switch L3 Fwd• Throughput1

• Latency (avg, max, min)

3 L3 Forwarding by vSwitch - latency

OVS with DPDK-netdev2 • Switch L3 Fwd• Latency distribution (percentile ranges)3

4 L3 Forwarding by vSwitch Intel® DPDK Accelerated vSwitch

• Switch L3 Fwd• Throughput1

• Latency (avg, max, min)

5 L2 Forwarding by single VM Pass-through • L2 Fwd in VM• Pkt MAC modification• Throughput1

• Latency (avg, max, min)

6 L3 Forwarding by single VM SR-IOV • L2 Fwd in VM• Pkt MAC modification• Throughput1

• Latency (avg, max, min)

7 L3 Forwarding by single VM OVS with DPDK-netdev2

User space vhost• L3 table lookup in switch with Port fwd (test pmd

app) in VM without pkt modification• Throughput1

• Latency (avg, max, min)

Page 10: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

10

8 L3 Forwarding by two VMs in series

OVS with DPDK-netdev2

User space vhost• L3 table lookup in switch with Port fwd (test pmd

app) in VM without pkt modification• Throughput1

• Latency (avg, max, min)

9 L3 Forwarding by VMs in parallel

OVS with DPDK-netdev2

User space vhost• L3 table lookup in switch with Port fwd (test pmd

app) in VM without pkt modification• Throughput1

• Latency (avg, max, min)

Security

10 Encrypt/decrypt with Intel® Quick Assist Technology

VM-VM IPSec tunnel performance

• Throughput1

1. Tests run for 60 seconds per load target.2. OVS with DPDK-netdev test-cases were performed using OVS with DPDK-netdev which is available as an “experimental feature”

in OVS version 2.3. This shows performance improvements possible today with standard OVS code when using DPDK. Refer to Section 2.2 for details of software versions with commit IDs.

3. Tests run 1 hour per load target.

Table 3-1 Test Case Summary (Continued)

Test Case Description Configuration Workload/Metrics

Page 11: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

11

Intel® Open Network Platform ServerBenchmark Performance Test Report

4.0 Test Descriptions

4.1 Performance MetricsThe Metro Ethernet Forum (http://metroethernetforum.org/) gives Ethernet network performance parameters that affect service quality as:

• Frame delay (latency)

• Frame delay variation (jitter)

• Frame loss

• Availability

Furthermore the following metrics can characterize a network connected device (i.e. compute node):

• Throughput1 — The maximum frame rate that can be received and transmitted by the node without any error for “N” flows.

• Latency — Introduced by the node that will be cumulative to all other end-to-end network delays.

• Jitter2 i— Introduced by the node infrastructure.

• Frame loss3 — While overall network losses do occur, all packets should be accounted for within a node.

• Burst behavior4 — This measures the ability of the node to buffer packets.

• Metrics5 — Used to characterize the availability and capacity of the compute node. These include start-up performance metrics as well as fault handling and recovery metrics. Metrics measured when the network service is fully “up” and connected include:

— Power consumption of the CPU in various power states.

— Number of cores available/used.

— Number of NIC interfaces supported.

— Headroom available for processing workloads (for example, available for VNF processing).

1. See Appendix D for definition for packet throughput.2. Jitter data is not included here. While “round-trip” jitter is fairly easy to measure, ideally it should be measured from the

perspective of the application that has to deal with the jitter (e.g. VNF running on the node).3. All data provided is with zero packet loss conditions.4. Burst behavior data is not included here.5. Availability and capacity data is not included here.

Page 12: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

12

4.2 Test Methodology

4.2.1 Layer 2 and Layer 3 Throughput TestsThese tests determines the maximum Layer 2 or Layer 3 forwarding rate without traffic loss, and average latency for different packet sizes for the Device Under Test (DUT). While packets do get lost in telco networks, generally communications equipment is expected NOT to lose any packets. Therefore, it is important to do a zero packet-loss test of sufficient duration. Test results are zero packet-loss unless otherwise noted.

Test are usually performed full duplex with traffic transmitting in both directions. Tests here were conducted with a variety of flows, for example, single flow (unidirectional traffic), two flows (bidirectional traffic with one flow in each direction), 30,000 flows (bidirectional with 15k flows in each direction).

For Layer 2 forwarding, the DUT must perform packet parsing and Layer 2 address look-ups on the ingress port and then modify the MAC header for before forwarding the packet on the egress port.

For Layer 3 forwarding, the DUT must perform packet parsing and Layer 3 address look-ups on the ingress port and then forward the packet on the egress port.

Test duration refers to the measurement period for and particular packet size with an offered load and assumes the system has reached a steady state. Using RFC 2544 test methodology, this is specified as at least 60 seconds. To detect outlier data, it is desirable to run longer duration “soak” tests once maximum zero loss throughput is determined. This is particularly relevant for latency tests. In this document, throughput tests used 60-second load iterations, and latency tests were run for one hour (per load target).

4.2.2 Latency TestsMinimum, Maximum, and average latency numbers were collected with throughput tests. This test aims to understand latency distribution for different packet sizes and over an extended test run to uncover outliers. This means that latency tests should run for extended periods (at least 1 hour and ideally 24 hours). Practical considerations dictate that we pick the highest throughput that has demonstrated zero packet loss (for a particular packet size) as determined with above throughput tests.

Note: Latency through the system depends on various parameters configured by the software, such as the ring size in the Ethernet adapters, Ethernet tail pointer update rates, platforms characteristics, etc.

Page 13: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

13

Intel® Open Network Platform ServerBenchmark Performance Test Report

5.0 Platform Configuration

The configuration parameters described in this section are based on experience running various tests and may not be optimal.

5.1 Linux Operating System Configuration

5.2 BIOS ConfigurationThe configuration parameters described in this section are based on experience running the tests described and may not be optimal.

Table 5-1 Linux Configuration

Parameter Enable/Disable Explanation

Firewall Disable Disable.

NetworkManager Disable As set by install.

irqbalance Disable

ssh Enabled

Address Space Layout Randomization (ASLR)

Disable

IPV4 Forwarding Disable

Host kernel boot parameters (regenerate grub.conf)

default_hugepagesz=1G hugepagesz=1G hugepages=32 'intel_iommu=off’ isolcpus=1,2,3,4,5,6,7,8,9..n-1 for n core systems

Isolate DPDK cores on host.

VM kernel boot parameters (regenerate grub.conf)

default_hugepagesz=1GB hugepagesz=1GB hugepages=1 isolcpus=1,2

Isolate DPDK cores on VM.

Table 5-2 BIOS Configuration

Parameter Enable/Disable Explanation

Power Management Settings

CPU Power and Performance Policy Traditional This is the equivalent of “performance” setting.

Intel® Turbo boost Enable

Processor C3 Disable Prevents low power state

Processor C6 Enable Not relevant when C3 is disabled.

Page 14: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

14

5.3 Core Usage for Intel DPDK Accelerated vSwitch

Performance results in this section do not include VM tests with Intel DPDK Accelerated vSwitch (OVDK). The core usage information here is provided as a reference as a suggestion for doing additional tests with OVDK.

Processor Configuration

Intel® Hyper-Threading Technology (HTT) Disable

MLC Streamer Enable Hardware Prefetcher: Enabled

MLC Spatial Prefetcher Enable Adjacent Cache Prefetch: Enabled

DCU Data Prefetcher Enable DCU Streamer Prefetcher: Enabled

DCU Instruction Prefetcher Enable DCU IP Prefetcher: Enabled

Virtualization Configuration

Intel® Virtualization Technology for Directed I/O (VT-d)

Enable This is disabled by the kernel when iommu = off (default).

Memory Configuration

Memory RAS and Performance Configuration -> NUMA Optimized

Auto

I/O Configuration

DDIO (Direct Cache Access) Auto

Other Configuration

HPET (High Precision Timer) Disable Requires a kernel rebuild to utilize (Fedora 20 by default sets to disabled).

Table 5-3 Core Usage for Intel DPDK Accelerated vSwitch

Process Core Linux Commands

OVS_DPDK Switching Cores

OVS_DPDK 0, 1, 2, 3 Affinity is set in `ovs_dpdk’ command line.Tests used coremask 0x0F (i.e., cores 0, 1, 2, and 3.

vswitchd 8 taskset –a <pid_of_vswitchd_process>This is only relevant when setting up flows/

OS/VM Cores

QEMU* VM1 VCPU0

4 taskset –p 10 <pid_of_vm1_qemu_vcpu0_process>

QEMU* VM1 VCPU1

5 taskset –p 20 <pid_of_vm1_qemu_vcpu1_process>

QEMU* VM2 VCPU0

6 taskset –c –p 40 <pid_of_vm2_qemu_vcpu0_process>

QEMU* VM2 VCPU1

7 taskset –c –p 80 <pid_of_vm2_qemu_vcpu1_process>

Kernel 0 + CPU socket 1 cores All other CPUs isolated (`isolcpus’ boot parameter).

Table 5-2 BIOS Configuration (Continued)

Parameter Enable/Disable Explanation

Page 15: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

15

Intel® Open Network Platform ServerBenchmark Performance Test Report

5.4 Core Usage for OVS

All Linux threads for OVS vSwitch daemon can be listed with:

$ top -p `pidof ovs-vswitchd` -H -d1

Using taskset, the current task thread affinity can be checked and changed to other CPU cores as needed.

Table 5-4 Core Usage for OVS

Process Core Linux Commands

OVS switchd Cores

ovs-vswitchd(all tasks except pmd)

3 OVS user space processes affinitized to core mask 0x04.# ./vswitchd/ovs-vswitchd --dpdk -c 0x4 -n 4 --socket-mem 4096,0 -- unix:$DB_SOCK --pidfile

ovs-vswitchd pmd task 4 Sets the PMD core mask to 2 for CPU core 1 affinitization (sets in the OVS database and is persistent),# ./ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2

Page 16: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

16

NOTE: This page intentionally left blank.

Page 17: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

17

Intel® Open Network Platform ServerBenchmark Performance Test Report

6.0 Test Results

Note: In this section, all performance testing done using RFC 2544 test methodology (zero loss), refer to Appendix D for more information.

6.1 Host Performance with DPDK

Configuration:

1. No VMs are configured and no vSwitch is started or used.

2. DPDK-based port forwarding sample application (L2/L3fwd).

Data Path (Numbers Matching Red Circles):

1. The packet generator creates flows based on RFC 2544.

2. The DPDK L2 or L3 forwarding application forwards the traffic from the first physical port to the second.

3. The traffic flows back to the packet generator and is measured per RFC 2544 throughput test.

Figure 6-1 Host Performance with DPDK

Page 18: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

18

6.1.1 Test Case 0 - Host L3 ForwardingTest Configuration Notes:

• Traffic is unidirectional (1 flow) or bidirectional (2 flows).

• Packets are not modified.

The average latency data in Table implies that latency increases with packet size, however this is because throughput is approaching maximum (where latency increases dramatically). Therefore these latency figures may not represent realistic conditions (i.e. at lower data rates).

Table 6-1 shows performance with bidirectional traffic. Note that performance is limited by the network card.

Figure 6-2 Host L3 Forwarding Performance – Throughput

Table 6-1 Host L3 forwarding Performance – Average Latency

Packet Size(Bytes)

Average Latency(s)

Bandwidth(% of 10 GbE)

64 9.77 78

128 12.33 97

256 13.17 99

512 20.24 99

1024 33.02 99

1280 39.82 99

1518 46.61 99

Page 19: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

19

Intel® Open Network Platform ServerBenchmark Performance Test Report

6.1.2 Test Case 1 - Host L2 ForwardingTest Configuration Notes:

• Traffic is bidirectional (2 flows).

• Packets are modified (Ethernet header).

Figure 6-3 Host L2 Forwarding Performance – Throughput

Page 20: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

20

6.2 Virtual Switching Performance

Configuration:

1. No VMs are configured (i.e. PHY-to-PHY).

2. Intel® DPDK Accelerated vSwitch or OVS with DPDK-netdev, port-to-port.

Data Path (Numbers Matching Red Circles):

1. The packet generator creates flows based on RFC 2544.

2. The Intel® DPDK Accelerated vSwitch or OVS network stack forwards the traffic from the first physical port to the second.

3. The traffic flows back to the packet generator.

Figure 6-4 Virtual Switching Performance

Page 21: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

21

Intel® Open Network Platform ServerBenchmark Performance Test Report

6.2.1 Test Case 2 - OVS with DPDK-netdev L3 Forwarding (Throughput)

Test Configuration Notes:

• Traffic is bidirectional, except for one flow case (unidirectional). Number of flows in chart represents total flows in both directions.

• Packets are not modified.

• These tests are “PHY-PHY” (physical port to physical port without a VM).

Figure 6-5 OVS with DPDK-netdev Switching Performance - Throughput

Figure 6-6 OVS with DPDK-netdev Switching Performance - Average Latency1

1. Latency is measured by the traffic generator on a per packet basis.

Page 22: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

22

6.2.2 Test Case 3 - OVS with DPDK-netdev L3 Forwarding (Latency)

Test Configuration Notes:

• Traffic is bidirectional.

• Packets are not modified.

• Test run for 1 hour per target load.

• Only 64-byte results are shown in the chart.

• These tests are “PHY-PHY” (physical port to physical port without a VM).

The chart in Figure 6-7 shows latency distribution of packets for a range of target loads. For example, with target load of 10% (i.e. 1 Gb/s) almost all packets have a latency of between 2.5 and 5.1 s, whereas at a target load of 50% (i.e. 5 Gb/s) almost all packets have a latency of between 10.2 and 20.5 s.

Figure 6-7 Host L2 Forwarding Performance – Latency Distribution for 64B Packets

Page 23: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

23

Intel® Open Network Platform ServerBenchmark Performance Test Report

6.2.3 Test Case 4 - OVDK vSwitch L3 Forwarding (Throughput)

Test Configuration Notes:

• Traffic is bidirectional.

• Packets are not modified.

• Results for three different flow settings are shown (2, 10k, 30k).

• These tests are “PHY-PHY” (physical port to physical port without a VM).

Figure 6-8 OVDK Switching Performance - Throughput

Figure 6-9 OVDK Switching Performance - Average Latency

Page 24: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

24

6.3 VM Throughput Performance without a vSwitch

Configuration (Done Manually):

1. One VM gets brought up and connected to the NIC with PCI Passthrough or SR-IOV.

2. IP addresses of the VM gets configured.

Data Path (Numbers Matching Red Circles):

1. The packet generator creates flows based on RFC 2544.

2. Flow arrived at the first vPort of the VM.

3. The VM receives the flows and forwards them out through its second vPort.

4. The flow is sent back to the packet generator.

Figure 6-10 VM Throughput Performance without a vSwitch

Page 25: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

25

Intel® Open Network Platform ServerBenchmark Performance Test Report

6.3.1 Test Case 5 - VM L2/L3 Forwarding with PCI Pass-through

Test Configuration Notes for L3 Forwarding:

• Traffic is bidirectional.

• Packets not modified.

Test Configuration Notes for L2 Forwarding:

• Traffic is bidirectional.

• Packets are modified (Ethernet Header).

Figure 6-11 VM L3 Forwarding Performance with PCI Pass-through - Throughput

Figure 6-12 VM L2 Forwarding Performance with PCI Pass-through - Throughput

Page 26: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

26

6.3.2 Test Case 6 - DPDK L3 Forwarding: SR-IOV Pass-through

Test Configuration Notes for L3 Forwarding:

• Traffic is bidirectional.

• Packets not modified.

Figure 6-13 VM L3 Forwarding Performance with SR-IOV - Throughput

Page 27: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

27

Intel® Open Network Platform ServerBenchmark Performance Test Report

6.4 VM Throughput Performance with vSwitch

6.4.1 Test Case 7 - Throughput Performance with One VM (OVS with DPDK-netdev, L3 Forwarding)

Configuration (Done Manually):

1. One VM gets brought up and connected to the vSwitch.

2. IP address of the VM gets configured.

3. Flows get programmed to the vSwitch.

Data Path (Numbers Matching Red Circles):

1. The packet generator creates flows based on RFC 2544.

2. The vSwitch forwards the flows to the first vPort of the VM.

3. The VM receives the flows and forwards them out through its second vPort.

4. The vSwitch forwards the flows back to the packet generator.

Test Configuration Notes:

• Traffic is bidirectional.

• Packets not modified.

Figure 6-14 VM Throughput Performance with vSwitch - One VM

Page 28: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

28

6.4.2 Test Case 8 - Throughput Performance with Two VMs in Series (OVS with DPDK-netdev, L3 Forwarding)

Figure 6-15 VM Throughput Performance - One VM (OVS with DPDK-netdev)

Figure 6-16 VM Throughput Performance with vSwitch - Two VMs in Series

Page 29: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

29

Intel® Open Network Platform ServerBenchmark Performance Test Report

Configuration (Done Manually):

1. Setup 2 VMs and connect to vSwitch.

2. IP addresses of VMs are configured.

3. Flows get programmed to the vSwitch.

Data Path (Numbers Matching Red Circles):

1. The packet generator creates flows based on RFC 2544.

2. The vSwitch forwards the flows to the first vPort of VM #1.

3. VM #1 receives the flows and forwards them to VM #2 through its second vPort (via the vSwitch).

4. The vSwitch forwards the flows to the first vPort of VM #2.

5. VM #2 receives the flows and forwards them back to the vSwitch through its second vPort.

6. The vSwitch forwards the flows back to the packet generator.

Test Configuration Notes:

• Traffic is bidirectional

• Packets not modified

Figure 6-17 VM Throughput Performance - Two VMs in Series (OVS with DPDK-netdev)

Page 30: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

30

6.4.3 Test Case 9 - Throughput Performance with Two VMs in Parallel (OVS with DPDK-netdev, L3 Forwarding)

Configuration (Done Manually):

1. Setup 2 VMs and connect to vSwitch.

2. IP addresses of VMs are configured.

3. Flows get programmed to the vSwitch.

Data Path (Numbers Matching Red Circles):

1. The packet generator creates flows based on RFC 2544.

2. The vSwitch forwards the flows to the first vPort of VM #1 and first vPort of VM #2.

3. VM #1 receives the flows and forwards out through its second vPort. VM #2 receives the flows and forwards out through its second vPort.

4. The vSwitch forwards the flows back to the packet generator.

Test Configuration Notes:

• Traffic is bidirectional

• Packets not modified

Figure 6-18 VM Throughput Performance with vSwitch - Two VMs in Parallel

Page 31: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

31

Intel® Open Network Platform ServerBenchmark Performance Test Report

Figure 6-19 VM Throughput Performance - Two VMs in Parallel (OVS with DPDK-netdev)

Page 32: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

32

6.5 Encrypt / Decrypt Performance

6.5.1 Test Case 10 - VM-VM IPSec Tunnel Performance Using Quick Assist Technology

An IPSec Tunnel is setup between two tunnel endpoints, one on each Virtual Machine, and each VM uses Quick Assist technology to offload encryption and decryption.

Throughput tests use UDP data through the IPSEC tunnel and encap/decap in the VM is handled by strongSwan (an openSource IPsec-based VPN Solution - https://www.strongswan.org/). Refer to Appendix B.6 to setup a VM with QAT drivers, netkeyshim module and the strongSwan IPSec application.

Configuration:

• Two Quick Assist Engines are required (one for each VM).

• PCI pass-through is used to attach the Quick Assist PCI accelerator to a VM.

• The virtual network interface of the VMs is virtio with vHost i.e. per standard Open vSwitch.

• Netperf UDP stream test is used to generate traffic in the VM.

The chart below provides a performance comparison with Quick Assist which indicates an improvement of more than twice the throughput for 64-byte packets and more for larger packets.

Figure 6-20 VM-VM IPSec Tunnel Performance Using QAT

Page 33: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

33

Intel® Open Network Platform ServerBenchmark Performance Test Report

Note: Results using Intel Grizzly Pass Xeon® DP Server, 64 GB RAM (8x 8 GB), Intel® Xeon® Processor Series E5-2680 v2 LGA2011 2.8GHz 25MB 115W 10 cores, 240GB SSD 2.5in SATA. See Intel® Open Network Platform for Servers Reference Architecture (Version 1.1), also is provided here for reference. Similar results are expected for the ingredients specified in Section 2.0, “Ingredient Specifications”.

Figure 6-21 VM-to-VM IPsec Tunnel Performance Using Intel® Quick Assist Technology

Page 34: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

34

NOTE: This page intentionally left blank.

Page 35: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

35

Intel® Open Network Platform ServerBenchmark Performance Test Report

Appendix A DPDK L2 and L3 Forwarding Applications

A.1 Building DPDK and L3/L2 Forwarding Applications

The DDPK is installed with IVSHMEM support. It can be also installed without IVSHMEM (see DPDK documentation).

1. If previously built for OVS or OVDK, DPDK needs to be uninstalled first.

# make uninstall# make install T=x86_64-ivshmem-linuxapp-gcc

2. Assuming DPDK Git repository, change DPDK paths for tar install.

# export RTE_SDK=/usr/src/dpdk# export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc

3. Build l3fwd.

# cd /usr/src/dpdk/examples/l3fwd/# make

4. Build l2fwd.

# cd /usr/src/dpdk/examples/l2fwd/# make

A.2 Running the DPDK L3 Forwarding Application

Build DPDK and l3fwd as described in Appendix A.1. Setup host bootline 1 GB and 2 MB hugepage memory and isolcpus.

1. Initialize host hugepage memory after reboot.

# mount -t hugetlbfs nodev /dev/hugepages# mkdir /dev/hugepages_2mb# mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB

2. Initialize host UIO driver.

# modprobe uio# insmod $RTE_SDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

Page 36: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

36

3. Locate target NICs to be used for test (assuming 06:00.0 and 06:00.01 was found target), attach to UIO driver, and check.

# lspci -nn |grep Ethernet# $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0# $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1# $RTE_SDK/tools/dpdk_nic_bind.py --status

4. To run the DPDK L3 forwarding application on the compute node host use the following command line:

# cd $RTE_SDK/examples/l3fwd# ./build/l3fwd -c 0x6 -n 4 --socket-mem 1024,0 -- -p 0x3 -config="(0,0,1),(1,0,2)"

Note: The -c option enables cores (1,2) to run the application. The --socket-mem option just allocates 1 GB of the hugepage memory from target NUMA node 0 and zero memory from NUMA node 1. The -p option is the hexadecimal bit mask of the ports to configure. The --config (port, queue, lcore) determines which queues from which ports are mapped to which cores.

Additionally, to run the standard Linux network stack L3 forwarding without DPDK, enable the kernel forwarding parameter as below and start the traffic.

# echo 1 > /proc/sys/net/ipv4/ip_forward

To setup the test for verifying the devstack installation on compute node and to contrast DPDK host L3 forwarding versus Linux kernel forwarding refer to Section 7.1. This section also details the performance data for both test cases.

A.3 Running the DPDK L2 Forwarding Application

The L2 forward application modifies the packet data by changing the Ethernet address in the outgoing buffer which might impact performance. L3fwd program forwards the packet without modifying the packet.

Build DPDK and l2fwd as described in Appendix A.1. Setup host bootline 1 GB and 2 MB hugepage memory and isolcpus.

1. Initialize host hugepage memory after reboot.

# mount -t hugetlbfs nodev /dev/hugepages# mkdir /dev/hugepages_2mb# mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB

2. Initialize host UIO driver.

# modprobe uio# insmod $RTE_SDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

3. Locate target NICs to be used for test (assuming 06:00.0 and 06:00.01 was found target), attach to UIO driver, and check.

# lspci -nn |grep Ethernet# $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0# $RTE_SDK/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1# $RTE_SDK/tools/dpdk_nic_bind.py --status

Page 37: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

37

Intel® Open Network Platform ServerBenchmark Performance Test Report

4. To run the DPDK 23 forwarding application on the compute node host use the following command line:

# cd $RTE_SDK/examples/l3fwd# ./build/l3fwd -c 0x6 -n 4 --socket-mem 1024,0 -- -p 0x3 -config="(0,0,1),(1,0,2)"

Note: The -c option enables cores (1,2) to run the application. The --socket-mem option just allocates 1 GB of the hugepage memory from target NUMA node 0 and zero memory from NUMA node 1. The -p option is the hexadecimal bit mask of the ports to configure. The --config (port, queue, lcore) determines which queues from which ports are mapped to which cores.

Page 38: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

38

NOTE: This page intentionally left blank.

Page 39: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

39

Intel® Open Network Platform ServerBenchmark Performance Test Report

Appendix B Building DPDK, Open vSwitch, and Intel DPDK Accelerated vSwitch

B.1 Getting DPDK

The DPDK needs to be built differently for 1) the various test programs (e.g. DPDK L2 and L3 test programs), 2) OVS with DPDK-netdev, and 3) Intel DPDK Accelerated vSwitch. The DPDK should be built in the same way for the host machine and VM. There are two ways of obtaining DPDK source as listed in Appendix B.1.1 and Appendix B.1.2.

B.1.1 Getting DPDK Git Source Repository

The git repository was the method used to obtain the code used for tests in this document.

The DPDK Git Source Repository which was used since the latest available tar file changes often. The older tar files are more difficult to obtain, while the target git code is obtained by just checking out the correct commit or tag.

# cd /usr/src# git clone git://dpdk.org/dpdk# cd /usr/src/dpdk

Need to checkout the target DPDK version:

# git checkout -b test_v1.7.1 v1.7.1

The export directory for DPDK git repository is:

# export RTE_SDK=/usr/src/dpdk

B.1.2 Getting DPDK Source tar from DPDK.ORG

This is an alternative way to get the DPDK source and is provided here as additional information.

Get DPDK release 1.7.1 and untar it. Download DPDK 1.7.1 at: http://www.dpdk.org/download

Move to install location, move DPDK tar file to destination, and untar:

# cd /usr/src# mv ~/dpdk-1.7.1.tar.gz .# tar -xf dpdk-1.7.1.tar.gz# cd /usr/src/dpdk-1.7.1

Page 40: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

40

The specific DPDK version directory must be used for the code builds. The git repository DPDK directory is shown in the documentation below and would need to be changed for the DPDK 1.7.1 target directory.

# export RTE_SDK=/usr/src/dpdk-1.7.1

B.2 Building DPDK for OVS

The CONFIG_RTE_BUILD_COMBINE_LIBS=y needs to be set in config/common_linuxapp file. It can be changed by using a text editor or using the following patch:

diff --git a/config/common_linuxapp b/config/common_linuxappindex 9047975..6478520 100644--- a/config/common_linuxapp+++ b/config/common_linuxapp@@ -81,7 +81,7 @@ CONFIG_RTE_BUILD_SHARED_LIB=n## Combine to one single library#-CONFIG_RTE_BUILD_COMBINE_LIBS=n+CONFIG_RTE_BUILD_COMBINE_LIBS=yCONFIG_RTE_LIBNAME="intel_dpdk"#

The patch can be applied by copying it into a file in the upper level directory just above DPDK directory (/usr/src). If the patch file was called patch_dpdk_single_lib, the following is an example of it being applied:

cd /usr/src/dpdkpatch -p1 < ../patch_dpdk_single_lib

# make install T=x86_64-ivshmem-linuxapp-gcc

B.2.1 Getting OVS Git Source Repository

Make sure the DPDK is currently built for the OVS build, if not change as documented above and rebuild DPDK.

Get the OVS Git Repository:

# cd /usr/src# git clone https://github.com/openvswitch/ovs.git# cd /usr/src/ovs

Assume that the latest commit is the target commit. Create a working branch off current branch to do testing and for applying user vhost patch:

# git checkout –b test_branch

Or checkout specific commit used in testing:

# git checkout –b test_branch e9bbe84b6b51eb9671451504b79c7b79b7250c3b

The user vhost interface patch needs to be applied for User vHost Tests.

Page 41: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

41

Intel® Open Network Platform ServerBenchmark Performance Test Report

B.2.2 Getting OVS DPDK-netdev User Space vHost Patch

Currently (10/31/14), not all the User vHost DPDK patches have been up streamed into the OVS source code.

The current patches can be obtained from:

Sept. 29, 2014: http://patchwork.openvswitch.org/patch/6280/

[ovs-dev,v5,1/1] netdev-dpdk: add dpdk vhost portsovs-dev-v5-1-1-netdev-dpdk-add-dpdk-vhost-ports.patch

B.2.3 Applying OVS DPDK-netdev User Space vHost Patch

Before applying the patch, it is desirable to move to and use your own git branch to prevent source tree issues switching issues. The following creates and switches to the new git branch test_srt, but any non-used branch name can be used:

git checkout -b test_srtSwitched to a new branch 'test_srt'

Assuming patch has been placed in /usr/src directory, the following applies the patch:

cd /usr/src/ovspatch -p1 < ../ovs-dev-v5-1-1-netdev-dpdk-add-dpdk-vhost-ports.patch

Assuming that the patch applies correctly, the changes can be checked into the git repository for the test branch allowing the switching between different OVS commit versions. At a later time the patch can be applied to later point in the code repository.

B.2.4 Building OVS with DPDK-netdev

The DPDK needs to be built for OVS as described above. The following assumes DPDK is from git repository, adjust DPDK path if other DPDK build directory:

# cd /usr/src/ovs# export DPDK_BUILD=/usr/src/dpdk/x86_64-ivshmem-linuxapp-gcc# ./boot.sh# ./configure --with-dpdk=$DPDK_BUILD# make# make eventfd

Note: The “ivshmem” option adds support for IVSHMEM and also supports other interface options (user space vHost). By using this option keeps the build consistent across all test configurations.

Page 42: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

42

B.3 Building DPDK for Intel DPDK Accelerated vSwitch

The DPDK must be built as part of the DPDK Accelerated vSwitch master build, or may not build correctly. It cannot have the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set.

B.3.1 Building Intel DPDK Accelerated vSwitch

The OVDK master build is the best way to build OVDK. Make sure DPDK is the base release code without the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set. The following for the git repository can be used as long as changes were not checked into the repository:

# cd /usr/src/dpdk# git diff –M –C

If no changes in code, no output will be generated. If the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set, the output will be similar to the OVS DPDK patch above. The following will discard the change:

# cd git checkout -- config/common_linuxapp

Applying the DPDK changes as a patch allows the easy switching back and forth between DPDK build. The other method is to create 2 branches with and without the CONFIG_RTE_BUILD_COMBINE_LIBS=y flag set and jump between them.

Getting OVDK git source repository:

# cd /usr/src/# git clone git://github.com/01org/dpdk-ovs

Check out code for testing (recommended to use more recent code):

# cd /usr/src/dpdk-ovs# git checkout –b test_my_xx

However, for testing OVDK equivalent to data in this document:

# cd /usr/src/dpdk-ovs# git git checkout -b test_v1.2.0-pre1 bd8e8ee7565ca7e843a43204ee24a7e1e2bf9c6f1

The OVDK master build:

# cd /usr/src/dpdk# export RTE_SDK=$(pwd)# cd ../dpdk-ovs# make config && make

1. Commit ID shown here has been used to collect data in this document.

Page 43: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

43

Intel® Open Network Platform ServerBenchmark Performance Test Report

B.4 Host Kernel Boot Configuration

The host system needs several boot parameters added or changed for optimal operation.

B.4.1 Kernel Boot Hugepage Configuration

Hugepage system makes large memory pages available for use with applications to improve program performance by reducing the processor overhead of updating Transaction Lookaside Buffer (TLB) table entries in the CPU less frequently. The 1 GB huge pages need to be configured on the boot command line and cannot be changed during runtime operation. However the 2 MB pages counts can be adjusted during runtime operation. Configuration on the kernel bootline allows the system to support both 1 GB hugepages and 2 MB hugepages at the same time.

The huge page configuration can be added to the default configuration file /etc/default/grub by adding to the GRUB_CMDLINE_LINUX and the grub configuration file regenerated to get an updated configuration file for Linux boot.

# vim /etc/default/grub // edit file

. . .GRUB_CMDLINE_LINUX_DEFAULT="... default_hugepagesz=1GB hugepagesz=1GB hugepages=16 hugepagesize=2m hugepages=2048 ...". . .

This sets up huge pages for both 1 GB pages for 16 GB of 1 GB hugepage memory and 2 MB pages for 4 GB of 2 MB hugepage memory. After boot the number of 1 GB pages cannot be changed, but the number of 2 MB pages can be changed during runtime operation. Half of each hugepage set comes from each of the NUMA nodes.

This target is setup for OVDK or OVS using two 1-GB Node 0 hugepages and two VMs. Use three 1-GB Node 0 hugepages each for a total of eight 1 GB Node 0 hugepages used for testing. This requires 16 1-GB hugepages are allocated on the Kernel Bootline for eight 1-GB Node 0 hugepages and eight 1-GB Node 1 hugepages.

B.4.2 Kernel Boot CPU Isolation Configuration

CPU isolation removes the CPU cores from the scheduler to prevent a CPU core targeted for particular processing to be assigned to general processing tasks.

You need to know what CPUs are available the system. This isolates 9 CPU cores used in testing.

# vim /etc/default/grub // edit file

. . .GRUB_CMDLINE_LINUX_DEFAULT="... isolcpus=1,2,3,4,5,6,7,8,9 ...". . .

Page 44: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

44

B.4.3 IOMMU Configuration

The IOMMU is needed for PCI-passthrough and for SR-IVO tests and should be disabled for all other tests. The IOMMU is enabled by setting kernel boot IOMMU parameters:

# vim /etc/default/grub // edit file. . .GRUB_CMDLINE_LINUX_DEFAULT="... iommu=pt intel_iommu=on ...". . .

B.4.4 Generating GRUB Boot File

After editing, the grub.cfg boot file needs to be regenerated to set the new kernel boot parameters:

# grub2-mkconfig -o /boot/grub2/grub.cfg

The system needs to be rebooted for the kernel boot changes to take effect.

B.4.5 Verifying Kernel Boot Configuration

Verify that the system booted with correct kernel parameters by reviewing the kernel boot up messages using dmesgs:

# dmesg | grep command

[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.11.10-301.fc20.x86_64 default_hugepagesz=1G hugepagesz=1G hugepages=32 root=UUID=978575a5-45f3-4675-9e4b- e17f3fd0a03 ro vconsole.font=latarcyrheb-sun16 rhgb quiet LANG=en_US.UTF-8

This can be used to verify that the kernel is set correctly for the system. The dmesg only holds the last system message created. If the kernel boot command line is not present, then search /var/log/messages file.

B.4.6 Configuring System Variables (Host System Configuration)

# echo "# Disable Address Space Layout Randomization (ASLR)" > /etc/sysctl.d/ aslr.conf# echo "kernel.randomize_va_space=0" >> /etc/sysctl.d/aslr.conf# echo "# Enable IPv4 Forwarding" > /etc/sysctl.d/ip_forward.conf# echo "net.ipv4.ip_forward=1" >> /etc/sysctl.d/ip_forward.conf# systemctl restart systemd-sysctl.service# cat /proc/sys/kernel/randomize_va_space 0# cat /proc/sys/net/ipv4/ip_forward 0

Page 45: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

45

Intel® Open Network Platform ServerBenchmark Performance Test Report

B.5 Setting Up Tests

B.5.1 OVS Throughput Tests (PHY-PHY without a VM)

Build OVS as described in previous section, reboot system, and check the kernel boot line for 1 GB hugepage, iosolcpu setting for target cores (1,2,3,4,5,6,7,8,9) and IOMMU not enabled.

1. Install OVS kernel module and check for being present.

# modprobe openvswitch# lsmod |grep openopenvswitch 70953 0gre 13535 1 openvswitchvxlan 37334 1 openvswitchlibcrc32c 12603 1 openvswitch

2. Install kernel UIO driver and DPDK UIO driver.

# modprobe uio# insmod /usr/src/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

3. Find target Ethernet interfaces.

# lspci -nn |grep Ethernet03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)03:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)06:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)06:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)

4. Bind to UIO interface (need to not be configured, ifconfig xxxx down if configured).

# /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0# /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1# /usr/src/dpdk/tools/dpdk_nic_bind.py –status

5. Mount hugepage file systems.

# mount -t hugetlbfs nodev /dev/hugepages# mkdir /dev/hugepages_2mb# mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB# cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages# echo 2048 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

6. Terminate OVS, clear previous OVS or OVDK database and setup for new database.

# pkill -9 ovs# rm -rf /usr/local/var/run/openvswitch/# rm -rf /usr/local/etc/openvswitch/# rm -f /tmp/conf.db

# mkdir -p /usr/local/etc/openvswitch# mkdir -p /usr/local/var/run/openvswitch

7. Initialize new OVS database.

# cd /usr/src/ovs# ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db ./vswitchd/vswitch.ovsschema

8. Start OVS database server.

# cd /usr/src/ovs# ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --privatekey=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach

Page 46: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

46

9. Initialize OVS database.

# cd /usr/src/ovs# ./utilities/ovs-vsctl --no-wait init

10. Start OVS with DPDP portion using 1 GB of Node 0 memory.

# cd /usr/src/ovs# ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile –detach

11. Create OVS DPDK Bridge and add the two physical NICs.

# cd /usr/src/ovs# ./utilities/ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev# ./utilities/ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk# ./utilities/ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk

# ./utilities/ovs-vsctl show

12. Set OVS port to port flows for NIC 0 endpoint IP Address of 1.1.1.1 and NIC 0 endpoint IP Address of 5.1.1.1 using script.

#! /bin/sh# Move to command directorycd /usr/src/ovs/utilities/# Clear current flows./ovs-ofctl del-flows br0# Add Flow for port 0 to port 1 and port 1 to port 0./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_src=1.1.1.1,nw_dst=5.1.1.1,idle_timeout=0,action=output:2./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_src=5.1.1.1,nw_dst=1.1.1.1,idle_timeout=0,action=output:1

# ./ovs_flow_ports.sh

13. Check load to verify CPU load for the OVS PMD is operating.

# top (1)top - 18:29:24 up 37 min, 4 users, load average: 0.86, 0.72, 0.63Tasks: 275 total, 1 running, 274 sleeping, 0 stopped, 0 zombie%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu1 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

14. Affinitize DPDK PDM task to CPU core 2.

# cd /usr/src/ovs/utilities# ./ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2

Example of check OVS thread affinitization:

# top –p `pidof ovs-switchd` -H –d1

… PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND1724 root 20 0 5035380 6100 4384 R 99.7 0.0 7:37.19 pmd621658 root 20 0 5035380 6100 4384 S 11.0 0.0 1:40.35 urcu21656 root 20 0 5035380 6100 4384 S 0.0 0.0 0:01.37 ovs-vswitchd1657 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 dpdk_watchdog11659 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 cuse_thread31693 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler601694 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler611695 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler391696 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler381697 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler371698 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler361699 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler331700 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler341701 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler351702 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler401703 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler461704 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler47

Page 47: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

47

Intel® Open Network Platform ServerBenchmark Performance Test Report

1705 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler481706 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler451707 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler501708 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler491709 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler411710 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler441711 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler421712 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 handler431713 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.02 revalidator551714 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.01 revalidator511715 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 revalidator561716 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 revalidator581717 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 revalidator521718 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.01 revalidator571719 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.00 revalidator541720 root 20 0 5035380 6100 4384 S 0.0 0.0 0:00.01 revalidator53

# taskset -p 1724pid 1724's current affinity mask: 2# taskset -p 1658pid 1658's current affinity mask: 4# taskset -p 1656pid 1656's current affinity mask: 4# taskset -p 1657pid 1657's current affinity mask: 4# taskset -p 1659pid 1659's current affinity mask: 4# taskset -p 1693pid 1693's current affinity mask: 4# taskset -p 1694pid 1694's current affinity mask: 4# taskset -p 1695pid 1695's current affinity mask: 4# taskset -p 1696pid 1696's current affinity mask: 4# taskset -p 1697pid 1697's current affinity mask: 4# taskset -p 1698pid 1698's current affinity mask: 4# taskset -p 1699pid 1699's current affinity mask: 4# taskset -p 1700pid 1700's current affinity mask: 4# taskset -p 1701pid 1701's current affinity mask: 4# taskset -p 1702pid 1702's current affinity mask: 4# taskset -p 1703pid 1703's current affinity mask: 4# taskset -p 1704pid 1704's current affinity mask: 4# taskset -p 1705pid 1705's current affinity mask: 4# taskset -p 1706pid 1706's current affinity mask: 4# taskset -p 1707pid 1707's current affinity mask: 4# taskset -p 1708pid 1708's current affinity mask: 4# taskset -p 1709pid 1709's current affinity mask: 4# taskset -p 1710pid 1710's current affinity mask: 4# taskset -p 1711pid 1711's current affinity mask: 4# taskset -p 1712pid 1712's current affinity mask: 4

Page 48: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

48

# taskset -p 1713pid 1713's current affinity mask: 4# taskset -p 1714pid 1714's current affinity mask: 4# taskset -p 1715pid 1715's current affinity mask: 4# taskset -p 1716pid 1716's current affinity mask: 4# taskset -p 1717pid 1717's current affinity mask: 4# taskset -p 1718pid 1718's current affinity mask: 4# taskset -p 1719pid 1719's current affinity mask: 4# taskset -p 1720pid 1720's current affinity mask: 4

Note: OVS task affiliation will change in the future.

B.5.2 Intel DPDK Accelerated vSwitch Throughput Tests (PHY-PHY without a VM)

1. Start with a clean system.

# cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR># pkill -9 ovs# rm -rf /usr/local/var/run/openvswitch/# rm -rf /usr/local/etc/openvswitch/# mkdir -p /usr/local/var/run/openvswitch/# mkdir -p /usr/local/etc/openvswitch/# rm -f /tmp/conf.db# rmmod vhost-net# rm -rf /dev/vhost-net# rmmod ixgbe# rmmod uio# rmmod igb_uio# umount /sys/fs/cgroup/hugetlb# umount /dev/hugepages# umount /mnt/huge

2. Mount hugepage filesystem on host.

# mount -t hugetlbfs nodev /dev/hugepages# mount|grep huge

3. Install the DPDK kernel modules.

# cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR># modprobe uio# insmod <DPDK_INSTALL_DIR>/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

4. Bind the physical interfaces to igb_uio driver. For example, if the PCI addresses on the setup are 08:00.0 and 08:00.1 the below command shows how to bind them to the DPDK driver. Replace the PCI addresses according to the setup.

# ./dpdk*/tools/dpdk_nic_bind.py --status# ./dpdk*/tools/dpdk_nic_bind.py --bind=igb_uio 08:00.0# ./dpdk*/tools/dpdk_nic_bind.py --bind=igb_uio 08:00.1

5. Create an Intel® DPDK Accelerated vSwitch database file.

# ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db <OVS_INSTALL_DIR/openvswitch/vswitchd/vswitch.ovsschema

Page 49: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

49

Intel® Open Network Platform ServerBenchmark Performance Test Report

6. Run the ovsdb-server.

# ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options &

7. Create Intel® DPDK Accelerated vSwitch bridge, add physical interfaces (type: dpdkphy) to the bridge:

# ./utilities/ovs-vsctl --no-wait add-br br0 -- set Bridge br0 datapath_type=dpdk# ./utilities/ovs-vsctl --no-wait add-port br0 port1 -- set Interface port1 type=dpdkphy ofport_request=1 option:port=0# ./utilities/ovs-vsctl --no-wait add-port br0 port2 -- set Interface port2 type=dpdkphy ofport_request=1 option:port=1

The output should be:

---------------------------------------------- 00000000-0000-0000-0000-000000000000

Bridge "br0" Port "br0"

Interface "br0" type: internal

Port "port1" Interface "port1"

type: dpdkphy options: {port="0"}

Port "port2" Interface "port2"

type: dpdkphy options: {port="1"}

----------------------------------------------

8. Start the Intel® DPDK Accelerated vSwitch for the PHY-PHY throughput test.

# ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --socket-mem 4096 -- -p 0x03 --stats_core 0 --stats_int 5

Exit the process using CTRL-C. Copy the valid address on the system from the stdout output of the above command. For example, EAL: virtual area found at 0x7F1740000000 (size = 0x80000000) and use it for the base-virtaddr parameter of the following command:

# ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --base- virtaddr=0x7f9a40000000 --socket-mem 4096 -- -p 0x03 --stats_core 0 --stats_int 5

9. Run the vswitchd daemon.

# ./vswitchd/ovs-vswitchd -c 0x100 --proc-type=secondary -- --pidfile=/tmp/vswitchd.pid

10. Using the ovs-ofctl utility, add flows for bidirectional test. An example below shows how to add flows when the source and destination IPs are 1.1.1.1 and 6.6.6.2 for a bidirectional use case.

# ./utilities/ovs-ofctl del-flows br0# ./utilities/ovs-ofctl add-flow br0 in_port=1,dl_type=0x0800,nw_src=1.1.1.1,nw_dst=6.6.6.2,idle_timeout=0,action=output:2# ./utilities/ovs-ofctl add-flow br0 in_port=2,dl_type=0x0800,nw_src=6.6.6.2,nw_dst=1.1.1.1,idle_timeout=0,action=output:1# ./utilities/ovs-ofctl dump-flows br0# ./utilities/ovs-vsctl --no-wait -- set Open_vSwitch.other_config:n-handler-threads=1

Page 50: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

50

B.5.3 Intel DPDK Accelerated vSwitch with User Space vHost

B.5.3.1 Host Configuration for Running a VM with User Space vHost

1. Start with a clean system.

# cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR># pkill -9 ovs# rm -rf /usr/local/var/run/openvswitch/# rm -rf /usr/local/etc/openvswitch/# mkdir -p /usr/local/var/run/openvswitch/# mkdir -p /usr/local/etc/openvswitch/# rm -f /tmp/conf.db# rmmod vhost-net# rm -rf /dev/vhost-net# rmmod ixgbe# rmmod uio# rmmod igb_uio# umount /sys/fs/cgroup/hugetlb# umount /dev/hugepages# umount /mnt/huge

2. Mount hugepage filesystem on host.

# mount -t hugetlbfs nodev /dev/hugepages# mount|grep huge

3. Install the DPDK, cuse and eventfd kernel modules.

# cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR># modprobe cuse# insmod ./openvswitch/datapath/dpdk/fd_link/fd_link.ko# modprobe uio# insmod <DPDK_INSTALL_DIR>/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

4. Bind/Unbind to/from the igb_uio module.

# <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --status Network devices using IGB_UIO driver====================================<none>

Network devices using kernel driver===================================0000:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=p1p1 drv=ixgbe unused=igb_uio0000:01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=p1p2 drv=ixgbe unused=igb_uio0000:08:00.0 'I350 Gigabit Network Connection' if=em0 drv=igb unused=igb_uio *Active*0000:08:00.1 'I350 Gigabit Network Connection' if=enp8s0f1 drv=igb unused=igb_uio 0000:08:00.2 'I350 Gigabit Network Connection' if=enp8s0f2 drv=igb unused=igb_uio 0000:08:00.3 'I350 Gigabit Network Connection' if=enp8s0f3 drv=igb unused=igb_uio Other network devices=====================<none>

To bind devices p1p1 and p1p2, (01:00.0 and 01:00.1), to the igb_uio driver.

# <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --bind=igb_uio 01:00.0# <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --bind=igb_uio 01:00.1# <DPDK_INSTALL_DIR>/tools/dpdk_nic_bind.py --status Network devices using IGB_UIO driver

====================================0000:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused= 0000:01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused= Network devices using kernel driver

Page 51: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

51

Intel® Open Network Platform ServerBenchmark Performance Test Report

===================================0000:08:00.0 'I350 Gigabit Network Connection' if=em0 drv=igb unused=igb_uio *Active* 0000:08:00.1 'I350 Gigabit Network Connection' if=enp8s0f1 drv=igb unused=igb_uio 0000:08:00.2 'I350 Gigabit Network Connection' if=enp8s0f2 drv=igb unused=igb_uio 0000:08:00.3 'I350 Gigabit Network Connection' if=enp8s0f3 drv=igb unused=igb_uio Other network devices=====================<none>

5. Create an Intel® DPDK Accelerated vSwitch database file.

# ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db <OVS_INSTALL_DIR/openvswitch/vswitchd/vswitch.ovsschema

6. Run the ovsdb-server.

# ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock -- remote=db:Open_vSwitch,Open_vSwitch,manager_options &

7. Create Intel® DPDK Accelerated vSwitch bridge, add physical interfaces (type: dpdkphy) and vHost interfaces (type: dpdkvhost) to the bridge.

# ./utilities/ovs-vsctl --no-wait add-br br0 -- set Bridge br0 datapath_type=dpdk# ./utilities/ovs-vsctl --no-wait add-port br0 port1 -- set Interface port1 type=dpdkphy ofport_request=1 option:port=0# ./utilities/ovs-vsctl --no-wait add-port br0 port2 -- set Interface port2type=dpdkphy ofport_request=2 option:port=1# ./utilities/ovs-vsctl --no-wait add-port br0 port3 -- set Interface port3 type=dpdkvhost ofport_request=3# ./utilities/ovs-vsctl --no-wait add-port br0 port4 -- set Interface port4 type=dpdkvhost ofport_request=4# ./utilities/ovs-vsctl show

The output should be:

---------------------------------------------- 00000000-0000-0000-0000-000000000000

Bridge "br0" Port "br0"

Interface "br0" type: internal

Port "port16" Interface "port1"

type: dpdkphy options: {port="1"}

Port "port17" Interface "port2"

type: dpdkphy options: {port="2"}

Port "port80" Interface "port3"

type: dpdkvhost

Port "port81"Interface "port4"

type: dpdkvhost

----------------------------------------------

8. Start the Intel® DPDK Accelerated vSwitch for 1 VM with two user space vHost interfaces.

# ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --socket-mem 4096-- -p 0x03 --stats_core 0 --stats_int 5

Page 52: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

52

Exit the process using CTRL-C. Copy the valid address on the system from the stdout output of the above command. For example, EAL: virtual area found at 0x7F1740000000 (size = 0x80000000). Use it for the base-virtaddr parameter of the following command:

# ./datapath/dpdk/ovs-dpdk -c 0x0F -n 4 --proc-type primary --base- virtaddr=0x7f9a40000000 --socket-mem 4096 -- -p 0x03 --stats_core 0 --stats_int 5

9. Run the vswitchd daemon.

# ./vswitchd/ovs-vswitchd -c 0x100 --proc-type=secondary -- --pidfile=/tmp/vswitchd.pid

10. Using the ovs-ofctl utility, add flows entries to switch packets from port 2 to port 4, port 3 to port 1 on ingress and egress path. An example below shows how to add flows for source and destination IPs are 1.1.1.1 and 6.6.6.2 for a bidirectional case.

# ./utilities/ovs-ofctl del-flows br0# ./utilities/ovs-ofctl add-flow br0 in_port=2,dl_type=0x0800,nw_src=1.1.1.1,nw_dst=6.6.6.2,idle_timeout=0,action=output:4# ./utilities/ovs-ofctl add-flow br0 in_port=3,dl_type=0x0800,nw_src=1.1.1.1,nw_dst=6.6.6.2,idle_timeout=0,action=output:1# ./utilities/ovs-ofctl add-flow br0 in_port=4,dl_type=0x0800,nw_src=6.6.6.2,nw_dst=1.1.1.1,idle_timeout=0,action=output:2# ./utilities/ovs-ofctl add-flow br0 in_port=1,dl_type=0x0800,nw_src=6.6.6.2,nw_dst=1.1.1.1,idle_timeout=0,action=output:3# ./utilities/ovs-ofctl dump-flows br0

11. Copy DPDK source to a shared directory to be passed to the VM.

# rm -rf /tmp/qemu_share# mkdir -p /tmp/qemu_share# mkdir -p /tmp/qemu_share/DPDK# chmod 777 /tmp/qemu_share# cp -aL <DPDK_INSTALL_DIR>/* /tmp/qemu_share/DPDK

12. Start the VM using qemu command.

# cd <Intel_DPDK_Accelerated_vSwitch_INSTALL_DIR># taskset 0x30 ./qemu/x86_64-softmmu/qemu-system-x86_64 -cpu host -boot c -hda <VM_IMAGE_LOCATION> -m 4096 -smp 2 --enable-kvm -name 'client 1' -nographic -vnc :12 -pidfile /tmp/vm1.pid -drive file=fat:rw:/tmp/qemu_share -monitor unix:/tmp vm1monitor,server,nowait -net none -no-reboot -mem-path /dev/hugepages -mem-prealloc -netdev type=tap,id=net1,script=no,downscript=no,ifname=port80,vhost=on -device virtio-net- pci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port81,vhost=on -device virtio net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off, guest_tso6=off,guest_ecn=off

The VM can be accessed using a VNC client at the port mentioned in the qemu startup command, which is port 12 in this instance.

B.5.3.2 Guest Configuration to Run DPDK Sample Forwarding Application with User Space vHost

1. Install DPDK on the VM using the source from the shared drive.

# mkdir -p /mnt/vhost# mkdir -p /root/vhost# mount -o iocharset=utf8 /dev/sdb1 /mnt/vhost# cp -a /mnt/vhost/* /root/vhost# export RTE_SDK=/root/vhost/DPDK# export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc# cd /root/vhost/DPDK# make uninstall# make install T=x86_64-ivshmem-linuxapp-gcc

Page 53: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

53

Intel® Open Network Platform ServerBenchmark Performance Test Report

2. Build the DPDK vHost application testpmd.

# cd /root/vhost/DPDK/app/test-pmd# make clean# make

3. Run the testpmd application after installing the UIO drivers.

# modprobe uio# echo 1280 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/ nr_hugepages# insmod /root/vhost/DPDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko# /root/vhost/DPDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:00:03.0 0000:00:04.0# ./testpmd -c 0x3 -n 4 --socket-mem 128 -- --burst=64 -i

At the prompt enter the following to start the application:

testpmd> set fwd mac_retry testpmd> start

B.5.4 OVS with User Space vHost

Build OVS as described in previous section, reboot system, and check the kernel boot line for 1 GB hugepage, iosolcpu setting for target cores (1,2,3,4,5,6,7,8,9) and IOMMU not enabled.

1. Check DPDK for correct build and build OVS with DPDK and eventfd module if not already built

2. Check system boot for valid Hugepage and isolcpus settings.

# dmesg|grep command[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=27da0816-bbbd-4b28-acaf-993939cfa758 ro default_hugepagesz=1GB hugepagesz=1GB hugepages=8 hugepagesz=2M hugepages=2048 isolcpus=1,2,3,4,5,6,7,8,9 vconsole.font=latarcyrheb-sun16 rhgb quiet

3. Remove vhost-net module and device directory.

rmmod vhost-netrm /dev/vhost-net

4. Mount hugepage file systems.

# mount -t hugetlbfs nodev /dev/hugepages

# mkdir /dev/hugepages_2mb# mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB

# cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages# echo 2048 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

5. Install OVS kernel module.

# modprobe openvswitch# lsmod |grep openopenvswitch 70953 0gre 13535 1 openvswitchvxlan 37334 1 openvswitchlibcrc32c 12603 1 openvswitch

6. Install cuse and fuse kernel modules.

# modprobe cuse# modprobe fuse

7. Install eventfd kernel module.

# modprobe /usr/src/ovs/utilities/eventfd_link/eventfd_link.ko

8. Install UIO and DPDK UIO kernel modules.

# modprobe uio# insmod /usr/src/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

Page 54: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

54

9. Find target Ethernet interfaces.

# lspci -nn |grep Ethernet03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)03:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)06:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)06:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)

10. Bind to UIO interface (need to not be configured, ifconfig xxxx down if configured).

# /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.0# /usr/src/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 06:00.1# /usr/src/dpdk/tools/dpdk_nic_bind.py -status

11. Terminate OVS, clear previous OVS or OVDK database and setup for new database.

# pkill -9 ovs# rm -rf /usr/local/var/run/openvswitch/# rm -rf /usr/local/etc/openvswitch/# rm -f /tmp/conf.db

# mkdir -p /usr/local/etc/openvswitch# mkdir -p /usr/local/var/run/openvswitch

12. Initialize new OVS database.

# cd /usr/src/ovs# ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db ./vswitchd/vswitch.ovsschema

13. Start OVS database server.

# cd /usr/src/ovs# ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach

14. Initialize OVS database.

# cd /usr/src/ovs# ./utilities/ovs-vsctl --no-wait init

15. Start OVS with DPDP portion using 1 GB or 2 GB of Node 0 memory/

# cd /usr/src/ovs# ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 2048,0 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile -detach

16. Set DPDK PDM thread affinity (persistent if OVS database not cleared).

# cd /usr/src/ovs./utilities/ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2

17. Create OVS DPDK Bridge and add the two physical NICs and user side vHost interfaces.

# cd /usr/src/ovs# ./utilities/ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev# ./utilities/ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk# ./utilities/ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk# ./utilities/ovs-vsctl add-port br0 dpdkvhost0 -- set Interface dpdkvhost0# ./utilities/ovs-vsctl add-port br0 dpdkvhost1 -- set Interface dpdkvhost1

Page 55: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

55

Intel® Open Network Platform ServerBenchmark Performance Test Report

The ovs-vswitchd needs to be running and each added in the order after the database has been cleared to establish dpdk0 as port 1, dpdk1 as port 2, dpdkvhost0 as port 3, and dpdkvhost1 as port 4 for the flow example entries to be correct. If the database was not cleared the port ids assignment remain and the port ids often are not in the expected order. In that case the flow commands need to be adjusted or the OVS terminated, OVS database cleared, OVS restarted and bridge created in correct order.

18. Check OVS DPDK bridge

# ./utilities/ovs-vsctl show[root@F20-v3 ovs]# ./utilities/ovs-vsctl showc2669cc3-55ac-4f62-854c-b88bf5b66c10 Bridge "br0" Port "dpdk0" Interface "dpdk0" type: dpdk Port "dpdkvhost0" Interface "dpdkvhost0" type: dpdkvhost Port "dpdk1" Interface "dpdk1" type: dpdk Port "br0" Interface "br0" type: internal Port "dpdkvhost1" Interface "dpdkvhost1" type: dpdkvhost

Note: UUID, c2669cc3-55ac-4f62-854c-b88bf5b66c10 for this instance, and is different for each instance.

19. Apply the vhost to VM vhost flow operations.

# ./ovs_flow_vm_vhost.sh# cat ovs_flow_vm_vhost.sh#! /bin/sh

# Move to command directory

cd /usr/src/ovs/utilities/

# Clear current flows./ovs-ofctl del-flows br0

# Add Flow for port 0 (16) to port 1 (17)./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:4./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:3./ovs-ofctl add-flow br0 in_port=4,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:2./ovs-ofctl add-flow br0 in_port=3,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:1

In this case, the end point port 2 has fixed IP endpoint 5.1.1.1 while the port 1 end point has multiple endpoints IPs starting at IP 1.1.1.1 and incrementing up to half the bidirectional flows of the test. This allows a large number of flows to be set using a small number of inexact flow commands. The nw_src and nd_dst can also be deleted and would result in the same flows, but without any IP matches on the inexact flow commands.

Page 56: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

56

B.5.4.1 VM Startup for (OVS User Space vHost) Throughput Test

The following test script is sample manual VM startup for vhost throughout test. 1 standard Linux network interface for control and test interaction and 2 User side vHost compatible network interfaces for attaching to DPDK bridge ports 3 and 4.This assumes that Linux network bridge “br-mgt” has been created on the management network. The VM is started with 3 GB of memory using the 1 GB hugepage memory. Three 1 GB pages is minimum of 1 GB hugepage memory that will have one 1 GB hugepage available for use in VM.

# cat vm_vhost_start.sh

#!/bin/shvm=/vm/Fed20-mp.qcow2vm_name="vhost-test"

vnc=10

n1=tap46bra=br-mgtdn_scrp_a=/vm/vm_ctl/br-mgt-ifdownmac1=00:1f:33:16:64:44

if [ ! -f $vm ];then echo "VM $vm not found!"else echo "VM $vm started! VNC: $vnc, management network: $n1" tunctl -t $n1 brctl addif $bra $n1 ifconfig $n1 0.0.0.0 up

taskset 0x30 qemu-system-x86_64 -cpu host -hda $vm -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=$mac1 -netdev tap,ifname=$n1,id=eth0,script=no,downscript=$dn_scrp_a -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtio-net-pci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtio-net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -name $vm_name -vnc :$vnc &fi

Script shutdown support script:

# cat br-mgt-ifdown

#!/bin/shbridge='br-mgt'/sbin/ifconfig $1 0.0.0.0 downbrctl delif ${bridge} $1

Page 57: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

57

Intel® Open Network Platform ServerBenchmark Performance Test Report

B.5.4.2 Operating VM for (OVS User Space vHost) Throughput Test

1. The VM kernel bootline needs to have 1 GB hugepage memory configured and vCPU 1 isolated. Check for correct bootline parameters, change if needed and reboot.

# dmesg |grep command[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=6635b8e3-0312-4892-92e7-ccff453ec06d ro default_hugepagesz=1GB hugepagesz=1G hugepages=1 isolcpus=1 vconsole.font=latarcyrheb-sun16 rhgb quiet LANG=en_US.UTF-8

2. Get DPDK package (same as host).

# cd /root# git clone git://dpdk.org/dpdk# cd /root/dpdk

3. Need to checkout the target DPDK version.

# git checkout -b test_v1.7.1 v1.7.1

4. The CONFIG_RTE_BUILD_COMBINE_LIBS=y needs to be set in config/common_linuxapp file. It can be changed by using a text editor or using a patch as discussed earlier in this document.

# make install T=x86_64-ivshmem-linuxapp-gcc

5. Build the vHost PMD application.

# cd /root/dpdk/app/test-pmd/# export RTE_SDK=$(pwd)# export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc# make CC testpmd.o CC parameters.o CC cmdline.o CC config.o CC iofwd.o CC macfwd.o CC macfwd-retry.o CC macswap.o CC flowgen.o CC rxonly.o CC txonly.o CC csumonly.o CC icmpecho.o CC mempool_anon.o LD testpmd INSTALL-APP testpmd INSTALL-MAP testpmd.map

6. Setup hugepage filesystem.

# mkdir -p /mnt/hugepages# mount -t hugetlbfs hugetlbfs /mnt/hugepage

7. Load the UIO kernel modules.

# modprobe uio# insmod /root/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

8. Find the passed Userside network devices. The order is in the order listed in the QEMU startup command line. The first network device is the management network, the second and third network device is the Userside vHost devices.

# lspci00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]

Page 58: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

58

00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)00:02.0 VGA compatible controller: Cirrus Logic GD 544600:03.0 Ethernet controller: Red Hat, Inc Virtio network device00:04.0 Ethernet controller: Red Hat, Inc Virtio network device00:05.0 Ethernet controller: Red Hat, Inc Virtio network device

9. Bind the Userside vHost devices to the igb_uio device driver.

# /root/dpdk/tools/dpdk_nic_bind.py -b igb_uio 0000:00:04.0# /root/dpdk/tools/dpdk_nic_bind.py -b igb_uio 0000:00:05.0

10. Check bind status to make sure PCI 0000:00:04.0 and 0000:00:05.0 bound to igb_uio device driver.

# /root/dpdk/tools/dpdk_nic_bind.py --status

Network devices using DPDK-compatible driver============================================0000:00:04.0 'Virtio network device' drv=igb_uio unused=virtio_pci0000:00:05.0 'Virtio network device' drv=igb_uio unused=virtio_pci

Network devices using kernel driver===================================0000:00:03.0 'Virtio network device' if= drv=virtio-pci unused=virtio_pci,igb_uio

Other network devices=====================<none>

11. Run test-pmd application.

# cd /root/dpdk/app/test-pmd/# ./testpmd -c 0x3 -n 4 --socket-mem 128 -- --burst=64 -i --txqflags=0xf00EAL: Detected lcore 0 as core 0 on socket 0EAL: Detected lcore 1 as core 0 on socket 0EAL: Support maximum 64 logical core(s) by configuration.EAL: Detected 2 lcore(s)EAL: unsupported IOMMU type!EAL: VFIO support could not be initializedEAL: Searching for IVSHMEM devices...EAL: No IVSHMEM configuration found!EAL: Setting up memory...EAL: Ask a virtual area of 0x40000000 bytesEAL: Virtual area found at 0x7fc980000000 (size = 0x40000000)EAL: Requesting 1 pages of size 1024MB from socket 0EAL: TSC frequency is ~2593994 KHzEAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !EAL: Master core 0 is ready (tid=7f723880)EAL: Core 1 is ready (tid=7eee5700)EAL: PCI device 0000:00:03.0 on NUMA socket -1EAL: probe driver: 1af4:1000 rte_virtio_pmdEAL: 0000:00:03.0 not managed by UIO driver, skippingEAL: PCI device 0000:00:04.0 on NUMA socket -1EAL: probe driver: 1af4:1000 rte_virtio_pmdEAL: PCI memory mapped at 0x7fca7f72c000EAL: PCI device 0000:00:05.0 on NUMA socket -1EAL: probe driver: 1af4:1000 rte_virtio_pmdEAL: PCI memory mapped at 0x7fca7f72b000Interactive-mode selectedConfiguring Port 0 (socket 0)Port 0: 00:00:00:00:00:01Configuring Port 1 (socket 0)Port 1: 00:00:00:00:00:02Checking link statuses...Port 0 Link Up - speed 10000 Mbps - full-duplexPort 1 Link Up - speed 10000 Mbps - full-duplexDonetestpmd>

Page 59: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

59

Intel® Open Network Platform ServerBenchmark Performance Test Report

12. Enter forward and mac_retry commands to setup operation.

testpmd> set fwd mac_retrySet mac_retry packet forwarding mode

13. Start forwarding operation.

testpmd> start mac_retry packet forwarding - CRC stripping disabled - packets/burst=64 nb forwarding cores=1 - nb forwarding ports=2 RX queues=1 - RX desc=128 - RX free threshold=0 RX threshold registers: pthresh=8 hthresh=8 wthresh=0 TX queues=1 - TX desc=512 - TX free threshold=0 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX RS bit threshold=0 - TXQ flags=0xf00testpmd>

14. Back on host, affinitize the vCPU to available target CPU core. Review QEMU tasks.

# ps -eLF|grep qemuroot 1658 1 1658 0 4 948308 37956 4 13:29 pts/2 00:00:01 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtio-net-pci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtio-net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -name vhost-test -vnc :10root 1658 1 1665 1 4 948308 37956 4 13:29 pts/2 00:00:20 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtio-net-pci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtio-net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -name vhost-test -vnc :10root 1658 1 1666 89 4 948308 37956 4 13:29 pts/2 00:18:59 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtio-net-pci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtio-net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -name vhost-test -vnc :10root 1658 1 1668 0 4 948308 37956 4 13:29 pts/2 00:00:00 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 2 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -netdev

Page 60: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

60

type=tap,id=net1,script=no,downscript=no,ifname=port3,vhost=on -device virtio-net-pci,netdev=net1,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -netdev type=tap,id=net2,script=no,downscript=no,ifname=port4,vhost=on -device virtio-net-pci,netdev=net2,mac=00:00:00:00:00:02,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -name vhost-test -vnc :10

15. In this case, we see that task 1666 is accumulating CPU run time. We change the affinitization to the target core.

# taskset -p 1666pid 1666's current affinity mask: 30# taskset -p 40 1666pid 1666's current affinity mask: 30pid 1666's new affinity mask: 40

16. Check CPU core load to make sure real-time task running on correct CPU.

# top (1)top - 13:51:26 up 1:37, 3 users, load average: 2.16, 2.12, 1.93Tasks: 272 total, 1 running, 271 sleeping, 0 stopped, 0 zombie%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu1 : 96.7 us, 3.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu2 : 4.5 us, 7.6 sy, 0.0 ni, 87.6 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu6 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu8 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu9 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu11 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu13 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu14 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu17 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu18 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu20 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu21 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu22 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu23 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu24 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu25 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu26 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu27 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 stKiB Mem: 32813400 total, 17556504 used, 15256896 free, 18756 buffersKiB Swap: 8191996 total, 0 used, 8191996 free, 301248 cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1581 root 20 0 10.802g 6268 4412 S 116.3 0.0 31:02.42 ovs-vswitchd 1658 root 20 0 3793232 37956 20400 S 100.3 0.1 20:08.86 qemu-system-x86 1 root 20 0 47528 5080 3620 S 0.0 0.0 0:02.72 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:02.23 kworker/0:0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u288:0

Page 61: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

61

Intel® Open Network Platform ServerBenchmark Performance Test Report

17. Set the OVS flows.

# ./ovs_flow_vm_vhost.sh

# cat ovs_flow_vm_vhost.sh#! /bin/sh

# Move to command directory

cd /usr/src/ovs/utilities/

# Clear current flows./ovs-ofctl del-flows br0

# Add Flow for port 0 (16) to port 1 (17)./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:3./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:4./ovs-ofctl add-flow br0 in_port=3,dl_type=0x800,nw_src=5.1.1.1,idle_timeout=0,action=output:1./ovs-ofctl add-flow br0 in_port=4,dl_type=0x800,nw_dst=5.1.1.1,idle_timeout=0,action=output:2

B.5.5 SR-IOV VM Test

The SR-IOV test gives Niantic 10 GbE SR-IOV Virtual Function interfaces to a VM to do a network throughput test of the VM using SR-IOV network interface.

1. Make sure the host has boot line configuration for 1 GB hugepage memory, CPU isolation, and the Intel IOMMU is on and running in PT mode.

# dmesg |grep command[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=27da0816-bbbd-4b28- acaf-993939cfa758 ro default_hugepagesz=1GB hugepagesz=1GB hugepages=16 hugepagesz=2M hugepages=2048 isolcpus=1,2 ,3,4,5,6,7,8,9 iommu=pt intel_iommu=on vconsole.font=latarcyrheb-sun16 rhgb quiet

2. The hugepage file systems are initialized.

# mount -t hugetlbfs nodev /dev/hugepages# mkdir /dev/hugepages_2mb# mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB

3. Black list the IXGBE virtual function device driver to prevent host loading and VF NIC configuration by adding it to /etc/modprobe.d/blacklist.conf.

cat /etc/modprobe.d/blacklist.conf. . .# Intel ixgbe sr-iov vf (virtual driver)blacklist ixgbevf

4. Uninstall and reinstall the IXGBE 10 GbE Niantic NIC device driver to enable SR_IOV and create the SR-IOV interfaces.

# modprobe -r ixgbe# modprobe ixgbe max_vfs=1# service network restart

In this case only one SR-IOV VF (Virtual Function) was created per PF (Physical Function) using max_vfs=1, but often more than 1 is desired. The network needs to be restarted afterward.

Page 62: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

62

5. Bring up and configure the Niantic PF interfaces to allow the SR-IOV interfaces to be used. Leaving the PF unconfigured will cause the SR-IOV interfaces to not be usable for the test. The Niantic 10 GbE interfaces happen to have default labeling of p786p1 and p786p2 in this example.

# ifconfig p786p1 11.1.1.225/24 up# ifconfig p786p2 12.1.1.225/24 up

6. Find the target SR-IOV interfaces to use.

# lspci -nn |grep Ethernet03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)03:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)03:10.0 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)03:10.1 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)06:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)06:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)06:10.0 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)06:10.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)

We see the two Niantic (device 82599) Virtual Functions, 1 for each Physical Device, are PCI devices 06:10.0 and 06:10.1 with PCI device types [8086:10ed].

7. Make sure the pci-stub device driver is loaded since it is used to transfer physical devices to VMs. It was observed that the pci-stub loaded by default on Fedora 20 and not loaded by default on Ubuntu 14.04. The following checks for pci-stub present.

#ls /sys/bus/pci/drivers |grep pci-stubpci-stub

If not present load

#modprobe pci-stub

8. The Niantic (82599) Virtual Function interfaces are the target interfaces which have a type ID of 8086:10ed. Load the VF devices into pci-stub.

# echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id

9. List the devices owned by the pci-stub driver.

# ls /sys/bus/pci/drivers/pci-stub0000:06:10.0 0000:06:10.1 bind new_id remove_id uevent unbind

10. If the target Virtual Function devices are not listed, they are probably attached to an IXGBEVF driver on the host. The following might be used in that case.

# echo 0000:06:10.0 > /sys/bus/pci/devices/0000:06:10.0/driver/unbind# echo 0000:06:10.1 > /sys/bus/pci/devices/0000:06:10.1/driver/unbind# echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id

11. Verify devices present for transfer to VM.

# ls /sys/bus/pci/drivers/pci-stub0000:06:10.0 0000:06:10.1 bind new_id remove_id uevent unbind

Page 63: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

63

Intel® Open Network Platform ServerBenchmark Performance Test Report

B.5.5.1 SR-IOV VM Startup

The VM is started with the SR-IOV physical devices passed to the VM in the QEMU command line. The VM is started with 3 GB of preloaded 1 GB huge memory (minimum 3 GB for one 1 GB page to be available in VM) from /dev/hugepages using 3 vCPUS of host type with first network interface device being virtio used for the bridged management network and second network device being SR-IOV 06:10.0 and third being SR-IOV 06:10.1. The following is a script that was used to start the VM for the test:

# cat vm_sr-iov_start.sh#!/bin/shvm=/vm/Fed20-mp.qcow2vm_name="SRIOVtest"

vnc=10

n1=tap46bra=br-mgtdn_scrp_a=/vm/vm_ctl/br-mgt-ifdownmac1=00:1f:33:16:64:44

if [ ! -f $vm ];then echo "VM $vm not found!"else echo "VM $vm started! VNC: $vnc, management network: $n1" tunctl -t $n1 brctl addif $bra $n1 ifconfig $n1 0.0.0.0 up

taskset 0x70 qemu-system-x86_64 -cpu host -hda $vm -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=$mac1 -netdev tap,ifname=$n1,id=eth0,script=no,downscript=$dn_scrp_a -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name $vm_name -vnc :$vnc &fiScript for VM shutdown support:# cat br-mgt-ifdown#!/bin/sh

bridge='br-mgt'/sbin/ifconfig $1 0.0.0.0 downbrctl delif ${bridge} $1Start the VM:# ./vm_sr-iov_start.shVM /vm/Fed20-mp.qcow2 started! VNC: 10, management network: tap46Set 'tap46' persistent and owned by uid 0

B.5.5.2 Operating VM for SR-IOV Test

In the VM, the first network is configured for management network access and used for test control.

1. Verify that the kernel has bootline parameters for 1 GB hugepage support with one 1 GB page and that vCPU 1 and vCPU 2 have been isolated from the VM task scheduler.

# dmesg |grep command[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.15.6-200.fc20.x86_64 root=UUID=6635b8e3-0312-4892-92e7-ccff453ec06d ro default_hugepagesz=1GB hugepagesz=1G hugepages=1 isolcpus=1,2 vconsole.font=latarcyrheb-sun16 rhgb quiet

Page 64: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

64

If not configure generate new grub file and reboot VM.

# vim /etc/default/grub // edit file

. . .GRUB_CMDLINE_LINUX="default_hugepagesz=1GB hugepagesz=1G hugepages=1 isolcpus=1,2...". . .

# grub2-mkconfig -o /boot/grub2/grub.cfgGenerating grub.cfg ...Found linux image: /boot/vmlinuz-3.15.6-200.fc20.x86_64Found initrd image: /boot/initramfs-3.15.6-200.fc20.x86_64.imgFound linux image: /boot/vmlinuz-3.11.10-301.fc20.x86_64Found initrd image: /boot/initramfs-3.11.10-301.fc20.x86_64.imgFound linux image: /boot/vmlinuz-0-rescue-92606197a3dd4d3e8a2b95312bca842fFound initrd image: /boot/initramfs-0-rescue-92606197a3dd4d3e8a2b95312bca842f.imgDone

# reboot

2. Verify 1 GB memory page available.

# cat /proc/meminfo... HugePages_Total: 1HugePages_Free: 1HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 1048576 kBDirectMap4k: 45048 kBDirectMap2M: 2052096 kBDirectMap1G: 1048576 kB

3. Check CPU is host type and 3 vCPUs are available.

# cat /proc/cpuinfo... processor : 2vendor_id : GenuineIntelcpu family : 6model : 63model name : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHzstepping : 2microcode : 0x1cpu MHz : 2593.992cache size : 4096 KBphysical id : 2siblings : 1core id : 0cpu cores : 1apicid : 2initial apicid : 2fpu : yesfpu_exception : yescpuid level : 13wp : yesflags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm xsaveopt fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcidbogomips : 5187.98clflush size : 64cache_alignment : 64address sizes : 40 bits physical, 48 bits virtualpower management:

Page 65: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

65

Intel® Open Network Platform ServerBenchmark Performance Test Report

4. Install DPDK.

# cd /root# git clone git://dpdk.org/dpdk# cd /root/dpdk# git checkout -b test_v1.7.1 v1.7.1# make install T=x86_64-ivshmem-linuxapp-gcc

5. Build l3fwd example program.

# cd examples/l3fwd# export RTE_SDK=/root/dpdk# export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc# makeCC main.o LD l3fwd INSTALL-APP l3fwd INSTALL-MAP l3fwd.map

6. Mount the hugepage file system.

# mount -t hugetlbfs nodev /dev/hugepages

7. Install UIO driver in VM.

# modprobe uio# insmod /root/dpdk/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko

8. Find the SR-IOV devices.

# lspci -nn00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000]00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010]00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8]00:03.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000]00:04.0 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)00:05.0 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function

The networks are in the order of the QEMU command line. The first network interface, 00:03.0 is the bridged control network. The second network interface 00:04.0 is the first SR-IOV physical PCI device passed on the QEMU command lined with the third network interface 00:05.0 is the second SR-IOV physical PCI device passed.

9. Bind the SR-IOV devices to the igb_uio device driver.

# /root/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 00:04.0# /root/dpdk/tools/dpdk_nic_bind.py --bind=igb_uio 00:05.0

10. Run the l3fwd example program.

# cd /root/dpdk/examples/l3fwd]# ./build/l3fwd -c 6 -n 4 --socket-mem 1024 -- -p 0x3 --config="(0,0,1),(1,0,2)" &

The -c 6 and -config parameters correctly Affinitizes the l3fwd task to the correct vCPUs in the VM, however since the CPU cores on the host are isolated and QEMU task was started on isolated cores, the real-time l3fwd task will be thrashing since both vCPUs are running on the same CPU core, requiring the CPU affinitization to set correctly on the host or poor network performance will occur.

Page 66: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

66

11. Locate the vCPU QEMU thread IDs on the host.

# ps -eLF |grep qemuroot 1718 1 1718 0 6 982701 44368 4 17:41 pts/0 00:00:05 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10

root 1718 1 1720 2 6 982701 44368 4 17:41 pts/0 00:00:32 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10

root 1718 1 1721 30 6 982701 44368 4 17:41 pts/0 00:06:07 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10

root 1718 1 1722 30 6 982701 44368 4 17:41 pts/0 00:06:06 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10

root 1718 1 1724 0 6 982701 44368 4 17:41 pts/0 00:00:00 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10

root 1718 1 1815 0 6 982701 44368 4 18:01 pts/0 00:00:00 qemu-system-x86_64 -cpu host -hda /vm/Fed20-mp.qcow2 -m 3072 -boot c -smp 3 -pidfile /tmp/vm1.pid -monitor unix:/tmp/vm1monitor,server,nowait -mem-path /dev/hugepages -mem-prealloc -enable-kvm -net nic,model=virtio,netdev=eth0,macaddr=00:1f:33:16:64:44 -netdev tap,ifname=tap46,id=eth0,script=no,downscript=/vm/vm_ctl/br-mgt-ifdown -device pci-assign,host=06:10.0 -device pci-assign,host=06:10.1 -name SRIOVtest -vnc :10

We see that QEMU thread task IDs 1721 and 1722 are accumulating process time of 6:07 and 6:06 respectively, compared to the other low usage threads in this case. We know that the two 100% load DPDK treads are running on vCPU1 and vCPU2 respectively, so this indicates that vCPU1 is task ID 1721 and vCPU2 is task ID 1722. Since QEMU creates vCPU task threads sequentially, vCPU0 must be QEMU thread task 1720.

12. Having CPU core 8 and 9 available on this 10 CPU core processors, we will move vCPU1 to CPU core 8 and vCPU2 to CPU core 9 for the test.

# taskset -p 100 1721pid 1721's current affinity mask: 70pid 1721's new affinity mask: 100

# taskset -p 200 1722pid 1722's current affinity mask: 70pid 1722's new affinity mask: 200

13. Using top, we verify that tasks are running on target CPU cores.

# top (1)top - 18:03:21 up 35 min, 2 users, load average: 1.93, 1.89, 1.24Tasks: 279 total, 1 running, 278 sleeping, 0 stopped, 0 zombie%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

Page 67: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

67

Intel® Open Network Platform ServerBenchmark Performance Test Report

%Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu8 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu9 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu11 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu13 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu14 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu17 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu18 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu20 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu21 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu22 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu23 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu24 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu25 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu26 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu27 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 stKiB Mem: 32879352 total, 21912988 used, 10966364 free, 16680 buffersKiB Swap: 8191996 total, 0 used, 8191996 free, 519040 cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1718 root 20 0 3930804 44368 20712 S 199.9 0.1 15:10.12 qemu-system-x86 218 root 20 0 0 0 0 S 0.3 0.0 0:00.31 kworker/10:1 1 root 20 0 47536 5288 3744 S 0.0 0.0 0:02.94 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd

This shows that the DPDK l3fwd CPU 100% loads are on the target CPU cores 8 and 9.

14. Check that the correct vCPU loads in the VM.

# top (1)top - 15:03:45 up 22 min, 3 users, load average: 2.00, 1.88, 1.24Tasks: 86 total, 2 running, 84 sleeping, 0 stopped, 0 zombie%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu1 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st%Cpu2 : 99.7 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 stKiB Mem: 3082324 total, 1465384 used, 1616940 free, 23304 buffersKiB Swap: 2097148 total, 0 used, 2097148 free, 298540 cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7388 root 20 0 1681192 3144 2884 R 200.1 0.1 2:11.14 l3fwd 7221 root 20 0 0 0 0 S 0.3 0.0 0:00.14 kworker/0:0 1 root 20 0 47496 5052 3692 S 0.0 0.2 0:00.33 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd

In the VM, the top utility shows that the DPDK l3task loads are on vCPU1 and vCPU2. Before correct host affiliation, due to CPU isolation the vCPU loads were running on the same host core since CPU load scheduling is disabled causing top to not show close to 100% user task loads in the VM. The VM is now ready for SR-IOV throughput tests.

In a production system, the VM startup and affiliation would be done in a different manor to avoid disrupting any real time processes currently running on the host or other VMs.

Page 68: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

68

B.5.6 Affinitization and Performance Tuning

To maximize network throughput, individual cores must be affinitized to particular tasks. This can be achieved by using either the taskset command on the host and/or by passing a core mask parameter to the VM application.

The VM starts with two cores: vCPU0 and vCPU1. The Linux operating system and related tasks must use only vCPU0. vCPU1 is reserved to run the DPDK processes.

B.5.6.1 Affinitization Using Core Mask Parameter in the qemu, and the test-pmd Startup Commands

The qemu, test-pmd startup commands offer a core mask parameter that can be set with a hex mask to ensure the tasks use specific cores.

Use core mask: -c 0x1 for test-pmd commands. This ensures that the DPDK task application in the VM uses vCPU1. Ensure by running top command if vCPU1 is used at 100%.

However, with the qemu command, even though core mask is set to use two host cores for the VM's vCPU0 and vCPU1, it allocates the vCPU0 and vCPU1 tasks on a single host core mostly the first core of the specified core mask. Hence, the qemu task needs to be re-affinitized.

B.5.6.2 Affinitized Host Cores for VMs vCPU0 and vCPU1

Use the taskset command to pin specific processes to a core.

# taskset -p <core_mask> <pid>

Ensure that the VM's vCPU0 and vCPU1 are assigned to two separate host cores. For example:

• Intel® DPDK Accelerated vSwitch uses cores 0, 1, 2 and 3 (-c 0x0F)

• QEMU task for VM's vCPU0 uses core 4 (-c 0x30)

• QEMU task for VM's vCPU1 uses core 5 (-c 0x30; taskset -p 20 <pid_vcpu1>)

Page 69: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

69

Intel® Open Network Platform ServerBenchmark Performance Test Report

B.6 PCI Passthrough with Intel® QuickAssist

When setting up a QAT PCI device for a passthrough-to-VM, make sure that VT-d is enabled in the BIOS and “intel_iommu=on iommu=pt” is used in the grub.cfg file to boot the OS with IOMMU enabled. The VM has access to the QAT PCI device using PCI passthrough. Make sure the host has two QAT cards since we will set up an IPSec tunnel between two VMs, each VM using QAT to accelerate the tunnel. The VM uses standard Open vSwitch virtio+standard vhost IO virtualization method for networking.

1. Run the following command to verify that the host has two QAT devices provisioned in it.

# lspci -nn |grep 0430c:00.0 Co-processor [0b40]: Intel Corporation Coleto Creek PCIe Endpoint [8086:0435] 85:00.0 Co-processor [0b40]: Intel Corporation Coleto Creek PCIe Endpoint [8086:0435]

2. Use the following commands to detach PCI devices from the host.

# echo 8086 0435 > /sys/bus/pci/drivers/pci-stub/new_id# echo 0000:85:00.0 > /sys/bus/pci/devices/0000\:85\:00.0/driver/unbind# echo 0000:0c:00.0 > /sys/bus/pci/devices/0000\:0c\:00.0/driver/unbind

Note: You may need to use the specific PCI bus ID per your system setup.

3. After detaching the acceleration complex from the host operating system, bind the appropriate bus/ device/function to pci-stub driver.

# echo 0000:85:00.0 > /sys/bus/pci/drivers/pci-stub/bind# echo 0000:0c:00.0 > /sys/bus/pci/drivers/pci-stub/bind

4. Verify if the devices are bound to pci-stub.

# lspci -vv |grep pci-stub

5. On a separate compute node that uses standard Open vSwitch for networking, add an ovs bridge called br0. The tap devices tap1, tap2, tap3 and tap4 is used as data network vNICs for the two VMs. Each of the two 10 GbE on the host are bridged to the ovs bridge br0 as follows:

# ovs-vsctl add-br br0# tunctl -t tap1# tunctl -t tap2# tunctl -t tap3# tunctl -t tap4# ovs-vsctl add-port br0 tap1# ovs-vsctl add-port br0 tap2# ovs-vsctl add-port br0 tap3# ovs-vsctl add-port br0 tap4# ovs-vsctl add-port br0 p786p1# ovs-vsctl add-port br0 p786p2

6. Bring up the tap devices and ports added to the bridge.

# ip link set br0 up# ip link set tap1 up# ip link set tap2 up# ip link set tap3 up# ip link set tap4 up# ip link set p786p1 up# ip link set p786p2 up

7. Disable Linux Kernel forwarding on the host.

# echo 0 > /proc/sys/net/ipv4/ip_forward

Page 70: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

70

8. The two VMs can now be started with QAT devices as PCI passthrough. Following is a qemu command to pass 85:00.0 and 0c:00.0 to the VMs:

# qemu-system-x86_64 -cpu host -enable-kvm -hda <VM1_image_path> -m 8192 -smp 4 -net nic,model=virtio,netdev=eth0,macaddr=00:00:00:00:00:01 -netdev tap,ifname=tap1,id=eth0,vhost=on,script=no,downscript=no -net nic,model=virtio,netdev=eth1,macaddr=00:00:00:00:00:02 -netdev tap,ifname=tap2,id=eth1,vhost=on,script=no,downscript=no -vnc :15 -name vm1 -device pci- assign,host=85:00.0 &

# qemu-system-x86_64 -cpu host -enable-kvm -hda <VM2_image_path> -m 8192 -smp 4 -net nic,model=virtio,netdev=eth0,macaddr=00:00:00:00:00:03 -netdev tap,ifname=tap3,id=eth0,vhost=on,script=no,downscript=no -net nic,model=virtio,netdev=eth1,macaddr=00:00:00:00:00:04 -netdev tap,ifname=tap4,id=eth1,vhost=on,script=no,downscript=no -vnc :16 -name vm2 -device pci- assign,host=0c:00.0 &

Refer to Appendix B.6.1 to setup a VM with QAT drivers, netkeyshim module and the strongSwan IPSec application.

B.6.1 VM Installation

B.6.1.1 Creating VM Image and VM Configuration

Refer to Appendix B.4.6.

B.6.1.2 Verifying Pass-through

Once the guest starts, run the following command within guest:

# lspci -nn

Pass-through PCI devices should appear with the same description as the host originally showed. For example, if the following was shown on the host:

0c:00.0 Co-processor [0b40]: Intel Corporation Coleto Creek PCIe Endpoint [8086:0435]

It should show up on the guest as:

Ethernet controller [0200]: Intel Corporation Device [8086:0435]

B.6.1.3 Installing Intel® Communications Chipset Software in KVM Guest

The instructions in this solutions guide assume that you have super user privileges. The QAT build directory used in this section is /QAT.

# su# mkdir /QAT# cd /QAT

1. Transfer the tarball using any preferred method. For example, USB memory stick, CDROM, or network transfer in the /QAT directory.

# tar -zxof <QAT_tarball_name>

2. Launch the script using the following command:

# ./installer.sh

Page 71: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

71

Intel® Open Network Platform ServerBenchmark Performance Test Report

3. Choose option 2 to build and install the acceleration software. Choose option 6 to build the sample LKCF code.

4. Start/Stop acceleration software.

# service qat_service start/stop

5. Setup the environment to install the Linux kernel crypto framework driver.

# export ICP_ROOT=/QAT# export KERNEL_SOURCE_ROOT=/usr/src/kernels/`uname -r`

6. Unpack the Linux kernel crypto driver.

# mkdir -p $ICP_ROOT/quickassist/shims/netkey# cd $ICP_ROOT/quickassist/shims/netkey# tar xzof <path_to>/icp_qat_netkey.L.<version>.tar.gz

7. Build the Linux kernel crypto driver.

# cd $ICP_ROOT/quickassist/shims/netkey/icp_netkey# make

8. Install the Linux kernel crypto driver.

# cd $ICP_ROOT/quickassist/shims/netkey/icp_netkey# insmod ./icp_qat_netkey.ko

9. Verify that the module has been installed.

# lsmod | grep icp

The following is the expected output:

icp_qat_netkey 21868 0icp_qa_al 1551435 2 icp_qat_netkey

B.6.2 Installing strongSwan IPSec Software

Install strongSwan IPsec software on both VM1 and VM2.

1. Download the original strongswan-4.5.3 software package from the following link:

http://download.strongswan.org/strongswan-4.5.3.tar.gz

2. Navigate to the shims directory.

# cd $ICP_ROOT/quickassist/shims

3. Extract the source files from the strongSwan software package to the shims directory.

# tar xzof <path_to>/strongswan-4.5.3.tar.gz

4. Navigate to the strongSwan-4.5.3 directory.

# cd $ICP_ROOT/quickassist/shims/strongswan-4.5.3

5. Configure strongSwan using the following command:

# ./configure --prefix=/usr --sysconfdir=/etc

6. Build strongSwan using the following command:

# make

7. Install strongSwan using the following command:

# make install

Repeat the installation of strongSwan IPsec software on the other VM.

Page 72: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

72

B.6.3 Configuring strongSwan IPsec Software

1. Add the following line to the configuration file /etc/strongswan.conf on the VM1 and VM2 platforms after the charon { line:

load = curl aes des sha1 sha2 md5 pem pkcs1 gmp random x509 revocation hmac xcbc stroke kernel-netlink socket-raw updown

After adding the line, the section looks like:

# strongswan.conf - strongSwan configuration file charon {load = curl aes des sha1 sha2 md5 pem pkcs1 gmp random x509 revocation hmac xcbc stroke kernel-netlink socket-raw updown# number of worker threads in charon threads = 16

Note: When the /etc/strongswan.conf text files are created, the line that starts with load and ends with updown must be on the same line, despite the appearance of it being on two separate lines in this documentation.

2. Update the strongSwan configuration files on the VM1 platform:

a. Edit /etc/ipsec.conf.

# ipsec.conf - strongSwan IPsec configuration file

# basic configuration

config setup plutodebug=none crlcheckinterval=180 strictcrlpolicy=no nat_traversal=no charonstart=no plutostart=yes

conn %default ikelifetime=60m keylife=1m rekeymargin=3m keyingtries=1 keyexchange=ikev1 ike=aes128-sha-modp2048! esp=aes128-sha1!

conn host-host left=192.168.99.2 leftfirewall=no right=192.168.99.3 auto=start authby=secret

b. Edit /etc/ipsec.secrets (this file might not exist and might need to be created).

# /etc/ipsec.secrets - strongSwan IPsec secrets file 192.168.99.2 192.168.99.3 : PSK "shared key"

3. Update the strongSwan configuration files on the VM2 platform:

a. Edit /etc/ipsec.conf.

# /etc/ipsec.conf - strongSwan IPsec configuration file config setup

crlcheckinterval=180 strictcrlpolicy=no plutostart=yes plutodebug=none

Page 73: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

73

Intel® Open Network Platform ServerBenchmark Performance Test Report

charonstart=no nat_traversal=no

conn %default ikelifetime=60m keylife=1m rekeymargin=3m keyingtries=1 keyexchange=ikev1 ike=aes128-sha1-modp2048! esp=aes128-sha1!

conn host-host left=192.168.99.3 leftfirewall=no right=192.168.99.2 auto=start authby=secret

b. Edit /etc/ipsec.secrets (this file might not exist and might need to be created).

# /etc/ipsec.secrets - strongSwan IPsec secrets file 192.168.99.3 192.168.99.2 : PSK "shared key"

B.6.3.1 Starting strongSwan IPsec Software on the VM

IPSec tunnel is setup between two VMs as shown in Figure 6-20 on page 32. The IPSec tunnel is setup for network interfaces on the subnet 192.168.99.0/24. The data networks for VM1 and VM2 are 1.1.1.0/24 and 6.6.6.0/24 respectively.

1. VM1 configuration settings:

a. Network configuration:

# cat /etc/sysconfig/network-scripts/ifcfg-eth1 TYPE=EthernetNAME=eth1 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=1.1.1.2 HWADDR=00:00:00:00:00:01

# cat /etc/sysconfig/network-scripts/ifcfg-eth2 TYPE=EthernetNAME=eth2 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=192.168.99.2 HWADDR=00:00:00:00:00:02

b. Add a route to the routing table for the VM2 network.

# ip route add 6.6.6.0/24 dev eth2# route -n

c. Enable Linux Kernel forwarding and stop the firewall daemon.

# echo 1 > /proc/sys/net/ipv4/ip_forward# service firewalld stop

d. Check QAT service status and insert the netkeyshim module.

# export ICP_ROOT=/qat/QAT1.6# service qat_service status# insmod $ICP_ROOT/quickassist/shims/netkey/icp_netkey/icp_qat_netkey.ko

Page 74: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

74

e. Start the IPSec process.

# ipsec start

2. VM2 configuration settings:

a. Network configuration:

# cat /etc/sysconfig/network-scripts/ifcfg-eth1 TYPE=EthernetNAME=eth1 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=6.6.6.6 HWADDR=00:00:00:00:00:03

# cat /etc/sysconfig/network-scripts/ifcfg-eth2 TYPE=EthernetNAME=eth2 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.0 IPADDR=192.168.99.3 HWADDR=00:00:00:00:00:04

b. Add a route to the routing table for the VM2 network.

# ip route add 1.1.1.0/24 dev eth2# route -n

c. Enable Linux Kernel forwarding and stop the firewall daemon

# echo 1 > /proc/sys/net/ipv4/ip_forward# service firewalld stop

d. Check QAT service status and insert the netkeyshim module.

# export ICP_ROOT=/qat/QAT1.6# service qat_service status# insmod $ICP_ROOT/quickassist/shims/netkey/icp_netkey/icp_qat_netkey.ko

e. Start the IPSec process

# ipsec start

3. Enable the IPsec tunnel “host-host” on both VM1 and VM2:

# ipsec up host-host

4. To test the VM-VM IPSec tunnel performance, use Netperf.

Page 75: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

75

Intel® Open Network Platform ServerBenchmark Performance Test Report

Appendix C Glossary

Acronym Description

COTS Commercial Off‐The-Shelf

DPI Deep Packet Inspection

IOMMU Input/Output Memory Management Unit

Kpps Kilo packets per seconds

KVM Kernel-based Virtual Machine

Mpps Millions packets per seconds

NIC Network Interface Card

pps Packets per seconds

QAT Quick Assist Technology

RA Reference Architecture

RSS Receive Side Scaling

SP Service Provider

SR-IOV Single root I/O Virtualization

Page 76: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

76

NOTE: This page intentionally left blank.

Page 77: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

77

Intel® Open Network Platform ServerBenchmark Performance Test Report

Appendix D Definitions

D.1 Packet Throughput

There is a difference between an Ethernet frame, an IP packet, and a UDP datagram. In the seven- layer OSI model of computer networking, packet refers to a data unit at layer 3 (network layer). The correct term for a data unit at layer 2 (data link layer) is a frame, and at layer 4 (transport layer) is a segment or datagram.

Important concepts related to 10GbE performance are frame rate and throughput. The MAC bit rate of 10GbE, defined in the IEEE standard 802.3ae, is 10 billion bits per second. Frame rate is based on the bit rate and frame format definitions. Throughput, defined in IETF RFC 1242, is the highest rate at which the system under test can forward the offered load, without loss.

The frame rate for 10GbE is determined by a formula that divides the 10 billion bits per second by the preamble + frame length + inter-frame gap.

The maximum frame rate is calculated using the minimum values of the following parameters, as described in the IEEE 802.3ae standard:

• Preamble: 8 bytes * 8 = 64 bits

• Frame length: 64 bytes (minimum) * 8 = 512 bits

• Inter-frame gap: 12 bytes (minimum) * 8 = 96 bits

Therefore, Maximum Frame Rate (64-byte packets)

= MAC Transmit Bit Rate / (Preamble + Frame Length + Inter-frame Gap)

= 10,000,000,000 / (64 + 512 + 96)

= 10,000,000,000 / 672

= 14,880,952.38 frame per second (fps)

Table D-1 Maximum Throughput

IP Packet Size(Bytes)

Theoretical Line Rate (fps)(Full duplex)

Theoretical Line Rate (fps)(Half duplex)

64 29,761,904 14,880,952

128 16,891,891 8,445,946

256 9,057,971 4,528,986

512 4,699,248 2,349,624

1024 2,394,636 1,197,318

1280 1,923,076 961,538

1518 1,625,487 812,744

Page 78: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

78

D.2 RFC 2544

RFC 2544 is an Internet Engineering Task Force (IETF) RFC that outlines a benchmarking methodology for network Interconnect Devices. The methodology results in performance metrics such as latency, frame loss percentage, and maximum data throughput.

In this document network “throughput” (measured in millions of frames per second) is based on RFC 2544, unless otherwise noted. Frame size refers to Ethernet frames ranging from smallest frames of 64 bytes to largest frames of 1518 bytes.

Types of tests are:

• Throughput test defines the maximum number of frames per second that can be transmitted without any error. Test time during which frames are transmitted must be at least 60 seconds.

• Latency test measures the time required for a frame to travel from the originating device through the network to the destination device.

• Frame loss test measures the network’s response in overload conditions—a critical indicator of the network’s ability to support real-time applications in which a large amount of frame loss will rapidly degrade service quality.

• Burst test assesses the buffering capability of a switch. It measures the maximum number of frames received at full line rate before a frame is lost. In carrier Ethernet networks, this measurement validates the excess information rate (EIR) as defined in many SLAs.

• System recovery to characterize speed of recovery from an overload condition

• Reset to characterize speed of recovery from device or software reset

Although not included in the defined RFC 2544 standard, another crucial measurement in Ethernet networking is packet jitter.

Page 79: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

79

Intel® Open Network Platform ServerBenchmark Performance Test Report

Appendix E References

Document Name Source

Internet Protocol version 4 http://www.ietf.org/rfc/rfc791.txt

Internet Protocol version 6 http://www.faqs.org/rfc/rfc2460.txt

Intel® 82599 10 Gigabit Ethernet Controller Datasheet http://www.intel.com/content/www/us/en/ethernet-controllers/82599-10-gbe-controller-datasheet.html

Intel DDIO https://www-ssl.intel.com/content/www/us/en/io/direct-data-i-o.html?

Bandwidth Sharing Fairness http://www.intel.com/content/www/us/en/network-adapters/10-gigabit- network-adapters/10-gbe-ethernet-flexible-port-partitioning-brief.html

Design Considerations for efficient network applications with Intel® multi-core processor- based systems on Linux

http://download.intel.com/design/intarch/papers/324176.pdf

OpenFlow with Intel 82599 http://ftp.sunet.se/pub/Linux/distributions/bifrost/seminars/workshop-2011-03-31/Openflow_1103031.pdf

Wu, W., DeMar,P. & Crawford,M (2012). A Transport-Friendly NIC for Multicore / Multiprocessor Systems

IEEE transactions on parallel and distributed systems, vol 23, no 4, April 2012. http://lss.fnal.gov/archive/2010/pub/fermilab-pub-10-327-cd.pdf

Why does Flow Director Cause Placket Reordering? http://arxiv.org/ftp/arxiv/papers/1106/1106.0443.pdf

IA packet processing http://www.intel.com/p/en_US/embedded/hwsw/technology/packet- processing

High Performance Packet Processing on Cloud Platforms using Linux* with Intel® Architecture

http://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf

Packet Processing Performance of Virtualized Platforms with Linux* and Intel® Architecture

http://networkbuilders.intel.com/docs/network_builders_RA_NFV.pdf

DPDK http://www.intel.com/go/dpdk

Intel® DPDK Accelerated vSwitch https://01.org/packet-processing

RFC 1242 (Benchmarking Terminology for Network Interconnection Devices)

http://www.ietf.org/rfc/rfc1242.txt

RFC 2544 (Benchmarking Methodology for Network Interconnect Devices)

http://www.ietf.org/rfc/rfc2544.txt

RFC 6349 (Framework for TCP Throughput Testing) http://www.faqs.org/rfc/rfc6349.txt

RFC 3393 (IP Packet Delay Variation Metric for IP Performance Metrics)

https://www.ietf.org/rfc/rfc3393.txt

MEF end-end measurement metrics http://metroethernetforum.org/Assets/White_Papers/Metro-Ethernet-Services.pdf

DPDK Sample Application User Guide http://dpdk.org/doc/intel/dpdk-sample-apps-1.7.0.pdf

Benchmarking Methodology for Virtualization Network Performance

https://tools.ietf.org/html/draft-liu-bmwg-virtual-network-benchmark-00

Benchmarking Virtual Network Functions and Their Infrastructure

https://tools.ietf.org/html/draft-morton-bmwg-virtual-net-01

Page 80: Intel Open Network Platform Server (Release 1.2) · 2019-06-27 · understanding of the hardware (compute platforms, networking technologies), familiarity with network protocols,

Intel® Open Network Platform ServerBenchmark Performance Test Report

80

LEGAL

By using this document, in addition to any agreements you have with Intel, you accept the terms set forth below.

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.

Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.

Intel technologies may require enabled hardware, specific software, or services activation. Check with your system manufacturer or retailer. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance.

All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses.

Intel does not control or audit third-party web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights.

© 2014 Intel Corporation. All rights reserved. Intel, the Intel logo, Core, Xeon and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.