Low network performance between instances on different compute nodes using VXLAN tunnels in Red Hat OpenStack Platform
Issue
Network performance between instances on different compute nodes in Red Hat OpenStack Platform is slow. The network uses VXLAN tunnels.This environment is using 10Gb network interface cards. However, iperf between instances on different compute nodes attains only speeds between a few hundred Mbit/s and a few Gb/s. Both instances are in the same tenant network.
- Note: in this knowledge base entry, interface
vxlan
is the interface with the vxlan tunnel endpoint IP address`
[centos@test-2 ~]$ iperf --server
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 5] local 10.0.1.11 port 5001 connected with 10.0.1.10 port 35841
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.0 sec 625 MBytes 524 Mbits/sec
[centos@test-1 ~]$ iperf -c 10.0.1.11
------------------------------------------------------------
Client connecting to 10.0.1.11, TCP port 5001
TCP window size: 22.1 KByte (default)
------------------------------------------------------------
[ 3] local 10.0.1.10 port 35841 connected with 10.0.1.11 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 625 MBytes 524 Mbits/sec
iperf
between both compute nodes (hypervisors) across the VXLAN tunnel endpoints (not within the VXLAN tunnels themselves) gets close to line speed
[root@compute-9 ~]# iperf3 -c 10.123.123.36
Connecting to host 10.123.123.36, port 5201
[ 4] local 10.123.123.40 port 37005 connected to 10.123.123.36 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.14 GBytes 9.81 Gbits/sec 0 944 KBytes
[ 4] 1.00-2.00 sec 1.11 GBytes 9.57 Gbits/sec 0 1.05 MBytes
[ 4] 2.00-3.00 sec 1.13 GBytes 9.69 Gbits/sec 0 1.05 MBytes
[ 4] 3.00-4.00 sec 1.14 GBytes 9.82 Gbits/sec 0 1.05 MBytes
[ 4] 4.00-5.00 sec 1.15 GBytes 9.84 Gbits/sec 0 1.19 MBytes
[ 4] 5.00-6.00 sec 1.13 GBytes 9.67 Gbits/sec 0 1.19 MBytes
[ 4] 6.00-7.00 sec 1.10 GBytes 9.43 Gbits/sec 0 1.19 MBytes
[ 4] 7.00-8.00 sec 1.15 GBytes 9.85 Gbits/sec 0 1.19 MBytes
[ 4] 8.00-9.00 sec 1.14 GBytes 9.75 Gbits/sec 0 1.19 MBytes
[ 4] 9.00-10.00 sec 1.14 GBytes 9.77 Gbits/sec 0 1.83 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 11.3 GBytes 9.72 Gbits/sec 0 sender
[ 4] 0.00-10.00 sec 11.3 GBytes 9.72 Gbits/sec receiver
====
[root@compute-5 ~]# ip -o -4 a
(...)
3: vxlan inet 10.123.123.36/24 brd 10.123.123.255 scope global vxlan\ valid_lft forever preferred_lft forever
(...)
[root@compute-5 ~]# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.123.123.40, port 37004
[ 5] local 10.123.123.36 port 5201 connected to 10.123.123.40 port 37005
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 1.10 GBytes 9.42 Gbits/sec
[ 5] 1.00-2.00 sec 1.11 GBytes 9.57 Gbits/sec
[ 5] 2.00-3.00 sec 1.13 GBytes 9.69 Gbits/sec
[ 5] 3.00-4.00 sec 1.14 GBytes 9.82 Gbits/sec
[ 5] 4.00-5.00 sec 1.15 GBytes 9.84 Gbits/sec
[ 5] 5.00-6.00 sec 1.13 GBytes 9.66 Gbits/sec
[ 5] 6.00-7.00 sec 1.10 GBytes 9.43 Gbits/sec
[ 5] 7.00-8.00 sec 1.15 GBytes 9.85 Gbits/sec
[ 5] 8.00-9.00 sec 1.14 GBytes 9.75 Gbits/sec
[ 5] 9.00-10.00 sec 1.14 GBytes 9.77 Gbits/sec
[ 5] 10.00-10.04 sec 44.6 MBytes 9.84 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.04 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.04 sec 11.3 GBytes 9.68 Gbits/sec receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated
In this case, the hypervisors run the following hardware
Vendor: Cisco Systems, Inc.
Version: B200M4.3.1.1a.0.121720151230
Using Red Hat's enic
driver
About the enic driver - this is RedHat's enic driver, which does not support vxlan hardware offloading according to our network support:
filename: /lib/modules/3.10.0-327.13.1.el7.x86_64/kernel/drivers/net/ethernet/cisco/enic/enic.ko
version: 2.1.1.83
license: GPL
author: Scott Feldman <scofeldm@cisco.com>
description: Cisco VIC Ethernet NIC Driver
rhelversion: 7.2
srcversion: E3ADA231AA76168CA78577A
alias: pci:v00001137d00000071sv*sd*bc*sc*i*
alias: pci:v00001137d00000044sv*sd*bc*sc*i*
alias: pci:v00001137d00000043sv*sd*bc*sc*i*
depends:
intree: Y
vermagic: 3.10.0-327.13.1.el7.x86_64 SMP mod_unload modversions
signer: Red Hat Enterprise Linux kernel signing key
sig_key: 4B:01:2C:B9:7A:31:97:ED:10:8D:C7:0A:A0:22:A9:29:B3:23:05:E0
sig_hashalgo: sha256
Enabling/disabling GRO (generic-receive-offload
) with ethtool
does not improve performance.
Tests showed that rx interrupts go up and put 100% SI on a single CPU handling that single rx queue
When not using vxlans, then way less interrupts / SIs are generated, so one single CPU can handle the load
Environment
Red Hat Enterprise Linux OpenStack Platform 7.0
Red Hat OpenStack Platform 8.0
Red Hat OpenStack Platform 9.0
RHEL 7.2
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.