Slow performance on pod to pod communication over vxlan in openshift-sdn

Solution Verified - Updated -

Issue

  • TCP and UDP iperf3 test between pods on different nodes over openshift-sdn is very slow. Even if the nodes are on the same Hypervisor or when an iperf between the node IPs is fast.
[  5]   0.00-1.00   sec  5.31 MBytes  44.5 Mbits/sec   18    620 KBytes
[  5]   1.00-2.00   sec  4.12 MBytes  34.5 Mbits/sec    0    625 KBytes
[  5]   2.00-3.00   sec  4.02 MBytes  33.7 Mbits/sec    0    628 KBytes
[  5]   3.00-4.00   sec  4.13 MBytes  34.6 Mbits/sec    0    640 KBytes
[  5]   4.00-5.00   sec  4.15 MBytes  34.8 Mbits/sec    0    665 KBytes
[  5]   5.00-6.00   sec  3.95 MBytes  33.1 Mbits/sec    7    673 KBytes
[  5]   6.00-7.00   sec  4.03 MBytes  33.8 Mbits/sec    3    675 KBytes
  • The same iperf3 tests after adding the iptables rules in the resolution section where performance is boosted from MB/s to GB/s:
[  5] 490.00-491.00 sec   382 MBytes  3.20 Gbits/sec    4   1.02 MBytes
[  5] 491.00-492.00 sec   403 MBytes  3.38 Gbits/sec    4    957 KBytes
[  5] 492.00-493.00 sec   404 MBytes  3.39 Gbits/sec   12    869 KBytes
[  5] 493.00-494.00 sec   398 MBytes  3.34 Gbits/sec    0   1.10 MBytes
[  5] 494.00-495.00 sec   384 MBytes  3.23 Gbits/sec   11   1.02 MBytes
  • Conntrack shows several UNREPLIED entries at the vxlan port
$ cat /proc/net/nf_conntrack | egrep udp | egrep dport=4789 | egrep UNREPLIED | wc -l
232
$ cat /proc/net/nf_conntrack | egrep udp | egrep dport=4789 | wc -l
232

Environment

  • Red Hat OpenShift Container Platform
  • Red Hat CoreOS
    • kernel-4.18.0-193.41.1.el8_2.x86_64
    • iptables-nft

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content