RHEL7 HOST VLAN Bridge slow network performance / ~5MBs over Gigabit interface

Latest response

Dear RHEL and KVM Experts,

I have a condition which I cannot be sure what is causing where its analyzes could be helpful to others.

As a reference condition I have a system as:

1) A HPE Synergy Blade HOST server installed as Virtualization environment with RHEL7.8 release and KVM/QEMU (actually not in production, or not traffic)
2) The network is configured using Bond interface in Active/Standby mode, supported for as recommended mode for virtualiaztion
3) The Blade Frame 12000 has externally two 10 Gbps interfaces, even using Active/Active mode its like that because on Trunk Mode with 4 VLANs where external switch is responsible to interconnect L2/L3
4) The HOST uses bridge interfaces with VLAN where HOST IP is under bridge VLAN interface
br0 -> bond0 -> int1+int2 (backup access port interface)
br1823-|
br1824-| -> br1 -> bond1 -> int3+int4 (application on trunk port interface)
5) All network configurations are disabling Network Manager, being handled only by network-services scripts
6) Bridge has no bridge_options, just default config, all working ok(no connectivity problems, just some strange speed changes with time)

HOST IP on br0, br1823 and br1824. It is done like that to allow VLAN to be worked on HOST. Guest can transfer in higher speeds that HOST. Guest are using virtio and vnet configurations.

vhost_net 22693 0
vhost 48851 1 vhost_net
macvtap 22757 1 vhost_net
tun 36164 2 vhost_net

Based on the bond interface on HOST as Mode 1, it is expected that we have a throughput of around 1Gbps, or around 100MBs.

On the HOST node just after network is started on HOST, transfer large files for testing reaches this speed.

But after sometime this reduces considerably to much smaller speeds. See below:

From HOST to an external server:
large.file 1% 24MB 5.0MB/s 07:40 ETA

From external server connecting on HOST and downloading:
large.file 17% 116MB 24.0MB/s 00:22 ETA

But if I just restart the network on HOST, I get the speeds expected, or around 100MBps.

After time passes the speed reduces again to the above speeds. This is very odd where I could not see any reason for it.

Some technical information, regarding the VLAN 823:

port no mac addr is local? ageing timer
1 00:1d:70:c4:f4:e6 no 0.97
1 08:f1:ea:6f:d6:c9 no 250.45
1 08:f1:ea:70:39:f1 no 0.00
1 12:d2:a6:f0:00:1f no 195.99
1 12:d2:a6:f0:00:25 no 216.92
1 16:1e:b1:80:00:12 no 65.19
1 16:1e:b1:80:00:13 no 59.18
1 16:1e:b1:80:00:1e no 36.31
1 16:1e:b1:80:00:24 yes 0.00
1 16:1e:b1:80:00:24 yes 0.00
1 2c:76:8a:55:e8:c5 no 244.12
1 2c:76:8a:56:40:15 no 120.21
1 b4:99:ba:06:71:7c no 147.64

33: br1823: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 16:1e:b1:80:00:24 brd ff:ff:ff:ff:ff:ff
inet 192.168.23.180/24 brd 192.168.23.255 scope global br1823
valid_lft forever preferred_lft forever

I am not willing to put ageing 0 on bridge, since I believe bridge should be avoiding to broadcast.

Does anyone have any recommendation to check why this behavior is happening? At this time I never left the Guest running long time but its speed under the same VLAN is always as expected, or around 100MBps.

Just to reinforce I do not have any connectivity issue just HOST speed is under what is expected for this type of connectivity.

Responses