SRIOV NIC drivers can allocate up to 500MB of memory for ring buffers per PF
Environment
- Red Hat Openshift Container Platform 4.16
- with kubernetes v1.31.8
- Note Openshift is not required to experience this issue
- Red Hat Enterprise Linux 9.4 and above
- Intel ice, i40e, ixgbe drivers
- Intel iavf, ixgbevf drivers
- This is not limited to Intel drivers. Affects all drivers that allocate memory upfront for ring buffers.
- 16 PFs.
Issue
- ice driver consumes 15GB of memory with below call trace when VFs are created.
- Every time we create 1 VF on a NIC port, memory usage increases by approximately 500 MB.
- Removing the VF brings back the memory to normal. So we suspect its not a memory leak, but unexpected use of 500MB for each VF which seems not reasonable. It should be a maximum of 40MB.
Resolution
- If possible only set the VFs MTU to 9000 and leave the PF as 1500.
- Otherwise consider decreasing the number of NIC queues with e.g:
ethtool -L <PF> combined 8, also if possible reduce the number of RX descriptors with e.g:ethtool -G <PF> rx 512for PFs.
Root Cause
- This issue is triggered when the VF is provisioned. OpenShift is one common environment where this configuration is triggered, but the issue is not limited to the OpenShift environment.
- This is expected because the SRIOV operator assigns VFs to multiple PFs and increases the PF MTU to 9000 at the same time. The bulk of the memory increase is due to the MTU 9000 setting on the PF which requires 6x the memory of the default 1500 MTU. The SRIOV operator sets the MTU to 9000 on a PF and then assigns a VF. It uses multiple PFs in the process and by setting the MTU to 9000 increases the memory dramatically considering 2048 descriptors of 9000 bytes are added for each of the 64 NIC queues per PF.
Diagnostic Steps
- As per solution https://access.redhat.com/solutions/5609521 page owner shows:
Call trace:
+++
7.00 GB
917056 times, 1834112 pages, allocated by OTHERS :
Page allocated via order 1, mask 0x162a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_COMP|__GFP_MEMALLOC|__GFP_HARDWALL), pid 6992, tgid 6837 (sriov-network-c), ts ns
get_page_from_freelist+0x387/0x530
__alloc_pages+0xf2/0x250
ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
ice_vsi_cfg_rxq+0x108/0x290 [ice]
ice_vsi_cfg_rxqs+0x6b/0xa0 [ice]
ice_down_up+0x2e/0x60 [ice]
ice_change_mtu+0xbd/0x140 [ice]
dev_set_mtu_ext+0xed/0x200
do_setlink+0x1a6/0xc00
rtnl_setlink+0xe5/0x180
rtnetlink_rcv_msg+0x159/0x3d0
netlink_rcv_skb+0x54/0x100
netlink_unicast+0x23b/0x360
netlink_sendmsg+0x24c/0x4c0
__sys_sendto+0x1dc/0x1f0
__x64_sys_sendto+0x20/0x30
4.00 GB
524032 times, 1048064 pages, allocated by OTHERS :
Page allocated via order 1, mask 0x162a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_COMP|__GFP_MEMALLOC|__GFP_HARDWALL), pid 6985, tgid 6837 (sriov-network-c), ts ns
get_page_from_freelist+0x387/0x530
__alloc_pages+0xf2/0x250
ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
ice_vsi_cfg_rxq+0x108/0x290 [ice]
ice_vsi_cfg_rxqs+0x6b/0xa0 [ice]
ice_down_up+0x2e/0x60 [ice]
ice_change_mtu+0xbd/0x140 [ice]
dev_set_mtu_ext+0xed/0x200
do_setlink+0x1a6/0xc00
rtnl_setlink+0xe5/0x180
rtnetlink_rcv_msg+0x159/0x3d0
netlink_rcv_skb+0x54/0x100
netlink_unicast+0x23b/0x360
netlink_sendmsg+0x24c/0x4c0
__sys_sendto+0x1dc/0x1f0
__x64_sys_sendto+0x20/0x30
smem -twkwill show that 500MB of memory is consumed for every VF when added to a different PF:
Test result
=============
+++
# 0 VFs
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 33.0G 11.4G 21.6G
userspace memory 15.7G 5.6G 10.1G
free memory 13.7G 13.7G 0
----------------------------------------------------------
62.5G 30.8G 31.7G
# 1 VF (1 new VF on ens17f0)
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 33.0G 11.3G 21.6G
userspace memory 15.8G 5.7G 10.1G
free memory 13.7G 13.7G 0
----------------------------------------------------------
62.5G 30.7G 31.7G
# 2 VF (1 new VF on ens17f1)
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 33.5G 11.4G 22.1G
userspace memory 15.8G 5.7G 10.1G
free memory 13.2G 13.2G 0
----------------------------------------------------------
62.5G 30.2G 32.3G
# 3 VF (1 new VF on ens17f2)
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 34.0G 11.4G 22.6G
userspace memory 15.8G 5.7G 10.1G
free memory 12.7G 12.7G 0
----------------------------------------------------------
62.5G 29.7G 32.7G
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments