The kernel's inbox irdma driver in RHEL with Intel Ethernet Controller E810 causes extremely high memory usage due to excessive dynamic allocations during initialization

Solution Unverified - Updated -

Issue

  • The system experiences significant memory consumption on boot (approximately 40 GiB), leading to resource exhaustion and potential performance degradation.
  • The page_owner output highlights that 40 GiB of memory is being consumed dynamically during irdma driver initialization. Inefficient handling of IOMMU mappings and HMC object creation in the inbox driver is suspected.
10483712 times, 10483712 pages, allocated by OTHERS :
Page allocated via order 0, mask 0x12dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NORETRY|__GFP_ZERO), pid 1527, tgid 1527 (systemd-udevd), ts  ns
 get_page_from_freelist+0x387/0x530
 __alloc_pages+0xf2/0x250
 __iommu_dma_alloc_pages.isra.0+0xf9/0x1c0
 __iommu_dma_alloc_noncontiguous.constprop.0+0xa6/0x220
 iommu_dma_alloc+0x118/0x1c0
 irdma_add_sd_table_entry+0x78/0x180 [irdma]
 irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
 irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
 irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
 irdma_probe+0x1db/0x340 [irdma]
 auxiliary_bus_probe+0x42/0x80
 really_probe+0xde/0x390
 __driver_probe_device+0xd6/0x130
 driver_probe_device+0x1e/0x90
 __driver_attach+0xd2/0x1c0
 bus_for_each_dev+0x75/0xd0
  • The vmalloc usage is explicitly shown in /proc/vmallocinfo. The total amount of memory allocated by dma_common_pages_remap() is approximately 42.3593 GiB.
$ awk '{total += $2} /dma_common_pages_remap/ {dma_total += $2} END {print "Grand Total:", total, "bytes (", total/2^30, "GiB)"; print "dma_common_pages_remap Total:", dma_total, "bytes (", dma_total/2^30, "GiB)"}' proc/vmallocinfo
Grand Total: 46710464512 bytes ( 43.5025 GiB)
dma_common_pages_remap Total: 45482975232 bytes ( 42.3593 GiB)
  • This aligns with the observation of over 40 GiB memory being allocated for the driver dynamically.

  • A test was carried out with 9.6 brew kernel - 5.14.0-570.el9 - containing a far more recent upstream irdma driver.

    • The result shows slight improvements. dma_common_pages_remap() usage drops to 0 GiB:
$ awk '{total += $2} /dma_common_pages_remap/ {dma_total += $2} END {print "Grand Total:", total, "bytes (", total/2^30, "GiB)"; print "dma_common_pages_remap Total:", dma_total, "bytes (", dma_total/2^30, "GiB)"}' proc/vmallocinfo
Grand Total: 1169719296 bytes ( 1.08939 GiB)
dma_common_pages_remap Total:  bytes ( 0 GiB)
  • While the backported irdma driver in 9.6 is likely a little bit better, it still has severe memory inefficiencies where allocations are requested with these functions until reaching up to ~40 GiB on boot:
        ...
 irdma_add_sd_table_entry+0x78/0x180 [irdma]
 irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
 irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
 irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
 irdma_probe+0x1df/0x440 [irdma]
        ...
  • The 9.6 inbox driver requests order-9 allocations but the total memory consumption is still ~40 GiB, just like before when order-0 allocations were dominant in 9.4.
9.4 – 5.14.0-427.13.1.el9_4 – inbox irdma driver
================================================
10483712 times, 10483712 pages, allocated by OTHERS :
Page allocated via order 0, mask 0x12dc2, pid 1527, tgid 1527 (systemd-udevd), ts  ns
The mask details: GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NORETRY|__GFP_ZERO
 get_page_from_freelist+0x387/0x530
 __alloc_pages+0xf2/0x250
 __iommu_dma_alloc_pages.isra.0+0xf9/0x1c0
 __iommu_dma_alloc_noncontiguous.constprop.0+0xa6/0x220
 iommu_dma_alloc+0x118/0x1c0
 irdma_add_sd_table_entry+0x78/0x180 [irdma]
 irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
 irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
 irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
 irdma_probe+0x1db/0x340 [irdma]
 auxiliary_bus_probe+0x42/0x80
 really_probe+0xde/0x390
 __driver_probe_device+0xd6/0x130
 driver_probe_device+0x1e/0x90
 __driver_attach+0xd2/0x1c0
 bus_for_each_dev+0x75/0xd0
Total: 39.9921875 GiB
9.6 – 5.14.0-570.el9 – inbox irdma driver
=========================================
20476 times, 10483712 pages, allocated by OTHERS :
Page allocated via order 9, mask 0xcc0, pid 1581, tgid 1581 (systemd-udevd), ts  ns
The mask details: GFP_KERNEL
 get_page_from_freelist+0x401/0x590
 __alloc_pages+0xf2/0x250
 __dma_direct_alloc_pages.constprop.0+0x1bc/0x230
 dma_direct_alloc+0x73/0x2a0
 dma_alloc_attrs+0x71/0x140
 irdma_add_sd_table_entry+0x78/0x180 [irdma]
 irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
 irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
 irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
 irdma_probe+0x1df/0x440 [irdma]
 auxiliary_bus_probe+0x42/0x80
 really_probe+0xde/0x390
 __driver_probe_device+0xd6/0x130
 driver_probe_device+0x1e/0x90
 __driver_attach+0xd2/0x1c0
 bus_for_each_dev+0x75/0xd0
Total: 39.9921875 GiB
  • It appears that the core issue isn't just order-9 allocations, but rather the total amount of memory being allocated remains the same where the driver is requesting excessive memory for HMC objects and DMA buffers, regardless of the page order.
Description RHEL 9.4 – 5.14.0-427.13.1.el9_4 – inbox irdma driver RHEL 9.6 – 5.14.0-570.el9 – inbox irdma driver
Allocation count and pages 10,483,712 times, 10,483,712 pages, allocated by OTHERS 20,476 times, 10,483,712 pages, allocated by OTHERS
Page allocation details order 0, mask 0x12dc2, pid 1527, tgid 1527 (systemd-udevd) order 9, mask 0xcc0, pid 1581, tgid 1581 (systemd-udevd)
GFP mask details GFP_KERNEL | GFP_HIGHMEM | GFP_NOWARN | GFP_NORETRY | __GFP_ZERO GFP_KERNEL
Stack trace line 1 get_page_from_freelist+0x387/0x530 get_page_from_freelist+0x401/0x590
Stack trace line 2 __alloc_pages+0xf2/0x250 __alloc_pages+0xf2/0x250
Stack trace line 3 __iommu_dma_alloc_pages.isra.0+0xf9/0x1c0 __dma_direct_alloc_pages.constprop.0+0x1bc/0x230
Stack trace line 4 __iommu_dma_alloc_noncontiguous.constprop.0+0xa6/0x220 dma_direct_alloc+0x73/0x2a0
Stack trace line 5 iommu_dma_alloc+0x118/0x1c0 dma_alloc_attrs+0x71/0x140
Stack trace line 6 irdma_add_sd_table_entry+0x78/0x180 [irdma] irdma_add_sd_table_entry+0x78/0x180 [irdma]
Stack trace line 7 irdma_sc_create_hmc_obj+0x105/0x380 [irdma] irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
Stack trace line 8 irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma] irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
Stack trace line 9 irdma_ctrl_init_hw+0x266/0x5b0 [irdma] irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
Stack trace line 10 irdma_probe+0x1db/0x340 [irdma] irdma_probe+0x1df/0x440 [irdma]
Stack trace line 11 auxiliary_bus_probe+0x42/0x80 auxiliary_bus_probe+0x42/0x80
Stack trace line 12 really_probe+0xde/0x390 really_probe+0xde/0x390
Stack trace line 13 __driver_probe_device+0xd6/0x130 __driver_probe_device+0xd6/0x130
Stack trace line 14 driver_probe_device+0x1e/0x90 driver_probe_device+0x1e/0x90
Stack trace line 15 __driver_attach+0xd2/0x1c0 __driver_attach+0xd2/0x1c0
Stack trace line 16 bus_for_each_dev+0x75/0xd0 bus_for_each_dev+0x75/0xd0
Total memory usage 39.9921875 GiB 39.9921875 GiB

Environment

  • Red Hat Enterprise Linux 8.10
    • 4.18.0-553.75.1.el8_10 with inbox irda driver
  • Red Hat Enterprise Linux 9.4
    • 5.14.0-427.13.1.el9_4 with inbox irdma driver
  • Further diagnostics revealed that the far newer 9.6 brew kernel containing upstream v6.1 irdma driver is still affected by this issue.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content