The kernel's inbox irdma driver in RHEL with Intel Ethernet Controller E810 causes extremely high memory usage due to excessive dynamic allocations during initialization
Issue
- The system experiences significant memory consumption on boot (approximately 40 GiB), leading to resource exhaustion and potential performance degradation.
- The page_owner output highlights that 40 GiB of memory is being consumed dynamically during irdma driver initialization. Inefficient handling of IOMMU mappings and HMC object creation in the inbox driver is suspected.
10483712 times, 10483712 pages, allocated by OTHERS :
Page allocated via order 0, mask 0x12dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NORETRY|__GFP_ZERO), pid 1527, tgid 1527 (systemd-udevd), ts ns
get_page_from_freelist+0x387/0x530
__alloc_pages+0xf2/0x250
__iommu_dma_alloc_pages.isra.0+0xf9/0x1c0
__iommu_dma_alloc_noncontiguous.constprop.0+0xa6/0x220
iommu_dma_alloc+0x118/0x1c0
irdma_add_sd_table_entry+0x78/0x180 [irdma]
irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
irdma_probe+0x1db/0x340 [irdma]
auxiliary_bus_probe+0x42/0x80
really_probe+0xde/0x390
__driver_probe_device+0xd6/0x130
driver_probe_device+0x1e/0x90
__driver_attach+0xd2/0x1c0
bus_for_each_dev+0x75/0xd0
- The vmalloc usage is explicitly shown in /proc/vmallocinfo. The total amount of memory allocated by dma_common_pages_remap() is approximately 42.3593 GiB.
$ awk '{total += $2} /dma_common_pages_remap/ {dma_total += $2} END {print "Grand Total:", total, "bytes (", total/2^30, "GiB)"; print "dma_common_pages_remap Total:", dma_total, "bytes (", dma_total/2^30, "GiB)"}' proc/vmallocinfo
Grand Total: 46710464512 bytes ( 43.5025 GiB)
dma_common_pages_remap Total: 45482975232 bytes ( 42.3593 GiB)
-
This aligns with the observation of over 40 GiB memory being allocated for the driver dynamically.
-
A test was carried out with 9.6 brew kernel - 5.14.0-570.el9 - containing a far more recent upstream irdma driver.
- The result shows slight improvements. dma_common_pages_remap() usage drops to 0 GiB:
$ awk '{total += $2} /dma_common_pages_remap/ {dma_total += $2} END {print "Grand Total:", total, "bytes (", total/2^30, "GiB)"; print "dma_common_pages_remap Total:", dma_total, "bytes (", dma_total/2^30, "GiB)"}' proc/vmallocinfo
Grand Total: 1169719296 bytes ( 1.08939 GiB)
dma_common_pages_remap Total: bytes ( 0 GiB)
- While the backported irdma driver in 9.6 is likely a little bit better, it still has severe memory inefficiencies where allocations are requested with these functions until reaching up to ~40 GiB on boot:
...
irdma_add_sd_table_entry+0x78/0x180 [irdma]
irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
irdma_probe+0x1df/0x440 [irdma]
...
- The 9.6 inbox driver requests order-9 allocations but the total memory consumption is still ~40 GiB, just like before when order-0 allocations were dominant in 9.4.
9.4 – 5.14.0-427.13.1.el9_4 – inbox irdma driver
================================================
10483712 times, 10483712 pages, allocated by OTHERS :
Page allocated via order 0, mask 0x12dc2, pid 1527, tgid 1527 (systemd-udevd), ts ns
The mask details: GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NORETRY|__GFP_ZERO
get_page_from_freelist+0x387/0x530
__alloc_pages+0xf2/0x250
__iommu_dma_alloc_pages.isra.0+0xf9/0x1c0
__iommu_dma_alloc_noncontiguous.constprop.0+0xa6/0x220
iommu_dma_alloc+0x118/0x1c0
irdma_add_sd_table_entry+0x78/0x180 [irdma]
irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
irdma_probe+0x1db/0x340 [irdma]
auxiliary_bus_probe+0x42/0x80
really_probe+0xde/0x390
__driver_probe_device+0xd6/0x130
driver_probe_device+0x1e/0x90
__driver_attach+0xd2/0x1c0
bus_for_each_dev+0x75/0xd0
Total: 39.9921875 GiB
9.6 – 5.14.0-570.el9 – inbox irdma driver
=========================================
20476 times, 10483712 pages, allocated by OTHERS :
Page allocated via order 9, mask 0xcc0, pid 1581, tgid 1581 (systemd-udevd), ts ns
The mask details: GFP_KERNEL
get_page_from_freelist+0x401/0x590
__alloc_pages+0xf2/0x250
__dma_direct_alloc_pages.constprop.0+0x1bc/0x230
dma_direct_alloc+0x73/0x2a0
dma_alloc_attrs+0x71/0x140
irdma_add_sd_table_entry+0x78/0x180 [irdma]
irdma_sc_create_hmc_obj+0x105/0x380 [irdma]
irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma]
irdma_ctrl_init_hw+0x266/0x5b0 [irdma]
irdma_probe+0x1df/0x440 [irdma]
auxiliary_bus_probe+0x42/0x80
really_probe+0xde/0x390
__driver_probe_device+0xd6/0x130
driver_probe_device+0x1e/0x90
__driver_attach+0xd2/0x1c0
bus_for_each_dev+0x75/0xd0
Total: 39.9921875 GiB
- It appears that the core issue isn't just order-9 allocations, but rather the total amount of memory being allocated remains the same where the driver is requesting excessive memory for HMC objects and DMA buffers, regardless of the page order.
| Description | RHEL 9.4 – 5.14.0-427.13.1.el9_4 – inbox irdma driver | RHEL 9.6 – 5.14.0-570.el9 – inbox irdma driver |
|---|---|---|
| Allocation count and pages | 10,483,712 times, 10,483,712 pages, allocated by OTHERS | 20,476 times, 10,483,712 pages, allocated by OTHERS |
| Page allocation details | order 0, mask 0x12dc2, pid 1527, tgid 1527 (systemd-udevd) |
order 9, mask 0xcc0, pid 1581, tgid 1581 (systemd-udevd) |
| GFP mask details | GFP_KERNEL | GFP_HIGHMEM | GFP_NOWARN | GFP_NORETRY | __GFP_ZERO |
GFP_KERNEL |
| Stack trace line 1 | get_page_from_freelist+0x387/0x530 |
get_page_from_freelist+0x401/0x590 |
| Stack trace line 2 | __alloc_pages+0xf2/0x250 |
__alloc_pages+0xf2/0x250 |
| Stack trace line 3 | __iommu_dma_alloc_pages.isra.0+0xf9/0x1c0 |
__dma_direct_alloc_pages.constprop.0+0x1bc/0x230 |
| Stack trace line 4 | __iommu_dma_alloc_noncontiguous.constprop.0+0xa6/0x220 |
dma_direct_alloc+0x73/0x2a0 |
| Stack trace line 5 | iommu_dma_alloc+0x118/0x1c0 |
dma_alloc_attrs+0x71/0x140 |
| Stack trace line 6 | irdma_add_sd_table_entry+0x78/0x180 [irdma] |
irdma_add_sd_table_entry+0x78/0x180 [irdma] |
| Stack trace line 7 | irdma_sc_create_hmc_obj+0x105/0x380 [irdma] |
irdma_sc_create_hmc_obj+0x105/0x380 [irdma] |
| Stack trace line 8 | irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma] |
irdma_create_hmc_objs.constprop.0+0xf6/0x1d0 [irdma] |
| Stack trace line 9 | irdma_ctrl_init_hw+0x266/0x5b0 [irdma] |
irdma_ctrl_init_hw+0x266/0x5b0 [irdma] |
| Stack trace line 10 | irdma_probe+0x1db/0x340 [irdma] |
irdma_probe+0x1df/0x440 [irdma] |
| Stack trace line 11 | auxiliary_bus_probe+0x42/0x80 |
auxiliary_bus_probe+0x42/0x80 |
| Stack trace line 12 | really_probe+0xde/0x390 |
really_probe+0xde/0x390 |
| Stack trace line 13 | __driver_probe_device+0xd6/0x130 |
__driver_probe_device+0xd6/0x130 |
| Stack trace line 14 | driver_probe_device+0x1e/0x90 |
driver_probe_device+0x1e/0x90 |
| Stack trace line 15 | __driver_attach+0xd2/0x1c0 |
__driver_attach+0xd2/0x1c0 |
| Stack trace line 16 | bus_for_each_dev+0x75/0xd0 |
bus_for_each_dev+0x75/0xd0 |
| Total memory usage | 39.9921875 GiB | 39.9921875 GiB |
Environment
- Red Hat Enterprise Linux 8.10
4.18.0-553.75.1.el8_10with inbox irda driver
- Red Hat Enterprise Linux 9.4
5.14.0-427.13.1.el9_4with inbox irdma driver
- Further diagnostics revealed that the far newer 9.6 brew kernel containing upstream v6.1 irdma driver is still affected by this issue.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.