i40e panic on link loss or after changing NIC ring buffer count
Issue
- System with Intel 700 series
i40e
NIC suffered error:
[3408318.640794] i40e 0000:03:00.0 ens64: NIC Link is Down
[3408319.378069] i40e 0000:03:00.0 ens64: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[3408332.303305] i40e 0000:03:00.0 ens64: NIC Link is Down
[3408338.343124] i40e 0000:03:00.0 ens64: Changing Tx descriptor count from 8160 to 512.
[3408338.344477] i40e 0000:03:00.0 ens64: Changing Rx descriptor count from 8160 to 512
[3408761.504733] i40e 0000:03:00.0 ens64: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[3408761.543703] i40e 0000:03:00.0 ens64: Changing Tx descriptor count from 512 to 8160.
[3408761.554146] i40e 0000:03:00.0 ens64: Changing Rx descriptor count from 512 to 8160
[3408761.583181] NetworkManager: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO)
[3408761.583595] CPU: 7 PID: 1428 Comm: NetworkManager Kdump: loaded Not tainted 5.14.0-570.28.1.el9_6.x86_64 #1
[3408761.584360] Call Trace:
[3408761.584734] <TASK>
[3408761.585105] dump_stack_lvl+0x34/0x48
[3408761.585481] warn_alloc+0x129/0x150
[3408761.585854] ? __alloc_pages_direct_compact+0xa7/0x210
[3408761.586220] __alloc_pages_slowpath.constprop.0+0xa73/0xb20
[3408761.586580] ? get_page_from_freelist+0x2aa/0x590
[3408761.586936] __alloc_pages+0x21d/0x250
[3408761.587287] __kmalloc_large_node+0x79/0x110
[3408761.587636] __kmalloc+0x322/0x440
[3408761.587982] ? i40e_setup_rx_descriptors+0x91/0xb0 [i40e]
[3408761.588363] ? i40e_setup_rx_descriptors+0x91/0xb0 [i40e]
[3408761.588717] i40e_setup_rx_descriptors+0x91/0xb0 [i40e]
[3408761.589075] i40e_set_ringparam.cold+0x11d/0x56e [i40e]
[3408761.589425] __dev_ethtool+0x8fc/0x1b40
[3408761.589752] ? __sk_destruct+0x155/0x230
[3408761.590073] ? kmem_cache_free+0x3f1/0x420
[3408761.590390] ? kmalloc_trace+0x176/0x330
[3408761.590704] dev_ethtool+0xa8/0x170
[3408761.591013] dev_ioctl+0x1b5/0x580
[3408761.591317] ? sk_ioctl+0x4a/0x110
[3408761.591616] sock_do_ioctl+0xab/0xf0
[3408761.591919] sock_ioctl+0x1ce/0x2e0
[3408761.592210] ? auditd_test_task+0x3c/0x50
[3408761.592498] ? __audit_syscall_entry+0xef/0x140
[3408761.592782] __x64_sys_ioctl+0x87/0xc0
[3408761.593062] do_syscall_64+0x5c/0xe0
[3408761.593336] ? __irq_exit_rcu+0x46/0xc0
[3408761.593608] ? common_interrupt+0x43/0xa0
[3408761.593873] entry_SYSCALL_64_after_hwframe+0x78/0x80
- The system panicked shortly after:
[3408762.143799] BUG: kernel NULL pointer dereference, address: 0000000000000008
[3408762.143802] #PF: supervisor write access in kernel mode
[3408762.143803] #PF: error_code(0x0002) - not-present page
[3408762.143804] PGD 0 P4D 0
[3408762.143806] Oops: 0002 [#1] PREEMPT SMP NOPTI
[3408762.143811] RIP: 0010:i40e_xmit_frame_ring+0xff/0x500 [i40e]
[3408762.143878] Call Trace:
[3408762.143880] <TASK>
[3408762.143881] ? show_trace_log_lvl+0x1c4/0x2df
[3408762.143885] ? show_trace_log_lvl+0x1c4/0x2df
[3408762.143887] ? dev_hard_start_xmit+0x85/0x1d0
[3408762.143890] ? __die_body.cold+0x8/0xd
[3408762.143892] ? page_fault_oops+0x134/0x170
[3408762.143895] ? _copy_to_iter+0x61/0x570
[3408762.143900] ? exc_page_fault+0x62/0x150
[3408762.143903] ? asm_exc_page_fault+0x22/0x30
[3408762.143907] ? i40e_xmit_frame_ring+0xff/0x500 [i40e]
[3408762.143928] dev_hard_start_xmit+0x85/0x1d0
[3408762.143930] sch_direct_xmit+0x9b/0x360
[3408762.143934] __dev_xmit_skb+0x22a/0x570
[3408762.143937] __dev_queue_xmit+0x2c2/0x6b0
[3408762.143939] ? packet_parse_headers+0x107/0x220
[3408762.143942] ? packet_parse_headers+0x107/0x220
[3408762.143944] packet_snd+0x382/0x760
[3408762.143946] __sys_sendto+0x1dc/0x1f0
[3408762.143950] ? syscall_exit_to_user_mode+0x19/0x40
[3408762.143953] ? auditd_test_task+0x3c/0x50
[3408762.143956] ? __audit_syscall_entry+0xef/0x140
[3408762.143959] __x64_sys_sendto+0x20/0x30
[3408762.143961] do_syscall_64+0x5c/0xe0
[3408762.143963] ? audit_reset_context.part.0.constprop.0+0x273/0x2e0
[3408762.143965] ? syscall_exit_work+0x103/0x130
[3408762.143967] ? syscall_exit_to_user_mode+0x19/0x40
[3408762.143970] ? do_syscall_64+0x6b/0xe0
[3408762.143971] ? syscall_exit_work+0x103/0x130
[3408762.143972] ? syscall_exit_to_user_mode+0x19/0x40
[3408762.143975] ? do_syscall_64+0x6b/0xe0
[3408762.143976] entry_SYSCALL_64_after_hwframe+0x78/0x80
Environment
- Red Hat Enterprise Linux 9
- Intel 700 series NIC with
i40e
driver - NetworkManager managing NIC ring buffer with
ethtool.ring-rx
andethtool.ring-tx
connection property
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.