Network interface hangs or disappears with "WARNING: at lib/list_debug.c" or "list_add corruption" on realtime kernel
Issue
- Network interface hangs or disappears with "WARNING: at lib/list_debug.c" or "list_add corruption" on realtime kernel
- Physical interface goes down after RT-kernel update
ethtool
showsCannot get device settings: No such device
for NIC which existed at boot time- A WARN-level message is logged about list corruption, such as:
WARNING: at lib/list_debug.c:29 __list_add+0x77/0xd0()
list_add corruption. next->prev should be prev (ffff883e13896900), but was ffff88803f955b90. (next=ffff88803f955b90).
Call Trace:
[<ffffffff815f08cd>] dump_stack+0x19/0x1c
[<ffffffff8105cd22>] warn_slowpath_common+0x82/0xc0
[<ffffffff8105ce16>] warn_slowpath_fmt+0x46/0x50
[<ffffffff812d3757>] __list_add+0x77/0xd0
[<ffffffff815183cd>] ? __napi_schedule_irqoff+0x1d/0x40
[<ffffffff815183d6>] __napi_schedule_irqoff+0x26/0x40
[<ffffffffa03d8185>] mlx4_en_rx_irq+0x45/0x60 [mlx4_en]
[<ffffffffa0386102>] mlx4_cq_completion+0x42/0x90 [mlx4_core]
[<ffffffffa0387988>] mlx4_eq_int+0x578/0xe50 [mlx4_core]
[<ffffffff810a5b2c>] ? pull_rt_task+0x29c/0x3b0
[<ffffffff810a6937>] ? dequeue_task_rt+0x57/0x70
[<ffffffffa0388274>] mlx4_msi_x_interrupt+0x14/0x20 [mlx4_core]
[<ffffffff810fd15e>] irq_forced_thread_fn+0x2e/0x70
[<ffffffff810fe22f>] irq_thread+0x13f/0x1c0
[<ffffffff810fd130>] ? irq_thread_fn+0x50/0x50
[<ffffffff810fd000>] ? irq_finalize_oneshot+0xf0/0xf0
[<ffffffff810fe0f0>] ? irq_thread_check_affinity+0xb0/0xb0
[<ffffffff810fe0f0>] ? irq_thread_check_affinity+0xb0/0xb0
[<ffffffff8108870e>] kthread+0xbe/0xd0
WARNING: at lib/list_debug.c:33 __list_add+0xbe/0xd0()
list_add corruption. prev->next should be next (ffff880c4fa35b90), but was dead000000100100. (prev=ffff880c12cf8a08).
Call Trace:
[<ffffffff815f078d>] dump_stack+0x19/0x1c
[<ffffffff8105cd12>] warn_slowpath_common+0x82/0xc0
[<ffffffff8105ce06>] warn_slowpath_fmt+0x46/0x50
[<ffffffff812d371e>] __list_add+0xbe/0xd0
[<ffffffff8151837e>] __napi_schedule+0x2e/0x70
[<ffffffffa03ff9fd>] efx_farch_msi_interrupt+0x5d/0x90 [sfc]
[<ffffffff810fcf5e>] irq_forced_thread_fn+0x2e/0x70
[<ffffffff810fe02f>] irq_thread+0x13f/0x1c0
[<ffffffff810fcf30>] ? irq_thread_fn+0x50/0x50
[<ffffffff810fce00>] ? irq_finalize_oneshot+0xf0/0xf0
[<ffffffff810fdef0>] ? irq_thread_check_affinity+0xb0/0xb0
[<ffffffff810fdef0>] ? irq_thread_check_affinity+0xb0/0xb0
[<ffffffff810886fe>] kthread+0xbe/0xd0
- Followed by a net device watchdog hang:
WARNING: at net/sched/sch_generic.c:297 dev_watchdog+0x27a/0x290()
NETDEV WATCHDOG: eth6 (mlx4_core): transmit queue 30 timed out
Environment
- Red Hat Enterprise Linux 6 with MRG Realtime
- Any kernel earlier than
kernel-rt-3.10.0-514.rt56.210.el6rt
- Any kernel earlier than
- Red Hat Enterprise Linux 7 for Real Time
- 7.3.z kernel earlier than
kernel-rt-3.10.0-514.6.1.rt56.429.el7
- 7.4 kernel earlier than
kernel-rt-3.10.0-529.rt56.436.el7
- 7.3.z kernel earlier than
- Network interface with high traffic rate
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.