Hard lockup deadlock after tlb_choose_channel when using ALB/TLB bonding
Issue
- Hard lockup deadlock after
bond_alb_xmitandtlb_choose_channelwhen using ALB/TLB bonding Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu Xwith calltrace like:
[exception RIP: _spin_lock+0x21]
RIP: ffffffff8153b801 RSP: ffff88089c403830 RFLAGS: 00000097
RAX: 000000000000c50f RBX: ffff88107374f6e0 RCX: ffff880eecd77a20
RDX: 000000000000c50e RSI: 000000000000006d RDI: ffff88107374f728
RBP: ffff88089c403830 R8: 0000000000000000 R9: ffff880eecd77a68
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000006d
R13: ffff88107374f728 R14: 000000000000006b R15: ffff8809f2bc5bc0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#13 [ffff88089c403830] _spin_lock at ffffffff8153b801
#14 [ffff88089c403838] tlb_choose_channel at ffffffffa0392ec5 [bonding]
#15 [ffff88089c403868] bond_alb_xmit at ffffffffa039409e [bonding]
#16 [ffff88089c4038b8] bond_start_xmit at ffffffffa038a48b [bonding]
#17 [ffff88089c4038f8] netpoll_send_skb_on_dev at ffffffff81484491
#18 [ffff88089c403958] netpoll_send_udp at ffffffff81484774
#19 [ffff88089c4039a8] write_msg at ffffffffa00a131b [netconsole]
#20 [ffff88089c403a08] __call_console_drivers at ffffffff81077625
#21 [ffff88089c403a38] _call_console_drivers at ffffffff8107768a
#22 [ffff88089c403a58] release_console_sem at ffffffff81077cd8
#23 [ffff88089c403a98] vprintk at ffffffff810783d8
#24 [ffff88089c403b38] printk at ffffffff81537d5d
#25 [ffff88089c403b98] __netdev_printk at ffffffff81467c11
#26 [ffff88089c403ba8] netdev_err at ffffffff81467de3
#27 [ffff88089c403c18] bond_alb_xmit at ffffffffa03942f5 [bonding]
#28 [ffff88089c403c68] bond_start_xmit at ffffffffa038a48b [bonding]
#29 [ffff88089c403ca8] dev_hard_start_xmit at ffffffff8146fa54
#30 [ffff88089c403d08] dev_queue_xmit at ffffffff8146fefd
#31 [ffff88089c403d48] arp_xmit at ffffffff814d19b8
#32 [ffff88089c403d78] arp_send at ffffffff814d1ef3
#33 [ffff88089c403d98] arp_solicit at ffffffff814d201f
#34 [ffff88089c403e08] neigh_timer_handler at ffffffff814796f8
#35 [ffff88089c403e48] run_timer_softirq at ffffffff8108a4d7
#36 [ffff88089c403ed8] __do_softirq at ffffffff8107ffd1
#37 [ffff88089c403f48] call_softirq at ffffffff8100c38c
#38 [ffff88089c403f60] do_softirq at ffffffff8100fbd5
#39 [ffff88089c403f80] irq_exit at ffffffff8107fe85
#40 [ffff88089c403f90] smp_apic_timer_interrupt at ffffffff815425aa
#41 [ffff88089c403fb0] apic_timer_interrupt at ffffffff8100bc13
--- <IRQ stack> ---
Environment
- Red Hat Enterprise Linux 6.7
- Bonding in
balance-tlb(Mode 5) orbalance-alb(Mode 6) netconsoleservice sending log messages over bond- Bonding slave removed or gone down at any point in the past
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.