After network transmit timeout and reset e1000 driver dead locks in e1000_watchdog worker thread
Issue
- System network interface using e1000 driver stops responding and no further communication is possible through it. Usually the event can happen after a network transmit timeout event plus interface reset, and these messages are seen on the kernel log:
NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed out
...
e1000 0000:02:00.0: eth0: Reset adapter
INFO: task events/0:19 blocked for more than 120 seconds.
Tainted: G W --------------- 2.6.32-504.30.3.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
events/0 D 0000000000000000 0 19 2 0x00000000
ffff88043a2b5d40 0000000000000046 ffff88043a2b5d60 ffffffff81063c63
ffff880028215928 ffff88043a2b3558 ffff88043a2b5ce0 ffffffff8106cc43
ffff88043a2b5d60 ffff88043a2b3558 ffff88043a2b3ad8 ffff88043a2b5fd8
Call Trace:
[<ffffffff81063c63>] ? perf_event_task_sched_out+0x33/0x70
[<ffffffff8106cc43>] ? dequeue_entity+0x113/0x2e0
[<ffffffff8152b486>] __mutex_lock_slowpath+0x96/0x210
[<ffffffff8152afab>] mutex_lock+0x2b/0x50
[<ffffffffa0265123>] e1000_watchdog+0x73/0x550 [e1000]
[<ffffffff8109ef4e>] ? prepare_to_wait+0x4e/0x80
[<ffffffffa02650b0>] ? e1000_watchdog+0x0/0x550 [e1000]
[<ffffffff81098100>] worker_thread+0x170/0x2a0
[<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81097f90>] ? worker_thread+0x0/0x2a0
[<ffffffff8109e78e>] kthread+0x9e/0xc0
[<ffffffff8100c28a>] child_rip+0xa/0x20
[<ffffffff8109e6f0>] ? kthread+0x0/0xc0
[<ffffffff8100c280>] ? child_rip+0x0/0x20
INFO: task events/1:20 blocked for more than 120 seconds.
Tainted: G W --------------- 2.6.32-504.30.3.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
events/1 D 0000000000000001 0 20 2 0x00000000
ffff88043a2b9ba0 0000000000000046 000000000000000a ffff8800000bd0a0
ffff88043a2b9b00 ffffffff8153022a ffff88043a2b7680 ffff880028295928
0000000000005ae7 ffff88043a2b2ab0 ffff88043a2b3068 ffff88043a2b9fd8
Call Trace:
[<ffffffff8153022a>] ? atomic_notifier_call_chain+0x1a/0x20
[<ffffffff8106cc43>] ? dequeue_entity+0x113/0x2e0
[<ffffffff8152aa05>] schedule_timeout+0x215/0x2e0
[<ffffffff8152a683>] wait_for_common+0x123/0x180
[<ffffffff81064c00>] ? default_wake_function+0x0/0x20
[<ffffffff8152a79d>] wait_for_completion+0x1d/0x20
[<ffffffff81098c33>] __cancel_work_timer+0x1b3/0x1e0
[<ffffffff81098580>] ? wq_barrier_func+0x0/0x20
[<ffffffff81098c72>] cancel_delayed_work_sync+0x12/0x20
[<ffffffffa0263c7a>] e1000_down_and_stop+0x3a/0x60 [e1000]
[<ffffffffa0269425>] e1000_down+0x155/0x200 [e1000]
[<ffffffffa0269a00>] ? e1000_reset_task+0x0/0xb0 [e1000]
[<ffffffffa0269a66>] e1000_reset_task+0x66/0xb0 [e1000]
[<ffffffff81098100>] worker_thread+0x170/0x2a0
[<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81097f90>] ? worker_thread+0x0/0x2a0
[<ffffffff8109e78e>] kthread+0x9e/0xc0
[<ffffffff8100c28a>] child_rip+0xa/0x20
[<ffffffff8109e6f0>] ? kthread+0x0/0xc0
[<ffffffff8100c280>] ? child_rip+0x0/0x20
Environment
- Red Hat Enterprise Linux 6.3, 6.4, 6.5, 6.6
- Network Interface using e1000 driver
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.