Why qib driver unable to handle job ?
Issue
- The issue is not exhibited when running with the Intel driver. Below are the call traces:
Shutting down interface ib0: INFO: task ipoib:1091 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ipoib D 0000000000000002 0 1091 2 0x00000000
ffff8802160f7d20 0000000000000046 0000000000000000 ffffffff8150e2ed
ffff8802160f7cd0 ffffffff81092b38 0000000000000000 ffffffff81092ac0
ffff88021776f098 ffff8802160f7fd8 000000000000fb88 ffff88021776f098
Call Trace:
[<ffffffff8150e2ed>] ? wait_for_completion+0x1d/0x20
[<ffffffff81092b38>] ? synchronize_sched+0x58/0x60
[<ffffffff81092ac0>] ? wakeme_after_rcu+0x0/0x20
[<ffffffffa0131145>] qib_destroy_qp+0x1e5/0x2a0 [ib_qib]
[<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa00f2511>] ib_destroy_qp+0xe1/0x120 [ib_core]
[<ffffffffa029e712>] ipoib_cm_tx_reap+0x1c2/0x510 [ib_ipoib]
[<ffffffffa029e550>] ? ipoib_cm_tx_reap+0x0/0x510 [ib_ipoib]
[<ffffffff81090ac0>] worker_thread+0x170/0x2a0
[<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81090950>] ? worker_thread+0x0/0x2a0
[<ffffffff81096916>] kthread+0x96/0xa0
[<ffffffff8100c0ca>] child_rip+0xa/0x20
[<ffffffff81096880>] ? kthread+0x0/0xa0
[<ffffffff8100c0c0>] ? child_rip+0x0/0x20
- Kernel panic with RIP: remove_qp+272
Environment
- Red Hat Enterprise Linux 6
- Infiniband Fabric
- Intel Qlogic Infiniband Cards
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.