System hang due to slow IO to DRBD on RHEL

Solution Verified - Updated -

Issue

  • Intermittent System hangs during backup, the console shows hung processes for xfs and requires a reboot
[506710.164547] INFO: task xfsaild/drbd0:4353 blocked for more than 240 seconds.
[506710.166234] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[506710.167770] xfsaild/drbd0   D ffffffffffffffff     0  4353      2 0x00000080
[506710.169311]  ffff88103d653b20 0000000000000046 ffff88102fac3f40 ffff88103d653fd8
[506710.170870]  ffff88103d653fd8 ffff88103d653fd8 ffff88102fac3f40 ffff88103d653c78
[506710.172434]  7fffffffffffffff ffff88103d653c70 ffff88102fac3f40 ffffffffffffffff
[506710.173980] Call Trace:
[506710.175512]  [<ffffffff816a94c9>] schedule+0x29/0x70
[506710.177058]  [<ffffffff816a6fd9>] schedule_timeout+0x239/0x2c0
[506710.178596]  [<ffffffff810c1309>] ? ttwu_do_wakeup+0x19/0xd0
[506710.180128]  [<ffffffff810c149d>] ? ttwu_do_activate.constprop.92+0x5d/0x70
[506710.181666]  [<ffffffff810c4593>] ? try_to_wake_up+0x183/0x340
[506710.183199]  [<ffffffff816a987d>] wait_for_completion+0xfd/0x140
[506710.184731]  [<ffffffff810c4810>] ? wake_up_state+0x20/0x20
[506710.186268]  [<ffffffff810a987d>] flush_work+0xfd/0x190
[506710.187790]  [<ffffffff810a5df0>] ? move_linked_works+0x90/0x90
[506710.189389]  [<ffffffffc03ebafa>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[506710.190924]  [<ffffffff8109927e>] ? try_to_del_timer_sync+0x5e/0x90
[506710.192500]  [<ffffffffc03e9bb5>] _xfs_log_force+0x85/0x2c0 [xfs]
[506710.194017]  [<ffffffff81098b20>] ? internal_add_timer+0x70/0x70
[506710.195553]  [<ffffffffc03f5b7c>] ? xfsaild+0x16c/0x6f0 [xfs]
[506710.197068]  [<ffffffffc03e9e1c>] xfs_log_force+0x2c/0x70 [xfs]
[506710.198573]  [<ffffffffc03f5a10>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[506710.200089]  [<ffffffffc03f5b7c>] xfsaild+0x16c/0x6f0 [xfs]
[506710.201591]  [<ffffffffc03f5a10>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[506710.203082]  [<ffffffff810b098f>] kthread+0xcf/0xe0
[506710.204570]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
[506710.206060]  [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
[506710.207531]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
[506710.208997] sending NMI to all CPUs:

Environment

  • Red Hat Enterprise Linux 7
  • DRDB Distributed Replicated Block Device v9.0.8-1

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In