RHEL7.5 - md workqueue deadlock with stacked md devices

Solution Verified - Updated -

Issue

  • System was hung with following call trace.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md              D ffff8df19b0bdee0     0    95      2 0x00000000    
Workqueue: md submit_flushes
Call Trace:
 [<ffffffff91f13f79>] schedule+0x29/0x70
 [<ffffffff91d4d6f6>] md_flush_request+0x96/0x150
 [<ffffffff918bc150>] ? wake_up_atomic_t+0x30/0x30
 [<ffffffffc0753c46>] raid0_make_request+0x126/0x1e0 [raid0]
 [<ffffffff91d48040>] md_handle_request+0xd0/0x150
 [<ffffffff91d4819a>] md_make_request+0x6a/0x180
 [<ffffffff91b1aafb>] generic_make_request+0x10b/0x320
 [<ffffffff91b1ad80>] submit_bio+0x70/0x150
 [<ffffffff91d4db5c>] submit_flushes+0xec/0x190
 [<ffffffff918b312f>] process_one_work+0x17f/0x440
 [<ffffffff918b3684>] rescuer_thread+0x294/0x3c0
 [<ffffffff918b33f0>] ? process_one_work+0x440/0x440
 [<ffffffff918bb161>] kthread+0xd1/0xe0
 [<ffffffff918bb090>] ? insert_kthread_work+0x40/0x40
 [<ffffffff91f20677>] ret_from_fork_nospec_begin+0x21/0x21
 [<ffffffff918bb090>] ? insert_kthread_work+0x40/0x40

Environment

  • Red Hat Enterprise Linux 7.5
  • 3.10.0-862.3.2.el7.x86_64
  • nvme storage
  • raid1 on top of raid0
            RAID1 (md0)
                   |
       ---------------------
       |                   |
      md1                 md2
    (RAID0)             (RAID0)
       |                   |
    nvme1n1             nvme2n1
    nvme0n1             nvme3n1

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In