After the IO errors on disk connected to HP Smart Array controller, system crashed with error 'RIP: dm_softirq_done+0xc4/0x190 [dm_mod]'

Solution Unverified - Updated -

Issue

  • System crashed after IO errors on internal disk connected to HP Smart Array controller:

    sd 0:0:0:0: [sda]  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
    sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 12 4b 80 48 00 00 08 00
    end_request: I/O error, dev sda, sector 306937928
    sd 0:0:0:0: [sda]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
    sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 0a a0 20 08 00 00 08 00
    end_request: I/O error, dev sda, sector 178266120
    [...]
    BUG: unable to handle kernel paging request at 00000000130593b0
    IP: [<ffffffffa0003714>] dm_softirq_done+0xc4/0x190 [dm_mod]
    PGD 0 
    Oops: 0000 [#1] SMP 
    last sysfs file: /sys/devices/virtual/block/dm-49/dev
    CPU 40 
    [...]
    Process swapper (pid: 0, threadinfo ffff8840268b0000, task ffff8840268ad520)
    Stack:
     ffff8801a5b43f30 ffff8840268b3fd8 ffffc90051d1801c ffff8801a5b43e90
    <d> ffffffff81a850a0 0000000000000020 0000000000000100 0000000000000004
    <d> ffff8801a5b43eb0 ffffffff81285c45 ffff8801a5b43e90 ffff8801a5b43e90
    Call Trace:
     <IRQ> 
     [<ffffffff81285c45>] blk_done_softirq+0x85/0xa0
     [<ffffffff81085245>] __do_softirq+0xe5/0x230
     [<ffffffffa007af93>] ? do_hpdsa_hw_intr+0x33/0x50 [hpdsa]
     [<ffffffff8100c38c>] call_softirq+0x1c/0x30
     [<ffffffff8100fc95>] do_softirq+0x65/0xa0
     [<ffffffff810850d5>] irq_exit+0x85/0x90
     [<ffffffff81554935>] do_IRQ+0x75/0xf0
     [<ffffffff8100ba53>] ret_from_intr+0x0/0x11
     <EOI> 
     [<ffffffff812fd8ce>] ? intel_idle+0xfe/0x1b0
     [<ffffffff812fd8b1>] ? intel_idle+0xe1/0x1b0
     [<ffffffff8144384a>] cpuidle_idle_call+0x7a/0xe0
     [<ffffffff81009fe6>] cpu_idle+0xb6/0x110
     [<ffffffff81543c39>] start_secondary+0x2c0/0x316
    Code: de ff d0 41 89 c4 45 85 e4 7f b0 48 8b 83 48 01 00 00 44 8b 7b 40 4c 8b 68 10 4c 8b 30 41 83 7d 44 02 74 43 49 8d 96 30 02 00 00 <49> 39 96 30 02 00 00 75 7b 48 89 df e8 fb e7 ff ff 44 89 e6 4c 
    RIP  [<ffffffffa0003714>] dm_softirq_done+0xc4/0x190 [dm_mod]
     RSP <ffff8801a5b43e40>
    CR2: 00000000130593b0
    
  • Another crash was also observed with following call traces:

    sd 0:0:0:0: rejecting I/O to offline device
    sd 0:0:0:0: rejecting I/O to offline device
    EXT4-fs error (device dm-6): ext4_find_entry: reading directory #393272 offset 0
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000044
    IP: [<ffffffff813a4062>] scsi_softirq_done+0x32/0x170
    PGD 8027af9067 PUD 8023a27067 PMD 0 
    [...]
    Call Trace:
     <IRQ> 
     [<ffffffff81285c45>] blk_done_softirq+0x85/0xa0
     [<ffffffff81015533>] ? native_sched_clock+0x13/0x80
     [<ffffffff81085245>] __do_softirq+0xe5/0x230
     [<ffffffff810add3d>] ? sched_clock_cpu+0xcd/0x110
     [<ffffffff8100c38c>] call_softirq+0x1c/0x30
     [<ffffffff8100fc95>] do_softirq+0x65/0xa0
     [<ffffffff810850d5>] irq_exit+0x85/0x90
     [<ffffffff81037695>] smp_call_function_single_interrupt+0x35/0x40
     [<ffffffff8100be33>] call_function_single_interrupt+0x13/0x20
     <EOI> 
     [<ffffffff8154dba7>] ? _spin_unlock_irqrestore+0x17/0x20
     [<ffffffff8106c2ee>] try_to_wake_up+0x24e/0x3e0
     [<ffffffff8106c4b0>] wake_up_state+0x10/0x20
     [<ffffffff810ba60c>] wake_futex+0x3c/0x60
     [<ffffffff810bd453>] do_futex+0x5e3/0xae0
     [<ffffffff8154a416>] ? schedule+0x176/0xb70
     [<ffffffff8100bcce>] ? invalidate_interrupt2+0xe/0x20
     [<ffffffff810bd9cb>] sys_futex+0x7b/0x170
     [<ffffffff810ee5d7>] ? audit_syscall_entry+0x1d7/0x200
     [<ffffffff810ee3ce>] ? __audit_syscall_exit+0x25e/0x290
     [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
    Code: 20 48 89 1c 24 4c 89 64 24 08 4c 89 6c 24 10 4c 89 74 24 18 0f 1f 44 00 00 48 8b 9f d8 00 00 00 44 8b af 38 01 00 00 48 8d 43 18 <44> 8b 73 44 48 c7 43 30 00 00 00 00 48 89 43 18 48 89 43 20 48 
    RIP  [<ffffffff813a4062>] scsi_softirq_done+0x32/0x170
     RSP <ffff884161403e80>
    CR2: 0000000000000044
    

Environment

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content