After the IO errors on disk connected to HP Smart Array controller, system crashed with error 'RIP: dm_softirq_done+0xc4/0x190 [dm_mod]'

Solution Unverified - Updated -

Issue

  • System crashed after IO errors on internal disk connected to HP Smart Array controller:

    sd 0:0:0:0: [sda]  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
    sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 12 4b 80 48 00 00 08 00
    end_request: I/O error, dev sda, sector 306937928
    sd 0:0:0:0: [sda]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
    sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 0a a0 20 08 00 00 08 00
    end_request: I/O error, dev sda, sector 178266120
    [...]
    BUG: unable to handle kernel paging request at 00000000130593b0
    IP: [<ffffffffa0003714>] dm_softirq_done+0xc4/0x190 [dm_mod]
    PGD 0 
    Oops: 0000 [#1] SMP 
    last sysfs file: /sys/devices/virtual/block/dm-49/dev
    CPU 40 
    [...]
    Process swapper (pid: 0, threadinfo ffff8840268b0000, task ffff8840268ad520)
    Stack:
     ffff8801a5b43f30 ffff8840268b3fd8 ffffc90051d1801c ffff8801a5b43e90
    <d> ffffffff81a850a0 0000000000000020 0000000000000100 0000000000000004
    <d> ffff8801a5b43eb0 ffffffff81285c45 ffff8801a5b43e90 ffff8801a5b43e90
    Call Trace:
     <IRQ> 
     [<ffffffff81285c45>] blk_done_softirq+0x85/0xa0
     [<ffffffff81085245>] __do_softirq+0xe5/0x230
     [<ffffffffa007af93>] ? do_hpdsa_hw_intr+0x33/0x50 [hpdsa]
     [<ffffffff8100c38c>] call_softirq+0x1c/0x30
     [<ffffffff8100fc95>] do_softirq+0x65/0xa0
     [<ffffffff810850d5>] irq_exit+0x85/0x90
     [<ffffffff81554935>] do_IRQ+0x75/0xf0
     [<ffffffff8100ba53>] ret_from_intr+0x0/0x11
     <EOI> 
     [<ffffffff812fd8ce>] ? intel_idle+0xfe/0x1b0
     [<ffffffff812fd8b1>] ? intel_idle+0xe1/0x1b0
     [<ffffffff8144384a>] cpuidle_idle_call+0x7a/0xe0
     [<ffffffff81009fe6>] cpu_idle+0xb6/0x110
     [<ffffffff81543c39>] start_secondary+0x2c0/0x316
    Code: de ff d0 41 89 c4 45 85 e4 7f b0 48 8b 83 48 01 00 00 44 8b 7b 40 4c 8b 68 10 4c 8b 30 41 83 7d 44 02 74 43 49 8d 96 30 02 00 00 <49> 39 96 30 02 00 00 75 7b 48 89 df e8 fb e7 ff ff 44 89 e6 4c 
    RIP  [<ffffffffa0003714>] dm_softirq_done+0xc4/0x190 [dm_mod]
     RSP <ffff8801a5b43e40>
    CR2: 00000000130593b0
    
  • Another crash was also observed with following call traces:

    sd 0:0:0:0: rejecting I/O to offline device
    sd 0:0:0:0: rejecting I/O to offline device
    EXT4-fs error (device dm-6): ext4_find_entry: reading directory #393272 offset 0
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000044
    IP: [<ffffffff813a4062>] scsi_softirq_done+0x32/0x170
    PGD 8027af9067 PUD 8023a27067 PMD 0 
    [...]
    Call Trace:
     <IRQ> 
     [<ffffffff81285c45>] blk_done_softirq+0x85/0xa0
     [<ffffffff81015533>] ? native_sched_clock+0x13/0x80
     [<ffffffff81085245>] __do_softirq+0xe5/0x230
     [<ffffffff810add3d>] ? sched_clock_cpu+0xcd/0x110
     [<ffffffff8100c38c>] call_softirq+0x1c/0x30
     [<ffffffff8100fc95>] do_softirq+0x65/0xa0
     [<ffffffff810850d5>] irq_exit+0x85/0x90
     [<ffffffff81037695>] smp_call_function_single_interrupt+0x35/0x40
     [<ffffffff8100be33>] call_function_single_interrupt+0x13/0x20
     <EOI> 
     [<ffffffff8154dba7>] ? _spin_unlock_irqrestore+0x17/0x20
     [<ffffffff8106c2ee>] try_to_wake_up+0x24e/0x3e0
     [<ffffffff8106c4b0>] wake_up_state+0x10/0x20
     [<ffffffff810ba60c>] wake_futex+0x3c/0x60
     [<ffffffff810bd453>] do_futex+0x5e3/0xae0
     [<ffffffff8154a416>] ? schedule+0x176/0xb70
     [<ffffffff8100bcce>] ? invalidate_interrupt2+0xe/0x20
     [<ffffffff810bd9cb>] sys_futex+0x7b/0x170
     [<ffffffff810ee5d7>] ? audit_syscall_entry+0x1d7/0x200
     [<ffffffff810ee3ce>] ? __audit_syscall_exit+0x25e/0x290
     [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
    Code: 20 48 89 1c 24 4c 89 64 24 08 4c 89 6c 24 10 4c 89 74 24 18 0f 1f 44 00 00 48 8b 9f d8 00 00 00 44 8b af 38 01 00 00 48 8d 43 18 <44> 8b 73 44 48 c7 43 30 00 00 00 00 48 89 43 18 48 89 43 20 48 
    RIP  [<ffffffff813a4062>] scsi_softirq_done+0x32/0x170
     RSP <ffff884161403e80>
    CR2: 0000000000000044
    

Environment

  • Red Hat Enterprise Linux 6.9
  • kernel-2.6.32-696.el6
  • HP Smart Array controller managed by
    vendor provided hpdsa module

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In