System crashed during SAN storage update

Solution Verified - Updated -

Issue

  • Several systems were crashed or at least lost paths to the SAN storage (HPE 3PAR) during a 3PAR firmware update
    This issue was seen only for systems having RHEL 7.6 installed, the systems with 7.4 kernel were not affected.

  • Below errors, call traces were logged during the crash:

    sd 4:0:1:3: [sdbo] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
    sd 4:0:1:3: [sdbo] CDB: Write(10) 2a 00 00 38 11 1b 00 00 73 00
    blk_update_request: I/O error, dev sdbo, sector 3674395
    device-mapper: multipath: Failing path 68:16.
    device-mapper: multipath: Failing path 68:32.
    sd 4:0:1:4: rejecting I/O to offline device
    device-mapper: multipath: Failing path 71:112.
    qla2xxx [0000:0b:00.0]-8030:3: TM IOCB failed (9).
    qla2xxx [0000:0b:00.0]-800c:3: do_reset failed for cmd=ffff93c6953e2300.
    qla2xxx [0000:0b:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:1 cmd=ffff93c6953e2300.
    qla2xxx [0000:0b:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:3 cmd=ffff93f4c63cda40.
    qla2xxx [0000:0b:00.0]-8030:3: TM IOCB failed (5).
    qla2xxx [0000:0b:00.0]-800c:3: do_reset failed for cmd=ffff93f4c63cda40.
    qla2xxx [0000:0b:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:3 cmd=ffff93f4c63cda40.
    qla2xxx [0000:0b:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:0 cmd=ffff93f4c67b3100.
    qla2xxx [0000:0b:00.0]-8030:3: TM IOCB failed (9).
    qla2xxx [0000:0b:00.0]-800c:3: do_reset failed for cmd=ffff93f4c67b3100.
    qla2xxx [0000:0b:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:0 cmd=ffff93f4c67b3100.
    qla2xxx [0000:0b:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:1.
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<ffffffffc02a974a>] qla2x00_eh_wait_on_command+0x1a/0xa0 [qla2xxx]
    PGD 0 
    Oops: 0000 [#1] SMP 
    Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 [...] qla2xxx nvme_fc nvme_fabrics
    CPU: 22 PID: 4286 Comm: scsi_eh_3 Kdump: loaded Tainted: P          IOE  ------------   3.10.0-957.5.1.el7.x86_64 #1
    Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    task: ffff93fe087b4100 ti: ffff93fd4e894000 task.ti: ffff93fd4e894000
    RIP: 0010:[<ffffffffc02a974a>]  [<ffffffffc02a974a>] qla2x00_eh_wait_on_command+0x1a/0xa0 [qla2xxx]
    RSP: 0018:ffff93fd4e897ce8  EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff93d56e8db9c0 RCX: ffff93cddf4a1740
    RDX: 0000000000000140 RSI: ffff93d56e8da680 RDI: ffff93d56e8db9c0
    RBP: ffff93fd4e897cf0 R08: ffff93d56e8d9c00 R09: 0000000000004000
    R10: ffff9400b1433ec0 R11: 0000000000000800 R12: 0000000000000000
    R13: ffff93cddf4a1740 R14: 0000000000000000 R15: ffff93d56e8db9c0
    FS:  0000000000000000(0000) GS:ffff93d6e7cc0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000005931210000 CR4: 00000000000207e0
    Call Trace:
     [<ffffffffc02ad50a>] qla2x00_eh_wait_for_pending_commands+0xda/0x140 [qla2xxx]
     [<ffffffffc02b5a4a>] qla2xxx_eh_bus_reset+0x17a/0x1c0 [qla2xxx]
     [<ffffffffa0ad33c6>] scsi_try_bus_reset+0x46/0x100
     [<ffffffffa0ad52b1>] scsi_eh_ready_devs+0x771/0xc60
     [<ffffffffa0ad6a8c>] scsi_error_handler+0x56c/0x8b0
     [<ffffffffa0ad6520>] ? scsi_eh_get_sense+0x250/0x250
     [<ffffffffa06c1c71>] kthread+0xd1/0xe0
     [<ffffffffa06c1ba0>] ? insert_kthread_work+0x40/0x40
     [<ffffffffa0d74c37>] ret_from_fork_nospec_begin+0x21/0x21
     [<ffffffffa06c1ba0>] ? insert_kthread_work+0x40/0x40
    Code: e8 2c e2 69 e0 eb 97 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 53 48 8b 07 48 89 fb 48 8b 30 48 8b 86 f0 0b 00 00 <48> 8b 10 83 ba 90 00 00 00 01 75 4a 48 8b 40 10 a9 00 00 04 00 
    RIP  [<ffffffffc02a974a>] qla2x00_eh_wait_on_command+0x1a/0xa0 [qla2xxx]
     RSP <ffff93fd4e897ce8>
    CR2: 0000000000000000
    
  • Below is another crash observed during the 3PAR storage array firmware upgrade:

    CPU: 0 PID: 421 Comm: scsi_eh_1 Kdump: loaded Tainted: G               ------------ T 3.10.0-957.el7.debugscsi.x86_64 #1
    Hardware name: HP ProLiant BL460c Gen9, BIOS I36 02/17/2017
    task: ffffa0626cf2a080 ti: ffffa0626ad18000 task.ti: ffffa0626ad18000
    RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
    RSP: 0018:ffffa0626ad1bbf8  EFLAGS: 00010082
    qla2xxx [0000:09:00.1]-801c:2: Abort command issued nexus=2:1:2 --  0 2003.
    RAX: 0000000000000000 RBX: ffffa05f02bdc540 RCX: 000000010024000d
    RDX: 000000010024000e RSI: ffffe63c240af700 RDI: ffffa05f02bdda40
    RBP: ffffa0626ad1bc10 R08: ffffa05f02bdc540 R09: 000000010024000d
    R10: 0000000002bdd801 R11: ffffe63c240af700 R12: ffffa05f02bdda40
    R13: ffffa05e58fc8740 R14: ffffa05e5d84b380 R15: 00000000000005e1
    FS:  0000000000000000(0000) GS:ffffa05e5fc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000001df610000 CR4: 00000000001607f0
    Call Trace:
     [<ffffffffc0451da4>] ? qla2x00_sp_compl+0x54/0xb0 [qla2xxx]
     [<ffffffffc0451bb2>] __qla2x00_abort_all_cmds+0xc2/0x260 [qla2xxx]
     [<ffffffffc0455bd7>] qla2x00_abort_all_cmds+0x27/0x70 [qla2xxx]
     [<ffffffffc046be13>] qla2x00_abort_isp_cleanup+0x2a3/0x330 [qla2xxx]
     [<ffffffffc046bf9d>] qla2x00_abort_isp+0xfd/0x6d0 [qla2xxx]
     [<ffffffffc04557f5>] qla2xxx_eh_host_reset+0x285/0x2c0 [qla2xxx]
     [<ffffffffc045da1a>] ? qla2xxx_eh_bus_reset+0x14a/0x1c0 [qla2xxx]
     [<ffffffffbecd2e46>] scsi_try_host_reset+0x46/0x100
     [<ffffffffbecd4cf6>] scsi_eh_ready_devs+0x876/0xc60
     [<ffffffffbecd63cc>] scsi_error_handler+0x56c/0x8b0
     [<ffffffffbecd5e60>] ? scsi_eh_get_sense+0x250/0x250
     [<ffffffffbe8c1c31>] kthread+0xd1/0xe0
     [<ffffffffbe8c1b60>] ? insert_kthread_work+0x40/0x40
     [<ffffffffbef74c37>] ret_from_fork_nospec_begin+0x21/0x21
     [<ffffffffbe8c1b60>] ? insert_kthread_work+0x40/0x40
    Code:  Bad RIP value.
    RIP  [<          (null)>]           (null)
     RSP <ffffa0626ad1bbf8>
    CR2: 0000000000000000
    

Environment

  • Red Hat Enterprise Linux 7.6
  • kernel version 3.10.0-957.el7 and < 3.10.0-957.10.1.el7
  • 3PAR Storage array configured with persistent port mode

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content