RHEL6: kernel crashes during EMC SAN upgrade in scsi_finish_command due to use-after-free of struct scsi_disk memory in size-1024 slab obtained from gendisk.private_data

Solution In Progress - Updated -

Issue

  • For each server, there are multiple drives (count depends on the server). For each device, there are 32 active paths. During the upgrade, 16 paths are taken offline on the SAN. The SAN then upgrades a component on those paths. Once done, the array brings these 16 paths back online. It is at this time that the hosts failed.
  • Servers rebooted during EMC SAN Upgrade, intended to be a non-distruptive upgrade. Of the approximately 30+ systems, 4 crashed with the following kernel message
BUG: unable to handle kernel paging request at 0000000000010079
IP: [<ffffffff8137f2fc>] scsi_finish_command+0xac/0x130
PGD 8d47c4067 PUD 533cc8067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:08:00.1/host2/rport-2:0-16/target2:0:14/2:0:14:0/block/sdsf/queue/read_ahead_kb
CPU 0
Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mptctl mptbase nfsd exportfs oracleasm(U) autofs4 nfs fscache auth_rpcgss nfs_acl lockd sunrpc pcc_cpufreq bonding ipv6 ext3 jbd emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) iTCO_wdt iTCO_vendor_support microcode serio_raw sb_edac edac_core i2c_i801 lpc_ich mfd_core tg3 hpilo hpwdt power_meter acpi_ipmi ipmi_si ipmi_msghandler igb i2c_algo_bit i2c_core ixgbe dca ptp pps_core mdio sg ext4 jbd2 mbcache sd_mod xhci_hcd lpfc(U) scsi_transport_fc scsi_tgt crc_t10dif hpsa wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: emcpioc]

Pid: 0, comm: swapper Tainted: P        W  ---------------    2.6.32-504.23.4.el6.x86_64 #1 HP ProLiant DL380 Gen9
RIP: 0010:[<ffffffff8137f2fc>]  [<ffffffff8137f2fc>] scsi_finish_command+0xac/0x130
RSP: 0018:ffff880028203e40  EFLAGS: 00010286
RAX: 0000000000010001 RBX: ffff88106dd7b580 RCX: 000000000000721b
RDX: ffff88402005c800 RSI: 0000000000000286 RDI: 0000000000000286
RBP: ffff880028203e60 R08: ffff884021a8d288 R09: ffff8840269fd090
R10: ffff880028203d08 R11: 0000000000000001 R12: 0000000000004000
R13: ffff88401f46d800 R14: ffff8840219d2000 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000010079 CR3: 0000000ecc6a0000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
Stack:
 ffff88106dd7b580 0000000000002002 0000000000007530 0000000000000005
<d> ffff880028203e90 ffffffff81389637 ffff880028203ea0 ffffffff81a830a0
<d> 0000000000000020 0000000000000100 ffff880028203ec0 ffffffff812787f5
Call Trace:
 <IRQ>
 [<ffffffff81389637>] scsi_softirq_done+0x147/0x170
 [<ffffffff812787f5>] blk_done_softirq+0x85/0xa0
 [<ffffffff8107d901>] __do_softirq+0xc1/0x1e0
 [<ffffffff810eac70>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c38c>] call_softirq+0x1c/0x30
 [<ffffffff8100fbd5>] do_softirq+0x65/0xa0
 [<ffffffff8107d7b5>] irq_exit+0x85/0x90
 [<ffffffff81533ba5>] do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ret_from_intr+0x0/0x11
 <EOI>
 [<ffffffff812eab5e>] ? intel_idle+0xde/0x170
 [<ffffffff812eab41>] ? intel_idle+0xc1/0x170
 [<ffffffff81426167>] cpuidle_idle_call+0xa7/0x140
 [<ffffffff81009fc6>] cpu_idle+0xb6/0x110
 [<ffffffff8151067a>] rest_init+0x7a/0x80
 [<ffffffff81c29f8f>] start_kernel+0x424/0x430
 [<ffffffff81c2933a>] x86_64_start_reservations+0x125/0x129
 [<ffffffff81c29453>] x86_64_start_kernel+0x115/0x124
Code: 77 7c 48 8b 83 80 00 00 00 44 8b 63 68 83 78 44 02 74 2e 48 8b 90 b0 00 00 00 31 c0 48 85 d2 74 0a 48 8b 82 c8 02 00 00 48 8b 00 <48> 8b 40 78 48 85 c0 74 33 48 89 df ff d0 44 39 e0 74 29 41 89
RIP  [<ffffffff8137f2fc>] scsi_finish_command+0xac/0x130
 RSP <ffff880028203e40>
CR2: 0000000000010079

Environment

  • Red Hat Enterprise Linux 6
    • kernel 2.6.32-504.23.4.el6
    • 3rd party modules: lpfc (U) emcpgpx P(U) emcp P(U) emcpmpx P(U) emcpdm P(U) oracleasm (U) oracleoks P(U) oracleadvm P(U) oracleacfs P(U)
  • EMC powerpath
    • EMCpower.LINUX-6.0.0.00.00-158.el6.x86_64
  • EMC SAN upgrade (host equivalent of port up / down type test)
  • Emulex FC HBA Detail
08:00.0 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03)
08:00.1 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03)
  • Emulex FC Driver version (non-Red Hat)
Emulex LightPulse Fibre Channel SCSI driver 10.2.477.17
Copyright(c) 2004-2013 Emulex.  All rights reserved.
scsi1 : Emulex LPe12000 PCIe Fibre Channel Adapter  on PCI bus 08 device 00 irq 16
scsi2 : Emulex LPe12000 PCIe Fibre Channel Adapter  on PCI bus 08 device 01 irq 17
  • Oracle ASM

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.