System crashed with message - 'WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)'

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 6, 7
  • Broadcom Corporation BCM57840 NetXtreme II 10 Gigabit Ethernet (for FCoE)

Issue

  • System crashed with following errors in Red Hat Enterprise Linux 6 :

    hrtimer: interrupt took 20589 ns
    ------------[ cut here ]------------
    WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)
    Hardware name: PowerEdge R820
    list_del corruption. prev->next should be ffff882035627d90, but was ffff884069541588
    [...]
    Pid: 40267, comm: perl Not tainted 2.6.32-642.3.1.el6.x86_64 #1
    Call Trace:
     [<ffffffff8107c691>] ? warn_slowpath_common+0x91/0xe0
     [<ffffffff8107c796>] ? warn_slowpath_fmt+0x46/0x60
     [<ffffffff812ad16e>] ? list_del+0x6e/0xa0
     [<ffffffff81547eed>] ? wait_for_common+0x14d/0x180
     [<ffffffff8106c4a0>] ? default_wake_function+0x0/0x20
     [<ffffffff81547fd3>] ? wait_for_completion_timeout+0x13/0x20
     [<ffffffffa05410b1>] ? bnx2fc_get_host_stats+0xa1/0x280 [bnx2fc]
     [<ffffffffa04cf630>] ? fc_stat_show+0x90/0xc0 [scsi_transport_fc]
     [<ffffffffa04cf8b6>] ? show_fcstat_tx_frames+0x16/0x20 [scsi_transport_fc]
     [<ffffffff8137c647>] ? dev_attr_show+0x27/0x50
     [<ffffffff8113b9be>] ? __get_free_pages+0xe/0x50
     [<ffffffff812170e1>] ? sysfs_read_file+0x111/0x200
     [<ffffffff8119a305>] ? vfs_read+0xb5/0x1a0
     [<ffffffff8119b0b6>] ? fget_light_pos+0x16/0x50
     [<ffffffff8119a651>] ? sys_read+0x51/0xb0
     [<ffffffff810ee1fe>] ? __audit_syscall_exit+0x25e/0x290
     [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
    ---[ end trace 606cd09771e7a6a7 ]---
    ------------[ cut here ]------------
    [...]
    bnx2fc: eh_abort: io_req (xid = 0x363) already in abts processing
    Watchdog detected hard LOCKUP on cpu 35
    Pid: 40760, comm: perl Tainted: G        W  -- ------------    2.6.32-642.3.1.el6.x86_64 #1
    Call Trace:
     <NMI>  [<ffffffff810f29c1>] ? watchdog_overflow_callback+0xf1/0x110
     [<ffffffff8112b7a7>] ? __perf_event_overflow+0xa7/0x240
     [<ffffffff8101dee6>] ? x86_perf_event_set_period+0xf6/0x180
     [<ffffffff8112be04>] ? perf_event_overflow+0x14/0x20
     [<ffffffff810252bc>] ? intel_pmu_handle_irq+0x21c/0x480
     [<ffffffff8154bfc9>] ? perf_event_nmi_handler+0x39/0xb0
     [<ffffffff8154dac5>] ? notifier_call_chain+0x55/0x80
     [<ffffffff8154db2a>] ? atomic_notifier_call_chain+0x1a/0x20
     [<ffffffff810acc2e>] ? notify_die+0x2e/0x30
     [<ffffffff8154b743>] ? do_nmi+0x1c3/0x350
     [<ffffffff8154b003>] ? nmi+0x83/0x90
     [<ffffffff8154a738>] ? _spin_lock_irq+0x28/0x40
     <<EOE>>  [<ffffffff81547ddc>] ? wait_for_common+0x3c/0x180
     [<ffffffff81547fd3>] ? wait_for_completion_timeout+0x13/0x20
     [<ffffffffa05410b1>] ? bnx2fc_get_host_stats+0xa1/0x280 [bnx2fc]
     [<ffffffffa04cf630>] ? fc_stat_show+0x90/0xc0 [scsi_transport_fc]
     [<ffffffffa04cf8b6>] ? show_fcstat_tx_frames+0x16/0x20 [scsi_transport_fc]
     [<ffffffff8137c647>] ? dev_attr_show+0x27/0x50
     [<ffffffff8113b9be>] ? __get_free_pages+0xe/0x50
     [<ffffffff812170e1>] ? sysfs_read_file+0x111/0x200
     [<ffffffff8119a305>] ? vfs_read+0xb5/0x1a0
     [<ffffffff8119b0b6>] ? fget_light_pos+0x16/0x50
     [<ffffffff8119a651>] ? sys_read+0x51/0xb0
     [<ffffffff810ee1fe>] ? __audit_syscall_exit+0x25e/0x290
     [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
    Kernel panic - not syncing: Hard LOCKUP
    
  • System crashed with following errors in Red Hat Enterprise Linux 7 :

    [4678539.802432] ------------[ cut here ]------------
    [4678539.802449] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
    [4678539.802452] list_del corruption. prev->next should be ffff887f1533fde0, but was ffff883f19e81180
    [4678539.802455] Modules linked in: arc4 md4 nls_utf8 cifs iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge bonding nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_service_time ses enclosure bnx2fc(OE) 8021q cnic(OE) uio garp stp mrp netconsole llc fcoe libfcoe libfc scsi_transport_fc scsi_tgt iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg hpwdt ipmi_ssif pcspkr hpilo i2c_i801 lpc_ich mfd_core sb_edac edac_core ipmi_si shpchp ipmi_msghandler pcc_cpufreq acpi_power_meter wmi dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs usb_storage sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common
    [4678539.802548]  mgag200 crc32c_intel syscopyarea sysfillrect serio_raw sysimgblt i2c_algo_bit drm_kms_helper ttm bnx2x(OE) mdio drm vxlan ip6_udp_tunnel udp_tunnel i2c_core ptp pps_core libcrc32c hpsa dm_mirror dm_region_hash dm_log dm_mod
    [4678539.802564] CPU: 37 PID: 53261 Comm: hpsensor Tainted: G        W  OE  ------------   3.10.0-327.el7.x86_64 #1
    [4678539.802565] Hardware name: HP Synergy 480 Gen9/Synergy 480 Gen9 Compute Module, BIOS I37 02/17/2017
    [4678539.802567]  ffff887f1533fd38 00000000d5990cbd ffff887f1533fcf0 ffffffff816351f1
    [4678539.802572]  ffff887f1533fd28 ffffffff8107b200 ffff887f1533fde0 0000000000000000
    [4678539.802575]  ffff883f19e81178 0000000000000000 ffff883f19e81170 ffff887f1533fd90
    [4678539.802580] Call Trace:
    [4678539.802589]  [<ffffffff816351f1>] dump_stack+0x19/0x1b
    [4678539.802596]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
    [4678539.802598]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
    [4678539.802601]  [<ffffffff8130c3f1>] __list_del_entry+0xa1/0xd0
    [4678539.802603]  [<ffffffff8130c42d>] list_del+0xd/0x30
    [4678539.802609]  [<ffffffff8163b167>] wait_for_completion_timeout+0x147/0x180
    [4678539.802617]  [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
    [4678539.802624]  [<ffffffffa04f1219>] bnx2fc_get_host_stats+0x89/0x2b0 [bnx2fc]
    [4678539.802630]  [<ffffffffa0410fca>] fc_stat_show.isra.17+0x6a/0xc0 [scsi_transport_fc]
    [4678539.802633]  [<ffffffffa04112d9>] show_fcstat_error_frames+0x19/0x20 [scsi_transport_fc]
    [4678539.802639]  [<ffffffff813f1c20>] dev_attr_show+0x20/0x60
    [4678539.802645]  [<ffffffff8125928a>] sysfs_read_file+0x9a/0x1a0
    [4678539.802650]  [<ffffffff811de43c>] vfs_read+0x9c/0x170
    [4678539.802652]  [<ffffffff811def8f>] SyS_read+0x7f/0xe0
    [4678539.802658]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
    [4678539.802659] ---[ end trace 35022e1ea5c3728a ]---
    [...]
    [4678636.673551] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 37
    [4678636.674199] CPU: 37 PID: 53261 Comm: hpsensor Tainted: G        W  OE  ------------   3.10.0-327.el7.x86_64 #1
    [4678636.674810] Hardware name: HP Synergy 480 Gen9/Synergy 480 Gen9 Compute Module, BIOS I37 02/17/2017
    [4678636.675460]  ffffffff818674b8 00000000d5990cbd ffff883f7fce5af0 ffffffff816351f1
    [4678636.676109]  ffff883f7fce5b70 ffffffff8162ea6c 0000000000000010 ffff883f7fce5b80
    [4678636.676736]  ffff883f7fce5b20 00000000d5990cbd 0000000000000000 0000000000000025
    [4678636.677365] Call Trace:
    [4678636.677980]  <NMI>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
    [4678636.678608]  [<ffffffff8162ea6c>] panic+0xd8/0x1e7
    [4678636.679238]  [<ffffffff8111b450>] ? restart_watchdog_hrtimer+0x50/0x50
    [4678636.679867]  [<ffffffff8111b512>] watchdog_overflow_callback+0xc2/0xd0
    [4678636.680566]  [<ffffffff8115ed71>] __perf_event_overflow+0xa1/0x250
    [4678636.681212]  [<ffffffff8115f844>] perf_event_overflow+0x14/0x20
    [4678636.681842]  [<ffffffff810325a8>] intel_pmu_handle_irq+0x1e8/0x470
    [4678636.682462]  [<ffffffff812f4b11>] ? ioremap_page_range+0x241/0x320
    [4678636.683092]  [<ffffffff811a59b1>] ? unmap_kernel_range_noflush+0x11/0x20
    [4678636.683704]  [<ffffffff81395b94>] ? ghes_copy_tofrom_phys+0x124/0x210
    [4678636.684320]  [<ffffffff81395d20>] ? ghes_read_estatus+0xa0/0x190
    [4678636.684934]  [<ffffffff8163ebeb>] perf_event_nmi_handler+0x2b/0x50
    [4678636.685547]  [<ffffffff8163e339>] nmi_handle.isra.0+0x69/0xb0
    [4678636.686147]  [<ffffffff8163e4e9>] do_nmi+0x169/0x340
    [4678636.686735]  [<ffffffff8163d771>] end_repeat_nmi+0x1e/0x2e
    [4678636.687304]  [<ffffffff8163ccba>] ? _raw_spin_lock_irq+0x3a/0x60
    [4678636.687868]  [<ffffffff8163ccba>] ? _raw_spin_lock_irq+0x3a/0x60
    [4678636.688417]  [<ffffffff8163ccba>] ? _raw_spin_lock_irq+0x3a/0x60
    [4678636.688946]  <<EOE>>  [<ffffffff8163b05a>] wait_for_completion_timeout+0x3a/0x180
    [4678636.689486]  [<ffffffffa047a5d0>] ? cnic_submit_bnx2x_kwqes+0x40/0x80 [cnic]
    [4678636.690019]  [<ffffffffa04f6e1c>] ? bnx2fc_send_stat_req+0x7c/0xa0 [bnx2fc]
    [4678636.690594]  [<ffffffffa04f1219>] bnx2fc_get_host_stats+0x89/0x2b0 [bnx2fc]
    [4678636.691104]  [<ffffffffa0410fca>] fc_stat_show.isra.17+0x6a/0xc0 [scsi_transport_fc]
    [4678636.691616]  [<ffffffffa0411399>] show_fcstat_tx_frames+0x19/0x20 [scsi_transport_fc]
    [4678636.692119]  [<ffffffff813f1c20>] dev_attr_show+0x20/0x60
    [4678636.692613]  [<ffffffff8125928a>] sysfs_read_file+0x9a/0x1a0
    [4678636.693102]  [<ffffffff811de43c>] vfs_read+0x9c/0x170
    [4678636.693581]  [<ffffffff811def8f>] SyS_read+0x7f/0xe0
    [4678636.694053]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
    

Resolution

Red Hat Enterprise Linux 6.9.z: upgrade to kernel-2.6.32-696.10.1.el6 from RHBA-2017:2504 or later.
Red Hat Enterprise Linux 6.10: upgrade to kernel-2.6.32-754.el6 from RHSA-2018:1854 or later.
Red Hat Enterprise Linux 7: upgrade to kernel-3.10.0-862.el7.x86_64.rpm from RHSA-2018:1062 or later.
NOTE: If the issue happens with bnx2fc of 3rd party module, please contact a vendor if there is an updated version to resolve it.

Root Cause

If multiple tasks attempt to read the stats, it may happen
that the start_req_done completion is re-initialized while
still being used by another task, causing a list corruption.

This patch fixes the bug by adding a mutex to serialize the
calls to bnx2fc_get_host_stats().

WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)
Hardware name: PowerEdge R820
list_del corruption. prev->next should be ffff882035627d90, but was ffff884069541588

Pid: 40267, comm: perl Not tainted 2.6.32-642.3.1.el6.x86_64 #1
Call Trace:
 [<ffffffff8107c691>] ? warn_slowpath_common+0x91/0xe0
 [<ffffffff8107c796>] ? warn_slowpath_fmt+0x46/0x60
 [<ffffffff812ad16e>] ? list_del+0x6e/0xa0
 [<ffffffff81547eed>] ? wait_for_common+0x14d/0x180
 [<ffffffff8106c4a0>] ? default_wake_function+0x0/0x20
 [<ffffffff81547fd3>] ? wait_for_completion_timeout+0x13/0x20
 [<ffffffffa05410b1>] ? bnx2fc_get_host_stats+0xa1/0x280 [bnx2fc]
 [<ffffffffa04cf630>] ? fc_stat_show+0x90/0xc0 [scsi_transport_fc]
 [<ffffffffa04cf8b6>] ? show_fcstat_tx_frames+0x16/0x20 [scsi_transport_fc]
 [<ffffffff8137c647>] ? dev_attr_show+0x27/0x50
 [<ffffffff8113b9be>] ? __get_free_pages+0xe/0x50
 [<ffffffff812170e1>] ? sysfs_read_file+0x111/0x200
 [<ffffffff8119a305>] ? vfs_read+0xb5/0x1a0
 [<ffffffff8119b0b6>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a651>] ? sys_read+0x51/0xb0
 [<ffffffff810ee1fe>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b

Diagnostic Steps

Red Hat Enterprise Linux 6

  • The backtrace shows that there was a custom perl script running on server (PID: 40760), this script is configured to run through crontab and collect statistics for FCoE HBA:

     PID: 1      TASK: ffff881069fff520  CPU: 8   COMMAND: "init"
      PID: 22267  TASK: ffff883069a1c040  CPU: 0   COMMAND: "crond"
       PID: 40758  TASK: ffff881058b67520  CPU: 37  COMMAND: "crond"    <---------- crond triggering a perl script 
        PID: 40759  TASK: ffff8820613f6040  CPU: 39  COMMAND: "sh"              which gathers Qlogic FCoE HBA
         PID: 40760  TASK: ffff884067489520  CPU: 35  COMMAND: "perl"   <---------- statistics
    
  • This process was running on CPU 35:

    crash> bt
    PID: 40760  TASK: ffff884067489520  CPU: 35  COMMAND: "perl"
    [...]
    #11 [ffff8830b8906f50] nmi at ffffffff8154b003
        [exception RIP: _spin_lock_irq+0x28]
        RIP: ffffffff8154a738  RSP: ffff884062a03d48  RFLAGS: 00000006
        RAX: 0000000000000000  RBX: 00000000000007d0  RCX: 0000000000000001
        RDX: 0000000000000003  RSI: 00000000000007d0  RDI: ffff884069541580
        RBP: ffff884062a03d48   R8: 0000000000000004   R9: 0000000000000000
        R10: ffffffffa021046c  R11: 0000000000000246  R12: ffff884069541578
        R13: ffff884069541580  R14: 0000000000000002  R15: 0000000001a0bc90
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    --- <NMI exception stack> ---
    #12 [ffff884062a03d48] _spin_lock_irq at ffffffff8154a738
    #13 [ffff884062a03d50] wait_for_common at ffffffff81547ddc
    #14 [ffff884062a03de0] wait_for_completion_timeout at ffffffff81547fd3
    #15 [ffff884062a03df0] bnx2fc_get_host_stats at ffffffffa05410b1 [bnx2fc]
    #16 [ffff884062a03e20] fc_stat_show at ffffffffa04cf630 [scsi_transport_fc]
    #17 [ffff884062a03e50] show_fcstat_tx_frames at ffffffffa04cf8b6 [scsi_transport_fc]
    #18 [ffff884062a03e60] dev_attr_show at ffffffff8137c647
    #19 [ffff884062a03e90] sysfs_read_file at ffffffff812170e1
    #20 [ffff884062a03ef0] vfs_read at ffffffff8119a305
    #21 [ffff884062a03f30] sys_read at ffffffff8119a651
    [...]
    
  • While collecting the HBA statistics, we had acquired a spin lock in following block in bnx2fc_get_host_stats routine, and then encountered a corruption with wait queue list which was used for interrupt handling:

    drivers/scsi/bnx2fc/bnx2fc_fcoe.c
     632 static struct fc_host_statistics *bnx2fc_get_host_stats(struct Scsi_Host *shost)
     633 {
    [...]
     641 
     642         fw_stats = (struct fcoe_statistics_params *)hba->stats_buffer;
     643         if (!fw_stats)
     644                 return NULL;
     645 
     646         bnx2fc_stats = fc_get_host_stats(shost);
     647 
     648         init_completion(&hba->stat_req_done);
     649         if (bnx2fc_send_stat_req(hba))
     650                 return bnx2fc_stats;
     651         rc = wait_for_completion_timeout(&hba->stat_req_done, (2 * HZ));   <<---------- acquired the spin lock here and disabled the interrupts.
    [...]
    
    
    RFLAGS: 00000006    <<---------- interrupts on CPU 35 were disabled
    
  • There are also following list corruption messages in vmcore:

    ------------[ cut here ]------------
    WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)
    Hardware name: PowerEdge R820
    list_del corruption. prev->next should be ffff882035627d90, but was ffff884069541588
    [...]
    
    ------------[ cut here ]------------
    WARNING: at lib/list_debug.c:51 list_del+0x8d/0xa0() (Tainted: G        W  -- ------------   )
    Hardware name: PowerEdge R820
    list_del corruption. next->prev should be ffff882035627d90, but was ffff884069541588
    [...]
    
  • It looks that a hard lockup was occurred due to a corrupt wait queue list.

Red Hat Enterprise Linux 7

  • In this example, system is using 3rd party bnx2fc driver.
crash> mod -t
NAME    TAINTS
bnx2x   OE
cnic    OE
bnx2fc  OE

crash> module.version ffffffffa05116a0
  version = 0xffff883f24980bd0 "2.11.6.1"

crash> ps -p 53261
PID: 0      TASK: ffffffff81951440  CPU: 0   COMMAND: "swapper/0"
 PID: 1      TASK: ffff883f27228000  CPU: 27  COMMAND: "systemd"
  PID: 52874  TASK: ffff880236618b80  CPU: 16  COMMAND: "ovcd"
   PID: 53261  TASK: ffff883d9a682e00  CPU: 37  COMMAND: "hpsensor"


[4674081.338281] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
[4674081.338285] list_del corruption. prev->next should be ffff887f1533fde0, but was ffff883f19e81180
[4674081.338287] Modules linked in: arc4 md4 nls_utf8 cifs iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge bonding nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_service_time ses enclosure bnx2fc(OE) 8021q cnic(OE) uio garp stp mrp netconsole llc fcoe libfcoe libfc scsi_transport_fc scsi_tgt iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg hpwdt ipmi_ssif pcspkr hpilo i2c_i801 lpc_ich mfd_core sb_edac edac_core ipmi_si shpchp ipmi_msghandler pcc_cpufreq acpi_power_meter wmi dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs usb_storage sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common
[4674081.338381]  mgag200 crc32c_intel syscopyarea sysfillrect serio_raw sysimgblt i2c_algo_bit drm_kms_helper ttm bnx2x(OE) mdio drm vxlan ip6_udp_tunnel udp_tunnel i2c_core ptp pps_core libcrc32c hpsa dm_mirror dm_region_hash dm_log dm_mod
[4674081.338408] CPU: 28 PID: 53261 Comm: hpsensor Tainted: G           OE  ------------   3.10.0-327.el7.x86_64 #1
[4674081.338411] Hardware name: HP Synergy 480 Gen9/Synergy 480 Gen9 Compute Module, BIOS I37 02/17/2017
[4674081.338413]  ffff887f1533fd38 00000000d5990cbd ffff887f1533fcf0 ffffffff816351f1
[4674081.338423]  ffff887f1533fd28 ffffffff8107b200 ffff887f1533fde0 0000000000000000
[4674081.338431]  ffff883f19e81178 0000000000000000 ffff883f19e81170 ffff887f1533fd90
[4674081.338439] Call Trace:
[4674081.338452]  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[4674081.338462]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
[4674081.338467]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
[4674081.338473]  [<ffffffff8130c3f1>] __list_del_entry+0xa1/0xd0
[4674081.338479]  [<ffffffff8130c42d>] list_del+0xd/0x30
[4674081.338489]  [<ffffffff8163b167>] wait_for_completion_timeout+0x147/0x180
[4674081.338501]  [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[4674081.338512]  [<ffffffffa04f1219>] bnx2fc_get_host_stats+0x89/0x2b0 [bnx2fc]
[4674081.338523]  [<ffffffffa0410fca>] fc_stat_show.isra.17+0x6a/0xc0 [scsi_transport_fc]
[4674081.338531]  [<ffffffffa0411239>] show_fcstat_prim_seq_protocol_err_count+0x19/0x20 [scsi_transport_fc]
[4674081.338540]  [<ffffffff813f1c20>] dev_attr_show+0x20/0x60
[4674081.338563]  [<ffffffff8125928a>] sysfs_read_file+0x9a/0x1a0
[4674081.338568]  [<ffffffff811de43c>] vfs_read+0x9c/0x170
[4674081.338570]  [<ffffffff811def8f>] SyS_read+0x7f/0xe0
[4674081.338576]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
[4674081.338578] ---[ end trace 35022e1ea5c371e0 ]---

[4678636.673551] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 37
[4678636.674199] CPU: 37 PID: 53261 Comm: hpsensor Tainted: G        W  OE  ------------   3.10.0-327.el7.x86_64 #1
[4678636.674810] Hardware name: HP Synergy 480 Gen9/Synergy 480 Gen9 Compute Module, BIOS I37 02/17/2017
[4678636.675460]  ffffffff818674b8 00000000d5990cbd ffff883f7fce5af0 ffffffff816351f1
[4678636.676109]  ffff883f7fce5b70 ffffffff8162ea6c 0000000000000010 ffff883f7fce5b80
[4678636.676736]  ffff883f7fce5b20 00000000d5990cbd 0000000000000000 0000000000000025
[4678636.677365] Call Trace:
[4678636.677980]  <NMI>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[4678636.678608]  [<ffffffff8162ea6c>] panic+0xd8/0x1e7
[4678636.679238]  [<ffffffff8111b450>] ? restart_watchdog_hrtimer+0x50/0x50
[4678636.679867]  [<ffffffff8111b512>] watchdog_overflow_callback+0xc2/0xd0
[4678636.680566]  [<ffffffff8115ed71>] __perf_event_overflow+0xa1/0x250
[4678636.681212]  [<ffffffff8115f844>] perf_event_overflow+0x14/0x20
[4678636.681842]  [<ffffffff810325a8>] intel_pmu_handle_irq+0x1e8/0x470
[4678636.682462]  [<ffffffff812f4b11>] ? ioremap_page_range+0x241/0x320
[4678636.683092]  [<ffffffff811a59b1>] ? unmap_kernel_range_noflush+0x11/0x20
[4678636.683704]  [<ffffffff81395b94>] ? ghes_copy_tofrom_phys+0x124/0x210
[4678636.684320]  [<ffffffff81395d20>] ? ghes_read_estatus+0xa0/0x190
[4678636.684934]  [<ffffffff8163ebeb>] perf_event_nmi_handler+0x2b/0x50
[4678636.685547]  [<ffffffff8163e339>] nmi_handle.isra.0+0x69/0xb0
[4678636.686147]  [<ffffffff8163e4e9>] do_nmi+0x169/0x340
[4678636.686735]  [<ffffffff8163d771>] end_repeat_nmi+0x1e/0x2e
[4678636.687304]  [<ffffffff8163ccba>] ? _raw_spin_lock_irq+0x3a/0x60
[4678636.687868]  [<ffffffff8163ccba>] ? _raw_spin_lock_irq+0x3a/0x60
[4678636.688417]  [<ffffffff8163ccba>] ? _raw_spin_lock_irq+0x3a/0x60
[4678636.688946]  <<EOE>>  [<ffffffff8163b05a>] wait_for_completion_timeout+0x3a/0x180
[4678636.689486]  [<ffffffffa047a5d0>] ? cnic_submit_bnx2x_kwqes+0x40/0x80 [cnic]
[4678636.690019]  [<ffffffffa04f6e1c>] ? bnx2fc_send_stat_req+0x7c/0xa0 [bnx2fc]
[4678636.690594]  [<ffffffffa04f1219>] bnx2fc_get_host_stats+0x89/0x2b0 [bnx2fc]
[4678636.691104]  [<ffffffffa0410fca>] fc_stat_show.isra.17+0x6a/0xc0 [scsi_transport_fc]
[4678636.691616]  [<ffffffffa0411399>] show_fcstat_tx_frames+0x19/0x20 [scsi_transport_fc]
[4678636.692119]  [<ffffffff813f1c20>] dev_attr_show+0x20/0x60
[4678636.692613]  [<ffffffff8125928a>] sysfs_read_file+0x9a/0x1a0
[4678636.693102]  [<ffffffff811de43c>] vfs_read+0x9c/0x170
[4678636.693581]  [<ffffffff811def8f>] SyS_read+0x7f/0xe0
[4678636.694053]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
  • The backtrace shows that there was a hpsensor process running on server (PID: 53261), this process is configured to run through ovcd and collect statistics for FCoE HBA:
crash> ps -p 53261
PID: 0      TASK: ffffffff81951440  CPU: 0   COMMAND: "swapper/0"
 PID: 1      TASK: ffff883f27228000  CPU: 27  COMMAND: "systemd"
  PID: 52874  TASK: ffff880236618b80  CPU: 16  COMMAND: "ovcd"
   PID: 53261  TASK: ffff883d9a682e00  CPU: 37  COMMAND: "hpsensor"
  • This process was running on CPU 37:
crash> bt
PID: 53261  TASK: ffff883d9a682e00  CPU: 37  COMMAND: "hpsensor"
 #0 [ffff883f7fce59c8] machine_kexec at ffffffff81051beb
 #1 [ffff883f7fce5a28] crash_kexec at ffffffff810f2542
 #2 [ffff883f7fce5af8] panic at ffffffff8162ea73
 #3 [ffff883f7fce5b78] watchdog_overflow_callback at ffffffff8111b512
 #4 [ffff883f7fce5b88] __perf_event_overflow at ffffffff8115ed71
 #5 [ffff883f7fce5c00] perf_event_overflow at ffffffff8115f844
 #6 [ffff883f7fce5c10] intel_pmu_handle_irq at ffffffff810325a8
 #7 [ffff883f7fce5e60] perf_event_nmi_handler at ffffffff8163ebeb
 #8 [ffff883f7fce5e80] nmi_handle at ffffffff8163e339
 #9 [ffff883f7fce5ec8] do_nmi at ffffffff8163e4e9
#10 [ffff883f7fce5ef0] end_repeat_nmi at ffffffff8163d771
    [exception RIP: _raw_spin_lock_irq+58]
    RIP: ffffffff8163ccba  RSP: ffff887f1533fdb8  RFLAGS: 00000097
    RAX: 0000000000005ee9  RBX: ffff883f19e81170  RCX: 0000000000000004
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff883f19e81178
    RBP: ffff887f1533fdb8   R8: 0000000000000004   R9: 000002041202001b
    R10: 0000003ece7ff000  R11: 0000000000000000  R12: 00000000000007d0
    R13: ffff883f19e81178  R14: ffff883f22512000  R15: ffff883f19e81170
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#11 [ffff887f1533fdb8] _raw_spin_lock_irq at ffffffff8163ccba
#12 [ffff887f1533fdc0] wait_for_completion_timeout at ffffffff8163b05a
#13 [ffff887f1533fe20] bnx2fc_get_host_stats at ffffffffa04f1219 [bnx2fc]
#14 [ffff887f1533fe58] fc_stat_show at ffffffffa0410fca [scsi_transport_fc]
#15 [ffff887f1533fe88] show_fcstat_tx_frames at ffffffffa0411399 [scsi_transport_fc]
#16 [ffff887f1533fe98] dev_attr_show at ffffffff813f1c20
#17 [ffff887f1533feb8] sysfs_read_file at ffffffff8125928a
#18 [ffff887f1533ff08] vfs_read at ffffffff811de43c
#19 [ffff887f1533ff38] sys_read at ffffffff811def8f
#20 [ffff887f1533ff80] system_call_fastpath at ffffffff81645909
    RIP: 00007f642672e9ed  RSP: 00007f64197c3af8  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffffffff81645909  RCX: ffffffffffffffff
    RDX: 0000000000001000  RSI: 00007f6427bb9000  RDI: 000000000000000c
    RBP: 000000000000000a   R8: 00000000ffffffff   R9: 0000000000000000
    R10: 0000000000000022  R11: 0000000000000293  R12: 00007f63e8051580
    R13: 0000000000000000  R14: 0000000000000000  R15: 00007f63e8051580
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
  • While collecting the HBA statistics, we had acquired a spin lock in following block in bnx2fc_get_host_stats routine, and then encountered a corruption with wait queue list which was used for interrupt handling:
 630 static struct fc_host_statistics *bnx2fc_get_host_stats(struct Scsi_Host *shost)
 631 {
 632         struct fc_host_statistics *bnx2fc_stats;
 633         struct fc_lport *lport = shost_priv(shost);
 634         struct fcoe_port *port = lport_priv(lport);
 635         struct bnx2fc_interface *interface = port->priv;
 636         struct bnx2fc_hba *hba = interface->hba;
 637         struct fcoe_statistics_params *fw_stats;
 638         int rc = 0;
 639 
 640         fw_stats = (struct fcoe_statistics_params *)hba->stats_buffer;
 641         if (!fw_stats)
 642                 return NULL;
 643 
 644         bnx2fc_stats = fc_get_host_stats(shost);
 645 
 646         init_completion(&hba->stat_req_done);
 647         if (bnx2fc_send_stat_req(hba))
 648                 return bnx2fc_stats;
 649         rc = wait_for_completion_timeout(&hba->stat_req_done, (2 * HZ));  <<---------- acquired the spin lock here and disabled the interrupts.

/usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/drivers/scsi/bnx2fc/bnx2fc_els.c: 769
0xffffffffa04f1204 <bnx2fc_get_host_stats+0x74>:        test   %eax,%eax
/usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/include/scsi/fc_frame.h: 47
0xffffffffa04f1206 <bnx2fc_get_host_stats+0x76>:        jne    0xffffffffa04f13e4 <bnx2fc_get_host_stats+0x254>
/usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/include/scsi/fc_encode.h: 65
0xffffffffa04f120c <bnx2fc_get_host_stats+0x7c>:        mov    $0x7d0,%esi
/usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/include/scsi/fc_encode.h: 67
0xffffffffa04f1211 <bnx2fc_get_host_stats+0x81>:        mov    %r15,%rdi
0xffffffffa04f1214 <bnx2fc_get_host_stats+0x84>:        callq  0xffffffff8163b020 <wait_for_completion_timeout>

R15: ffff883f19e81170

crash> completion ffff883f19e81170
struct completion {
  done = 0x0, 
  wait = {
    lock = {
      {
        rlock = {
          raw_lock = {
            {
              head_tail = 0x40004, 
              tickets = {
                head = 0x4, 
                tail = 0x4
              }
            }
          }
        }
      }
    }, 
    task_list = {
      next = 0xffff883f19e81180, 
      prev = 0xffff883f19e81180
    }
  }
}

  • There are also following list corruption messages in vmcore:
[4674081.338281] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
[4674081.338285] list_del corruption. prev->next should be ffff887f1533fde0, but was ffff883f19e81180
  • Therefore, it is obviously a hard lockup was occurred due to a corrupt wait queue list.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments