Kernel tainted with WARNING on RIP pci_irq_get_affinity+0x3b/0x80

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 8
  • kernel-4.18.0-80.el8.x86_64

Issue

  • Hardware certification failed with Kernel tainted message.
  • WARNING: CPU: X PID: XXXX at drivers/pci/msi.c:1274 pci_irq_get_affinity+0x3b/0x80
  • RIP in function pci_irq_get_affinity+0x3b/0x80

Resolution

  • The issue is reported by internal Bugzilla ID 1699135 and upstream patch has been proposed to fix this.
  • This fix will be available in a future update to Red Hat Enterprise Linux 8.
  • Until then this warning can be safely ignored.

Root Cause

  • The warning is triggered when BLK-MQ is enabled and hardware does not support multiqueue (MQ). This result into driver requesting MSIx vectors which are equal or less than pre_desc via PCI IRQ Affinity infrastructure.

Diagnostic Steps

  • System Information

    $ cat etc/redhat-release 
    Red Hat Enterprise Linux release 8.0 (Ootpa)
    
    $ awk '{print $3}' uname 
    4.18.0-80.el8.x86_64
    
  • Kernel is tainted

    $ cat proc/sys/kernel/tainted 
    512
    
    $ perl -e 'printf "%08b\n", 512'
    1000000000
    
  • Warning messages are captured in /var/log/dmesg file.

    [    7.386131] WARNING: CPU: 2 PID: 502 at drivers/pci/msi.c:1274 pci_irq_get_affinity+0x3b/0x80
    [    7.386132] Modules linked in: joydev ipmi_ssif dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul qla2xxx(+) sg ghash_clmulni_intel nvme_fc pcspkr nvme_fabrics nvme_core sp5100_tco ipmi_si scsi_transport_fc ipmi_devintf ccp i2c_piix4 ipmi_msghandler k10temp acpi_power_meter acpi_cpufreq xfs libcrc32c sr_mod cdrom sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci libata crc32c_intel megaraid_sas tg3 dm_mirror dm_region_hash dm_log dm_mod
    [    7.386157] CPU: 2 PID: 502 Comm: kworker/2:1 Not tainted 4.18.0-80.el8.x86_64 #1
    [    7.386157] Hardware name: Dell Inc. PowerEdge R7415/065PKD, BIOS 1.8.4 02/22/2019
    [    7.386161] Workqueue: events work_for_cpu_fn
    [    7.386163] RIP: 0010:pci_irq_get_affinity+0x3b/0x80
    [    7.386164] Code: 48 8b 87 08 03 00 00 48 81 c7 08 03 00 00 48 39 f8 74 17 85 f6 74 4e 31 d2 eb 04 39 d6 74 46 48 8b 00 83 c2 01 48 39 f8 75 f1 <0f> 0b 31 c0 c3 83 e2 02 48 c7 c0 e0 0e 60 a6 74 29 48 8b 97 08 03
    [    7.386165] RSP: 0018:ffffba2d0e793d08 EFLAGS: 00010246
    [    7.386166] RAX: ffff9af504790308 RBX: 0000000000000000 RCX: ffff9af4be826000
    [    7.386167] RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffff9af504790308
    [    7.386167] RBP: 0000000000000001 R08: ffff9afcdde28180 R09: ffff9af4cf50e000
    [    7.386168] R10: ffff9aed87c0e800 R11: 0000000000000001 R12: 0000000000000002
    [    7.386168] R13: ffff9af504790000 R14: 00000000ffffffff R15: ffff9af50460e0a8
    [    7.386169] FS:  0000000000000000(0000) GS:ffff9afcdde00000(0000) knlGS:0000000000000000
    [    7.386170] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    7.386171] CR2: 000055a2593293d0 CR3: 000000057c80a000 CR4: 00000000003406e0
    [    7.386171] Call Trace:
    [    7.386176]  blk_mq_pci_map_queues+0x37/0xd0
    [    7.386181]  blk_mq_alloc_tag_set+0x121/0x2e0
    [    7.386185]  scsi_add_host_with_dma+0x7e/0x300
    [    7.386200]  qla2x00_probe_one+0x101f/0x2270 [qla2xxx]
    [    7.386204]  ? __switch_to_asm+0x34/0x70
    [    7.386205]  ? __switch_to_asm+0x40/0x70
    [    7.386206]  ? __switch_to_asm+0x34/0x70
    [    7.386207]  ? __switch_to_asm+0x40/0x70
    [    7.386208]  ? __switch_to_asm+0x34/0x70
    [    7.386208]  ? __switch_to_asm+0x40/0x70
    [    7.386209]  ? __switch_to_asm+0x34/0x70
    [    7.386210]  ? __switch_to_asm+0x40/0x70
    [    7.386213]  local_pci_probe+0x41/0x90
    [    7.386214]  work_for_cpu_fn+0x16/0x20
    [    7.386216]  process_one_work+0x1a7/0x360
    [    7.386217]  worker_thread+0x1cf/0x390
    [    7.386219]  ? pwq_unbound_release_workfn+0xd0/0xd0
    [    7.386221]  kthread+0x112/0x130
    [    7.386222]  ? kthread_bind+0x30/0x30
    [    7.386223]  ret_from_fork+0x22/0x40
    [    7.386226] ---[ end trace d79e26f5c45cb39e ]---
    [    7.388590] qla2xxx [0000:84:00.0]-00fb:7: QLogic QLE2562 - PCI-Express Dual Channel 8Gb Fibre Channel HBA.
    [    7.388599] qla2xxx [0000:84:00.0]-00fc:7: ISP2532: PCIe (5.0GT/s x8) @ 0000:84:00.0 hdma+ host#=7 fw=8.07.00 (90d5).
    [    7.388895] qla2xxx [0000:84:00.1]-001b: : BAR 3 not enabled.
    [    7.388897] qla2xxx [0000:84:00.1]-001d: : Found an ISP2532 irq 174 iobase 0x000000000bd1fcac.
    [    7.798120] scsi host8: qla2xxx
    [    7.800705] qla2xxx [0000:84:00.1]-00fb:8: QLogic QLE2562 - PCI-Express Dual Channel 8Gb Fibre Channel HBA.
    [    7.800714] qla2xxx [0000:84:00.1]-00fc:8: ISP2532: PCIe (5.0GT/s x8) @ 0000:84:00.1 hdma+ host#=8 fw=8.07.00 (90d5).
    
    
    All code
    ========
      0:    48 8b 87 08 03 00 00    mov    0x308(%rdi),%rax
      7:    48 81 c7 08 03 00 00    add    $0x308,%rdi
      e:    48 39 f8                cmp    %rdi,%rax
     11:    74 17                   je     0x2a
     13:    85 f6                   test   %esi,%esi
     15:    74 4e                   je     0x65
     17:    31 d2                   xor    %edx,%edx
     19:    eb 04                   jmp    0x1f
     1b:    39 d6                   cmp    %edx,%esi
     1d:    74 46                   je     0x65
     1f:    48 8b 00                mov    (%rax),%rax
     22:    83 c2 01                add    $0x1,%edx
     25:    48 39 f8                cmp    %rdi,%rax
     28:    75 f1                   jne    0x1b
     2a:*   0f 0b                   ud2         <-- trapping instruction
     2c:    31 c0                   xor    %eax,%eax
     2e:    c3                      retq   
     2f:    83 e2 02                and    $0x2,%edx
     32:    48 c7 c0 e0 0e 60 a6    mov    $0xffffffffa6600ee0,%rax
     39:    74 29                   je     0x64
     3b:    48                      rex.W
     3c:    8b                      .byte 0x8b
     3d:    97                      xchg   %eax,%edi
     3e:    08 03                   or     %al,(%rbx)
    
    Code starting with the faulting instruction
    ===========================================
      0:    0f 0b                   ud2    
      2:    31 c0                   xor    %eax,%eax
      4:    c3                      retq   
      5:    83 e2 02                and    $0x2,%edx
      8:    48 c7 c0 e0 0e 60 a6    mov    $0xffffffffa6600ee0,%rax
      f:    74 29                   je     0x3a
     11:    48                      rex.W
     12:    8b                      .byte 0x8b
     13:    97                      xchg   %eax,%edi
     14:    08 03                   or     %al,(%rbx)
    
    
  • Corresponding kernel source code is below.

    ....
    1258 /**
    1259  * pci_irq_get_affinity - return the affinity of a particular msi vector
    1260  * @dev:        PCI device to operate on
    1261  * @nr:         device-relative interrupt vector index (0-based).
    1262  */
    1263 const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
    1264 {
    1265         if (dev->msix_enabled) {
    1266                 struct msi_desc *entry;
    1267                 int i = 0;
    1268 
    1269                 for_each_pci_msi_entry(entry, dev) {
    1270                         if (i == nr)
    1271                                 return entry->affinity;
    1272                         i++;
    1273                 }
    1274                 WARN_ON_ONCE(1);
    ....
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments