Excessive spinlock contention with QLogic Fibre Channel HBA Driver: 10.02.01.00.a14-k1

Solution Unverified - Updated -

Issue

The system is performing poorly with Qlogic fibre channel cards driven by qla2xxx driver version 10.02.01.00.a14-k1.

The performance impact may be so severe that soft lockup and/or hard lockups may be reported.
In cluster environments the affected node may appear hung or unresponsive or the cluster timers may expire and the node could crash when the cluster management brings it down deliberately.

The common feature is that the affected CPUs are running a kernel stack similar to this:

[4017175.362426] Call Trace:
[4017175.362582]  <IRQ>  [<ffffffff9db865c9>] dump_stack+0x19/0x1b
[4017175.362751]  [<ffffffff9db802d1>] panic+0xe8/0x21f
[4017175.362913]  [<ffffffffc0463210>] ? check_user_ids.part.0+0x30/0x30 [deadman]
[4017175.363219]  [<ffffffffc046326f>] deadman_timer_fire+0x5f/0x60 [deadman]
[4017175.363384]  [<ffffffff9d4abf48>] call_timer_fn+0x38/0x110
[4017175.363544]  [<ffffffffc0463210>] ? check_user_ids.part.0+0x30/0x30 [deadman]
[4017175.363852]  [<ffffffff9d4ae53d>] run_timer_softirq+0x25d/0x340
[4017175.364014]  [<ffffffff9d4a4d85>] __do_softirq+0xf5/0x280
[4017175.364174]  [<ffffffff9db9d4ec>] call_softirq+0x1c/0x30
[4017175.364335]  [<ffffffff9d42f715>] do_softirq+0x65/0xa0
[4017175.364495]  [<ffffffff9d4a5105>] irq_exit+0x105/0x110
[4017175.364654]  [<ffffffff9db9ea28>] smp_apic_timer_interrupt+0x48/0x60
[4017175.364814]  [<ffffffff9db9afba>] apic_timer_interrupt+0x16a/0x170
[4017175.364973]  <EOI>  [<ffffffff9db8ea15>] ? _raw_spin_unlock_irqrestore+0x15/0x20
[4017175.365310]  [<ffffffffc061d4d5>] qla24xx_tgt_dif_start_scsi+0x245/0xaa0 [qla2xxx]
[4017175.365626]  [<ffffffffc062030e>] qla2xxx_dif_start_scsi_mq+0x16be/0x16f0 [qla2xxx]
[4017175.365934]  [<ffffffff9d7b2636>] ? swiotlb_unmap_sg_attrs+0x46/0x60
[4017175.366094]  [<ffffffff9d8ef211>] ? scsi_dma_unmap+0x61/0x80
[4017175.366259]  [<ffffffffc05ed819>] qla2xxx_mqueuecommand+0x2f9/0x4e0 [qla2xxx]
[4017175.366572]  [<ffffffffc05edc24>] qla2xxx_queuecommand+0x224/0x630 [qla2xxx]
[4017175.366881]  [<ffffffff9d8e4680>] scsi_dispatch_cmd+0xb0/0x240
[4017175.367039]  [<ffffffff9d8eddac>] scsi_request_fn+0x4ac/0x680
[4017175.367200]  [<ffffffff9d4bc29b>] ? __queue_delayed_work+0x8b/0x1a0
[4017175.367361]  [<ffffffff9d753d39>] __blk_run_queue+0x39/0x50
[4017175.367520]  [<ffffffff9d753db6>] blk_run_queue+0x26/0x40
[4017175.367678]  [<ffffffff9d8ec308>] scsi_run_queue+0x258/0x2f0
[4017175.367838]  [<ffffffff9d8edf95>] scsi_requeue_run_queue+0x15/0x20
[4017175.367999]  [<ffffffff9d4bdfbf>] process_one_work+0x17f/0x440
[4017175.368160]  [<ffffffff9d4bf0d6>] worker_thread+0x126/0x3c0
[4017175.368320]  [<ffffffff9d4befb0>] ? manage_workers.isra.26+0x2a0/0x2a0
[4017175.368482]  [<ffffffff9d4c5f91>] kthread+0xd1/0xe0
[4017175.368641]  [<ffffffff9d4c5ec0>] ? insert_kthread_work+0x40/0x40
[4017175.368803]  [<ffffffff9db99df7>] ret_from_fork_nospec_begin+0x21/0x21
[4017175.368964]  [<ffffffff9d4c5ec0>] ? insert_kthread_work+0x40/0x40

Environment

Red Hat Enterprise Linux 7.9
Fibre Channel interfaces driven by qla2xxx driver version 10.02.01.00.a14-k1

System startup messages will show:

[   timestamp  ] qla2xxx: module verification failed: signature and/or required key missing - tainting kernel
[   timestamp  ] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.02.01.00.a14-k1.

Please note: This driver is NOT supplied by Red Hat.
It is tainting the kernel.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content