System crashes from a double-submit of block request when mq_ops->queue_rq() function becomes stalled

Solution Verified - Updated -

Issue

  • System crashes from a double-submit of block request when mq_ops->queue_rq() function becomes stalled. This issue can present itself in many different ways. Several different cores are shown below, each linked to their analysis in the diagnostic section.

  • Vmcore 1

Unable to handle kernel paging request for data at address 0x00000038
Faulting instruction address: 0xc00000000074aa08
....
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
....
CPU: 177 PID: 108367 Comm: multipathd Kdump: loaded Tainted: G           OE  ------------ T 3.10.0-1127.18.2.el7.ppc64le #1
task: c000005f69ff6d80 ti: c000006dd3174000 task.ti: c000006dd3174000
NIP: c00000000074aa08 LR: c000000000756308 CTR: 0000000000000000
REGS: c000006dd31773d0 TRAP: 0300   Tainted: G           OE  ------------ T  (3.10.0-1127.18.2.el7.ppc64le)
MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24002888  XER: 00000000
....
NIP [c00000000074aa08] scsi_dispatch_cmd+0xb8/0x2e0
LR [c000000000756308] scsi_request_fn+0x5c8/0x9c0
Call Trace:
[c000006dd3177650] [c0000000005478c8] blk_peek_request+0x1c8/0x320 (unreliable)
[c000006dd31776d0] [c000000000756308] scsi_request_fn+0x5c8/0x9c0
[c000006dd3177800] [c00000000053db9c] __blk_run_queue+0x5c/0x80
[c000006dd3177830] [c00000000053c9f4] __elv_add_request+0x124/0x3f0
[c000006dd31778c0] [c000000000551df0] blk_execute_rq_nowait+0xf0/0x1b0
[c000006dd3177910] [c000000000551f34] blk_execute_rq+0x84/0x180
[c000006dd31779f0] [c00000000056bf88] sg_io+0x328/0x540
[c000006dd3177ab0] [c00000000056cbf8] scsi_cmd_ioctl+0x5b8/0x640
[c000006dd3177be0] [d000000036c32130] sd_ioctl+0x120/0x1a0 [sd_mod]
[c000006dd3177c80] [c0000000005661e8] blkdev_ioctl+0x2e8/0x1080
[c000006dd3177cf0] [c00000000040e2b0] block_ioctl+0x50/0xa0
[c000006dd3177d10] [c0000000003bf388] do_vfs_ioctl+0x438/0x870
[c000006dd3177dd0] [c0000000003bf894] SyS_ioctl+0xd4/0xf0
[c000006dd3177e30] [c00000000000a288] system_call+0x3c/0x100
Instruction dump:
e9290000 a0e90108 7f863840 419d009c 813d0188 2f890004 41de0194 60000000 
e93d00c0 7fa3eb78 7fe4fb78 3bc00000 <e9290038> f8410018 7d2903a6 7d2c4b78 
---[ end trace a036f879f3727ba3 ]--- 
Sending IPI to other CPUs
IPI complete
[1786158.704772] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[1786158.709039] Oops: 0000 [#1] SMP NOPTI
[1786158.709041] CPU: 19 PID: 2126625 Comm: kworker/u96:1 Kdump: loaded Tainted: G        W  O L   --------- -  - 4.18.0-372.19.1.el8_6.x86_64 #1
[1786158.710482] Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 09/03/2021
[1786158.712108] Workqueue: fc_rport_eq fc_rport_work [libfc]
[1786158.716273] RIP: 0010:dma_direct_unmap_sg+0x41/0x1a0
....
[1786158.731294] PKRU: 55555554
[1786158.731897] Call Trace
[1786158.731900]  ? __next_timer_interrupt+0xf0/0xf0
[1786158.733024]  qedf_unmap_sg_list.isra.11+0x44/0x50 [qedf]
[1786158.733988]  qedf_scsi_done+0xb4/0x310 [qedf]
[1786158.734963]  qedf_initiate_cleanup+0x1da/0x3d0 [qedf]
[1786158.734969]  qedf_flush_active_ios+0x6c7/0xb10 [qedf]
[1786158.734974]  qedf_cleanup_fcport+0x57/0x1c0 [qedf]
[1786158.734978]  ? vprintk_emit+0x125/0x250
[1786158.734982]  qedf_rport_event_handler+0x633/0x7e0 [qedf]
[1786158.734986]  ? printk+0x58/0x6f
[1786158.734990]  fc_rport_work+0xf5/0x4f0 [libfc]
[1786158.734996]  process_one_work+0x1a7/0x360
[1786158.734999]  ? create_worker+0x1a0/0x1a0
[1786158.735001]  worker_thread+0x30/0x390
[1786158.735003]  ? create_worker+0x1a0/0x1a0
[1786158.735005]  kthread+0x10a/0x120
[1786158.735009]  ? set_kthread_struct+0x40/0x40
[1786158.735012]  ret_from_fork+0x1f/0x40
....

Environment

  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 9

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content