fio stress test on nvme device results in hard lockup errors and system hang

Solution Verified - Updated -

Issue

  • fio stress test on nvme device results in hard lockup errors and system hang

    [  566.574910] NMI watchdog: Watchdog detected hard LOCKUP on cpu 45
    [...]
    [  566.574950] CPU: 45 PID: 0 Comm: swapper/45 Kdump: loaded Not tainted 5.14.0-70.22.1.el9_0.x86_64 #1
    [  566.574953] RIP: 0010:native_queued_spin_lock_slowpath.part.0+0x16e/0x190
    [  566.574959] Code: 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 80 98 02 00 48 03 04 f5 e0 4a a0 96 48 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 48 8b 31 48 85 f6 74 87 0f 0d 0e eb 82 bf 01 00 00 00
    [  566.574960] RSP: 0018:ff6607400118ce60 EFLAGS: 00000046
    [  566.574961] RAX: 0000000000000000 RBX: ff1af464b4584d00 RCX: ff1af5204b169880
    [  566.574962] RDX: ff1af4872e962210 RSI: 0000000000000030 RDI: ff1af4872e962210
    [  566.574962] RBP: ff1af4872e962200 R08: 0000000000b80000 R09: 0000000000000000
    [  566.574963] R10: 0000000000000081 R11: 0000000000000001 R12: 0000000000000096
    [  566.574964] R13: ff1af464ad55c000 R14: ff1af4872e962210 R15: ff1af4641a7ef400
    [  566.574964] FS:  0000000000000000(0000) GS:ff1af5204b140000(0000) knlGS:0000000000000000
    [  566.574965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  566.574966] CR2: 00007f60dd530bd8 CR3: 00000002b367e006 CR4: 0000000000771ee0
    [  566.574966] PKRU: 55555554
    [  566.574967] Call Trace:
    [  566.574968]  <IRQ>
    [  566.574969]  _raw_spin_lock_irqsave+0x2c/0x40
    [  566.574975]  dma_pool_free+0x29/0xc0
    [  566.574978]  nvme_free_prps+0x64/0x80 [nvme]
    [  566.574982]  nvme_unmap_data+0xc8/0xe0 [nvme]
    [  566.574984]  nvme_pci_complete_rq+0x38/0x90 [nvme]
    [  566.574986]  nvme_process_cq+0x160/0x250 [nvme]
    [  566.574988]  nvme_irq+0xd/0x20 [nvme]
    [  566.574989]  __handle_irq_event_percpu+0x3d/0x180
    [  566.574991]  handle_irq_event+0x58/0xb0
    [  566.574992]  handle_edge_irq+0x93/0x240
    [  566.574994]  __common_interrupt+0x41/0xa0
    [  566.574996]  common_interrupt+0x7e/0xa0
    [  566.574999]  </IRQ>
    [  566.574999]  asm_common_interrupt+0x1e/0x40
    

Environment

  • Red Hat Enterprise Linux 9.0
  • kernel-5.14.0-70.22.1.el9_0
  • nvme device

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content