The kernel panicked due to hard lockup introduced by the deadlocking ABBA behaviour in the rdmavt driver.

Solution Unverified - Updated 2025-06-05T17:08:16+00:00 -

Environment

Red Hat Enterprise Linux 9.2.
Red Hat Enterprise Linux 9.3.
Red Hat Enterprise Linux 9.4.
Red Hat Enterprise Linux 9.5.

Issue

The deadlocking ABBA action in the rdmavt driver caused a hard lockup, triggering the kernel to panic.

Resolution

Right now, this is being tracked in a private JIRA:83851.
Open a case with Red Hat support if any assistance is needed on this issue.

Workaround:

Boot the server with RHEL 9.0/9.1.

Root Cause

This behaviour has been mentioned upstream in [1] and was introduced to RHEL with patch "IB/rdmavt: add missing locks in rvt_ruc_loopback"

[1] Deadlock: https://lkml.indiana.edu/hypermail/linux/kernel/2210.2/07062.html
Disclaimer: The Above link is shared only for informational purposes. Red Hat support neither maintains its contents nor takes responsibility of the content.

Introduced with this patch:


$ git show 436f65f0c17e
commit 436f65f0c17e340ae003c23c3d6aeb24d9356c90
Author: Kamal Heib <kheib@redhat.com>
Date:   Wed Aug 31 08:15:01 2022 -0400

    IB/rdmavt: add missing locks in rvt_ruc_loopback

    Bugzilla: https://bugzilla.redhat.com/2120662

    commit 22cbc6c2681a0a4fe76150270426e763d52353a4
    Author: Niels Dossche <dossche.niels@gmail.com>
    Date:   Mon Feb 28 20:51:44 2022 +0100

        IB/rdmavt: add missing locks in rvt_ruc_loopback

Diagnostic Steps

The panic occurred as a result of detecting hard lockup conditions on CPU 6:


crash> log | grep LOCK
[  760.109604] NMI watchdog: Watchdog detected hard LOCKUP on CPU 6
[  760.109949] Kernel panic - not syncing: Hard LOCKUP

Backtrace of the panic task:

On CPU 6, there was a kworker. It was vying for a spinlock from within the rdmavt driver.


crash> bt
PID: 58       TASK: ffff8df9041b9c80  CPU: 6    COMMAND: "kworker/6:0H"
 #0 [fffffe549e08ea58] machine_kexec at ffffffffa967a897
 #1 [fffffe549e08eab0] __crash_kexec at ffffffffa97faeba
 #2 [fffffe549e08eb70] panic at ffffffffaa275ce7
 #3 [fffffe549e08ebf8] watchdog_overflow_callback.cold at ffffffffaa2822f9
 #4 [fffffe549e08ec08] __perf_event_overflow at ffffffffa99228f5
 #5 [fffffe549e08ec38] handle_pmi_common at ffffffffa9614208
 #6 [fffffe549e08ee08] intel_pmu_handle_irq at ffffffffa9614d23
 #7 [fffffe549e08ee48] perf_event_nmi_handler at ffffffffa9605fc8
 #8 [fffffe549e08ee68] nmi_handle at ffffffffa96324be
 #9 [fffffe549e08eeb0] default_do_nmi at ffffffffaa2d0100
#10 [fffffe549e08eed0] exc_nmi at ffffffffaa2d0300
#11 [fffffe549e08eef0] end_repeat_nmi at ffffffffaa40163c
    [exception RIP: native_queued_spin_lock_slowpath+0x76]
    RIP: ffffffffaa2e5706  RSP: ffffb93d8660bd00  RFLAGS: 00000002
    RAX: 0000000000000001  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff8df955283280
    RBP: ffff8df955283280   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000007  R11: 0000000000000007  R12: ffff8df96ee03000
    R13: 0000000000000286  R14: ffff8df955283280  R15: ffff8df955283000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <NMI exception stack> ---
#12 [ffffb93d8660bd00] native_queued_spin_lock_slowpath at ffffffffaa2e5706
#13 [ffffb93d8660bd20] _raw_spin_lock at ffffffffaa2e54c5
#14 [ffffb93d8660bd28] rvt_ruc_loopback at ffffffffc0aca034 [rdmavt]
#15 [ffffb93d8660be08] hfi1_do_send at ffffffffc0be573c [hfi1]
#16 [ffffb93d8660be88] process_one_work at ffffffffa972fb57
#17 [ffffb93d8660bec8] worker_thread at ffffffffa973071e
#18 [ffffb93d8660bf18] kthread at ffffffffa9738ac0
#19 [ffffb93d8660bf50] ret_from_fork at ffffffffa9603e8c

The spinlock can be found in register RDI in the backtrace above. It's set up in the below assembly
from register R14, which was previously set up from the address in register R15 at offset 0x280.
Per x86 convention, the first argument passed into a routine is copied to RDI before being called.

crash> dis -rl ffffffffc0aca034 | tail -n 20 
...
0xffffffffc0ac9ff8 <rvt_ruc_loopback+0x578>:    lea    0x280(%r15),%r14
...
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 351
0xffffffffc0aca00d <rvt_ruc_loopback+0x58d>:    mov    %r14,%rdi
...
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 351
0xffffffffc0aca02f <rvt_ruc_loopback+0x5af>:    call   0xffffffffaa2e54a0 <_raw_spin_lock>
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 2784
0xffffffffc0aca034 <rvt_ruc_loopback+0x5b4>:    movzbl 0x1d7(%r15),%eax

Calculating the address in R15 plus 0x280 offset is deriving the r_lock within a rvt_qp structure.


crash> struct rvt_qp.r_lock  -o
struct rvt_qp {
  [0x280] spinlock_t r_lock;
}

2902 void rvt_ruc_loopback(struct rvt_qp *sqp)
2903 {
2904         struct rvt_ibport *rvp =  NULL;
2905         struct rvt_dev_info *rdi = ib_to_rvt(sqp->ibqp.device);
2906         struct rvt_qp *qp;
2907         struct rvt_swqe *wqe;
2908         struct rvt_sge *sge;
2909         unsigned long flags;
2910         struct ib_wc wc;
2911         u64 sdata;
2912         atomic64_t *maddr;
2913         enum ib_wc_status send_status;
2914         bool release;
2915         int ret;
2916         bool copy_last = false;
2917         int local_ops = 0;
...
3133 flush_send:
3134         sqp->s_rnr_retry = sqp->s_rnr_retry_cnt;
3135         spin_lock(&sqp->r_lock);                      <<<<<<

The address in R15 was 0xffff8df955283000. And the calculated address of the
lock matches the address in RDI in the backtrace.


crash> kmem ffff8df955283000
CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
ffff8df900042c00     2048       3339      4256    266    32k  kmalloc-2k
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  ffffe74e8554a000  ffff8df955280000     0     16          5    11
  FREE / [ALLOCATED]
  [ffff8df955283000]

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffe74e8554a0c0  155283000 dead000000000400        0  0 17ffffc0000000

crash> struct rvt_qp.r_lock ffff8df955283000  -o
struct rvt_qp {
  [ffff8df955283280] spinlock_t r_lock;
}

RDI: ffff8df955283280

Searching kernel space memory, we see that the address of the lock is also
referenced in the kernel stack of the namd3 PID that is actively on CPU 11.


crash> search -t ffff8df955283280
PID: 25490    TASK: ffff8e08847a8000  CPU: 11   COMMAND: "namd3"
ffffb93d880d3590: ffff8df955283280 

crash> runq -c 11
CPU 11 RUNQUEUE: ffff8e08400f39c0
  CURRENT: PID: 25490  TASK: ffff8e08847a8000  COMMAND: "namd3"

We see that the address is in CPU 11's register R15.


crash> bt -c 11 
PID: 25490    TASK: ffff8e08847a8000  CPU: 11   COMMAND: "namd3"
 #0 [fffffe24b735de60] crash_nmi_callback at ffffffffa966c121
 #1 [fffffe24b735de68] nmi_handle at ffffffffa96324be
 #2 [fffffe24b735deb0] default_do_nmi at ffffffffaa2d0100
 #3 [fffffe24b735ded0] exc_nmi at ffffffffaa2d0300
 #4 [fffffe24b735def0] end_repeat_nmi at ffffffffaa40163c
    [exception RIP: native_queued_spin_lock_slowpath+0x76]
    RIP: ffffffffaa2e5706  RSP: ffffb93d880d3828  RFLAGS: 00000002
    RAX: 0000000000000001  RBX: 000000000aa8a900  RCX: 0000000000000065
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff8df9552833c0
    RBP: ffff8df9552833c0   R8: ffffb93d880d38c0   R9: ffff8df930200000
    R10: 0000000000000002  R11: ffff8df930d1a000  R12: ffff8df9553456e8
    R13: ffff8df930200000  R14: 0000000000000000  R15: ffff8df955283280   <<<< 
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffffb93d880d3828] native_queued_spin_lock_slowpath at ffffffffaa2e5706
 #6 [ffffb93d880d3848] _raw_spin_lock at ffffffffaa2e54c5
 #7 [ffffb93d880d3850] rvt_qp_mr_clean at ffffffffc0acad5b [rdmavt]
 #8 [ffffb93d880d38b8] rvt_qp_iter at ffffffffc0ac7647 [rdmavt]
 #9 [ffffb93d880d38f8] rvt_dereg_mr at ffffffffc0ac6f46 [rdmavt]
#10 [ffffb93d880d3938] ib_dereg_mr_user at ffffffffc079ab80 [ib_core]
#11 [ffffb93d880d3968] destroy_hw_idr_uobject at ffffffffc0ab7c7e [ib_uverbs]
#12 [ffffb93d880d3988] uverbs_destroy_uobject at ffffffffc0ab8367 [ib_uverbs]
#13 [ffffb93d880d39b8] uobj_destroy at ffffffffc0ab87bc [ib_uverbs]
#14 [ffffb93d880d39d8] ib_uverbs_run_method at ffffffffc0aba7f3 [ib_uverbs]
#15 [ffffb93d880d3a28] ib_uverbs_cmd_verbs at ffffffffc0abaa92 [ib_uverbs]
#16 [ffffb93d880d3c60] ib_uverbs_ioctl at ffffffffc0ababf4 [ib_uverbs]
#17 [ffffb93d880d3ca0] __x64_sys_ioctl at ffffffffa9a649ea
#18 [ffffb93d880d3cd0] do_syscall_64 at ffffffffaa2ce45f
#19 [ffffb93d880d3f50] entry_SYSCALL_64_after_hwframe at ffffffffaa400130
    RIP: 0000152f9e1aa67b  RSP: 0000152f8e3fa0d8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000152f8e3fa1f8  RCX: 0000152f9e1aa67b
    RDX: 0000152f8e3fa1e0  RSI: 00000000c0181b01  RDI: 0000000000000008
    RBP: 0000152f8e3fa1c0   R8: 000000000172e990   R9: 0000152f60119ce0
    R10: 0000000001718310  R11: 0000000000000246  R12: 0000152f8e3fa1b0
    R13: 000000000000001c  R14: 0000152f8052b140  R15: 0000152f60119ce0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

The address was placed in R15 in routine rvt_qp_mr_clean(), as it took ownership
of the r_lock derived from the rvt_qp structured passed into the routine. So we
see that CPU 11 does own the lock.


crash> dis -rl ffffffffc0acad5b 
...
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 376
0xffffffffc0acad20 <rvt_qp_mr_clean+0x30>:  lea    0x280(%rdi),%r15
0xffffffffc0acad27 <rvt_qp_mr_clean+0x37>:  mov    %rdi,%rbp                   <<<< ARG 1 `struct rvt_qp *qp` copied to rbp
0xffffffffc0acad2a <rvt_qp_mr_clean+0x3a>:  mov    %esi,%ebx
0xffffffffc0acad2c <rvt_qp_mr_clean+0x3c>:  mov    %r15,%rdi                   <<<< ffff8df955283280
0xffffffffc0acad2f <rvt_qp_mr_clean+0x3f>:  call   0xffffffffaa2e5520 <_raw_spin_lock_irq>

 688 void rvt_qp_mr_clean(struct rvt_qp *qp, u32 lkey)
 689 {
 690         bool lastwqe = false;
 691 
 692         if (qp->ibqp.qp_type == IB_QPT_SMI ||
 693             qp->ibqp.qp_type == IB_QPT_GSI)
 694                 /* avoid special QPs */
 695                 return;
 696         spin_lock_irq(&qp->r_lock);   <<<  here
 697         spin_lock(&qp->s_hlock);
 698         spin_lock(&qp->s_lock);

CPU 11 continued on, took the qp->s_hlock, but is now spinning vying for
ownership of the qp->s_lock.


/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 351
0xffffffffc0acad34 <rvt_qp_mr_clean+0x44>:  lea    0x380(%rbp),%rax

    crash> struct rvt_qp.s_hlock -o
    struct rvt_qp {
      [0x380] spinlock_t s_hlock;
    }

0xffffffffc0acad3b <rvt_qp_mr_clean+0x4b>:  mov    %rax,%rdi
0xffffffffc0acad3e <rvt_qp_mr_clean+0x4e>:  mov    %rax,0x8(%rsp)
0xffffffffc0acad43 <rvt_qp_mr_clean+0x53>:  call   0xffffffffaa2e54a0 <_raw_spin_lock>
0xffffffffc0acad48 <rvt_qp_mr_clean+0x58>:  lea    0x3c0(%rbp),%rax

    crash> struct rvt_qp.s_lock -o
    struct rvt_qp {
      [0x3c0] spinlock_t s_lock;
    }

0xffffffffc0acad4f <rvt_qp_mr_clean+0x5f>:  mov    %rax,%rdi
0xffffffffc0acad52 <rvt_qp_mr_clean+0x62>:  mov    %rax,(%rsp)
0xffffffffc0acad56 <rvt_qp_mr_clean+0x66>:  call   0xffffffffaa2e54a0 <_raw_spin_lock>
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 700
0xffffffffc0acad5b <rvt_qp_mr_clean+0x6b>:  movzbl 0x1d7(%rbp),%eax

The s_lock exists of offset 0x3c0 of the rvt_qp structure passed into the routine.
We can find the lock address passed into RDI before vying for the lock based on
the backtrace.


RDI: ffff8df9552833c0                                   <<<

crash> struct rvt_qp.s_lock ffff8df955283000 -o 
struct rvt_qp {
  [ffff8df9552833c0] spinlock_t s_lock;                 <<<
}

Since this namd3 task is spinning vying for the lock, we'll need to find the
owner. Searching kernel space memory for it, we see it is referenced in the stack
of PID 58, which is actively on CPU 6.
```
crash> search -t ffff8df9552833c0 | grep PID 
PID: 58       TASK: ffff8df9041b9c80  CPU: 6    COMMAND: "kworker/6:0H"
PID: 25490    TASK: ffff8e08847a8000  CPU: 11   COMMAND: "namd3"
```

The kworker is vying for the r_lock, but we see the address of the s_lock referenced
in the stack frame marked below.


crash> bt 58
PID: 58       TASK: ffff8df9041b9c80  CPU: 6    COMMAND: "kworker/6:0H"
 #0 [fffffe549e08ea58] machine_kexec at ffffffffa967a897
 #1 [fffffe549e08eab0] __crash_kexec at ffffffffa97faeba
 #2 [fffffe549e08eb70] panic at ffffffffaa275ce7
 #3 [fffffe549e08ebf8] watchdog_overflow_callback.cold at ffffffffaa2822f9
 #4 [fffffe549e08ec08] __perf_event_overflow at ffffffffa99228f5
 #5 [fffffe549e08ec38] handle_pmi_common at ffffffffa9614208
 #6 [fffffe549e08ee08] intel_pmu_handle_irq at ffffffffa9614d23
 #7 [fffffe549e08ee48] perf_event_nmi_handler at ffffffffa9605fc8
 #8 [fffffe549e08ee68] nmi_handle at ffffffffa96324be
 #9 [fffffe549e08eeb0] default_do_nmi at ffffffffaa2d0100
#10 [fffffe549e08eed0] exc_nmi at ffffffffaa2d0300
#11 [fffffe549e08eef0] end_repeat_nmi at ffffffffaa40163c
    [exception RIP: native_queued_spin_lock_slowpath+0x76]
    RIP: ffffffffaa2e5706  RSP: ffffb93d8660bd00  RFLAGS: 00000002
    RAX: 0000000000000001  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff8df955283280
    RBP: ffff8df955283280   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000007  R11: 0000000000000007  R12: ffff8df96ee03000
    R13: 0000000000000286  R14: ffff8df955283280  R15: ffff8df955283000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <NMI exception stack> ---
#12 [ffffb93d8660bd00] native_queued_spin_lock_slowpath at ffffffffaa2e5706
#13 [ffffb93d8660bd20] _raw_spin_lock at ffffffffaa2e54c5
#14 [ffffb93d8660bd28] rvt_ruc_loopback at ffffffffc0aca034 [rdmavt]
#15 [ffffb93d8660be08] hfi1_do_send at ffffffffc0be573c [hfi1]
#16 [ffffb93d8660be88] process_one_work at ffffffffa972fb57
#17 [ffffb93d8660bec8] worker_thread at ffffffffa973071e
#18 [ffffb93d8660bf18] kthread at ffffffffa9738ac0
#19 [ffffb93d8660bf50] ret_from_fork at ffffffffa9603e8c


crash> bt 58 -FFls
...
#14 [ffffb93d8660bd28] rvt_ruc_loopback+0x5b4 at ffffffffc0aca034 [rdmavt]
    /usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 2784
    ffffb93d8660bd30: [ffff8df9552833c0:kmalloc-2k] [ffff8df96ee03280:kmalloc-2k] 
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ffffb93d8660bd40: 0000000000000286 0000000000000000 
    ffffb93d8660bd50: ffff8df930202090 [ffff8e08a22178a8:kmalloc-512] 
    ffffb93d8660bd60: 00ff8e0800000000 ffffb93d86eda200 
    ffffb93d8660bd70: ffff8e083ffb3a80 ffff8e083ffb3a40 
    ffffb93d8660bd80: [ffff8df99e563900:task_struct] 0000000000000000 
    ffffb93d8660bd90: 0000000000000000 0000000000000000 
    ffffb93d8660bda0: 0000000000000000 0000000000000000 
    ffffb93d8660bdb0: 0000000000000000 0000000000000000 
    ffffb93d8660bdc0: 0000000000000000 0000000000000000 
    ffffb93d8660bdd0: 70bd0ffcb033d000 [ffff8df900bfe540:kmalloc-192] 
    ffffb93d8660bde0: [ffff8df955283000:kmalloc-2k] [ffff8df930e67800:pool_workqueue] 
    ffffb93d8660bdf0: [ffff8df9202d1400:kmalloc-1k] [ffff8df930e67805:pool_workqueue] 
    ffffb93d8660be00: 0000000000000000 hfi1_do_send+0x38c 
#15 [ffffb93d8660be08] hfi1_do_send+0x38c at ffffffffc0be573c [hfi1]
...

The address of the s_lock was placed there in routine rvt_ruc_loopback(),
and was derived from the rvt_qp structure passed into the routine.


crash> dis -rl ffffffffc0aca034| less
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 2903
0xffffffffc0ac9a80 <rvt_ruc_loopback>:  nopl   0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc0ac9a85 <rvt_ruc_loopback+0x5>:      push   %r15
0xffffffffc0ac9a87 <rvt_ruc_loopback+0x7>:      mov    %rdi,%r15  <<< rvt_qp copied from RDI to R15. 

...
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 326
0xffffffffc0ac9b24 <rvt_ruc_loopback+0xa4>:     lea    0x3c0(%r15),%rax  <<< deriving s_lock address from rvt_qp struct.

    crash> struct rvt_qp.s_lock 
    struct rvt_qp {
      [0x3c0] spinlock_t s_lock;
    }

/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 2930
0xffffffffc0ac9b2b <rvt_ruc_loopback+0xab>:     mov    %rax,%rdi         <<< copy s_lock to RDI
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 326
0xffffffffc0ac9b2e <rvt_ruc_loopback+0xae>:     mov    %rax,(%rsp)       <<< copy s_lock address to the top of the stack frame. 
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 2930
0xffffffffc0ac9b32 <rvt_ruc_loopback+0xb2>:     call   0xffffffffaa2e4e60 <_raw_spin_lock_irqsave>
...

The routine then continues on to unlock an r_lock for a different rvt_qp structure. The address
is stored on in the stackframe at offset 0x8 of RSP. Then it locks the s_lock found on the stack.
Finally, it tries to take the r_lock of the same rvt_qp structure and spins, as discussed in [I.4,5,6].


crash> dis -rl ffffffffc0aca034| less
...
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 406
0xffffffffc0ac9fea <rvt_ruc_loopback+0x56a>:    mov    0x10(%rsp),%rsi
0xffffffffc0ac9fef <rvt_ruc_loopback+0x56f>:    mov    0x8(%rsp),%rdi
0xffffffffc0ac9ff4 <rvt_ruc_loopback+0x574>:    mov    %edx,0x30(%rsp)
0xffffffffc0ac9ff8 <rvt_ruc_loopback+0x578>:    lea    0x280(%r15),%r14   <<<<

crash> struct rvt_qp.r_lock -o 
struct rvt_qp {
  [0x280] spinlock_t r_lock;
}

0xffffffffc0ac9fff <rvt_ruc_loopback+0x57f>:    call   0xffffffffaa2e4f10 <_raw_spin_unlock_irqrestore>
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 3131
0xffffffffc0aca004 <rvt_ruc_loopback+0x584>:    mov    (%rsp),%rdi                                           <<< lock the s_lock that exists
0xffffffffc0aca008 <rvt_ruc_loopback+0x588>:    call   0xffffffffaa2e4e60 <_raw_spin_lock_irqsave>           <<< on the stack frame at offset 0x0. 
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 351
0xffffffffc0aca00d <rvt_ruc_loopback+0x58d>:    mov    %r14,%rdi             <<<<                                    <<<  then get r_lock address
...
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/./include/linux/spinlock.h: 351
0xffffffffc0aca02f <rvt_ruc_loopback+0x5af>:    call   0xffffffffaa2e54a0 <_raw_spin_lock>                           <<< take the r_lock address.
/usr/src/debug/kernel-5.14.0-503.29.1.el9_5/linux-5.14.0-503.29.1.el9_5.x86_64/drivers/infiniband/sw/rdmavt/qp.c: 2784
0xffffffffc0aca034 <rvt_ruc_loopback+0x5b4>:    movzbl 0x1d7(%r15),%eax


2892 /**
2893  * rvt_ruc_loopback - handle UC and RC loopback requests
2894  * @sqp: the sending QP
2895  *
2896  * This is called from rvt_do_send() to forward a WQE addressed to the same HFI
2897  * Note that although we are single threaded due to the send engine, we still
2898  * have to protect against post_send().  We don't have to worry about
2899  * receive interrupts since this is a connected protocol and all packets
2900  * will pass through here.
2901  */
2902 void rvt_ruc_loopback(struct rvt_qp *sqp)
2903 {
2904         struct rvt_ibport *rvp =  NULL;
2905         struct rvt_dev_info *rdi = ib_to_rvt(sqp->ibqp.device);
2906         struct rvt_qp *qp;
2907         struct rvt_swqe *wqe;
2908         struct rvt_sge *sge;
2909         unsigned long flags;
2910         struct ib_wc wc;
2911         u64 sdata;
2912         atomic64_t *maddr;
2913         enum ib_wc_status send_status;
2914         bool release;
2915         int ret;
2916         bool copy_last = false;
2917         int local_ops = 0;
....
3129 send_comp:
3130         spin_unlock_irqrestore(&qp->r_lock, flags); <<< crash> px ((struct rvt_qp *)0xffff8df96ee03000)->r_lock->rlock->raw_lock->val.counter = 0x0, so free! 
3131         spin_lock_irqsave(&sqp->s_lock, flags);     <<< take ownership of this lock, exists at 0x0 of rsp. 
3132         rvp->n_loop_pkts++;
3133 flush_send:
3134         sqp->s_rnr_retry = sqp->s_rnr_retry_cnt;
3135         spin_lock(&sqp->r_lock);                    <<< stuck spinning here.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Select Your Language

The kernel panicked due to hard lockup introduced by the deadlocking ABBA behaviour in the rdmavt driver.

Environment

Issue

Resolution

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Environment

Issue

Resolution

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links