Mellanox ConnectX-4 (mlx5) used for SCSI RDMA Protocol (SRP) is exposed to lock contention in srp_queuecommand(), srp_recv_completion(), mlx5_ib_poll_cq()
Issue
-
While using Mellanox ConnectX-4 (mlx5 driver) to access SCSI RDMA Protocol (SRP) devices hard lockups have been seen during IO loading. The logged call trace output of course varies, but the most common paths documented to have seen the problem include
srp_recv_completion()
srp_queuecommand()
srp_send_completion()
-
The same issue has also surfaced in
__list_add()
, with the following in the logsBUG: unable to handle kernel NULL pointer dereference at 0000000000000002 IP: [<ffffffff8130c2c3>] __list_add+0x33/0xc0 Oops: 0002 [#1] SMP
Environment
- Red Hat Enterprise Linux (RHEL) 7.2
- Red Hat Enterprise Linux (RHEL) 6.8
- mlx5 driver
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.