The kernel crashes due to a GPF happens in mutex_spin_on_owner(). The known RDMA/cma bug that was introduced with a patch from upstream commit 722c7b2bfead is the possible cause.
Issue
- The kernel crashes due to a GPF happens in mutex_spin_on_owner().
[56846.416132] general protection fault: 0000 [#1] SMP
[56846.417749] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ptlrpc(OE) ko2iblnd(OE) obdclass(OE) lnet(OE) libcfs(OE) bonding ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp rpcrdma ib_umad rdma_ucm ib_iser ib_ipoib rdma_cm iw_cm sunrpc libiscsi scsi_transport_iscsi ib_cm mlx4_ib ib_uverbs ib_core iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp dm_round_robin coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr cdc_ether usbnet mii mei_me lpc_ich i2c_i801 mei sg ipmi_si ipmi_devintf
[56846.427958] ipmi_msghandler wmi acpi_power_meter acpi_pad dm_multipath binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx4_core drm crc32c_intel bnx2x tg3 drm_panel_orientation_quirks devlink mdio libcrc32c ptp pps_core sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common qla2xxx nvme_fc nvme_fabrics nvme_core scsi_transport_fc scsi_tgt ata_piix libata megaraid_sas dm_mirror dm_region_hash dm_log dm_mod fuse
[56846.437415] CPU: 30 PID: 50120 Comm: kworker/u112:1 Kdump: loaded Tainted: G OE ------------ 3.10.0-1160.53.1.el7.x86_64 #1
[56846.441387] Hardware name: LENOVO System x3650 M5 -[8871AC1]-/01DC328, BIOS -[TCE150C-3.40]- 01/18/2021
[56846.443467] Workqueue: rdma_cm cma_work_handler [rdma_cm]
[56846.445530] task: ffff8955b3615280 ti: ffff8956043d4000 task.ti: ffff8956043d4000
[56846.447612] RIP: 0010:[<ffffffffaeec9602>] [<ffffffffaeec9602>] mutex_spin_on_owner+0x12/0x50
[56846.449742] RSP: 0018:ffff8956043d7d80 EFLAGS: 00010246
[56846.451864] RAX: 5a5a5a5a5a5a5a5a RBX: ffff897636c8ee78 RCX: ffff8956043d7fd8
[56846.454001] RDX: 0000000000000000 RSI: 5a5a5a5a5a5a5a5a RDI: ffff897636c8ee78
[56846.456135] RBP: ffff8956043d7d80 R08: ffff897564f74888 R09: da29f59842f74880
[56846.458284] R10: da29f59842f74880 R11: 0000000000000000 R12: 5a5a5a5a5a5a5a5a
[56846.460446] R13: ffff8955b3615280 R14: ffff8956043d7fd8 R15: ffff897636c8ee98
[56846.462604] FS: 0000000000000000(0000) GS:ffff895640000000(0000) knlGS:0000000000000000
[56846.464780] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[56846.466956] CR2: 00007f7d092efd70 CR3: 0000001fd0c62000 CR4: 00000000003607e0
[56846.469159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[56846.471346] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[56846.473510] Call Trace:
[56846.475669] [<ffffffffaf5882c6>] __mutex_lock_slowpath+0x166/0x1d0
[56846.477864] [<ffffffffaee2b59e>] ? __switch_to+0xce/0x580
[56846.480039] [<ffffffffaf5875ff>] mutex_lock+0x1f/0x2f
[56846.482203] [<ffffffffc07ddac5>] cma_work_handler+0x25/0xa0 [rdma_cm]
[56846.484379] [<ffffffffaeebde8f>] process_one_work+0x17f/0x440
[56846.486556] [<ffffffffaeebefa6>] worker_thread+0x126/0x3c0
[56846.488715] [<ffffffffaeebee80>] ? manage_workers.isra.26+0x2a0/0x2a0
[56846.490871] [<ffffffffaeec5e61>] kthread+0xd1/0xe0
[56846.493020] [<ffffffffaeec5d90>] ? insert_kthread_work+0x40/0x40
[56846.495183] [<ffffffffaf595df7>] ret_from_fork_nospec_begin+0x21/0x21
[56846.497334] [<ffffffffaeec5d90>] ? insert_kthread_work+0x40/0x40
[56846.499470] Code: b8 01 00 00 00 75 f3 48 8b 86 a0 07 00 00 5d 48 c1 e8 08 83 e0 01 c3 66 90 0f 1f 44 00 00 55 48 89 e5 48 8b 47 18 48 39 f0 75 31 <8b> 48 28 85 c9 74 2a 65 48 8b 0c 25 b8 0e 01 00 eb 0b 0f 1f 40
[56846.504135] RIP [<ffffffffaeec9602>] mutex_spin_on_owner+0x12/0x50
[56846.506372] RSP <ffff8956043d7d80>
Environment
Red Hat Enterprise Linux 7.9.z - kernel-3.10.0-1160.53.1.el7.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.