The kernel crashes due to a GPF happens in mutex_spin_on_owner(). The known RDMA/cma bug that was introduced with a patch from upstream commit 722c7b2bfead is the possible cause.

Solution Verified - Updated -

Issue

  • The kernel crashes due to a GPF happens in mutex_spin_on_owner().
[56846.416132] general protection fault: 0000 [#1] SMP 
[56846.417749] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ptlrpc(OE) ko2iblnd(OE) obdclass(OE) lnet(OE) libcfs(OE) bonding ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp rpcrdma ib_umad rdma_ucm ib_iser ib_ipoib rdma_cm iw_cm sunrpc libiscsi scsi_transport_iscsi ib_cm mlx4_ib ib_uverbs ib_core iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp dm_round_robin coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr cdc_ether usbnet mii mei_me lpc_ich i2c_i801 mei sg ipmi_si ipmi_devintf
[56846.427958]  ipmi_msghandler wmi acpi_power_meter acpi_pad dm_multipath binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx4_core drm crc32c_intel bnx2x tg3 drm_panel_orientation_quirks devlink mdio libcrc32c ptp pps_core sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common qla2xxx nvme_fc nvme_fabrics nvme_core scsi_transport_fc scsi_tgt ata_piix libata megaraid_sas dm_mirror dm_region_hash dm_log dm_mod fuse
[56846.437415] CPU: 30 PID: 50120 Comm: kworker/u112:1 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1160.53.1.el7.x86_64 #1
[56846.441387] Hardware name: LENOVO System x3650 M5 -[8871AC1]-/01DC328, BIOS -[TCE150C-3.40]- 01/18/2021
[56846.443467] Workqueue: rdma_cm cma_work_handler [rdma_cm]
[56846.445530] task: ffff8955b3615280 ti: ffff8956043d4000 task.ti: ffff8956043d4000
[56846.447612] RIP: 0010:[<ffffffffaeec9602>]  [<ffffffffaeec9602>] mutex_spin_on_owner+0x12/0x50
[56846.449742] RSP: 0018:ffff8956043d7d80  EFLAGS: 00010246
[56846.451864] RAX: 5a5a5a5a5a5a5a5a RBX: ffff897636c8ee78 RCX: ffff8956043d7fd8
[56846.454001] RDX: 0000000000000000 RSI: 5a5a5a5a5a5a5a5a RDI: ffff897636c8ee78
[56846.456135] RBP: ffff8956043d7d80 R08: ffff897564f74888 R09: da29f59842f74880
[56846.458284] R10: da29f59842f74880 R11: 0000000000000000 R12: 5a5a5a5a5a5a5a5a
[56846.460446] R13: ffff8955b3615280 R14: ffff8956043d7fd8 R15: ffff897636c8ee98
[56846.462604] FS:  0000000000000000(0000) GS:ffff895640000000(0000) knlGS:0000000000000000
[56846.464780] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[56846.466956] CR2: 00007f7d092efd70 CR3: 0000001fd0c62000 CR4: 00000000003607e0
[56846.469159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[56846.471346] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[56846.473510] Call Trace:
[56846.475669]  [<ffffffffaf5882c6>] __mutex_lock_slowpath+0x166/0x1d0
[56846.477864]  [<ffffffffaee2b59e>] ? __switch_to+0xce/0x580
[56846.480039]  [<ffffffffaf5875ff>] mutex_lock+0x1f/0x2f
[56846.482203]  [<ffffffffc07ddac5>] cma_work_handler+0x25/0xa0 [rdma_cm]
[56846.484379]  [<ffffffffaeebde8f>] process_one_work+0x17f/0x440
[56846.486556]  [<ffffffffaeebefa6>] worker_thread+0x126/0x3c0
[56846.488715]  [<ffffffffaeebee80>] ? manage_workers.isra.26+0x2a0/0x2a0
[56846.490871]  [<ffffffffaeec5e61>] kthread+0xd1/0xe0
[56846.493020]  [<ffffffffaeec5d90>] ? insert_kthread_work+0x40/0x40
[56846.495183]  [<ffffffffaf595df7>] ret_from_fork_nospec_begin+0x21/0x21
[56846.497334]  [<ffffffffaeec5d90>] ? insert_kthread_work+0x40/0x40
[56846.499470] Code: b8 01 00 00 00 75 f3 48 8b 86 a0 07 00 00 5d 48 c1 e8 08 83 e0 01 c3 66 90 0f 1f 44 00 00 55 48 89 e5 48 8b 47 18 48 39 f0 75 31 <8b> 48 28 85 c9 74 2a 65 48 8b 0c 25 b8 0e 01 00 eb 0b 0f 1f 40 
[56846.504135] RIP  [<ffffffffaeec9602>] mutex_spin_on_owner+0x12/0x50
[56846.506372]  RSP <ffff8956043d7d80>

Environment

Red Hat Enterprise Linux 7.9.z - kernel-3.10.0-1160.53.1.el7.x86_64

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content