Kernel panic observed during module unload with "kernel BUG at kernel/cpu.c:1955!" error message
Issue
- We are getting panic on kernel-4.18.0-372.36.1.el8_6.x86_64 while unloading the VxFS kernel module with the following messages:
PANIC: "kernel BUG at kernel/cpu.c:1955!"
- In VxFS during post module load processing we call the cpuhp_setup_satet(). The cpuhp_setup_state() is returning very large value:
vx_cpu_state_key = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
"vxfs:pvec_init",
delayed_pvec_online_call,
delayed_pvec_offline_call);
- During module unload we pass the returned value to cpuhp_remove_state() but system is getting panic due to BUG_ON() in __cpuhp_remove_state_cpuslocked() with below stack trace:
crash> bt
PID: 26211 TASK: ffff95f77e3dc000 CPU: 7 COMMAND: "rmmod"
#0 [ffffaa0506447bf8] machine_kexec at ffffffff89867e8e
#1 [ffffaa0506447c50] __crash_kexec at ffffffff899ae65a
#2 [ffffaa0506447d10] crash_kexec at ffffffff899af591
#3 [ffffaa0506447d28] oops_end at ffffffff898274f1
#4 [ffffaa0506447d48] do_trap at ffffffff898239a7
#5 [ffffaa0506447d90] do_invalid_op at ffffffff898244b6
#6 [ffffaa0506447db0] invalid_op at ffffffff8a200d64
[exception RIP: __cpuhp_remove_state_cpuslocked+246]
RIP: ffffffff898f3906 RSP: ffffaa0506447e60 RFLAGS: 00010286
RAX: 00000000ffffffef RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000fffffff0
RBP: fffffffffffffd80 R8: 00000000000095bb R9: 00000000000095bb
R10: 0000000000000005 R11: 000000b6673b7e00 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffffaa0506447e80] __cpuhp_remove_state at ffffffff898f393e
#8 [ffffaa0506447e98] vx_delayed_pvec_deinit_v2 at ffffffffc13fc0cd [vxfs]
#9 [ffffaa0506447ea8] vx_osdep_deinit at ffffffffc13a71fa [vxfs]
#10 [ffffaa0506447eb8] cleanup_module at ffffffffc1498999 [vxfs]
#11 [ffffaa0506447ee0] __x64_sys_delete_module at ffffffff899a8d2d
#12 [ffffaa0506447f38] do_syscall_64 at ffffffff898043ab
#13 [ffffaa0506447f50] entry_SYSCALL_64_after_hwframe at ffffffff8a2000a9
RIP: 00007f8dba1b20ab RSP: 00007fff91f6ea18 RFLAGS: 00000206
RAX: ffffffffffffffda RBX: 00005589689637c0 RCX: 00007f8dba1b20ab
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000558968963828
RBP: 0000000000000000 R8: 00007fff91f6d991 R9: 0000000000000000
R10: 00007f8dba2e9460 R11: 0000000000000206 R12: 00007fff91f6ec40
R13: 00007fff91f6ef11 R14: 00005589689632a0 R15: 00005589689637c0
ORIG_RAX: 00000000000000b0 CS: 0033 SS: 002b
- Value of vx_cpu_state_key:
crash> rd vx_cpu_state_key
ffffffffc160d560: 00000000fffffff0 ........
crash> rd -d vx_cpu_state_key
ffffffffc160d560: 4294967280
- BUG_ON() hit as cpuhp_cb_check() returning EINVAL.
1616 static int cpuhp_cb_check(enum cpuhp_state state)
1617 {
1618 if (state <= CPUHP_OFFLINE || state >= CPUHP_ONLINE)
1619 return -EINVAL;
1620 return 0;
1621 }
Environment
- Red Hat Enterprise Linux
- RHEL 8.6 kernel versions from 4.18.0-372.36.1.el8_6 to 4.18.0-372.43.1.el8_6
- RHEL 8.7 kernel versions from GA up to 4.18.0-425.13.1.el8_7
- Eventually a vxfs module built against one of the broken kernels above will work correctly with these but will break with fixed kernels as well as with older not yet broken ones
- Any other 3rd party module using the CPU hotplug kernel features, this issue is not limited just to vxfs
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.