RHEL7: System crashes after adding cpu online ( hotplug / hot add ) in Red Hat Enterprise Linux 7
Issue
-
System crashes after adding cpu online (hotplug).
-
The following log messages and backtrace observed in the kernel ring buffer (dmesg) at the time of the issue:
[1960173.260232] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[1960173.262269] IP: [<ffffffff810124da>] rapl_cpu_init+0x6a/0x80
[1960173.262619] PGD 0
[1960173.262974] Oops: 0002 [#1] SMP
[1960173.263324] Modules linked in: cts ip6table_filter ip6_tables iptable_filter nfsv3 8021q garp mrp stp llc rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock ext4 mbcache jbd2 intel_powerclamp coretemp iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw ppdev gf128mul glue_helper ablk_helper cryptd vmw_balloon pcspkr sg shpchp i2c_piix4 vmw_vmci parport_pc parport nfsd nfs_acl lockd grace binfmt_misc auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel serio_raw vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm vmxnet3 ata_piix drm vmw_pvscsi i2c_core libata floppy fjes dm_mirror dm_region_hash dm_log dm_mod
[1960173.265723] CPU: 0 PID: 27552 Comm: systemd-udevd Not tainted 3.10.0-514.6.1.el7.x86_64 #1
[1960173.266251] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
[1960173.266784] task: ffff8801812eaf10 ti: ffff880097eac000 task.ti: ffff880097eac000
[1960173.267362] RIP: 0010:[<ffffffff810124da>] [<ffffffff810124da>] rapl_cpu_init+0x6a/0x80
[1960173.267899] RSP: 0018:ffff880097eafd18 EFLAGS: 00010246
[1960173.268426] RAX: 0000000000000020 RBX: 0000000000000001 RCX: 0000000000000020
[1960173.268987] RDX: 00000000ffffffff RSI: 0000000000000020 RDI: 0000000000000020
[1960173.269552] RBP: ffff880097eafd28 R08: ffffffff81ca5bc0 R09: 0000000000000000
[1960173.270168] R10: ffff880236e19b80 R11: ffffea0001f46400 R12: 0000000000000000
[1960173.270723] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
[1960173.271293] FS: 00007fe0199fa8c0(0000) GS:ffff880236e00000(0000) knlGS:0000000000000000
[1960173.271872] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1960173.272461] CR2: 0000000000000008 CR3: 0000000089c90000 CR4: 00000000000407f0
[1960173.273054] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1960173.273614] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[1960173.274201] Stack:
[1960173.274797] 0000000000000001 ffffffff819fb210 ffff880097eafd48 ffffffff8101257d
[1960173.275454] 00000000fffffff9 ffffffff819fb210 ffff880097eafd80 ffffffff816921bc
[1960173.276091] 0000000000000001 0000000000000000 0000000000000000 ffff880184e81f60
[1960173.276703] Call Trace:
[1960173.277255] [<ffffffff8101257d>] rapl_cpu_notifier+0x8d/0x100
[1960173.277887] [<ffffffff816921bc>] notifier_call_chain+0x4c/0x70
[1960173.278523] [<ffffffff810b67ae>] __raw_notifier_call_chain+0xe/0x10
[1960173.279153] [<ffffffff81089883>] cpu_notify+0x23/0x50
[1960173.279773] [<ffffffff81089a6d>] _cpu_up+0x17d/0x1a0
[1960173.280412] [<ffffffff81089b41>] cpu_up+0xb1/0x120
[1960173.281043] [<ffffffff81678e4c>] cpu_subsys_online+0x3c/0x90
[1960173.281693] [<ffffffff8142a445>] device_online+0x65/0x90
[1960173.282348] [<ffffffff8142a505>] store_online+0x95/0xa0
[1960173.282980] [<ffffffff814270a8>] dev_attr_store+0x18/0x30
[1960173.283644] [<ffffffff8127b736>] sysfs_write_file+0xc6/0x140
[1960173.284288] [<ffffffff811fe27d>] vfs_write+0xbd/0x1e0
[1960173.284970] [<ffffffff811fed9f>] SyS_write+0x7f/0xe0
[1960173.285623] [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b
[1960173.286255] Code: c0 08 a0 00 00 48 8b 14 d5 80 57 ad 81 bf ff ff ff ff 48 8b 14 10 e8 16 56 30 00 3b 05 a4 12 ad 00 7c 0d f0 0f ab 1d e6 36 c9 00 <41> 89 5c 24 08 5b 41 5c 5d c3 66 90 66 2e 0f 1f 84 00 00 00 00
[1960173.287673] RIP [<ffffffff810124da>] rapl_cpu_init+0x6a/0x80
[1960173.288392] RSP <ffff880097eafd18>
[1960173.289131] CR2: 0000000000000008
Environment
- Red Hat Enterprise Linux 7 (kernel-3.10.0-514.6.1.el7)
- RHEL guest running on VMware ESXi (but theoritically the issue may occur on bare metal server)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.