RHEL7: System crashes after adding cpu online ( hotplug / hot add ) in Red Hat Enterprise Linux 7

Solution Verified - Updated -

Issue

  • System crashes after adding cpu online (hotplug).

  • The following log messages and backtrace observed in the kernel ring buffer (dmesg) at the time of the issue:

[1960173.260232] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[1960173.262269] IP: [<ffffffff810124da>] rapl_cpu_init+0x6a/0x80
[1960173.262619] PGD 0 
[1960173.262974] Oops: 0002 [#1] SMP 
[1960173.263324] Modules linked in: cts ip6table_filter ip6_tables iptable_filter nfsv3 8021q garp mrp stp llc rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock ext4 mbcache jbd2 intel_powerclamp coretemp iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw ppdev gf128mul glue_helper ablk_helper cryptd vmw_balloon pcspkr sg shpchp i2c_piix4 vmw_vmci parport_pc parport nfsd nfs_acl lockd grace binfmt_misc auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel serio_raw vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm vmxnet3 ata_piix drm vmw_pvscsi i2c_core libata floppy fjes dm_mirror dm_region_hash dm_log dm_mod
[1960173.265723] CPU: 0 PID: 27552 Comm: systemd-udevd Not tainted 3.10.0-514.6.1.el7.x86_64 #1
[1960173.266251] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
[1960173.266784] task: ffff8801812eaf10 ti: ffff880097eac000 task.ti: ffff880097eac000
[1960173.267362] RIP: 0010:[<ffffffff810124da>]  [<ffffffff810124da>] rapl_cpu_init+0x6a/0x80
[1960173.267899] RSP: 0018:ffff880097eafd18  EFLAGS: 00010246
[1960173.268426] RAX: 0000000000000020 RBX: 0000000000000001 RCX: 0000000000000020
[1960173.268987] RDX: 00000000ffffffff RSI: 0000000000000020 RDI: 0000000000000020
[1960173.269552] RBP: ffff880097eafd28 R08: ffffffff81ca5bc0 R09: 0000000000000000
[1960173.270168] R10: ffff880236e19b80 R11: ffffea0001f46400 R12: 0000000000000000
[1960173.270723] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
[1960173.271293] FS:  00007fe0199fa8c0(0000) GS:ffff880236e00000(0000) knlGS:0000000000000000
[1960173.271872] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1960173.272461] CR2: 0000000000000008 CR3: 0000000089c90000 CR4: 00000000000407f0
[1960173.273054] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1960173.273614] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[1960173.274201] Stack:
[1960173.274797]  0000000000000001 ffffffff819fb210 ffff880097eafd48 ffffffff8101257d
[1960173.275454]  00000000fffffff9 ffffffff819fb210 ffff880097eafd80 ffffffff816921bc
[1960173.276091]  0000000000000001 0000000000000000 0000000000000000 ffff880184e81f60
[1960173.276703] Call Trace:
[1960173.277255]  [<ffffffff8101257d>] rapl_cpu_notifier+0x8d/0x100
[1960173.277887]  [<ffffffff816921bc>] notifier_call_chain+0x4c/0x70
[1960173.278523]  [<ffffffff810b67ae>] __raw_notifier_call_chain+0xe/0x10
[1960173.279153]  [<ffffffff81089883>] cpu_notify+0x23/0x50
[1960173.279773]  [<ffffffff81089a6d>] _cpu_up+0x17d/0x1a0
[1960173.280412]  [<ffffffff81089b41>] cpu_up+0xb1/0x120
[1960173.281043]  [<ffffffff81678e4c>] cpu_subsys_online+0x3c/0x90
[1960173.281693]  [<ffffffff8142a445>] device_online+0x65/0x90
[1960173.282348]  [<ffffffff8142a505>] store_online+0x95/0xa0
[1960173.282980]  [<ffffffff814270a8>] dev_attr_store+0x18/0x30
[1960173.283644]  [<ffffffff8127b736>] sysfs_write_file+0xc6/0x140
[1960173.284288]  [<ffffffff811fe27d>] vfs_write+0xbd/0x1e0
[1960173.284970]  [<ffffffff811fed9f>] SyS_write+0x7f/0xe0
[1960173.285623]  [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b
[1960173.286255] Code: c0 08 a0 00 00 48 8b 14 d5 80 57 ad 81 bf ff ff ff ff 48 8b 14 10 e8 16 56 30 00 3b 05 a4 12 ad 00 7c 0d f0 0f ab 1d e6 36 c9 00 <41> 89 5c 24 08 5b 41 5c 5d c3 66 90 66 2e 0f 1f 84 00 00 00 00 
[1960173.287673] RIP  [<ffffffff810124da>] rapl_cpu_init+0x6a/0x80
[1960173.288392]  RSP <ffff880097eafd18>
[1960173.289131] CR2: 0000000000000008

Environment

  • Red Hat Enterprise Linux 7 (kernel-3.10.0-514.6.1.el7)
  • RHEL guest running on VMware ESXi (but theoritically the issue may occur on bare metal server)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content