kernel BUG at kernel/smpboot.c:134! on systems with more than 1 NUMA node, after moving tasks to a given cpuset cgroup with kernel-rt

Solution Verified - Updated -

Issue

System crashes after trying to move all tasks from root cpuset cgroup to another one, eg. given a cpuset named "0", move all tasks to the cpuset 0:

cat /cgroup/cpuset/tasks | xargs -n 1 /bin/echo >> /cgroup/cpuset/0/tasks

After issuing command above I get this kernel crash:

------------[ cut here ]------------
kernel BUG at kernel/smpboot.c:134!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: autofs4 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables ipv6 ppdev parport_pc parport joydev sg microcode snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core pcspkr e1000 ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom sym53c8xx scsi_transport_spi pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf]
CPU: 9 PID: 70 Comm: ksoftirqd/8 Not tainted 3.10.33-rt32.34.el6rt.x86_64 #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff880077af4800 ti: ffff880077afa000 task.ti: ffff880077afa000
RIP: 0010:[<ffffffff8107ff06>]  [<ffffffff8107ff06>] smpboot_thread_fn+0x2a6/0x310
RSP: 0018:ffff880077afbe48  EFLAGS: 00010297
RAX: 0000000000000009 RBX: ffff880077a9e030 RCX: 0000000000000000
RDX: ffff880077af4800 RSI: ffff880077af4800 RDI: 0000000000000008
RBP: ffff880077afbeb8 R08: ffff880077afa000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880077afa000
R13: ffffffff81a35d20 R14: ffff880077afa010 R15: ffff880077afa010
FS:  0000000000000000(0000) GS:ffff880079e40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f5e03c29000 CR3: 0000000001a0f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880077af4800 ffff880077afa000 ffff880077afa010 ffff880077afa000
 ffff880077afa000 ffff880077af4800 ffff880077af4800 0000000000000000
 ffff880077afbeb8 ffff88003a6c3d08 ffff880077afbec8 ffff880077a9e030
Call Trace:
 [<ffffffff8107fc60>] ? smpboot_park_threads+0x90/0x90
 [<ffffffff81076cbe>] kthread+0xbe/0xd0
 [<ffffffff81076c00>] ? kthreadd+0x1e0/0x1e0
 [<ffffffff8157e2ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff81076c00>] ? kthreadd+0x1e0/0x1e0
Code: 65 58 00 0f a3 3a 19 d2 31 f6 85 d2 40 0f 95 c6 ff d0 48 89 df e8 5b 0e 0f 00 48 83 c4 48 31 c0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 <0f> 0b eb fe e8 61 3f 4f 00 90 e9 33 ff ff ff e8 56 3f 4f 00 66
RIP  [<ffffffff8107ff06>] smpboot_thread_fn+0x2a6/0x310
 RSP <ffff880077afbe48>
---[ end trace 0000000000000002 ]---
note: ksoftirqd/8[70] exited with preempt_count 1
------------[ cut here ]------------

And the system goes down/becomes unresponsive.

Environment

  • MRG Realtime 2.5 or Red Hat Enterprise Linux 6 with MRG Realtime Kernel (confirmed only on kernel-rt 3.10 series)
  • System with more than 1 NUMA node enabled

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content