kernel BUG at kernel/smpboot.c:134! on systems with more than 1 NUMA node, after moving tasks to a given cpuset cgroup with kernel-rt
Issue
System crashes after trying to move all tasks from root cpuset cgroup to another one, eg. given a cpuset named "0", move all tasks to the cpuset 0:
cat /cgroup/cpuset/tasks | xargs -n 1 /bin/echo >> /cgroup/cpuset/0/tasks
After issuing command above I get this kernel crash:
------------[ cut here ]------------
kernel BUG at kernel/smpboot.c:134!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: autofs4 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables ipv6 ppdev parport_pc parport joydev sg microcode snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core pcspkr e1000 ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom sym53c8xx scsi_transport_spi pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf]
CPU: 9 PID: 70 Comm: ksoftirqd/8 Not tainted 3.10.33-rt32.34.el6rt.x86_64 #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff880077af4800 ti: ffff880077afa000 task.ti: ffff880077afa000
RIP: 0010:[<ffffffff8107ff06>] [<ffffffff8107ff06>] smpboot_thread_fn+0x2a6/0x310
RSP: 0018:ffff880077afbe48 EFLAGS: 00010297
RAX: 0000000000000009 RBX: ffff880077a9e030 RCX: 0000000000000000
RDX: ffff880077af4800 RSI: ffff880077af4800 RDI: 0000000000000008
RBP: ffff880077afbeb8 R08: ffff880077afa000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880077afa000
R13: ffffffff81a35d20 R14: ffff880077afa010 R15: ffff880077afa010
FS: 0000000000000000(0000) GS:ffff880079e40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f5e03c29000 CR3: 0000000001a0f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff880077af4800 ffff880077afa000 ffff880077afa010 ffff880077afa000
ffff880077afa000 ffff880077af4800 ffff880077af4800 0000000000000000
ffff880077afbeb8 ffff88003a6c3d08 ffff880077afbec8 ffff880077a9e030
Call Trace:
[<ffffffff8107fc60>] ? smpboot_park_threads+0x90/0x90
[<ffffffff81076cbe>] kthread+0xbe/0xd0
[<ffffffff81076c00>] ? kthreadd+0x1e0/0x1e0
[<ffffffff8157e2ec>] ret_from_fork+0x7c/0xb0
[<ffffffff81076c00>] ? kthreadd+0x1e0/0x1e0
Code: 65 58 00 0f a3 3a 19 d2 31 f6 85 d2 40 0f 95 c6 ff d0 48 89 df e8 5b 0e 0f 00 48 83 c4 48 31 c0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 <0f> 0b eb fe e8 61 3f 4f 00 90 e9 33 ff ff ff e8 56 3f 4f 00 66
RIP [<ffffffff8107ff06>] smpboot_thread_fn+0x2a6/0x310
RSP <ffff880077afbe48>
---[ end trace 0000000000000002 ]---
note: ksoftirqd/8[70] exited with preempt_count 1
------------[ cut here ]------------
And the system goes down/becomes unresponsive.
Environment
- MRG Realtime 2.5 or Red Hat Enterprise Linux 6 with MRG Realtime Kernel (confirmed only on kernel-rt 3.10 series)
- System with more than 1 NUMA node enabled
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.