RHEL6: race condition in hugetlb code in kernels newer than kernel-2.6.32-504.23.4
Issue
Race condition can manifest itself as
- System crash in region_* functions when HugePages are enabled
crash> log | grep -e ^BUG -e ^IP:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000c
IP: [<ffffffff81166dd4>] region_chg+0xe4/0x100
crash> bt
PID: 20463 TASK: ffff881566e4f520 CPU: 5 COMMAND: "java"
#0 [ffff881566e7b7d0] machine_kexec at ffffffff8103b60b
#1 [ffff881566e7b830] crash_kexec at ffffffff810c99e2
#2 [ffff881566e7b900] oops_end at ffffffff8152e1c0
#3 [ffff881566e7b930] no_context at ffffffff8104c80b
#4 [ffff881566e7b980] __bad_area_nosemaphore at ffffffff8104ca95
#5 [ffff881566e7b9d0] bad_area at ffffffff8104cbbe
#6 [ffff881566e7ba00] __do_page_fault at ffffffff8104d3c3
#7 [ffff881566e7bb20] do_page_fault at ffffffff8153010e
#8 [ffff881566e7bb50] page_fault at ffffffff8152d4b5
[exception RIP: region_chg+228]
RIP: ffffffff8116a0a4 RSP: ffff8803c31b7ca8 RFLAGS: 00010282
RAX: fffffffffffffffe RBX: fffffffffffffffe RCX: 00000000000001c0
RDX: dead000000100100 RSI: 00000000000001bd RDI: ffff8803f1cffc48
RBP: ffff8803c31b7cc8 R8: ffff8803f1cffc41 R9: 00000006979ffff0
R10: ffff881073b05480 R11: 0000000000000000 R12: 00000000000001c0
R13: 00000000000001bd R14: ffffffff81fd19e0 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#6 [ffff8803c31b7ca0] anon_vma_prepare at ffffffff8115c1a0
#7 [ffff8803c31b7ce0] hugetlb_fault at ffffffff8116bc23
#8 [ffff8803c31b7d90] handle_mm_fault at ffffffff81153285
#9 [ffff8803c31b7e00] __do_page_fault at ffffffff8104f156
#10 [ffff8803c31b7f20] do_page_fault at ffffffff8153eb7e
#11 [ffff8803c31b7f50] page_fault at ffffffff8153bf25
- list corruption in region_* functions
[ 3043.345741] ------------[ cut here ]------------
[ 3043.345762] WARNING: at lib/list_debug.c:51 list_del+0x8d/0xa0() (Not tainted)
[ 3043.345766] Hardware name: PRIMERGY RX600 S5
[ 3043.345769] list_del corruption. next->prev should be ffff88153b50e460, but was ffff88153b50
e7c0
[ 3043.345772] Modules linked in: mptctl mptbase autofs4 nfs lockd fscache auth_rpcgss nfs_acl
sunrpc 8021q garp stp llc smbus(U) cpufreq_ondemand acpi_cpufreq freq_table mperf bonding ipv6
iTCO_wdt iTCO_vendor_support microcode ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandl
er i2c_i801 lpc_ich mfd_core ioatdma i7core_edac edac_core e1000e ixgbe mdio sg igb dca i2c_alg
o_bit i2c_core ptp pps_core ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_gener
ic ata_piix megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
[ 3043.345843] Pid: 32950, comm: java Not tainted 2.6.32-504.30.3.el6.x86_64 #1
[ 3043.345847] Call Trace:
[ 3043.345860] [<ffffffff81074e47>] ? warn_slowpath_common+0x87/0xc0
[ 3043.345865] [<ffffffff81074f36>] ? warn_slowpath_fmt+0x46/0x50
[ 3043.345870] [<ffffffff8129edbd>] ? list_del+0x8d/0xa0
[ 3043.345878] [<ffffffff8116688a>] ? region_add+0x9a/0xe0
[ 3043.345882] [<ffffffff81167c9d>] ? alloc_huge_page+0x29d/0x3c0
[ 3043.345888] [<ffffffff811691bb>] ? hugetlb_fault+0x43b/0x7b0
[ 3043.345896] [<ffffffff8105872d>] ? check_preempt_curr+0x6d/0x90
[ 3043.345906] [<ffffffff81150115>] ? handle_mm_fault+0x395/0x3d0
[ 3043.345914] [<ffffffff81063c63>] ? perf_event_task_sched_out+0x33/0x70
[ 3043.345920] [<ffffffff8104d096>] ? __do_page_fault+0x146/0x500
[ 3043.345930] [<ffffffff81529afe>] ? thread_return+0x4e/0x7d0
[ 3043.345937] [<ffffffff8153010e>] ? do_page_fault+0x3e/0xa0
[ 3043.345941] [<ffffffff8152d4b5>] ? page_fault+0x25/0x30
[ 3043.345945] ---[ end trace 3171fe47b71fad99 ]---
- System crash at shm_close+0xd6/0xe0
------------[ cut here ]------------
kernel BUG at ipc/shm.c:232!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:08.0/0000:0a:00.1/host2/rport-2:0-4/target2:0:4/2:0:4:118/state
CPU 1
Modules linked in: mptctl mptbase ipmi_devintf cpufreq_ondemand acpi_cpufreq mperf 8021q garp stp llc bonding ipv6 ipt_REJECT iptable_filter ip_tables dm_round_robin dm_multipath cpufreq_stats freq_table joydev sg iTCO_wdt iTCO_vendor_support serio_raw hpilo hpwdt ses enclosure lpc_ich mfd_core i7core_edac edac_core power_meter acpi_ipmi ipmi_si ipmi_msghandler qlcnic shpchp ext4 jbd2 mbcache sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 9797, comm: httpd Not tainted 2.6.32-573.8.1.el6.x86_64 #1 HP ProLiant DL380 G6
RIP: 0010:[<ffffffff81224336>] [<ffffffff81224336>] shm_close+0xd6/0xe0
RSP: 0018:ffff880176a43e98 EFLAGS: 00010202
RAX: ffffffffffffffea RBX: ffffffff81aec340 RCX: 0000000000000006
RDX: ffffffffffffffea RSI: 0000000000000040 RDI: 0000000000000000
RBP: ffff880176a43eb8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: ffffffff81aec3e0
R13: ffffffffffffffea R14: ffff8800e6bb0188 R15: 00007fffe9031000
FS: 00007ffff7f8a7e0(0000) GS:ffff880c42600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffe9775612 CR3: 0000000176a39000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process httpd (pid: 9797, threadinfo ffff880176a40000, task ffff880bfff5a040)
Stack:
0000000000000000 0000000000000000 ffff8800e6bb0188 ffff880d7345de90
<d> ffff880176a43ed8 ffffffff811562e3 0000000000000000 ffff880bff459a00
<d> ffff880176a43f38 ffffffff811588d7 00007fffffffe0a0 ffff8800e6bb0188
Call Trace:
[<ffffffff811562e3>] remove_vma+0x33/0x90
[<ffffffff811588d7>] do_munmap+0x317/0x3b0
[<ffffffff8122327e>] sys_shmdt+0xce/0x170
[<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
Code: 0f 1f 44 00 00 41 f6 45 21 02 75 e6 4c 89 e8 66 ff 00 66 66 90 4c 89 e7 e8 a8 27 e8 ff 48 8b 5d e8 4c 8b 65 f0 4c 8b 6d f8 c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d
RIP [<ffffffff81224336>] shm_close+0xd6/0xe0
RSP <ffff880176a43e98>
Environment
- Red Hat Enterprise Linux 6.7
- Red Hat Enterprise Linux 6.6
- HugePages activated and in use.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.