RHEL6 で kernel-2.6.32-504.23.4 より新しいカーネルの hugetlb コードで競合が発生する
Issue
以下のような競合状態が発生します。
- HugePages が有効な場合に region_* 関数でシステムがクラッシュします。
crash> log | grep -e ^BUG -e ^IP:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000c
IP:[<ffffffff81166dd4>] region_chg+0xe4/0x100
crash> bt
PID:20463 TASK: ffff881566e4f520 CPU:5 COMMAND:"java"
#0 [ffff881566e7b7d0] machine_kexec at ffffffff8103b60b
#1 [ffff881566e7b830] crash_kexec at ffffffff810c99e2
#2 [ffff881566e7b900] oops_end at ffffffff8152e1c0
#3 [ffff881566e7b930] no_context at ffffffff8104c80b
#4 [ffff881566e7b980] __bad_area_nosemaphore at ffffffff8104ca95
#5 [ffff881566e7b9d0] bad_area at ffffffff8104cbbe
#6 [ffff881566e7ba00] __do_page_fault at ffffffff8104d3c3
#7 [ffff881566e7bb20] do_page_fault at ffffffff8153010e
#8 [ffff881566e7bb50] page_fault at ffffffff8152d4b5
[exception RIP: region_chg+228]
RIP: ffffffff8116a0a4 RSP: ffff8803c31b7ca8 RFLAGS:00010282
RAX: fffffffffffffffe RBX: fffffffffffffffe RCX:00000000000001c0
RDX: dead000000100100 RSI:00000000000001bd RDI: ffff8803f1cffc48
RBP: ffff8803c31b7cc8 R8: ffff8803f1cffc41 R9:00000006979ffff0
R10: ffff881073b05480 R11:0000000000000000 R12: 00000000000001c0
R13:00000000000001bd R14: ffffffff81fd19e0 R15:0000000000000000
ORIG_RAX: ffffffffffffffff CS:0010 SS:0000
#6 [ffff8803c31b7ca0] anon_vma_prepare at ffffffff8115c1a0
#7 [ffff8803c31b7ce0] hugetlb_fault at ffffffff8116bc23
#8 [ffff8803c31b7d90] handle_mm_fault at ffffffff81153285
#9 [ffff8803c31b7e00] __do_page_fault at ffffffff8104f156
#10 [ffff8803c31b7f20] do_page_fault at ffffffff8153eb7e
#11 [ffff8803c31b7f50] page_fault at ffffffff8153bf25
- region_* 関数での破損が挙げられます。
[ 3043.345741] ------------[ cut here ]------------
[ 3043.345762] WARNING: at lib/list_debug.c:51 list_del+0x8d/0xa0() (Not tainted)
[ 3043.345766] Hardware name:PRIMERGY RX600 S5
[ 3043.345769] list_del corruption. next->prev should be ffff88153b50e460, but was ffff88153b50
e7c0
[ 3043.345772] Modules linked in: mptctl mptbase autofs4 nfs lockd fscache auth_rpcgss nfs_acl
sunrpc 8021q garp stp llc smbus(U) cpufreq_ondemand acpi_cpufreq freq_table mperf bonding ipv6
iTCO_wdt iTCO_vendor_support microcode ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandl
er i2c_i801 lpc_ich mfd_core ioatdma i7core_edac edac_core e1000e ixgbe mdio sg igb dca i2c_alg
o_bit i2c_core ptp pps_core ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_gener
ic ata_piix megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
[ 3043.345843] Pid:32950, comm: java Not tainted 2.6.32-504.30.3.el6.x86_64 #1
[ 3043.345847] Call Trace:
[ 3043.345860] [<ffffffff81074e47>] ? warn_slowpath_common+0x87/0xc0
[ 3043.345865] [<ffffffff81074f36>] ? warn_slowpath_fmt+0x46/0x50
[ 3043.345870] [<ffffffff8129edbd>] ? list_del+0x8d/0xa0
[ 3043.345878] [<ffffffff8116688a>] ? region_add+0x9a/0xe0
[ 3043.345882] [<ffffffff81167c9d>] ? alloc_huge_page+0x29d/0x3c0
[ 3043.345888] [<ffffffff811691bb>] ? hugetlb_fault+0x43b/0x7b0
[ 3043.345896] [<ffffffff8105872d>] ? check_preempt_curr+0x6d/0x90
[ 3043.345906] [<ffffffff81150115>] ? handle_mm_fault+0x395/0x3d0
[ 3043.345914] [<ffffffff81063c63>] ? perf_event_task_sched_out+0x33/0x70
[ 3043.345920] [<ffffffff8104d096>] ?__do_page_fault+0x146/0x500
[ 3043.345930] [<ffffffff81529afe>] ? thread_return+0x4e/0x7d0
[ 3043.345937] [<ffffffff8153010e>] ? do_page_fault+0x3e/0xa0
[ 3043.345941] [<ffffffff8152d4b5>] ? page_fault+0x25/0x30
[ 3043.345945] ---[ end trace 3171fe47b71fad99 ]---
- shm_close+0xd6/0xe0 でシステムがクラッシュします。
------------[ cut here ]------------
kernel BUG at ipc/shm.c:232!
invalid opcode:0000 [#1] SMP
last sysfs file:/sys/devices/pci0000:00/0000:00:08.0/0000:0a:00.1/host2/rport-2:0-4/target2:0:4/2:0:4:118/state
CPU 1
Modules linked in: mptctl mptbase ipmi_devintf cpufreq_ondemand acpi_cpufreq mperf 8021q garp stp llc bonding ipv6 ipt_REJECT iptable_filter ip_tables dm_round_robin dm_multipath cpufreq_stats freq_table joydev sg iTCO_wdt iTCO_vendor_support serio_raw hpilo hpwdt ses enclosure lpc_ich mfd_core i7core_edac edac_core power_meter acpi_ipmi ipmi_si ipmi_msghandler qlcnic shpchp ext4 jbd2 mbcache sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid:9797, comm: httpd Not tainted 2.6.32-573.8.1.el6.x86_64 #1 HP ProLiant DL380 G6
RIP:0010:[<ffffffff81224336>] [<ffffffff81224336>] shm_close+0xd6/0xe0
RSP:0018:ffff880176a43e98 EFLAGS:00010202
RAX: ffffffffffffffea RBX: ffffffff81aec340 RCX:0000000000000006
RDX: ffffffffffffffea RSI:0000000000000040 RDI:0000000000000000
RBP: ffff880176a43eb8 R08:0000000000000000 R09:0000000000000000
R10:0000000000000000 R11:0000000000000202 R12: ffffffff81aec3e0
R13: ffffffffffffffea R14: ffff8800e6bb0188 R15:00007fffe9031000
FS:00007ffff7f8a7e0(0000) GS:ffff880c42600000(0000) knlGS:0000000000000000
CS:0010 DS:0000 ES:0000 CR0:0000000080050033
CR2:00007fffe9775612 CR3:0000000176a39000 CR4:00000000000007e0
DR0:0000000000000000 DR1:0000000000000000 DR2:0000000000000000
DR3:0000000000000000 DR6:00000000ffff0ff0 DR7:0000000000000400
Process httpd (pid:9797, threadinfo ffff880176a40000, task ffff880bfff5a040)
Stack:
0000000000000000 0000000000000000 ffff8800e6bb0188 ffff880d7345de90
<d> ffff880176a43ed8 ffffffff811562e3 0000000000000000 ffff880bff459a00
<d> ffff880176a43f38 ffffffff811588d7 00007fffffffe0a0 ffff8800e6bb0188
Call Trace:
[<ffffffff811562e3>] remove_vma+0x33/0x90
[<ffffffff811588d7>] do_munmap+0x317/0x3b0
[<ffffffff8122327e>] sys_shmdt+0xce/0x170
[<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
Code:0f 1f 44 00 00 41 f6 45 21 02 75 e6 4c 89 e8 66 ff 00 66 66 90 4c 89 e7 e8 a8 27 e8 ff 48 8b 5d e8 4c 8b 65 f0 4c 8b 6d f8 c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d
RIP [<ffffffff81224336>] shm_close+0xd6/0xe0
RSP <ffff880176a43e98>
Environment
- Red Hat Enterprise Linux 6.7
- Red Hat Enterprise Linux 6.6
- HugePages が有効で使用中である
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
