soft_offlining ページが原因で、split_huge_page+0x7a4/0x7e0 でサーバーがカーネルパニックになる
Issue
- RHEL 6.4 では、カーネルバージョンが 2.6.32-358.6.2.el6.x86_64 の場合は、split_huge_page+0x7a4/0x7e0 でパニックが発生します。vmcore 内で取得したカーネルのリングバッファーに、以下が記載されました。
soft_offline:0x95dff7: unknown non LRU page type c0000000008000
------------[ cut here ]------------
kernel BUG at mm/huge_memory.c:1197!
invalid opcode:0000 [#1] SMP
last sysfs file:/sys/devices/system/cpu/cpu23/topology/thread_siblings
CPU 1
Modules linked in: vxodm(P)(U) gab(P)(U) llt(P)(U) autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc dmpjbod(P)(U) dmpap(P)(U) dmpaa(P)(U) vxspec(P)(U) vxio(P)(U) vxdmp(P)(U) cpufreq_ondemand freq_table pcc_cpufreq bonding ipv6 8021q garp stp llc vxportal(P)(U) fdd(P)(U) vxfs(P)(U) exportfs ext3 jbd power_meter sg be2net microcode serio_raw iTCO_wdt iTCO_vendor_support hpilo hpwdt bnx2 i7core_edac edac_core shpchp ext4 mbcache jbd2 qla2xxx scsi_transport_fc scsi_tgt sd_mod crc_t10dif hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid:6095, comm: perfd Tainted:P --------------- 2.6.32-358.6.2.el6.x86_64 #1 HP ProLiant DL380 G7
RIP:0010:[<ffffffff81179404>] [<ffffffff81179404>] split_huge_page+0x7a4/0x7e0
RSP:0018:ffff880bfd139ca8 EFLAGS:00010086
RAX:00000000ffffffff RBX: ffffea0020c8fe08 RCX: ffff880c1e9a92b0
RDX: ffffea0020c89028 RSI: ffffea0020c8fea0 RDI: ffffea0020c8fe68
RBP: ffff880bfd139d78 R08: ffffea0020c89028 R09:0000000000000000
R10:0000000000000018 R11:0000000000000206 R12:00000000000001f7
R13:0000000000000000 R14: ffffea0020c89000 R15: ffff880628010dc0
FS:00007f47e0e95720(0000) GS:ffff88063d400000(0000) knlGS:0000000000000000
CS:0010 DS:0000 ES:0000 CR0:0000000080050033
CR2:0000000002c78000 CR3:0000000bfd3f2000 CR4:00000000000007e0
DR0:0000000000000000 DR1:0000000000000000 DR2:0000000000000000
DR3:0000000000000000 DR6:00000000ffff0ff0 DR7:0000000000000400
Process perfd (pid:6095, threadinfo ffff880bfd138000, task ffff880bfd141500)
Stack:
ffff880bfd139ce8 000000000e976291 ffff880c1e99a880 ffff880c1bbcd340
<d> ffffffff817ca730 ffff88061ce9c420 ffff880bfd139db8 ffff880a4ef47ffe
<d> ffff88061ce9c438 ffffffff81281200 0000002100000009 0000000100000008
Call Trace:
[<ffffffff81281200>] ? vsnprintf+0x450/0x5e0
[<ffffffff811794c1>] __split_huge_page_pmd+0x81/0xc0
[<ffffffff8117959c>] split_huge_page_address+0x9c/0xa0
[<ffffffff81179643>] __vma_adjust_trans_huge+0xa3/0xf0
[<ffffffff81148416>] vma_adjust+0x556/0x5e0
[<ffffffff8101a40e>] ? c_start+0x6e/0x80
[<ffffffff811486ab>] __split_vma+0x20b/0x280
[<ffffffff811498a7>] do_munmap+0x187/0x3a0
[<ffffffff811817a5>] ? vfs_read+0xb5/0x1a0
[<ffffffff81149fb1>] sys_brk+0x121/0x130
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code:1f 84 00 00 00 00 00 eb f6 48 8b 43 10 e9 9a fc ff ff 0f 0b 0f 1f 00 eb fb 49 8b 06 a9 00 00 00 02 0f 84 36 fa ff ff f3 90 eb ee <0f> 0b eb fe 0f 0b 66 0f 1f 44 00 00 eb f8 0f 0b eb fe 0f 0b 0f
RIP [<ffffffff81179404>] split_huge_page+0x7a4/0x7e0
RSP <ffff880bfd139ca8>
Environment
- Red Hat Enterprise Linux 6.4
- 有効である透過的な Hugepage
- 不具合のあるハードウェア
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.