Kernel BUG at mm/slab.c:605 when using sisips module
Environment
- Red Hat Enterprise linux 5
Issue
System panicked with following message.
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:605
invalid opcode: 0000 [1] SMP
last sysfs file: /devices/system/cpu/cpu23/topology/thread_siblings
CPU 11
Modules linked in: lp parport_pc parport ipmi_si edd st ide_cd mpt2sas mptctl ipmi_devintf ipmi_msghandler dell_rbu autofs4 nfs nfs_acl lockd sunrpc bonding be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi sisips(PU) dm_round_robin dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac joydev tpm_tis sr_mod cdrom i7core_edac tpm edac_mc sg tpm_bios serio_raw pcspkr bnx2 dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 8268, comm: mdServer Tainted: P ---- 2.6.18-308.4.1.el5 #1
RIP: 0010:[<ffffffff800df83c>] [<ffffffff800df83c>] free_block+0x94/0x145
RSP: 0018:ffff8105a8467db8 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8103483e7000 RCX: 0000000000000069
RDX: ffff81010b7cda88 RSI: ffff8106300c6348 RDI: 0000000000000000
RBP: ffff810115aca140 R08: ffff81062fe3ec04 R09: 0000000000000246
R10: 0000000000000000 R11: 0000000000000246 R12: ffff810645167100
R13: ffff8103483e7c00 R14: 000000000000003b R15: 0000000000000001
FS: 000000004238f940(0063) GS:ffff81062fe3ecc0(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002abcad4373c0 CR3: 0000000335302000 CR4: 00000000000006a0
Process mdServer (pid: 8268, threadinfo ffff8105a8466000, task ffff81033b9ce7a0)
Stack: 0000000000000000 0000003c88550e36 ffff81062fe46418 000000000000003c
ffff81062fe46400 ffff810115aca140 0000000000000001 ffff810645167100
000000004238ee00 ffffffff800dfa21 ffff81062fe46400 0000000000000001
Call Trace:
[<ffffffff800dfa21>] cache_flusharray+0x74/0xa3
[<ffffffff8000b63a>] kfree+0x1c0/0x1dc
[<ffffffff8854ee36>] :sisips:_ZN8UserSids5clearEv+0x26/0x40
[<ffffffff8853c1af>] :sisips:_Z12ProcessNewIdv+0x2f/0xb0
[<ffffffff8853dcb2>] :sisips:hook_setresuid+0x102/0x190
[<ffffffff8005d28d>] tracesys+0xd5/0xe0
Code: 0f 0b 68 35 0b 2c 80 c2 5d 02 eb fe 48 8b 5a 30 49 63 c7 49
RIP [<ffffffff800df83c>] free_block+0x94/0x145
RSP <ffff8105a8467db8>
Resolution
Contact the module vendor.
Root Cause
Seems like the slab corruption is casued by thridparty module sisips.
Diagnostic Steps
The panic is in a defined bug location written to check the consistency of slab.
0602 static inline struct slab *page_get_slab(struct page *page)
0603 {
0604 page = compound_head(page);
0605 BUG_ON(!PageSlab(page));
0606 return (struct slab *)page->lru.prev;
0607 }
Checking the slab..
crash> kmem -s
[..]
ffff8106300c21c0 size-256 256 2407 2745 183 4k
ffff8106300c1180 size-128(DMA) 128 0 0 0 4k
ffff8106300c0140 size-64(DMA) 64 0 0 0 4k
kmem: size-64: partial list: slab: ffff810be89c4000 bad inuse counter: 4294967236 <------------
Looks like there is a corruption in it.
Let us have a look at the back trace..
#5 [ffff8105a8467e00] cache_flusharray at ffffffff800dfa21
#6 [ffff8105a8467e30] kfree at ffffffff8000b63a
#7 [ffff8105a8467e70] _ZN8UserSids5clearEv at ffffffff8854ee36 [sisips]
#8 [ffff8105a8467e80] _Z12ProcessNewIdv at ffffffff8853c1af [sisips]
#9 [ffff8105a8467ea0] hook_setresuid at ffffffff8853dcb2 [sisips]
#10 [ffff8105a8467f80] tracesys at ffffffff8005d28d (via system_call)
The panic code is moving through the sisips code.
Checking the stack..
#6 [ffff8105a8467e30] kfree at ffffffff8000b63a
ffff8105a8467e38: 0000000000000000 ffff8103483e7c00
ffff8105a8467e48: ffff81031d89c4f8 0000000000000000
ffff8105a8467e58: ffff81031d89c400 0000000000000000
ffff8105a8467e68: 000000004238ee01 ffffffff8854ee36
It looks like sisips is passing invalid addresses.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments