Kernel BUG at mm/slab.c:605 when using sisips module

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise linux 5

Issue

System panicked with following message.

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:605
invalid opcode: 0000 [1] SMP 
last sysfs file: /devices/system/cpu/cpu23/topology/thread_siblings
CPU 11 
Modules linked in: lp parport_pc parport ipmi_si edd st ide_cd mpt2sas mptctl ipmi_devintf ipmi_msghandler dell_rbu autofs4 nfs nfs_acl lockd sunrpc bonding be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi sisips(PU) dm_round_robin dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac joydev tpm_tis sr_mod cdrom i7core_edac tpm edac_mc sg tpm_bios serio_raw pcspkr bnx2 dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 8268, comm: mdServer Tainted: P     ---- 2.6.18-308.4.1.el5 #1
RIP: 0010:[<ffffffff800df83c>]  [<ffffffff800df83c>] free_block+0x94/0x145
RSP: 0018:ffff8105a8467db8  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8103483e7000 RCX: 0000000000000069
RDX: ffff81010b7cda88 RSI: ffff8106300c6348 RDI: 0000000000000000
RBP: ffff810115aca140 R08: ffff81062fe3ec04 R09: 0000000000000246
R10: 0000000000000000 R11: 0000000000000246 R12: ffff810645167100
R13: ffff8103483e7c00 R14: 000000000000003b R15: 0000000000000001
FS:  000000004238f940(0063) GS:ffff81062fe3ecc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002abcad4373c0 CR3: 0000000335302000 CR4: 00000000000006a0
Process mdServer (pid: 8268, threadinfo ffff8105a8466000, task ffff81033b9ce7a0)
Stack:  0000000000000000 0000003c88550e36 ffff81062fe46418 000000000000003c
 ffff81062fe46400 ffff810115aca140 0000000000000001 ffff810645167100
 000000004238ee00 ffffffff800dfa21 ffff81062fe46400 0000000000000001
Call Trace:
 [<ffffffff800dfa21>] cache_flusharray+0x74/0xa3
 [<ffffffff8000b63a>] kfree+0x1c0/0x1dc
 [<ffffffff8854ee36>] :sisips:_ZN8UserSids5clearEv+0x26/0x40
 [<ffffffff8853c1af>] :sisips:_Z12ProcessNewIdv+0x2f/0xb0
 [<ffffffff8853dcb2>] :sisips:hook_setresuid+0x102/0x190
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f 0b 68 35 0b 2c 80 c2 5d 02 eb fe 48 8b 5a 30 49 63 c7 49 
RIP  [<ffffffff800df83c>] free_block+0x94/0x145
 RSP <ffff8105a8467db8>

Resolution

Contact the module vendor.

Root Cause

Seems like the slab corruption is casued by thridparty module sisips.

Diagnostic Steps

The panic is in a defined bug location written to check the consistency of slab.

   0602 static inline struct slab *page_get_slab(struct page *page)
   0603 {
   0604     page = compound_head(page);
   0605     BUG_ON(!PageSlab(page));
   0606     return (struct slab *)page->lru.prev;
   0607 }

Checking the slab..

 crash> kmem -s
[..]
 ffff8106300c21c0 size-256                 256       2407      2745    183     4k
 ffff8106300c1180 size-128(DMA)            128          0         0      0     4k
 ffff8106300c0140 size-64(DMA)              64          0         0      0     4k
 kmem: size-64: partial list: slab: ffff810be89c4000  bad inuse counter: 4294967236 <------------

Looks like there is a corruption in it.

Let us have a look at the back trace..

 #5 [ffff8105a8467e00] cache_flusharray at ffffffff800dfa21
 #6 [ffff8105a8467e30] kfree at ffffffff8000b63a
 #7 [ffff8105a8467e70] _ZN8UserSids5clearEv at ffffffff8854ee36 [sisips]
 #8 [ffff8105a8467e80] _Z12ProcessNewIdv at ffffffff8853c1af [sisips]
 #9 [ffff8105a8467ea0] hook_setresuid at ffffffff8853dcb2 [sisips]
 #10 [ffff8105a8467f80] tracesys at ffffffff8005d28d (via system_call)

The panic code is moving through the sisips code.

Checking the stack..

 #6 [ffff8105a8467e30] kfree at ffffffff8000b63a
     ffff8105a8467e38: 0000000000000000 ffff8103483e7c00 
     ffff8105a8467e48: ffff81031d89c4f8 0000000000000000 
     ffff8105a8467e58: ffff81031d89c400 0000000000000000 
     ffff8105a8467e68: 000000004238ee01 ffffffff8854ee36 

 It looks like sisips is passing invalid addresses.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments