Kernel panic at locks_remove_flock : Kernel BUG at locks:1798
Environment
- Red Hat Enterprise Linux 4
- kernel-2.6.9-42.0.8.ELsmp
- Red Hat Enterprise Linux 5
- kernel-2.6.18-53.el5
Issue
- Server crashed with the following message;
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at locks:1798
invalid operand: 0000 [1] SMP
CPU 0
Modules linked in: cpqci(U) mptctl sg autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc ide_dump cciss_dump scsi_dump diskdump zlib_deflate lpfcdfc lpfc scsi_transport_fc vxspec(U) vxio(U) vxdmp(U) fdd(U) vxportal(U) vxfs(U) dm_mod button battery ac ohci_hcd hw_random tg3 e1000 floppy ext3 jbd mptsas cciss mptspi mptscsi mptbase sd_mod scsi_mod
Pid: 6875, comm: java Tainted: PF 2.6.9-42.0.8.ELsmp
RIP: 0010:[<ffffffff8018e598>] <ffffffff8018e598>{locks_remove_flock+201}
RSP: 0000:000001010ff71e38 EFLAGS: 00010246
RAX: 000001017b8a34c0 RBX: 000001003e5afec8 RCX: 000000000000000a
RDX: 0000000000000000 RSI: 000000000000007c RDI: ffffffff804e6600
RBP: 000001003e5afdb8 R08: ffffffffa0551c88 R09: 0000000300000000
R10: ffffffffffffbefc R11: 00000100e42120c0 R12: 00000100e42120c0
R13: 00000100ded8a408 R14: 00000000fffffff0 R15: 0000000000000152
FS: 0000000048f94960(005b) GS:ffffffff804e5900(0000) knlGS:00000000e093fbb0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000002a9e3d2070 CR3: 0000000000101000 CR4: 00000000000006e0
Process java (pid: 6875, threadinfo 000001010ff70000, task 0000010014756030)
Stack: 000001010ff71e68 ffffffffa055cb79 00000100e42120c0 00000100e42120c0
000001003e5afec8 ffffffff8018e4c5 00000000002c9518 00000000523c4066
00000000098433f8 00000000523c4066
Call Trace:<ffffffffa055cb79>{:lockd:nlmclnt_locks_release_private+33}
<ffffffff8018e4c5>{locks_remove_posix+374} <ffffffff8017a141>{__fput+73}
<ffffffff80178d6c>{filp_close+103} <ffffffff8018a18a>{sys_dup2+284}
<ffffffff8011026a>{system_call+126}
Code: 0f 0b 94 93 32 80 ff ff ff ff 06 07 48 89 c3 48 8b 03 eb ba
RIP <ffffffff8018e598>{locks_remove_flock+201} RSP <000001010ff71e38>
Resolution
Red Hat Enterprise Linux 4:
- Upgrade to kernel-smp-2.6.9-78.0.22.EL.x86_64 from RHSA-2009-0459 or later.
Red Hat Enterprise Linux 5:
- Upgrade to kernel-2.6.18-128.1.10.el5.x86_64 from RHSA-2009:0473 or later.
Root Cause
-
Race condition in the do_setlk function in fs/nfs/file.c in the Linux kernel before 2.6.26 allows local users to cause a denial of service (crash) via vectors resulting in an interrupted RPC call that leads to a stray FL_POSIX lock, related to improper handling of a race between fcntl and close in the EINTR case.
Diagnostic Steps
Kernel Ring Buffer:
crash> log
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at locks:1798
invalid operand: 0000 [1] SMP
CPU 0
Modules linked in: cpqci(U) mptctl sg autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc ide_dump cciss_dump scsi_dump diskdump zlib_deflate lpfcdfc lpfc scsi_transport_fc vxspec(U) vxio(U) vxdmp(U) fdd(U) vxportal(U) vxfs(U) dm_mod button battery ac ohci_hcd hw_random tg3 e1000 floppy ext3 jbd mptsas cciss mptspi mptscsi mptbase sd_mod scsi_mod
Pid: 6875, comm: java Tainted: PF 2.6.9-42.0.8.ELsmp
RIP: 0010:[<ffffffff8018e598>] <ffffffff8018e598>{locks_remove_flock+201}
RSP: 0000:000001010ff71e38 EFLAGS: 00010246
RAX: 000001017b8a34c0 RBX: 000001003e5afec8 RCX: 000000000000000a
RDX: 0000000000000000 RSI: 000000000000007c RDI: ffffffff804e6600
RBP: 000001003e5afdb8 R08: ffffffffa0551c88 R09: 0000000300000000
R10: ffffffffffffbefc R11: 00000100e42120c0 R12: 00000100e42120c0
R13: 00000100ded8a408 R14: 00000000fffffff0 R15: 0000000000000152
FS: 0000000048f94960(005b) GS:ffffffff804e5900(0000) knlGS:00000000e093fbb0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000002a9e3d2070 CR3: 0000000000101000 CR4: 00000000000006e0
Process java (pid: 6875, threadinfo 000001010ff70000, task 0000010014756030)
Stack: 000001010ff71e68 ffffffffa055cb79 00000100e42120c0 00000100e42120c0
000001003e5afec8 ffffffff8018e4c5 00000000002c9518 00000000523c4066
00000000098433f8 00000000523c4066
Call Trace:<ffffffffa055cb79>{:lockd:nlmclnt_locks_release_private+33}
<ffffffff8018e4c5>{locks_remove_posix+374} <ffffffff8017a141>{__fput+73}
<ffffffff80178d6c>{filp_close+103} <ffffffff8018a18a>{sys_dup2+284}
<ffffffff8011026a>{system_call+126}
Code: 0f 0b 94 93 32 80 ff ff ff ff 06 07 48 89 c3 48 8b 03 eb ba
RIP <ffffffff8018e598>{locks_remove_flock+201} RSP <000001010ff71e38>
Backtrace:
crash> bt
PID: 6875 TASK: 10014756030 CPU: 0 COMMAND: "java"
#0 [1010ff71c60] start_disk_dump at ffffffffa051736d [diskdump]
#1 [1010ff71c90] try_crashdump at ffffffff8014bd05
#2 [1010ff71ca0] die at ffffffff80111c00
#3 [1010ff71cc0] do_invalid_op at ffffffff80111fc8
#4 [1010ff71d80] error_exit at ffffffff80110d91
[exception RIP: locks_remove_flock+201]
RIP: ffffffff8018e598 RSP: 000001010ff71e38 RFLAGS: 00010246
RAX: 000001017b8a34c0 RBX: 000001003e5afec8 RCX: 000000000000000a
RDX: 0000000000000000 RSI: 000000000000007c RDI: ffffffff804e6600
RBP: 000001003e5afdb8 R8: ffffffffa0551c88 R9: 0000000300000000
R10: ffffffffffffbefc R11: 00000100e42120c0 R12: 00000100e42120c0
R13: 00000100ded8a408 R14: 00000000fffffff0 R15: 0000000000000152
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#5 [1010ff71ef0] __fput at ffffffff8017a141
#6 [1010ff71f20] filp_close at ffffffff80178d6c
#7 [1010ff71f40] sys_dup2 at ffffffff8018a18a
#8 [1010ff71f80] system_call at ffffffff8011026a
RIP: 00000038999b98d9 RSP: 0000000048f93220 RFLAGS: 00010202
RAX: 0000000000000021 RBX: ffffffff8011026a RCX: 0000000000000000
RDX: 0000002a964b9000 RSI: 0000000000000152 RDI: 00000000000000f1
RBP: 0000002aa2a75800 R8: 0000000000000ffc R9: 0000002a961092f8
R10: 0000002a9610d158 R11: 0000000000000202 R12: 0000000048f93428
R13: 00000006f9c6dad0 R14: 0000000000000000 R15: 0000000048f93408
ORIG_RAX: 0000000000000021 CS: 0033 SS: 002b
Disassembly of exception pointer (RIP):
crash> dis -rl locks_remove_flock+201 |tail -n 4
/builddir/build/BUILD/kernel-2.6.9/linux-2.6.9/fs/locks.c: 1795
0xffffffff8018e596 <locks_remove_flock+199>: jmp 0xffffffff8018e5a7 <locks_remove_flock+216>
/builddir/build/BUILD/kernel-2.6.9/linux-2.6.9/fs/locks.c: 1798
0xffffffff8018e598 <locks_remove_flock+201>: ud2
Corresponding kernel code of exception pointer(RIP):
/*
* This function is called on the last close of an open file.
*/
void locks_remove_flock(struct file *filp)
{
[...]
while ((fl = *before) != NULL) {
if (fl->fl_file == filp) {
if (IS_FLOCK(fl)) {
locks_delete_lock(before);
continue;
}
if (IS_LEASE(fl)) {
lease_modify(before, F_UNLCK);
continue;
}
if (IS_POSIX(fl))
continue;
/* What? */
BUG(); <<<-----[ Panic triggered ]
}
before = &fl->fl_next;
}
unlock_kernel();
}
Mount point and FL_POSIX Lock:
crash> px locks_remove_flock
locks_remove_flock = $2 =
{void (struct file *)} 0xffffffff8018e4cf <locks_remove_flock>
crash> dis -rl locks_remove_flock+201 |head -n 5
/builddir/build/BUILD/kernel-2.6.9/linux-2.6.9/fs/locks.c: 1770
0xffffffff8018e4cf <locks_remove_flock>: push %r12
0xffffffff8018e4d1 <locks_remove_flock+2>: mov %rdi,%r12
0xffffffff8018e4d4 <locks_remove_flock+5>: push %rbp
0xffffffff8018e4d5 <locks_remove_flock+6>: push %rbx
crash> bt -f |grep -A 6 locks_remove_flock+201
[exception RIP: locks_remove_flock+201]
RIP: ffffffff8018e598 RSP: 000001010ff71e38 RFLAGS: 00010246
RAX: 000001017b8a34c0 RBX: 000001003e5afec8 RCX: 000000000000000a
RDX: 0000000000000000 RSI: 000000000000007c RDI: ffffffff804e6600
RBP: 000001003e5afdb8 R8: ffffffffa0551c88 R9: 0000000300000000
R10: ffffffffffffbefc R11: 00000100e42120c0 R12: 00000100e42120c0
R13: 00000100ded8a408 R14: 00000000fffffff0 R15: 0000000000000152
R12: 00000100e42120c0
crash> struct file.f_dentry 00000100e42120c0
f_dentry = 0x100ded8a408
crash> files -d 0x100ded8a408
DENTRY INODE SUPERBLK TYPE PATH
100ded8a408 1003e5afdb8 100f568b800 REG /tlm/tlmdev/oms/DEFAULT/cache-EMEA-DEV/BDB/BdbConvertsUnderlier/je.info.0.lck
crash> mount |grep 100f568b800
10001135980 100f568b800 nfs server1.example.com:/vol/volgrp02/dev_2502 /tlm
crash> inode.i_flock 1003e5afdb8
i_flock = 0x1017b8a34c0
crash> file_lock.fl_flags 0x1017b8a34c0
fl_flags = 129 '\201'
129 means FL_POSIX + FL_SLEEP
625#define FL_POSIX 1
626#define FL_FLOCK 2
627#define FL_ACCESS 8 /* not trying to lock, just looking */
628#define FL_LOCKD 16 /* lock held by rpc.lockd */
629#define FL_LEASE 32 /* lease held on this file */
630#define FL_SLEEP 128 /* A blocking lock */
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
