RHEL5: System hangs frequently, running process seen in ext3_find_entry
Issue
- System hang or soft lockup with a running task seen inside
ext3_find_entry
Mar 22 08:57:48 foo kernel: BUG: soft lockup - CPU#4 stuck for 10s! [db2stop2:7748]
Mar 22 08:57:48 foo kernel: CPU 4:
Mar 22 08:57:48 foo kernel: Modules linked in: nfsd exportfs lockd nfs_acl auth_rpcgss sunrpc dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev sg pcspkr ide_cd serio_raw bnx2 cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc ata_piix libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Mar 22 08:57:48 foo kernel: Pid: 7748, comm: db2stop2 Not tainted 2.6.18-128.el5 #1
Mar 22 08:57:48 foo kernel: RIP: 0010:[<ffffffff88052e45>] [<ffffffff88052e45>] :ext3:ext3_find_entry+0x1a1/0x592
...
Mar 22 08:57:48 foo kernel:
Mar 22 08:57:48 foo kernel: Call Trace:
Mar 22 08:57:48 foo kernel: [<ffffffff88052e1a>] :ext3:ext3_find_entry+0x176/0x592
Mar 22 08:57:48 foo kernel: [<ffffffff8005dde9>] error_exit+0x0/0x84
Mar 22 08:57:48 foo kernel: [<ffffffff8805c0e2>] :ext3:ext3_get_acl+0x63/0x30d
Mar 22 08:57:48 foo kernel: [<ffffffff80013688>] find_lock_page+0x97/0xa1
Mar 22 08:57:48 foo kernel: [<ffffffff880548ee>] :ext3:ext3_lookup+0x33/0x161
Mar 22 08:57:48 foo kernel: [<ffffffff8000cc36>] do_lookup+0xe5/0x1e6
Mar 22 08:57:48 foo kernel: [<ffffffff80009fc7>] __link_path_walk+0xa01/0xf42
Mar 22 08:57:48 foo kernel: [<ffffffff8000e80a>] link_path_walk+0x5c/0xe5
Mar 22 08:57:48 foo kernel: [<ffffffff800b4628>] audit_syscall_entry+0x16e/0x1a1
Mar 22 08:57:48 foo kernel: [<ffffffff8002c77e>] mntput_no_expire+0x19/0x89
Mar 22 08:57:48 foo kernel: [<ffffffff8000c9d5>] do_path_lookup+0x270/0x2e8
Mar 22 08:57:48 foo kernel: [<ffffffff800123ef>] getname+0x15b/0x1c1
Mar 22 08:57:48 foo kernel: [<ffffffff80023298>] __user_walk_fd+0x37/0x4c
Mar 22 08:57:48 foo kernel: [<ffffffff8003204f>] sys_faccessat+0xe4/0x18d
- Red Hat Enterprise Linux 5.7 VmWare guest hangs frequently with following messages.
INFO: task crond:4667 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
crond D ffff81000a53eaa0 0 4667 3704 (NOTLB)
ffff810061c1de58 0000000000000086 ffff8100622947e0 ffff8100622947e0
000004169e2af62f 0000000000000007 ffff8100622947e0 ffff81007fe210c0
000004169e2af6e8 00000000000000b9 ffff8100622949c8 00000001801abaee
Call Trace:
[<ffffffff80063161>] wait_for_completion+0x79/0xa2
[<ffffffff8008e857>] default_wake_function+0x0/0xe
[<ffffffff8012778a>] __key_instantiate_and_link+0x8f/0xc5
[<ffffffff800a11c5>] synchronize_rcu+0x30/0x36
[<ffffffff800a0d01>] wakeme_after_rcu+0x0/0x9
[<ffffffff8012a1aa>] install_session_keyring+0xc0/0xd3
[<ffffffff8012a6d8>] join_session_keyring+0x25/0xcb
[<ffffffff80129b95>] keyctl_join_session_keyring+0x2d/0x40
[<ffffffff8005d28d>] tracesys+0xd5/0xe0
Environment
- Red Hat Enterprise Linux 5
- seen on kernels 2.6.18-128.el5 and 2.6.18-274.el5
- all kernels up to 2.6.18-371.el5 likely affected
- ext3 filesystem
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.