The kernel crashes with lots of blocked tasks that are getting stuck waiting on mmap_sem or i_mmap_mutex where the system memory is largely consumed for file-backed page-cache pages.
Issue
- The kernel crashes with lots of blocked tasks that are getting stuck waiting on mmap_sem or i_mmap_mutex where the system memory is largely consumed for file-backed page-cache pages.
...
[1740555.363829] INFO: task BESClient:6750 blocked for more than 69 seconds.
[1740555.363894] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1740555.363953] BESClient D ffff88bb91455860 0 6750 1 0x00000080
[1740555.363959] Call Trace:
[1740555.363976] [<ffffffff93b8c3f9>] schedule+0x29/0x70
[1740555.363982] [<ffffffff93b8dde5>] rwsem_down_read_failed+0x105/0x1c0
[1740555.363989] [<ffffffff93797fa8>] call_rwsem_down_read_failed+0x18/0x30
[1740555.363994] [<ffffffff93b8b8c0>] down_read+0x20/0x40
[1740555.364000] [<ffffffff936cae02>] proc_pid_cmdline_read+0xb2/0x5d0
[1740555.364005] [<ffffffff937091cc>] ? security_file_permission+0x8c/0xa0
[1740555.364013] [<ffffffff9364e33f>] vfs_read+0x9f/0x170
[1740555.364018] [<ffffffff9364f185>] SyS_read+0x55/0xd0
[1740555.364023] [<ffffffff93b99f92>] system_call_fastpath+0x25/0x2a
...
[1740555.366589] Kernel panic - not syncing: hung_task: blocked tasks
[1740555.366641] CPU: 31 PID: 173 Comm: khungtaskd Kdump: loaded Tainted: P OE ------------ 3.10.0-1160.71.1.el7.x86_64 #1
[1740555.366711] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
[1740555.366753] Call Trace:
[1740555.366778] [<ffffffff93b865c9>] dump_stack+0x19/0x1b
[1740555.366814] [<ffffffff93b802d1>] panic+0xe8/0x21f
[1740555.366849] [<ffffffff9354eabe>] watchdog+0x26e/0x2c0
[1740555.366883] [<ffffffff9354e850>] ? reset_hung_task_detector+0x20/0x20
[1740555.366926] [<ffffffff934c5f91>] kthread+0xd1/0xe0
[1740555.366959] [<ffffffff934c5ec0>] ? insert_kthread_work+0x40/0x40
[1740555.367002] [<ffffffff93b99df7>] ret_from_fork_nospec_begin+0x21/0x21
[1740555.367049] [<ffffffff934c5ec0>] ? insert_kthread_work+0x40/0x40
Environment
- Red Hat Enterprise Linux 7.9.z - kernel-3.10.0-1160.71.1.el7
- Physical machine (HPE ProLiant BL460c Gen9)
- Quite a large number (more than 50) of Oracle Database instances are running on one server
- Roughly 40% of system RAM is utilized for hugegpages (for Oracle Database SGA)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.