Processes were hung in aio context when one of the paths to the LUN was lost.

Solution Verified - Updated -

Issue

  • Processes were hung in aio context when one of the paths to the LUN was lost. For database cluster nodes, this could cause eviction of one of the nodes.
[31714012.691250] INFO: task ora_lgwr_oexsdb:12586 blocked for more than 120 seconds.
[31714012.691254]       Tainted: P           OE    --------- -  - 4.18.0-193.el8.x86_64 #1
[31714012.691254] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[31714012.691255] ora_lgwr_oexsdb D    0 12586      1 0x80000000
[31714012.691257] Call Trace:
[31714012.691267]  ? __schedule+0x24f/0x650
[31714012.691269]  ? __switch_to_asm+0x35/0x70
[31714012.691270]  schedule+0x2f/0xa0
[31714012.691271]  schedule_timeout+0x20d/0x310
[31714012.691273]  ? __switch_to_asm+0x41/0x70
[31714012.691278]  ? __kfifo_from_user_r+0xb0/0xb0
[31714012.691279]  ? __percpu_ref_switch_mode+0xd4/0x180
[31714012.691285]  ? __wake_up_common_lock+0x89/0xc0
[31714012.691286]  wait_for_completion+0x11f/0x190
[31714012.691290]  ? wake_up_q+0x70/0x70
[31714012.691293]  exit_aio+0xdc/0xf0
[31714012.691297]  mmput+0x28/0x130
[31714012.691301]  do_exit+0x287/0xb40
[31714012.691305]  ? __do_page_fault+0x24c/0x4e0
[31714012.691307]  do_group_exit+0x3a/0xa0
[31714012.691308]  __x64_sys_exit_group+0x14/0x20
[31714012.691313]  do_syscall_64+0x5b/0x1a0
[31714012.691315]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[31714012.691316] RIP: 0033:0x7ffff3927506

[31714246.156805] device-mapper: multipath: Failing path 71:0.
[31714246.322353] sysrq: SysRq : Trigger a crash
[31714246.322360] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[31714246.322361] PGD 0 P4D 0 
[31714246.322363] Oops: 0002 [#1] SMP NOPTI
[31714246.322365] CPU: 3 PID: 8424 Comm: cssdagent Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-193.el8.x86_64 #1
[31714246.322367] Hardware name: HPE ProLiant DL580 Gen10/ProLiant DL580 Gen10, BIOS U34 05/24/2021
[31714246.322371] RIP: 0010:sysrq_handle_crash+0x12/0x20
[31714246.322373] Code: 54 f1 c4 ff 48 89 df e8 7c fb ff ff e9 9c fe ff ff 90 90 90 90 90 90 90 0f 1f 44 00 00 c7 05 fd 02 d2 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 c3 0f 1f 44 00 00 0f 1f 44 00 00 bf 01 00
[31714246.322374] sysrq: SysRq : Trigger a crash
[31714246.322376] RSP: 0018:ffffaba16c50fe78 EFLAGS: 00010246
[31714246.322377] RAX: ffffffff9ad24300 RBX: 0000000000000063 RCX: 0000000000000000
[31714246.322378] RDX: 0000000000000000 RSI: ffff9661ff4d6a08 RDI: 0000000000000063
[31714246.322379] RBP: 0000000000000004 R08: 0000000000002681 R09: 0000000000000030
[31714246.322380] R10: 0000000000000000 R11: ffffaba16c50fd30 R12: 0000000000000000
[31714246.322380] R13: 0000000000000000 R14: ffffffff9bb3c0a0 R15: 0000000000000000
[31714246.322382] FS:  00007fffed03e700(0000) GS:ffff9661ff4c0000(0000) knlGS:0000000000000000
[31714246.322383] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[31714246.322383] CR2: 0000000000000000 CR3: 00000020e07a6001 CR4: 00000000007606e0
[31714246.322384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[31714246.322385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[31714246.322386] PKRU: 55555554
[31714246.322386] Call Trace:
[31714246.322389]  __handle_sysrq.cold.9+0x48/0xfb
[31714246.322392]  write_sysrq_trigger+0x2b/0x30
[31714246.322398]  proc_reg_write+0x3c/0x60
[31714246.322403]  vfs_write+0xa5/0x1a0
[31714246.322405]  ksys_write+0x4f/0xb0
[31714246.322410]  do_syscall_64+0x5b/0x1a0
[31714246.322415]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Environment

  • Red Hat Enterprise Linux (RHEL) 8

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content