The current commit thread of the ext4 journal is calling cond_resched() to give the scheduler a chance to run a higher priority task on the runqueue without finishing committing the journal. Oracle RAC node exiction occurred as a result.
Issue
-
The system gets hung up.
-
The tasks are stuck waiting for the ext4 journal to be committed by the current commit thread of that journal.
PID: 12471 TASK: ffff90baf6618000 CPU: 10 COMMAND: "java"
#0 [ffff90ba932b7d88] __schedule at ffffffffa1169a72
#1 [ffff90ba932b7e10] schedule at ffffffffa1169f19
#2 [ffff90ba932b7e20] jbd2_log_wait_commit at ffffffffc06a77c5 [jbd2]
#3 [ffff90ba932b7e98] jbd2_complete_transaction at ffffffffc06a8e52 [jbd2]
#4 [ffff90ba932b7eb8] ext4_sync_file at ffffffffc060e782 [ext4]
...
- The current commit thread of the ext4 journal is calling cond_resched() to give the scheduler a chance to run a higher priority task on the runqueue without finishing committing the journal. Oracle RAC node exiction occurred as a result.
PID: 6081 TASK: ffff90bb10b04100 CPU: 1 COMMAND: "jbd2/dm-6-8"
#0 [ffff90bb1eb6fa58] __schedule at ffffffffa1169a72
#1 [ffff90bb1eb6fae0] __cond_resched at ffffffffa0ad4646
#2 [ffff90bb1eb6faf8] _cond_resched at ffffffffa116a1ba
#3 [ffff90bb1eb6fb08] tag_pages_for_writeback at ffffffffa0bc2b36
#4 [ffff90bb1eb6fb40] write_cache_pages at ffffffffa0bc350c
#5 [ffff90bb1eb6fc48] generic_writepages at ffffffffa0bc392d
#6 [ffff90bb1eb6fca8] jbd2_journal_commit_transaction at ffffffffc06a150e [jbd2]
...
PID: 28810 TASK: ffff90baf4a5e180 CPU: 4 COMMAND: "cssdmonitor"
...
[exception RIP: sysrq_handle_crash+22]
RIP: ffffffffa0e64106 RSP: ffff90ba2beafe58 RFLAGS: 00010246
RAX: ffffffffa0e640f0 RBX: ffffffffa16e4f40 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000282 RDI: 0000000000000063
RBP: ffff90ba2beafe58 R8: 00000000a06022b0 R9: ffffffffa19f9667
R10: 00000000000ea274 R11: 0000000000100000 R12: 0000000000000063
R13: 0000000000000000 R14: 0000000000000007 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#11 [ffff90ba2beafe60] __handle_sysrq at ffffffffa0e6492d
#12 [ffff90ba2beafe90] write_sysrq_trigger at ffffffffa0e64d98
...
- A possible CPU runtime starvation is being encountered due to the highly CPU-bound workload of Oracle real-time application threads.
crash> runq -c 1
CPU 1 RUNQUEUE: ffff90bb2325ab80
CURRENT: PID: 28782 TASK: ffff90bb1beda080 COMMAND: "cssdagent" <<------------
RT PRIO_ARRAY: ffff90bb2325ad20
[ 0] PID: 28782 TASK: ffff90bb1beda080 COMMAND: "cssdagent" <<------------
CFS RB_ROOT: ffff90bb2325ac28
[120] PID: 2020 TASK: ffff90a60beb9040 COMMAND: "kworker/1:11"
[120] PID: 27266 TASK: ffff90a74f7f8000 COMMAND: "ora_q004_tolss1"
[120] PID: 24997 TASK: ffff90a59ab430c0 COMMAND: "ora_tt03_arcdb1"
[120] PID: 28485 TASK: ffff90ba25b50000 COMMAND: "gipcd.bin"
[120] PID: 6081 TASK: ffff90bb10b04100 COMMAND: "jbd2/dm-6-8" <<------------
[120] PID: 19318 TASK: ffff90b75eec2080 COMMAND: "ora_lmd0_wwsm1"
[120] PID: 19740 TASK: ffff90b65ff45140 COMMAND: "ora_rs00_dwh1"
[120] PID: 29158 TASK: ffff90ba502a0000 COMMAND: "crsd.bin"
[120] PID: 21628 TASK: ffff90ac41b22080 COMMAND: "ora_ctwr_upp1"
crash> ps -y RR | awk '$1~/>/'
> 19037 1 5 ffff90b7dbf9e180 RU 0.0 11220908 25256 ora_lmhb_ccr1
> 28540 1 0 ffff90ba07e030c0 RU 0.1 1663924 176868 osysmond.bin
> 28752 1 7 ffff90bb1b928000 RU 0.2 3164060 279688 ocssd.bin
> 28753 1 10 ffff90bb1b92d140 RU 0.2 3164060 279688 ocssd.bin
> 28765 1 2 ffff90b90129a080 RU 0.2 3164060 279688 ocssd.bin
> 28767 1 11 ffff90b901298000 RU 0.2 3164060 279688 ocssd.bin
> 28782 1 1 ffff90bb1beda080 RU 0.1 1186164 156112 cssdagent
> 28810 1 4 ffff90baf4a5e180 RU 0.1 1182784 154088 cssdmonitor
> 28830 1 6 ffff90baf661b0c0 RU 0.2 3164060 279688 ocssd.bin
> 28831 1 9 ffff90baf661c100 RU 0.2 3164060 279688 ocssd.bin
> 31742 1 13 ffff90bb11e65140 RU 0.0 4794312 37076 asm_lms0_+asm1
Environment
- Red Hat Enterprise Linux 7.6 kernel-3.10.0-957.27.2.el7
- The RHEL guest running on the KVM hypervisor
- Oracle RAC
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.