Pacemaker pengine reports resource errors or failures every 15 minutes in a High Availability cluster
Issue
-
The
pacemaker
cluster in Red Hat 7 is logging that its forcing a resource away from a node every 15 minutes:Aug 28 16:00:00 node1 pengine: warning: common_apply_stickiness: Forcing <resource>-start away from node2-priv after 1000000 failures (max=1000000)
-
The
pacemaker
cluster in Red Hat 8 is logging the resource monitor previous failure every 15 minutes (historical failure):Aug 28 16:00:00 node1 pacemaker-schedulerd: warning: Unexpected result (not running) was recorded for monitor of <resource> on node1 at Aug 13 15:00:00 2025
-
pengine
reports "failed op monitor" warnings for the same resource every 15 minutes, which causes an issue for our monitoring software, because it causes alerts to fire off. -
Is it normal for
pengine
to repeatedly report errors saying a resource is not running after I've disabled it withpcs
? -
The cluster log shows
Processing failed op monitor for <resource> on <node>: not running (7)
repeatedly which was caused after we manually stopped the application process with start/stop script or run "pcs resource disable" command.
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7, 8, 9, 10 with the High Availability Add On
pacemaker
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.