A pacemaker resource's `fail-count` is incremented when another resource's monitor operation fails
Issue
- A pacemaker resource's
fail-count
is incremented when another resource's monitor operation fails. pengine
detects a failure for a master/slave resource immediately after another resource or stonith device fails.- In logs an IPaddr2 resource fails, causing another resource's fail-count to increment.
Jun 15 12:55:27 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: info: get_failcount_full: ip-pgsql has failed 1 times on cs-rh7-4-clust.examplerh.com
Jun 15 12:56:32 [4454] cs-rh7-4.gsslab.rdu2.redhat.com crmd: info: update_failcount: Updating failcount for ip-pgsql on cs-rh7-4-clust.examplerh.com after failed monitor: rc=7 (update=value++, time=1529081792)
Jun 15 12:56:32 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: warning: unpack_rsc_op_failure: Processing failed op monitor for ip-pgsql on cs-rh7-4-clust.examplerh.com: not running (7)
Jun 15 12:56:32 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: info: get_failcount_full: msPostgresql has failed 1 times on cs-rh7-4-clust.examplerh.com
Jun 15 12:56:32 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: info: get_failcount_full: msPostgresql has failed 1 times on cs-rh7-4-clust.examplerh.com
Jun 15 12:56:32 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: info: get_failcount_full: msPostgresql has failed 1 times on cs-rh7-4-clust.examplerh.com
Jun 15 12:56:32 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: info: get_failcount_full: msPostgresql has failed 1 times on cs-rh7-4-clust.examplerh.com
Jun 15 12:56:32 [4453] cs-rh7-4.gsslab.rdu2.redhat.com pengine: info: get_failcount_full: ip-pgsql has failed 1 times on cs-rh7-4-clust.examplerh.com
Environment
- Red Hat Enterprise Linux Server 6, 7 (with the High Availability Add On)
pacemaker-1.1.16-12.el7_4.8
or earlier
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.