double increment of failcount upon single failure of start operation
Issue
When certain resource fails to start the failcount value increases twice instead of only once in some cases. Example (one start failure but failcount increases from 2 to 3 and then from 3 to 4):
Feb 27 12:33:04 [13688] vm2 stonith-ng: ( xml.c:2089 ) debug: xml_patch_version_check: Can apply patch 0.594.9 to 0.594.8
Feb 27 12:33:04 [13690] vm2 attrd: ( commands.c:757 ) info: attrd_peer_update: Setting fail-count-myapache2[vm2]: 2 -> 3 from vm1
Feb 27 12:33:04 [13690] vm2 attrd: ( commands.c:757 ) info: attrd_peer_update: Setting last-failure-myapache2[vm2]: 1519702379 -> 1519702383 from vm1
Feb 27 12:33:04 [13690] vm2 attrd: ( commands.c:757 ) info: attrd_peer_update: Setting fail-count-myapache2[vm2]: 3 -> 4 from vm1
Feb 27 12:33:04 [13687] vm2 cib: ( cib_utils.c:285 ) debug: cib_acl_enabled: CIB ACL is disabled
Feb 27 12:33:04 [13687] vm2 cib: ( cib_ops.c:378 ) debug: cib_process_modify: Destroying /cib/status/node_state[2]/transient_attributes/instance_attributes/nvpair[2]
The nature of this problem is random. The problem doesn't happen every time - it has been observed that it happens only when fail count and time get updated in separate events. If both are updated in same event the overall failcount value gets incremented by 1 as expected. This doesn't seem to be resource-type specific as it has been observed with various resources.
Environment
- Red Hat Enterprise Linux 7
- pacemaker-1.1.15-11.el7_3.4.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
