Pacemaker fails to stop haproxy container resource and it causes fencing in Red Hat OpenStack Platform 16.1
Issue
-
Pacemaker fails to stop
haproxy-bundle-podman
container due togiven PIDs did not die within timeout
</var/log/messages> pacemaker-execd[1000]: warning: haproxy-bundle-podman-0_monitor_60000 process (PID 1002) timed out pacemaker-execd[1000]: warning: haproxy-bundle-podman-0_monitor_60000:1002 - timed out after 120000ms pacemaker-controld[1000]: error: Result of monitor operation for haproxy-bundle-podman-0 on controller-0: Timed Out pacemaker-controld[1000]: notice: Transition 17516 action 173 (haproxy-bundle-podman-0_monitor_60000 on controller-0): expected 'ok' but got 'error' ===> Monitoring timeout occurs on haproxy-bundle-podman-X resource pacemaker-controld[1000]: notice: Initiating stop operation haproxy-bundle-podman-0_stop_0 locally on controller-0 ==> Because of the timeout, pacemaker attepts to stop the haproxy container podman(haproxy-bundle-podman-0)[1001]: ERROR: Error: given PIDs did not die within timeout podman(haproxy-bundle-podman-0)[1001]: ERROR: Failed to stop container, haproxy-bundle-podman-0, based on image, cluster.common.tag/openstack-haproxy:pcmklatest. pacemaker-execd[1000]: notice: haproxy-bundle-podman-0_stop_0:1003:stderr [ ocf-exit-reason:Failed to stop container, haproxy-bundle-podman-0, based on image, cluster.common.tag/openstack-haproxy:pcmklatest. ] pacemaker-controld[1000]: notice: Result of stop operation for haproxy-bundle-podman-0 on controller-0: 1 (error) ==> However, pacemaker failed to stop the haproxy container due to "given PIDs did not die within timeout" ==> This causes fencing of the controller node where the failed haproxy container is
-
This issue causes a fencing of the controller node where the failed haproxy container exists.
Environment
- Red Hat OpenStack Platform 16.1
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.