Solve pcs resource stuck on OpenStack controller
Issue
On Openstack 7, 8 and 9 there is lots of constraint in between all openstack services which are started as a pcs resource.
In some case of network failure, we could get some service fail on one controller and a resource cleanup do not resolve the issue.
Failed Actions:
* rabbitmq_monitor_10000 on overcloud-controller-0 'not running' (7): call=63, status=complete, exitreason='none',
last-rc-change='Fri Jan 26 17:40:45 2018', queued=0ms, exec=0ms
* openstack-heat-engine_monitor_60000 on overcloud-controller-0 'not running' (7): call=324, status=complete, exitreason='none',
last-rc-change='Fri Jan 26 17:39:12 2018', queued=0ms, exec=0ms
Try to cleanup
pcs resource cleanup
Error: Cleaning up all resources on all nodes will execute more than 100 operations in the cluster, which may negatively impact the responsiveness of the cluster. Consider specifying resource and/or node, use --force to override
pcs resource cleanup --force
But some resource stay stopped as
Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
Started: [ overcloud-controller-1 overcloud-controller-2 ]
Stopped: [ overcloud-controller-0 ]
Environment
- Red Hat Openstack 7
- Red Hat Openstack 8
- Red Hat Openstack 9
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.