Rabbitmq fails to start on all 3 openstack controllers
Issue
Rabbitmq cluster is not able to start and all nodes are in FAIL or Stopped state. The exitreason is an empty string ('') and the exec time is approximately 37 seconds (in this case it is 37525ms). DNS resolution for the rabbitmq hostnames is failing.
[root@controller-0 ~]# pcs status
[...]
* Container bundle set: rabbitmq-bundle [cluster.common.tag/ntvs-osp16_containers-rabbitmq:pcmklatest]:
* rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): FAILED controller-0
* rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Stopped controller-1
* rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Stopped controller-2
[...]
Failed Resource Actions:
* rabbitmq_start_0 on rabbitmq-bundle-0 'error' (1): call=13, status='complete', exitreason='', last-rc-change='2021-05-18 18:34:28Z', queued=1ms, exec=37525ms
In journalctl log we can see that pacemaker-remoted can not connect to controller-2:
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ Error: unable to perform an operation on node 'rabbit@controller-2'. Please see diagnostics information and suggestions below. ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ Most common reasons for this are: ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues) ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server) ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * Target node is not running ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ In addition to the diagnostics info below: ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * Consult server logs on node rabbit@controller-2 ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * If target node is configured to use long node names, don't forget to use --longnames with CLI tools ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ DIAGNOSTICS ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ =========== ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ attempted to contact: ['rabbit@controller-2'] ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ rabbit@controller-2: ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * connected to epmd (port 4369) on controller-2 ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * epmd reports: node 'rabbit' not running at all ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ no other nodes on controller-2 ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * suggestion: start the node ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ Current node details: ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * node name: 'rabbitmqcli-8099-rabbit@controller-2' ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * effective user's home directory: /var/lib/rabbitmq ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ * Erlang cookie hash: NQheT9ndizt04UcVWmXWsg== ]
May 18 19:54:53 controller-2 pacemaker-remoted[417644]: notice: rabbitmq_stop_0:7614:stderr [ ]
May 18 19:54:53 controller-2 pacemaker-controld[3181]: notice: Result of stop operation for rabbitmq on rabbitmq-bundle-2: 0 (ok)
The DNS resolution for rabbitmq hostname is failing with SERVFAIL:
$ host controller-2
Host controller-2 not found: 2(SERVFAIL)
Environment
- Red Hat OpenStack Platform 16
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.