Overcloud deployment failed du to haproxy when we enabled ceph-dashboard

Solution In Progress - Updated -

Issue

  • We did the overcloud deploy with --stack-only.

  • We enabled ceph-dashboard:

./openstack_log.sh openstack overcloud deploy \
--disable-validations \
--debug \
--templates \
-e /home/stack/templates/00-global-config.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dvr.yaml \
-e /home/stack/templates/01-hostname-map.yaml \
-e /home/stack/templates/02-ctlplane-assignments.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/templates/03-network-environment.yaml \
-e /home/stack/templates/04-cloudname.yaml \
-e /home/stack/templates/31-enable-tls.yaml \
-e /home/stack/templates/32-inject-trust-anchor-hiera.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-dns.yaml \
-e /home/stack/templates/51-storage-environment.yaml \
-e /home/stack/templates/52-ceph-ansible.yaml \
-e /home/stack/templates/53-disable-swift.yaml \
-e /home/stack/templates/98-disable-services.yaml \
-e /home/stack/templates/containers-prepare-parameter.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-dashboard.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \
-e /home/stack/templates/54-custom_container_fc.yaml \
-r /home/stack/templates/99-deployed-server-roles-data.yaml \
--overcloud-ssh-user heat-admin \
--overcloud-ssh-key ~/.ssh/id_rsa \
--stack-only \
--log-file overcloudDeploy_${DATE}.log
  • The installation fails because the ha-proxy does not start:
...
        "TASK [ceph-dashboard : set or update dashboard admin username and password] ****",
        "Friday 03 July 2020  12:38:40 +0200 (0:00:02.039)       0:13:03.198 *********** ",
        "FAILED - RETRYING: set or update dashboard admin username and password (6 retries left).",
        "FAILED - RETRYING: set or update dashboard admin username and password (5 retries left).",
        "FAILED - RETRYING: set or update dashboard admin username and password (4 retries left).",
        "FAILED - RETRYING: set or update dashboard admin username and password (3 retries left).",
        "FAILED - RETRYING: set or update dashboard admin username and password (2 retries left).",
        "FAILED - RETRYING: set or update dashboard admin username and password (1 retries left).",
        "fatal: [controller01 -> 10.128.8.171]: FAILED! => changed=false ",
        "  attempts: 6",
        "    if podman exec ceph-mon-controller01 ceph --cluster ceph dashboard ac-user-show admin; then",
        "      podman exec ceph-mon-controller01 ceph --cluster ceph dashboard ac-user-set-password admin XXXXXXXXXXXXX",
        "    else",
        "      podman exec ceph-mon-controller01 ceph --cluster ceph dashboard ac-user-create admin XXXXXXXXXXXX administrator",
        "    fi",
        "  delta: '0:00:01.490360'",
        "  end: '2020-07-03 12:39:30.300089'",
        "  rc: 5",
        "  start: '2020-07-03 12:39:28.809729'",
        "    Error EIO: Module 'dashboard' has experienced an error and cannot handle commands: OSError(\"No socket could be created -- (('::', 8444, 0, 0): [Errno 98] Address already in us
e)\",)",
        "    Error: non zero exit code: 5: OCI runtime error",
...
  • The reason is, that the ceph-mgr Container binds the dashboard-Port 8444 to all Ips. (0.0.0.0)

  • This causes that the HAp_proxy cannot bind to 8444 on the Provisioning Net IP.

  • During the installation, the ceph-mgr container is started and binds to 0.0.0.0:8444 and later haproxy tries to start port 8444 but the port is already bound and haproxy failes to start.

  • When we exec to the ceph-mgr container and change the Ip Address of the dashboard to bind to a specific IP

# ceph config set mgr mgr/dashboard/server_addr 10.10.10.10
# ceph config set mgr module disable dashboard
# ceph config set mgr module enable dashboard

then the mgr starts properly and when we restart the build Playbook all is fine but after a reboot of the controller, the HAproxy starts fine, but now the ceph-mgr fails to start .

Environment

  • Red Hat OpenStack Platform 16.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content