RHOSP13z9 failed to deploy overcloud at Deployment_Step4

Solution In Progress - Updated -

Issue

  • Deployment fails with errors similar to this:
overcloud.AllNodesDeploySteps.ComputeHCIOvsDpdkDeployment_Step4.1:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: efd2d54a-4d87-49f9-9449-2ed8bee09929
  status: CREATE_FAILED
  status_reason: |
    Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
            "DEBUG:novaclient.v2.client:RESP: [200] Date: Mon, 09 Dec 2019 05:35:13 GMT Server: Apache OpenStack-API-Version: compute 2.11 X-OpenStack-Nova-API-Version: 2.11 Vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version x-openstack-request-id: req-32f26991-bcc0-4d0e-82db-86b2929bee1f x-compute-request-id: req-32f26991-bcc0-4d0e-82db-86b2929bee1f Content-Length: 16 Content-Type: application/json ",
            "DEBUG:novaclient.v2.client:GET call to compute for http://10.10.10.10:8774/v2.1/os-services?binary=nova-compute used request id req-32f26991-bcc0-4d0e-82db-86b2929bee1f",
            "stdout: ca2bcbf224a4846e6eaff2c6b7027a5ad1a00e91f2d2cf390827dfae6ef58208"
        ]
    }
        to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/5d8e2245-7c8d-4e6b-b4bf-15ced3e4d753_playbook.retry

    PLAY RECAP *********************************************************************
    localhost                  : ok=12   changed=8    unreachable=0    failed=1

    (truncated, view all with --long)
  deploy_stderr: |
  • Deployment command is :
(undercloud) [stack@undercloud ~]$ cat ./deploy_overcloud-hci-Redhat.sh
time openstack overcloud deploy  --templates \
  -p ~/templates/plan-environment-derived-params.yaml \
  -r ~/templates/roles_data_hci.yaml \
  -e ~/templates/overcloud_images.yaml \
  -e ~/templates/network-isolation.yaml \
  -e ~/templates/resume-guests.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-ovs-dpdk.yaml \
  -e ~/templates/storage-config.yaml \
  -e ~/templates/ovs-dpdk-permissions.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
  -e ~/templates/network-environment-er703-hci.yaml \
  -e ~/templates/ip-allocation-map-er703-hci.yaml \
  -e ~/templates/fernet.yaml \
  -e ~/templates/environments/deployment-artifacts.yaml \
  -e ~/templates/rhel-registration/rhel-registration-resource-registry.yaml \
  -e ~/templates/rhel-registration/environment-rhel-registration.yaml \
  -e ~/templates/fencing-er703-hci.yaml \
  -t 120 \
  --ntp-server 10.10.10.1 \
  --libvirt-type kvm
  • One of the errors is:
  "Error running ['docker', 'run', '--name', 'nova_wait_for_compute_service', '--label', 'config_id=tripleo_step4', '--label', 'container_name=nova_wait_for_compute_service', '--label', 'managed_by=paunch', '--label', 'config_data={\"start_order\": 4, \"command\": \"/docker-config-scripts/nova_wait_for_compute_service.py\", \"user\": \"nova\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/config-data/nova_libvirt/etc/my.cnf.d/:/etc/my.cnf.d/:ro\", \"/var/lib/config-data/nova_libvirt/etc/nova/:/etc/nova/:ro\", \"/var/log/containers/nova:/var/log/nova\", \"/var/lib/docker-config-scripts/:/docker-config-scripts/\"], \"image\": \"sat.localhost.localdomain:5000/osp-osp13_containers-nova-compute:13.0-115.1574774353\", \"detach\": false, \"net\": \"host\"}', '--net=host', '--user=nova', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/config-data/nova_libvirt/etc/my.cnf.d/:/etc/my.cnf.d/:ro', '--volume=/var/lib/config-data/nova_libvirt/etc/nova/:/etc/nova/:ro', '--volume=/var/log/containers/nova:/var/log/nova', '--volume=/var/lib/docker-config-scripts/:/docker-config-scripts/', 'sat.localhost.localdomain:5000/osp-osp13_containers-nova-compute:13.0-115.1574774353', '/docker-config-scripts/nova_wait_for_compute_service.py']. [1]", 
  • All the services report failures to connect to the AMQP service with messages similar to these:
Dec 11 18:20:00 overcloud-controller-0 docker: debug 2019-12-11 18:20:00.401504 7f4454871700  0 log_channel(audit) log [DBG] : from='client.? 172.18.1.13:0/723200108' entity='client.admin' cmd=[{"prefix": "fs ls"}]: dispatch
Dec 11 18:20:00 overcloud-controller-0 journal: Process Process-321143:
Dec 11 18:20:00 overcloud-controller-0 journal: Traceback (most recent call last):
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Dec 11 18:20:00 overcloud-controller-0 journal:    self.run()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
Dec 11 18:20:00 overcloud-controller-0 journal:    self._target(*self._args, **self._kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/cotyledon/_utils.py", line 52, in _bootstrap_process
Dec 11 18:20:00 overcloud-controller-0 journal:    target(*args, **kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/cotyledon/_service.py", line 161, in create_and_wait
Dec 11 18:20:00 overcloud-controller-0 journal:    sw = cls(*args, **kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/cotyledon/_service.py", line 175, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.service = config.service(worker_id, *args, **kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/aodh/event.py", line 66, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.listener.start()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 270, in wrapper
Dec 11 18:20:00 overcloud-controller-0 journal:    log_after, timeout_timer)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 190, in run_once
Dec 11 18:20:00 overcloud-controller-0 journal:    post_fn = fn()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 269, in <lambda>
Dec 11 18:20:00 overcloud-controller-0 journal:    states[state].run_once(lambda: fn(self, *args, **kwargs),
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 416, in start
Dec 11 18:20:00 overcloud-controller-0 journal:    self.listener = self._create_listener()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/notify/listener.py", line 160, in _create_listener
Dec 11 18:20:00 overcloud-controller-0 journal:    self._batch_timeout
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 156, in _listen_for_notifications
Dec 11 18:20:00 overcloud-controller-0 journal:    targets_and_priorities, pool, batch_size, batch_timeout
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 650, in listen_for_notifications
Dec 11 18:20:00 overcloud-controller-0 journal:    conn = self._get_connection(rpc_common.PURPOSE_LISTEN)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 541, in _get_connection
Dec 11 18:20:00 overcloud-controller-0 journal:    purpose=purpose)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/common.py", line 402, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.connection = connection_pool.create(purpose)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/pool.py", line 144, in create
Dec 11 18:20:00 overcloud-controller-0 journal:    return self.connection_cls(self.conf, self.url, purpose)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 635, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.ensure_connection()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 734, in ensure_connection
Dec 11 18:20:00 overcloud-controller-0 journal:    self.ensure(method=self.connection.connect)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 845, in ensure
Dec 11 18:20:00 overcloud-controller-0 journal:    raise exceptions.MessageDeliveryFailure(msg)
Dec 11 18:20:00 overcloud-controller-0 journal: MessageDeliveryFailure: Unable to connect to AMQP server on overcloud-controller-0.internalapi.localdomain:5672 after None tries: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For details see the broker logfile.

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content