RHOSP13z9 failed to deploy overcloud at Deployment_Step4

Solution In Progress - Updated -

Issue

  • Deployment fails with errors similar to this:
overcloud.AllNodesDeploySteps.ComputeHCIOvsDpdkDeployment_Step4.1:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: efd2d54a-4d87-49f9-9449-2ed8bee09929
  status: CREATE_FAILED
  status_reason: |
    Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
            "DEBUG:novaclient.v2.client:RESP: [200] Date: Mon, 09 Dec 2019 05:35:13 GMT Server: Apache OpenStack-API-Version: compute 2.11 X-OpenStack-Nova-API-Version: 2.11 Vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version x-openstack-request-id: req-32f26991-bcc0-4d0e-82db-86b2929bee1f x-compute-request-id: req-32f26991-bcc0-4d0e-82db-86b2929bee1f Content-Length: 16 Content-Type: application/json ",
            "DEBUG:novaclient.v2.client:GET call to compute for http://10.10.10.10:8774/v2.1/os-services?binary=nova-compute used request id req-32f26991-bcc0-4d0e-82db-86b2929bee1f",
            "stdout: ca2bcbf224a4846e6eaff2c6b7027a5ad1a00e91f2d2cf390827dfae6ef58208"
        ]
    }
        to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/5d8e2245-7c8d-4e6b-b4bf-15ced3e4d753_playbook.retry

    PLAY RECAP *********************************************************************
    localhost                  : ok=12   changed=8    unreachable=0    failed=1

    (truncated, view all with --long)
  deploy_stderr: |
  • Deployment command is :
(undercloud) [stack@undercloud ~]$ cat ./deploy_overcloud-hci-Redhat.sh
time openstack overcloud deploy  --templates \
  -p ~/templates/plan-environment-derived-params.yaml \
  -r ~/templates/roles_data_hci.yaml \
  -e ~/templates/overcloud_images.yaml \
  -e ~/templates/network-isolation.yaml \
  -e ~/templates/resume-guests.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-ovs-dpdk.yaml \
  -e ~/templates/storage-config.yaml \
  -e ~/templates/ovs-dpdk-permissions.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
  -e ~/templates/network-environment-er703-hci.yaml \
  -e ~/templates/ip-allocation-map-er703-hci.yaml \
  -e ~/templates/fernet.yaml \
  -e ~/templates/environments/deployment-artifacts.yaml \
  -e ~/templates/rhel-registration/rhel-registration-resource-registry.yaml \
  -e ~/templates/rhel-registration/environment-rhel-registration.yaml \
  -e ~/templates/fencing-er703-hci.yaml \
  -t 120 \
  --ntp-server 10.10.10.1 \
  --libvirt-type kvm
  • One of the errors is:
  "Error running ['docker', 'run', '--name', 'nova_wait_for_compute_service', '--label', 'config_id=tripleo_step4', '--label', 'container_name=nova_wait_for_compute_service', '--label', 'managed_by=paunch', '--label', 'config_data={\"start_order\": 4, \"command\": \"/docker-config-scripts/nova_wait_for_compute_service.py\", \"user\": \"nova\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/config-data/nova_libvirt/etc/my.cnf.d/:/etc/my.cnf.d/:ro\", \"/var/lib/config-data/nova_libvirt/etc/nova/:/etc/nova/:ro\", \"/var/log/containers/nova:/var/log/nova\", \"/var/lib/docker-config-scripts/:/docker-config-scripts/\"], \"image\": \"sat.localhost.localdomain:5000/osp-osp13_containers-nova-compute:13.0-115.1574774353\", \"detach\": false, \"net\": \"host\"}', '--net=host', '--user=nova', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/config-data/nova_libvirt/etc/my.cnf.d/:/etc/my.cnf.d/:ro', '--volume=/var/lib/config-data/nova_libvirt/etc/nova/:/etc/nova/:ro', '--volume=/var/log/containers/nova:/var/log/nova', '--volume=/var/lib/docker-config-scripts/:/docker-config-scripts/', 'sat.localhost.localdomain:5000/osp-osp13_containers-nova-compute:13.0-115.1574774353', '/docker-config-scripts/nova_wait_for_compute_service.py']. [1]", 
  • All the services report failures to connect to the AMQP service with messages similar to these:
Dec 11 18:20:00 overcloud-controller-0 docker: debug 2019-12-11 18:20:00.401504 7f4454871700  0 log_channel(audit) log [DBG] : from='client.? 172.18.1.13:0/723200108' entity='client.admin' cmd=[{"prefix": "fs ls"}]: dispatch
Dec 11 18:20:00 overcloud-controller-0 journal: Process Process-321143:
Dec 11 18:20:00 overcloud-controller-0 journal: Traceback (most recent call last):
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Dec 11 18:20:00 overcloud-controller-0 journal:    self.run()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
Dec 11 18:20:00 overcloud-controller-0 journal:    self._target(*self._args, **self._kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/cotyledon/_utils.py", line 52, in _bootstrap_process
Dec 11 18:20:00 overcloud-controller-0 journal:    target(*args, **kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/cotyledon/_service.py", line 161, in create_and_wait
Dec 11 18:20:00 overcloud-controller-0 journal:    sw = cls(*args, **kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/cotyledon/_service.py", line 175, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.service = config.service(worker_id, *args, **kwargs)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/aodh/event.py", line 66, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.listener.start()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 270, in wrapper
Dec 11 18:20:00 overcloud-controller-0 journal:    log_after, timeout_timer)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 190, in run_once
Dec 11 18:20:00 overcloud-controller-0 journal:    post_fn = fn()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 269, in <lambda>
Dec 11 18:20:00 overcloud-controller-0 journal:    states[state].run_once(lambda: fn(self, *args, **kwargs),
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 416, in start
Dec 11 18:20:00 overcloud-controller-0 journal:    self.listener = self._create_listener()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/notify/listener.py", line 160, in _create_listener
Dec 11 18:20:00 overcloud-controller-0 journal:    self._batch_timeout
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 156, in _listen_for_notifications
Dec 11 18:20:00 overcloud-controller-0 journal:    targets_and_priorities, pool, batch_size, batch_timeout
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 650, in listen_for_notifications
Dec 11 18:20:00 overcloud-controller-0 journal:    conn = self._get_connection(rpc_common.PURPOSE_LISTEN)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 541, in _get_connection
Dec 11 18:20:00 overcloud-controller-0 journal:    purpose=purpose)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/common.py", line 402, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.connection = connection_pool.create(purpose)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/pool.py", line 144, in create
Dec 11 18:20:00 overcloud-controller-0 journal:    return self.connection_cls(self.conf, self.url, purpose)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 635, in __init__
Dec 11 18:20:00 overcloud-controller-0 journal:    self.ensure_connection()
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 734, in ensure_connection
Dec 11 18:20:00 overcloud-controller-0 journal:    self.ensure(method=self.connection.connect)
Dec 11 18:20:00 overcloud-controller-0 journal:  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 845, in ensure
Dec 11 18:20:00 overcloud-controller-0 journal:    raise exceptions.MessageDeliveryFailure(msg)
Dec 11 18:20:00 overcloud-controller-0 journal: MessageDeliveryFailure: Unable to connect to AMQP server on overcloud-controller-0.internalapi.localdomain:5672 after None tries: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For details see the broker logfile.

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In