OpenStack overcloud run fails due to facter running for more than 5 minutes in Red Hat OpenStack Platform

Solution Verified - Updated -

Issue

This issue may happen in WorkflowTasks_Step2_Execution in ceph_base_ansible_workflow in task enable_ssh_admin:

2020-02-19 04:19:10Z [overcloud-AllNodesDeploySteps-hdpp3w3ojlco.WorkflowTasks_Step2_Execution]: UPDATE_IN_PROGRESS  state changed
2020-02-19 04:24:32Z [overcloud-AllNodesDeploySteps-hdpp3w3ojlco.WorkflowTasks_Step2_Execution]: UPDATE_FAILED  resources.WorkflowTasks_Step2_Execution: Failure caused by error in tasks: ceph_base_ansible_workflow

  ceph_base_ansible_workflow [task_ex_id=a025f701-de16-481d-8e3e-8f8d6c04bb03] -> Failure caused by error in tasks: enable_ssh_admin

  enable_ssh_admin
2020-02-19 04:24:32Z [overcloud-AllNodesDeploySteps-hdpp3w3ojlco]: UPDATE_FAILED  Resource UPDATE failed: resources.WorkflowTasks_Step2_Execution: Failure caused by error in tasks: ceph_base_ansible_workflow

  ceph_base_ansible_workflow [task_ex_id=a025f701-de16-481d-8e3e-8f8d6c04bb03] -> Failure caused by error in tasks: enable_ssh_a
2020-02-19 04:24:33Z [AllNodesDeploySteps]: UPDATE_FAILED  resources.WorkflowTasks_Step2_Execution: resources.AllNodesDeploySteps.Failure caused by error in tasks: ceph_base_ansible_workflow

  ceph_base_ansible_workflow [task_ex_id=a025f701-de16-481d-8e3e-8f8d6c04bb03] -> Failure caused by error in tasks: enable
2020-02-19 04:24:33Z [overcloud]: UPDATE_FAILED  Resource UPDATE failed: resources.WorkflowTasks_Step2_Execution: resources.AllNodesDeploySteps.Failure caused by error in tasks: ceph_base_ansible_workflow

  ceph_base_ansible_workflow [task_ex_id=a025f701-de16-481d-8e3e-8f8d6c04bb03] -> Failure caused b

 Stack overcloud UPDATE_FAILED

overcloud.AllNodesDeploySteps.WorkflowTasks_Step2_Execution:
  resource_type: OS::TripleO::WorkflowSteps
  physical_resource_id: 4e2eda2d-2b99-4a38-b4ad-b076ec4da82a
  status: UPDATE_FAILED
  status_reason: |
    ...


        [wf_ex_id=14101f6e-760c-45c3-b72f-c6a2119d9c30, idx=0]: Failure caused by error in tasks: create_admin

      create_admin [task_ex_id=6042f936-0570-4158-9652-35acc22cd431] -> One or more actions had failed.
        [wf_ex_id=6bc7ef3b-5d87-4675-b475-240d3db4dbdb, idx=7]: None
        [wf_ex_id=9d5be15b-d0d4-41b6-a14b-67ce58a6a702, idx=10]: None
        [wf_ex_id=e0234903-e865-4c7f-be18-3e628721802b, idx=17]: None

It does hence look similar to:

However, in this specific case, the undercloud does not use SSL.

Instead, one can see from the controller os-collect-config logs that the task is not completing before the URL expires:

Feb 19 01:19:50 controller-prd03 os-collect-config[460653]: [2020-02-19 01:19:50,108] (heat-config) [DEBUG] Running /usr/libexec/heat-config/hooks/ansible < /var/lib/heat-config/deployed/a5784c21-03c3-4731-be53-ee2797cf7c3d.json
Feb 19 01:19:51 controller-prd03 ansible-setup[48601]: Invoked with filter=* gather_subset=['all'] fact_path=/etc/ansible/facts.d gather_timeout=10
Feb 19 01:26:12 controller-prd03 ansible-user[102143]: Invoked with comment=None ssh_key_bits=0 update_password=always non_unique=False force=False ssh_key_type=rsa create_home=True password_lock=None ssh_key_passphrase=NOT_LOGGING_PARAM
ETER uid=None home=None append=False skeleton=None ssh_key_comment=ansible-generated on controller-prd03 group=None system=False state=present hidden=None local=None shell=None expires=None ssh_key_file=None groups=None move_home=False p
assword=NOT_LOGGING_PARAMETER name=tripleo-admin seuser=None remove=False login_class=None generate_ssh_key=None
Feb 19 01:26:20 controller-prd03 os-collect-config[460653]: [2020-02-19 01:26:20,849] (heat-config) [DEBUG] [2020-02-19 01:26:20,785] (heat-config-notify) [DEBUG] Signaling to http://10.51.110.10:8080/v1/AUTH_2123ca4387294ee19b88ab8deafe395c/create_admin-5cffb359-cb0d-4148-b03c-16787a959b31/8c20319e-d41a-4e1b-ba5d-5b671eafc6a4?temp_url_sig=c43c3c9c8e16c85e148d7c4763151983922a090d&temp_url_expires=1582103960 via PUT
Feb 19 01:26:20 controller-prd03 os-collect-config[460653]: [2020-02-19 01:26:20,800] (heat-config-notify) [DEBUG] Response <Response [404]>
Feb 19 01:19:20 director.example.com object-server[5408]: 10.51.110.10 - - [19/Feb/2020:04:19:20 +0000] "PUT /1/952/AUTH_2123ca4387294ee19b88ab8deafe395c/create_admin-5cffb359-cb0d-4148-b03c-16787a959b31/8c20319e-d41a-4e1b-ba5d-5b671eafc6a4" 201 - "PUT http://10.51.110.10:8080/v1/AUTH_2123ca4387294ee19b88ab8deafe395c/create_admin-5cffb359-cb0d-4148-b03c-16787a959b31/8c20319e-d41a-4e1b-ba5d-5b671eafc6a4" "tx061bd764d0a546a7a70a2-005e4cb748" "proxy-server 5876" 0.0057 "-" 5408 0
Feb 19 01:24:26 director.example.com object-server[5413]: 10.51.110.10 - - [19/Feb/2020:04:24:26 +0000] "DELETE /1/952/AUTH_2123ca4387294ee19b88ab8deafe395c/create_admin-5cffb359-cb0d-4148-b03c-16787a959b31/8c20319e-d41a-4e1b-ba5d-5b671eafc6a4" 204 - "DELETE http://10.51.110.10:8080/v1/AUTH_2123ca4387294ee19b88ab8deafe395c/create_admin-5cffb359-cb0d-4148-b03c-16787a959b31/8c20319e-d41a-4e1b-ba5d-5b671eafc6a4" "tx65c2273ebb9b4bef89b7a-005e4cb87a" "proxy-server 5876" 0.0106 "-" 5413 0

Most of this time is taken up by ansible-setup[48601]: Invoked with filter=* gather_subset=['all'] fact_path=/etc/ansible/facts.d gather_timeout=10.

Environment

Red Hat OpenStack Platform 13

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content