OpenStack overcloud update fails waiting for Ceph clean pgs
Issue
The openstack overcloud update will fail with an error similar to the following:
2020-12-04 13:11:55Z [overcloud-AllNodesDeploySteps-2d3a5xmhthpb.WorkflowTasks_Step2]: UPDATE_COMPLETE state changed
2020-12-04 13:11:56Z [overcloud-AllNodesDeploySteps-2d3a5xmhthpb.WorkflowTasks_Step2_Execution]: CREATE_IN_PROGRESS state changed
2020-12-04 13:47:53Z [overcloud-AllNodesDeploySteps-2d3a5xmhthpb.WorkflowTasks_Step2_Execution]: CREATE_FAILED resources.WorkflowTasks_Step2_Execution: Failure caused
by error in tasks: ceph_base_ansible_workflow
ceph_base_ansible_workflow [task_ex_id=207c6837-efcb-467b-b882-c4b5a9a3abcd] -> Failure caused by error in tasks: ceph_install
ceph_install [task_e
2020-12-04 13:47:53Z [overcloud-AllNodesDeploySteps-2d3a5xmhabcd]: UPDATE_FAILED Resource CREATE failed: resources.WorkflowTasks_Step2_Execution: Failure caused by err
or in tasks: ceph_base_ansible_workflow
ceph_base_ansible_workflow [task_ex_id=207c6837-efcb-467b-b882-c4b5a9a3abcd] -> Failure caused by error in tasks: ceph_install
2020-12-04 13:47:53Z [AllNodesDeploySteps]: UPDATE_FAILED resources.AllNodesDeploySteps: Resource CREATE failed: resources.WorkflowTasks_Step2_Execution: Failure cause
d by error in tasks: ceph_base_ansible_workflow
ceph_base_ansible_workflow [task_ex_id=207c6837-efcb-467b-b882-c4b5a9a3abcd] -> Failure caused
2020-12-04 13:47:53Z [overcloud]: UPDATE_FAILED Resource UPDATE failed: resources.AllNodesDeploySteps: Resource CREATE failed: resources.WorkflowTasks_Step2_Execution:
Failure caused by error in tasks: ceph_base_ansible_workflow
ceph_base_ansible_workflow [task_ex_id=207c6837-efcb-467b-b882-c4b5abcd
Stack overcloud UPDATE_FAILED
overcloud.AllNodesDeploySteps.WorkflowTasks_Step2_Execution:
resource_type: OS::TripleO::WorkflowSteps
physical_resource_id: 846bfbc3-641a-49ac-9380-73554cdaabcd
status: CREATE_FAILED
status_reason: |
resources.WorkflowTasks_Step2_Execution: Failure caused by error in tasks: ceph_base_ansible_workflow
ceph_base_ansible_workflow [task_ex_id=207c6837-efcb-467b-b882-c4b5a9a3abcd] -> Failure caused by error in tasks: ceph_install
ceph_install [task_ex_id=80d7b0f8-8657-44a2-86bf-026d7c12abcd] -> One or more actions had failed.
And the ceph-install-workflow.log
will show messages similar to these:
2020-12-04 08:46:47,171 p=2691 u=mistral | FAILED - RETRYING: waiting for clean pgs... (2 retries left).
2020-12-04 08:47:17,908 p=2691 u=mistral | FAILED - RETRYING: waiting for clean pgs... (1 retries left).
2020-12-04 08:47:48,635 p=2691 u=mistral | fatal: [192.168.132.46 -> 192.168.132.28]: FAILED! => {"attempts": 40, "changed": true, "cmd".........
Environment
- Red Hat OpenStack Platform 13
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.