Bring OSP Environment Back to Health
Issue
- Ran incorrect (similarly named) script on pod while trying to perform an overcloud upgrade. The contents of the script were:
"#!/bin/bash
# Was getting JSON errors on overcloud deploy command
# so RedHat support suggested simplifying.
. ~/stackrc
time openstack overcloud deploy --templates /home/stack/openstack-tripleo-heat-templates \
-e /home/stack/prod-environments/lab/east/network-environment.yaml \
--control-scale 3 \
--compute-scale 3 \
--ceph-storage-scale 0 \
--swift-storage-scale 0 \
--block-storage-scale 0 \
--compute-flavor baremetal \
--control-flavor baremetal \
--ceph-storage-flavor ceph-storage \
--swift-storage-flavor swift-storage \
--block-storage-flavor block-storage \
--ntp-server 10.10.10.10 \
--no-cleanup \
--libvirt-type kvm
#-r /home/stack/prod-environments/lab/east/roles_data-network_node.yaml \
#-e /home/stack/prod-environments/lab/east/network-environment.yaml \
#-e /home/stack/prod-environments/lab/east/scheduler-hints.yaml \
#-e /home/stack/prod-environments/lab/east/ips-from-pool-all.yaml \
#-e /home/stack/prod-environments/lab/east/linux-bond-with-vlans.yaml \
#-e /home/stack/prod-environments/lab/east/enable-tls.yaml \
#-e /home/stack/prod-environments/lab/east/inject-trust-anchor.yaml \
#-e /home/stack/prod-environments/lab/east/cloudname.yaml \
#-e /home/stack/prod-environments/lab/east/tls-endpoints-public-dns.yaml \
#-e /home/stack/prod-environments/lab/east/cinder-custom-backends.yaml \
#-e /home/stack/prod-environments/lab/east/node-count-flavor.yaml \
"
-
Obvious issues here are that the templates are excluded, and the compute count is lower than what exists. The deployment of course failed.
-
Overcloud commands now hang.
-
In parallel, the PCS cluster on the controllers is also broken (every service is stopped and in unmanaged state).
Environment
- Red Hat OpenStack Platform 10.0 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.