Chapter 15. Performing basic overcloud administration tasks

This chapter contains information about basic tasks you might need to perform during the lifecycle of your overcloud.

15.1. Accessing overcloud nodes through SSH

You can access each overcloud node through the SSH protocol.

  • Each overcloud node contains a tripleo-admin user, formerly known as the heat-admin user.
  • The stack user on the undercloud has key-based SSH access to the tripleo-admin user on each overcloud node.
  • All overcloud nodes have a short hostname that the undercloud resolves to an IP address on the control plane network. Each short hostname uses a .ctlplane suffix. For example, the short name for overcloud-controller-0 is overcloud-controller-0.ctlplane

Prerequisites

  • A deployed overcloud with a working control plane network.

Procedure

  1. Log in to the undercloud as the stack user.
  2. Find the name of the node that you want to access:

    (undercloud)$ metalsmith list
  3. Connect to the node as the tripleo-admin user:

    (undercloud)$ ssh tripleo-admin@overcloud-controller-0

15.2. Managing containerized services

Red Hat OpenStack Platform (RHOSP) runs services in containers on the undercloud and overcloud nodes. In certain situations, you might need to control the individual services on a host. This section contains information about some common commands you can run on a node to manage containerized services.

Listing containers and images

To list running containers, run the following command:

$ sudo podman ps

To include stopped or failed containers in the command output, add the --all option to the command:

$ sudo podman ps --all

To list container images, run the following command:

$ sudo podman images

Inspecting container properties

To view the properties of a container or container images, use the podman inspect command. For example, to inspect the keystone container, run the following command:

$ sudo podman inspect keystone

Managing containers with Systemd services

Previous versions of OpenStack Platform managed containers with Docker and its daemon. Now, the Systemd services interface manages the lifecycle of the containers. Each container is a service and you run Systemd commands to perform specific operations for each container.

Note

It is not recommended to use the Podman CLI to stop, start, and restart containers because Systemd applies a restart policy. Use Systemd service commands instead.

To check a container status, run the systemctl status command:

$ sudo systemctl status tripleo_keystone
● tripleo_keystone.service - keystone container
   Loaded: loaded (/etc/systemd/system/tripleo_keystone.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-02-15 23:53:18 UTC; 2 days ago
 Main PID: 29012 (podman)
   CGroup: /system.slice/tripleo_keystone.service
           └─29012 /usr/bin/podman start -a keystone

To stop a container, run the systemctl stop command:

$ sudo systemctl stop tripleo_keystone

To start a container, run the systemctl start command:

$ sudo systemctl start tripleo_keystone

To restart a container, run the systemctl restart command:

$ sudo systemctl restart tripleo_keystone

Because no daemon monitors the containers status, Systemd automatically restarts most containers in these situations:

  • Clean exit code or signal, such as running podman stop command.
  • Unclean exit code, such as the podman container crashing after a start.
  • Unclean signals.
  • Timeout if the container takes more than 1m 30s to start.

For more information about Systemd services, see the systemd.service documentation.

Note

Any changes to the service configuration files within the container revert after restarting the container. This is because the container regenerates the service configuration based on files on the local file system of the node in /var/lib/config-data/puppet-generated/. For example, if you edit /etc/keystone/keystone.conf within the keystone container and restart the container, the container regenerates the configuration using /var/lib/config-data/puppet-generated/keystone/etc/keystone/keystone.conf on the local file system of the node, which overwrites any the changes that were made within the container before the restart.

Monitoring podman containers with Systemd timers

The Systemd timers interface manages container health checks. Each container has a timer that runs a service unit that executes health check scripts.

To list all OpenStack Platform containers timers, run the systemctl list-timers command and limit the output to lines containing tripleo:

$ sudo systemctl list-timers | grep tripleo
Mon 2019-02-18 20:18:30 UTC  1s left       Mon 2019-02-18 20:17:26 UTC  1min 2s ago  tripleo_nova_metadata_healthcheck.timer            tripleo_nova_metadata_healthcheck.service
Mon 2019-02-18 20:18:34 UTC  5s left       Mon 2019-02-18 20:17:23 UTC  1min 5s ago  tripleo_keystone_healthcheck.timer                 tripleo_keystone_healthcheck.service
Mon 2019-02-18 20:18:35 UTC  6s left       Mon 2019-02-18 20:17:13 UTC  1min 15s ago tripleo_memcached_healthcheck.timer                tripleo_memcached_healthcheck.service
(...)

To check the status of a specific container timer, run the systemctl status command for the healthcheck service:

$ sudo systemctl status tripleo_keystone_healthcheck.service
● tripleo_keystone_healthcheck.service - keystone healthcheck
   Loaded: loaded (/etc/systemd/system/tripleo_keystone_healthcheck.service; disabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2019-02-18 20:22:46 UTC; 22s ago
  Process: 115581 ExecStart=/usr/bin/podman exec keystone /openstack/healthcheck (code=exited, status=0/SUCCESS)
 Main PID: 115581 (code=exited, status=0/SUCCESS)

Feb 18 20:22:46 undercloud.localdomain systemd[1]: Starting keystone healthcheck...
Feb 18 20:22:46 undercloud.localdomain podman[115581]: {"versions": {"values": [{"status": "stable", "updated": "2019-01-22T00:00:00Z", "..."}]}]}}
Feb 18 20:22:46 undercloud.localdomain podman[115581]: 300 192.168.24.1:35357 0.012 seconds
Feb 18 20:22:46 undercloud.localdomain systemd[1]: Started keystone healthcheck.

To stop, start, restart, and show the status of a container timer, run the relevant systemctl command against the .timer Systemd resource. For example, to check the status of the tripleo_keystone_healthcheck.timer resource, run the following command:

$ sudo systemctl status tripleo_keystone_healthcheck.timer
● tripleo_keystone_healthcheck.timer - keystone container healthcheck
   Loaded: loaded (/etc/systemd/system/tripleo_keystone_healthcheck.timer; enabled; vendor preset: disabled)
   Active: active (waiting) since Fri 2019-02-15 23:53:18 UTC; 2 days ago

If the healthcheck service is disabled but the timer for that service is present and enabled, it means that the check is currently timed out, but will be run according to timer. You can also start the check manually.

Note

The podman ps command does not show the container health status.

Checking container logs

Red Hat OpenStack Platform 17.1 logs all standard output (stdout) from all containers, and standard errors (stderr) consolidated inone single file for each container in /var/log/containers/stdout.

The host also applies log rotation to this directory, which prevents huge files and disk space issues.

In case a container is replaced, the new container outputs to the same log file, because podman uses the container name instead of container ID.

You can also check the logs for a containerized service with the podman logs command. For example, to view the logs for the keystone container, run the following command:

$ sudo podman logs keystone

Accessing containers

To enter the shell for a containerized service, use the podman exec command to launch /bin/bash. For example, to enter the shell for the keystone container, run the following command:

$ sudo podman exec -it keystone /bin/bash

To enter the shell for the keystone container as the root user, run the following command:

$ sudo podman exec --user 0 -it <NAME OR ID> /bin/bash

To exit the container, run the following command:

# exit

15.3. Modifying the overcloud environment

You can modify the overcloud to add additional features or alter existing operations.

Procedure

  1. To modify the overcloud, make modifications to your custom environment files and heat templates, then rerun the openstack overcloud deploy command from your initial overcloud creation. For example, if you created an overcloud using Section 11.3, “Configuring and deploying the overcloud”, rerun the following command:

    $ source ~/stackrc
    (undercloud) $ openstack overcloud deploy --templates \
      -e ~/templates/overcloud-baremetal-deployed.yaml \
      -e ~/templates/network-environment.yaml \
      -e ~/templates/storage-environment.yaml \
      --ntp-server pool.ntp.org

    Director checks the overcloud stack in heat, and then updates each item in the stack with the environment files and heat templates. Director does not recreate the overcloud, but rather changes the existing overcloud.

    Important

    Removing parameters from custom environment files does not revert the parameter value to the default configuration. You must identify the default value from the core heat template collection in /usr/share/openstack-tripleo-heat-templates and set the value in your custom environment file manually.

  2. If you want to include a new environment file, add it to the openstack overcloud deploy command with the`-e` option. For example:

    $ source ~/stackrc
    (undercloud) $ openstack overcloud deploy --templates \
      -e ~/templates/new-environment.yaml \
      -e ~/templates/network-environment.yaml \
      -e ~/templates/storage-environment.yaml \
      -e ~/templates/overcloud-baremetal-deployed.yaml \
      --ntp-server pool.ntp.org

    This command includes the new parameters and resources from the environment file into the stack.

    Important

    It is not advisable to make manual modifications to the overcloud configuration because director might overwrite these modifications later.

15.4. Importing virtual machines into the overcloud

You can migrate virtual machines from an existing OpenStack environment to your Red Hat OpenStack Platform (RHOSP) environment.

Procedure

  1. On the existing OpenStack environment, create a new image by taking a snapshot of a running server and download the image:

    $ openstack server image create --name <image_name> <instance_name>
    $ openstack image save --file <exported_vm.qcow2> <image_name>
    • Replace <instance_name> with the name of the instance.
    • Replace <image_name> with the name of the new image.
    • Replace <exported_vm.qcow2> with the name of the exported virtual machine.
  2. Copy the exported image to the undercloud node:

    $ scp exported_vm.qcow2 stack@192.168.0.2:~/.
  3. Log in to the undercloud as the stack user.
  4. Source the overcloudrc credentials file:

    $ source ~/overcloudrc
  5. Upload the exported image into the overcloud:

    (overcloud) $ openstack image create --disk-format qcow2  -file <exported_vm.qcow2> --container-format bare <image_name>
  6. Launch a new instance:

    (overcloud) $ openstack server create  --key-name default --flavor m1.demo --image imported_image --nic net-id=net_id <instance_name>
Important

You can use these commands to copy each virtual machine disk from the existing OpenStack environment to the new Red Hat OpenStack Platform. QCOW snapshots lose their original layering system.

15.5. Launching the ephemeral heat process

In previous versions of Red Hat OpenStack Platform (RHOSP) a system-installed Heat process was used to install the overcloud. Now, we use ephermal Heat to install the overcloud meaning that the heat-api and heat-engine processes are started on demand by the deployment, update, and upgrade commands.

Previously, you used the openstack stack command to create and manage stacks. This command is no longer available by default. For troubleshooting and debugging purposes, for example if the stack should fail, you must first launch the ephemeral Heat process to use the openstack stack commands.

Use the openstack overcloud tripleo launch heat command to enable ephemeral heat outside of a deployment.

Procedure

  1. Launch the ephemeral Heat process:

    (undercloud)$ openstack tripleo launch heat --heat-dir /home/stack/overcloud-deploy/<overcloud>/heat-launcher --restore-db
    • Replace <overcloud> with the name of your overcloud stack.
    Note

    The command exits after launching the Heat process, and the Heat process continues to run in the background as a Podman pod.

  2. Verify that the ephemeral-heat process is running:

    (undercloud)$ sudo podman pod ps
    POD ID        NAME            STATUS      CREATED        INFRA ID      # OF CONTAINERS
    958b141609b2  ephemeral-heat  Running     2 minutes ago  44447995dbcf  3
  3. Export the OS_CLOUD environment:

    (undercloud)$ export OS_CLOUD=heat
  4. List the installed stacks:

    (undercloud)$ openstack stack list
    +--------------------------------------+------------+---------+-----------------+----------------------+--------------+
    | ID                                   | Stack Name | Project | Stack Status    | Creation Time        | Updated Time |
    +--------------------------------------+------------+---------+-----------------+----------------------+--------------+
    | 761e2a54-c6f9-4e0f-abe6-c8e0ad51a76c | overcloud  | admin   | CREATE_COMPLETE | 2022-08-29T20:48:37Z | None         |
    +--------------------------------------+------------+---------+-----------------+----------------------+--------------+

    You can debug with commands such as openstack stack environment show and openstack stack resource list.

  5. After you have finished debugging, stop the ephemeral Heat process:

    (undercloud)$ openstack tripleo launch heat --kill
Note

Sometimes, exporting the heat environment fails. This can happen when other credentials, such as overcloudrc, are in use. In this case unset the existing environment and source the heat environment.

(overcloud)$ unset OS_CLOUD
(overcloud)$ unset OS_PROJECT_NAME
(overcloud)$ unset OS_PROJECT_DOMAIN_NAME
(overcloud)$ unset OS_USER_DOMAIN_NAME
(overcloud)$ OS_AUTH_TYPE=none
(overcloud)$ OS_ENDPOINT=http://127.0.0.1:8006/v1/admin
(overcloud)$ export OS_CLOUD=heat

15.6. Running the dynamic inventory script

You can run Ansible-based automation in your Red Hat OpenStack Platform (RHOSP) environment. Use the tripleo-ansible-inventory.yaml inventory file located in the /home/stack/overcloud-deploy/<stack> directory to run ansible plays or ad-hoc commands.

Note

If you want to run an Ansible playbook or an Ansible ad-hoc command on the undercloud, you must use the /home/stack/tripleo-deploy/undercloud/tripleo-ansible-inventory.yaml inventory file.

Procedure

  1. To view your inventory of nodes, run the following Ansible ad-hoc command:

    (undercloud) $ ansible -i ./overcloud-deploy/<stack>/tripleo-ansible-inventory.yaml all --list
    Note

    Replace stack with the name of your deployed overcloud stack.

  2. To execute Ansible playbooks on your environment, run the ansible command and include the full path to inventory file using the -i option. For example:

    (undercloud) $ ansible <hosts> -i ./overcloud-deploy/tripleo-ansible-inventory.yaml <playbook> <options>
    • Replace <hosts> with the type of hosts that you want to use to use:

      • controller for all Controller nodes
      • compute for all Compute nodes
      • overcloud for all overcloud child nodes. For example, controller and compute nodes
      • "*" for all nodes
    • Replace <options> with additional Ansible options.

      • Use the --ssh-extra-args='-o StrictHostKeyChecking=no' option to bypass confirmation on host key checking.
      • Use the -u [USER] option to change the SSH user that executes the Ansible automation. The default SSH user for the overcloud is automatically defined using the ansible_ssh_user parameter in the dynamic inventory. The -u option overrides this parameter.
      • Use the -m [MODULE] option to use a specific Ansible module. The default is command, which executes Linux commands.
      • Use the -a [MODULE_ARGS] option to define arguments for the chosen module.
Important

Custom Ansible automation on the overcloud is not part of the standard overcloud stack. Subsequent execution of the openstack overcloud deploy command might override Ansible-based configuration for OpenStack Platform services on overcloud nodes.

15.7. Removing an overcloud stack

You can delete an overcloud stack and unprovision all the stack nodes.

Note

Deleting your overcloud stack does not erase all the overcloud data. If you need to erase all the overcloud data, contact Red Hat support.

Procedure

  1. Log in to the undercloud host as the stack user.
  2. Source the stackrc undercloud credentials file:

    $ source ~/stackrc
  3. Retrieve a list of all the nodes in your stack and their current status:

    (undercloud)$ openstack baremetal node list
    +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
    | UUID                                 | Name         | Instance UUID                        | Power State | Provisioning State | Maintenance |
    +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
    | 92ae71b0-3c31-4ebb-b467-6b5f6b0caac7 | compute-0    | 059fb1a1-53ea-4060-9a47-09813de28ea1 | power on    | active             | False       |
    | 9d6f955e-3d98-4d1a-9611-468761cebabf | compute-1    | e73a4b50-9579-4fe1-bd1a-556a2c8b504f | power on    | active             | False       |
    | 8a686fc1-1381-4238-9bf3-3fb16eaec6ab | controller-0 | 6d69e48d-10b4-45dd-9776-155a9b8ad575 | power on    | active             | False       |
    | eb8083cc-5f8f-405f-9b0c-14b772ce4534 | controller-1 | 1f836ac0-a70d-4025-88a3-bbe0583b4b8e | power on    | active             | False       |
    | a6750f1f-8901-41d6-b9f1-f5d6a10a76c7 | controller-2 | e2edd028-cea6-4a98-955e-5c392d91ed46 | power on    | active             | False       |
    +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
  4. Delete the overcloud stack and unprovision the nodes and networks:

    (undercloud)$ openstack overcloud delete -b <node_definition_file> \
     --networks-file <networks_definition_file> --network-ports <stack>
    • Replace <node_definition_file> with the name of your node definition file, for example, overcloud-baremetal-deploy.yaml.
    • Replace <networks_definition_file> with the name of your networks definition file, for example, network_data_v2.yaml.
    • Replace <stack> with the name of the stack that you want to delete. If not specified, the default stack is overcloud.
  5. Confirm that you want to delete the overcloud:

    Are you sure you want to delete this overcloud [y/N]?
  6. Wait for the overcloud to delete and the nodes and networks to unprovision.
  7. Confirm that the bare-metal nodes have been unprovisioned:

    (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list
    +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
    | UUID                                 | Name         | Instance UUID | Power State | Provisioning State | Maintenance |
    +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
    | 92ae71b0-3c31-4ebb-b467-6b5f6b0caac7 | compute-0    | None          | power off   | available          | False       |
    | 9d6f955e-3d98-4d1a-9611-468761cebabf | compute-1    | None          | power off   | available          | False       |
    | 8a686fc1-1381-4238-9bf3-3fb16eaec6ab | controller-0 | None          | power off   | available          | False       |
    | eb8083cc-5f8f-405f-9b0c-14b772ce4534 | controller-1 | None          | power off   | available          | False       |
    | a6750f1f-8901-41d6-b9f1-f5d6a10a76c7 | controller-2 | None          | power off   | available          | False       |
    +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
  8. Remove the stack directories:

    $ rm -rf ~/overcloud-deploy/<stack>
    $ rm -rf ~/config-download/<stack>
    Note

    The directory paths for your stack might be different from the default if you used the --output-dir and --working-dir options when deploying the overcloud with the openstack overcloud deploy command.

15.8. Managing local disk partition sizes

If your local disk partitions continue to fill up after you have optimized the configuration of your partition sizes, then perform one of the following tasks:

  • Manually delete files from the affected partitions.
  • Add a new physical disk and add it to the LVM volume group. For more information, see Configuring and managing logical volumes.
  • Overprovision the partition to use the remaining spare disk space. This option is possible because the default whole disk overcloud image, overcloud-hardened-uefi-full.qcow2, is backed by a thin pool. For more information on thin-provisioned logical volumes, see Creating and managing thin provisioned volumes (thin volumes) in the RHEL Configuring and managing local volumes guide.

    Warning

    Only use overprovisioning when it is not possible to manually delete files or add a new physical disk. Overprovisioning can fail during the write operation if there is insufficient free physical space.

Note

Adding a new disk and overprovisioning the partition require a support exception. Contact the Red Hat Customer Experience and Engagement team to discuss a support exception, if applicable, or other options.