Resolving attachment and connection inconsistencies between the Compute service (nova) database and the Block Storage service (cinder) database

Solution Verified - Updated -

Environment

Red Hat OpenStack Platform 16+

Issue

When operations are performed on the storage back end without using the Compute service (nova) API or the Block Storage service (cinder) API, it can change the volume attachment and connection information, such as the IP addresses of the Ceph nodes. Changes performed without using the APIs can cause the attachment and connection information in the Compute service database and the Block Storage service database to become out of sync. When an attachment or connection inconsistency occurs between the Compute service database and the Block Storage service database, it can affect the operation and management of the virtual machine instances.

Resolution

You can use the following procedure to resolve database inconsistencies when the connection and attachment information between the Compute service database and the Block Storage service database becomes out of sync, by regenerating the connection information to clean and recreate the volume attachment between the instance and the volume, and restarting the instance. Repeat the procedure on each Compute node with issues that affect the operation and management of the instances hosted on the node.

TIP: Check the /var/log/nova-manage.log log to troubleshoot errors.

Prerequisites
* The volume must be attached to the instance.
* The instance must have a vm_state of stopped and not be locked. To determine the state of the instance, run openstack server show myInstance -c locked -f value. If the instance is locked, then unlock it by running openstack server unlock myInstance.

Procedure

  1. Collect the Block Storage service (cinder) information from the nova.conf file on the host Compute node and save it to a temporary configuration file, for example, nova-config.conf:
(overcloud) [stack@undercloud-0 ~]$ ssh compute-0.ctlplane.redhat.local sudo crudini --get --format=ini /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf cinder > nova-config.conf
  1. Collect the service_user configuration from the nova.conf file on the Controller node and save it to your temporary configuration file:
(overcloud) [stack@undercloud-0 ~]$ ssh controller-0.ctlplane.redhat.local sudo crudini --get --format=ini /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf service_user >> nova-config.conf
  1. Collect the api_database and DEFAULT configuration from the nova.conf file on the Controller node and save it to your temporary configuration file:
(overcloud) [stack@undercloud-0 ~]$ ssh controller-0.ctlplane.redhat.local sudo crudini --get --format=ini /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf api_database >> nova-config.conf
(overcloud) [stack@undercloud-0 ~]$ ssh controller-0.ctlplane.redhat.local sudo crudini --get --format=ini /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf DEFAULT >> nova-config.conf
  1. Determine the Nova Cell that the Compute node is a part of:
(overcloud) [stack@undercloud-0 ~]$ ssh tripleo-admin@controller-0.ctlplane.redhat.local sudo podman exec -u root -ti nova_conductor nova-manage cell_v2 list_hosts
+-----------+--------------------------------------+------------------------------+
| Cell Name |              Cell UUID               |           Hostname           |
+-----------+--------------------------------------+------------------------------+
|   cell1   | e492943f-55a4-4387-bfa0-db330a6db34a | cell1-compute-0.redhat.local |
|   cell1   | e492943f-55a4-4387-bfa0-db330a6db34a | cell1-compute-1.redhat.local |
|  default  | ff04f76f-bdd4-457d-83f1-96cefc7c0e77 |    compute-0.redhat.local    |
+-----------+--------------------------------------+------------------------------+
  1. Collect the database configuration from the nova.conf file on the Cell Controller node from the previous step and save it to your temporary configuration file:
(overcloud) [stack@undercloud-0 ~]$ ssh cell1-controller-0.ctlplane.redhat.local sudo crudini --get --format=ini /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf database >> nova-config.conf
  1. Collect the Compute connection information from the host Compute node and save it to a temporary connection information file, for example, compute_connection_info.json:
(overcloud) [stack@undercloud-0 ~]$ ssh compute-0.ctlplane.redhat.local sudo podman exec -it -u root nova_compute nova-manage volume_attachment get_connector --json > compute_connection_info.json
  1. Copy the configuration and connection information files to the Controller node:
(overcloud) [stack@undercloud-0 ~]$ scp nova-config.conf controller-0.ctlplane.redhat.local:~/
(overcloud) [stack@undercloud-0 ~]$ scp compute_connection_info.json controller-0.ctlplane.redhat.local:~/
  1. Copy the configuration and connection information files into a temporary directory in the nova_api container:
(overcloud) [stack@undercloud-0 ~]$ ssh controller-0.ctlplane.redhat.local
[heat-admin@controller-0 ~]$ sudo podman cp compute_connection_info.json nova_api:/tmp/                     
[heat-admin@controller-0 ~]$ sudo podman cp nova-config.conf nova_api:/tmp/
  1. Open a bash terminal to interact with the Compute API container:
[heat-admin@controller-0 ~]$ sudo podman exec -it -u root nova_api /bin/bash
  1. Change to the temporary directory where you stored the copies of the connection data:
[root@controller-0 /]# cd /tmp/
  1. Refresh the connection data associated with the volume attachment for each instance with an attached volume:
[root@controller-0 tmp]# nova-manage --config-file nova-config.conf volume_attachment refresh <server_uuid> <volume_id> compute_connection_info.json

NOTE
If the refresh fails, the instance may be moved to the locked state. You must unlock it manually. This issue is being addressed in BZ2178500.

  1. Delete the connection data:
[root@controller-0 tmp]# rm compute_connection_info.json
[root@controller-0 tmp]# rm connection_info.conf
  1. Exit the container.
  2. Restart the instances hosted on the Compute node.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments