How to configure instance HA using tags

Solution Verified - Updated -

Environment

  • Red Hat Openstack 10 (Newton)

Issue

  • Instance HA without configuring tags in either flavors or images are working fine.
  • Instance HA with only instances deployed with a flavor with tag evacuable=true are working fine.
  • Instance HA with instances deployed with both flavors with tag evacuable=true and another flavor without tag are not working correctly.
  • Instance HA with instances deployed with both flavors with tag evacuable=true and another flavor with tag evacuable=false are not working correctly.

Resolution

Red Hat Enterprise Linux 7
  • The issue (BZ #1600600) has been resolved with errata RHBA-2018:3031 with the following package(s): fence-agents-4.2.1-11.el7 or later.
Workaround

Follow our official documentation to configure instance ha, where it is possible to download the ansible scripts and configure the instance ha.

When there are tags defined all instances with the tag evacuable=truewill be evacuated (if there are enough resources available) in case a compute failure.
In this case instances with no tag or evacuable=false tag will be not evacuated.

To deploy instance-ha playbooks are placed in /home/stack/ansible-instanceha. In that directory, there is an install.sh script which deploys instance-ha.
A problem was detected during installation, some stonith devices were not created and installation fails.

The following errors makes installation to fail:

failed: [undercloud -> controller-2] (item=compute1) => {"changed": true, "cmd": "pcs stonith show ipmilan-compute1", "delta": "0:00:00.263657", "end": "2018-07-06 10:40:29.698450", "failed": true, "item": "compute1", "msg": "non-zero return code", "rc": 1, "start": "2018-07-06 10:40:29.434793", "stderr": "Error: unable to find resource 'ipmilan-compute1'", "stderr_lines": ["Error: unable to find resource 'ipmilan-compute1'"], "stdout": "", "stdout_lines": []}

To solve that after failure the device was created manually:

[root@controller0 ~]# pcs stonith create ipmilan-compute1 fence_ipmilan pcmk_host_list="compute1" ipaddr=192.168.1.6 action="reboot" login="admin" passwd="password" delay=20 op monitor interval=60s]

To avoid a new failure the nova-evacuate cluster resource must be deleted:

[root@controller0 ~]# pcs resource delete nova-evacuate

If this resource is not deleted the following error will arise:

TASK [instance-ha : Create resource nova-evacuate (no_shared_storage)] ********************************************************************************************************************************************
fatal: [undercloud -> controller-2]: FAILED! => {"changed": true, "cmd": "pcs resource create nova-evacuate ocf:openstack:NovaEvacuate auth_url=$OS_AUTH_URL username=$OS_USERNAME password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME no_shared_storage=1", "delta": "0:00:00.274869", "end": "2018-07-06 10:53:47.636635", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2018-07-06 10:53:47.361766", "stderr": "Error: 'nova-evacuate' already exists", "stderr_lines": ["Error: 'nova-evacuate' already exists"], "stdout": "", "stdout_lines": []}
        to retry, use: --limit @/home/stack/ansible-instanceha/playbooks/overcloud-instance-ha.retry

After deleting the resource, it is necessary to run the installer ha installation again.
After the installation controllers could be shown as computes so the following constraints do not have to be present:

  Resource: nova-compute-checkevacuate-clone
    Constraint: location-nova-compute-checkevacuate-clone (resource-discovery=exclusive)
    Rule: score=0
        Expression: osprole eq compute
    Constraint: location-nova-compute-checkevacuate-clone-1 (resource-discovery=exclusive)
    Rule: score=0
        Expression: osprole eq controller
  Resource: nova-compute-clone
    Constraint: location-nova-compute-clone (resource-discovery=exclusive)
    Rule: score=0
        Expression: osprole eq compute
        Constraint: location-nova-compute-clone-1 (resource-discovery=exclusive)
    Rule: score=0
    Expression: osprole eq controller

So if they are present, it must to delete them:

[root@controller0 ~]# pcs constraint remove location-nova-compute-checkevacuate-clone-1
[root@controller0 ~]# pcs constraint remove location-nova-compute-clone-1

It seems that those constraints are not added by ansible playbook that it is being tracked by Red Hat.
Stonith has to be enabled:

[root@controller0 ~]# pcs property set stonith-enabled=true

Due to issue on the fence_evacuate, it is necessary to apply errata:

  • The issue (bz1600602) has been resolved with errata RHBA-2018:2459 with the following package(s): fence-agents-4.0.11-86.el7_5.3,fence-agents-all-4.0.11-86.el7_5.3,fence-agents-common-4.0.11-86.el7_5.3 or later.
  • The issue (bz1600600) has been resolved with errata RHBA-2018:2416 with the following package(s): fence-agents-4.0.11-66.el7_4.8,fence-agents-all-4.0.11-66.el7_4.8,fence-agents-common-4.0.11-66.el7_4.8 or later for RHEL 7.4.z releases.
[stack@undercloud ~]$ cd ansible-instanceha
[stack@undercloud ansible-instance-ha]$ ansible-playbook -i hosts copy_patched_fence_evacuate_to_controllers.yaml

Diagnostic Steps

To simulate a compute crash the following command was used:

[root@compute ~]# echo c > /proc/sysrq-trigger

After crash instance ha reboots the compute and when boot the following cluster resources are stopped on the compute:

# nova-compute-checkevacuate-clone
# nova-compute-clone

This is the expected behaviour. After checking that compute is in good shape to run workloads:

[root@controller ~]# pcs resource cleanup nova-compute-check-evacuate-clone
[root@controller ~]# pcs resource cleanup nova-compute-clone

Testing steps verification
- Environment: RHOSP10, RHEL 7.5, fence-agents-4.2.1-2.el7, IHA setup using tripleo-ha-utils repo.
If RHOSP10 is using RHEL 7.4, make sure that the selinux RPMs are updated in the overcloud.
- Tests results: All instance with flavor property evacuable=true were evacuated, instances with no property were not evacuated and left in SHUTOFF state, instances with property evacuable=false were not evacuated.
- Procedure: Created 8 instances 4 with evacuable=true flavor 4 without, hard rebooted one compute. Created 8 instances 4 with evacuable=true flavor 4 with evacuate=false, hard rebooted one compute.
- Details:

pcs status:

controller-0 | SUCCESS | rc=0 >>
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-0 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Jul 12 21:50:45 2018
Last change: Thu Jul 12 13:31:02 2018 by hacluster via crmd on controller-1

5 nodes configured
46 resources configured

Online: [ controller-0 controller-1 controller-2 ]
RemoteOnline: [ compute-0 compute-1 ]

Full list of resources:

 ip-192.168.24.10   (ocf::heartbeat:IPaddr2):   Started controller-2
 ip-172.17.4.11 (ocf::heartbeat:IPaddr2):   Started controller-0
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
     Stopped: [ compute-0 compute-1 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
     Stopped: [ compute-0 compute-1 ]
 ip-172.17.1.19 (ocf::heartbeat:IPaddr2):   Started controller-1
 ip-10.0.0.107  (ocf::heartbeat:IPaddr2):   Started controller-2
 ip-172.17.3.15 (ocf::heartbeat:IPaddr2):   Started controller-0
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ controller-0 controller-1 controller-2 ]
     Stopped: [ compute-0 compute-1 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-1 ]
     Slaves: [ controller-0 controller-2 ]
     Stopped: [ compute-0 compute-1 ]
 ip-172.17.1.14 (ocf::heartbeat:IPaddr2):   Started controller-1
 openstack-cinder-volume    (systemd:openstack-cinder-volume):  Started controller-2
 ipmilan-controller-2   (stonith:fence_ipmilan):    Stopped
 ipmilan-controller-1   (stonith:fence_ipmilan):    Stopped
 ipmilan-controller-0   (stonith:fence_ipmilan):    Stopped
 ipmilan-compute-0  (stonith:fence_ipmilan):    Stopped
 ipmilan-compute-1  (stonith:fence_ipmilan):    Stopped
 nova-evacuate  (ocf::openstack:NovaEvacuate):  Started controller-0
 Clone Set: nova-compute-checkevacuate-clone [nova-compute-checkevacuate]
     Started: [ compute-0 compute-1 ]
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: nova-compute-clone [nova-compute]
     Started: [ compute-0 compute-1 ]
     Stopped: [ controller-0 controller-1 controller-2 ]
 fence-nova (stonith:fence_compute):    Started controller-1
 compute-1  (ocf::pacemaker:remote):    Started controller-0
 compute-0  (ocf::pacemaker:remote):    Started controller-1

[stack@undercloud-0 ~]$ openstack flavor show m1.tiny
+----------------------------+---------+
| Field                      | Value   |
+----------------------------+---------+
| OS-FLV-DISABLED:disabled   | False   |
| OS-FLV-EXT-DATA:ephemeral  | 0       |
| access_project_ids         | None    |
| disk                       | 1       |
| id                         | 0       |
| name                       | m1.tiny |
| os-flavor-access:is_public | True    |
| properties                 |         |
| ram                        | 64      |
| rxtx_factor                | 1.0     |
| swap                       |         |
| vcpus                      | 1       |
+----------------------------+---------+

[stack@undercloud-0 ~]$ openstack flavor create --id 1 --vcpus 1 --ram 64 --disk 1 m1.tiny-evac
+----------------------------+--------------+
| Field                      | Value        |
+----------------------------+--------------+
| OS-FLV-DISABLED:disabled   | False        |
| OS-FLV-EXT-DATA:ephemeral  | 0            |
| disk                       | 1            |
| id                         | 1            |
| name                       | m1.tiny-evac |
| os-flavor-access:is_public | True         |
| properties                 |              |
| ram                        | 64           |
| rxtx_factor                | 1.0          |
| swap                       |              |
| vcpus                      | 1            |
+----------------------------+--------------+
[stack@undercloud-0 ~]$ openstack flavor set  m1.tiny-evac --property evacuable=true
[stack@undercloud-0 ~]$ openstack flavor show m1.tiny-evac
+----------------------------+------------------+
| Field                      | Value            |
+----------------------------+------------------+
| OS-FLV-DISABLED:disabled   | False            |
| OS-FLV-EXT-DATA:ephemeral  | 0                |
| access_project_ids         | None             |
| disk                       | 1                |
| id                         | 1                |
| name                       | m1.tiny-evac     |
| os-flavor-access:is_public | True             |
| properties                 | evacuable='true' |
| ram                        | 64               |
| rxtx_factor                | 1.0              |
| swap                       |                  |
| vcpus                      | 1                |
+----------------------------+------------------+

[stack@undercloud-0 ~]$ date;for i in `openstack server list -cID -fvalue`;do openstack server show $i |grep -w 'name \|flavor\|id\|OS-EXT-SRV-ATTR:host\|status';echo'';done
Thu Jul 12 17:48:48 EDT 2018
| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 907657b9-9e02-4972-a642-8c1148b72469                     |
| name                                 | osvm-evac-4                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 45c627f2-bcf9-4826-aa35-9cbe7d328327                     |
| name                                 | osvm-evac-3                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | e4238c23-9fd9-4867-a72d-a34e328430b0                     |
| name                                 | osvm-evac-2                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 728cd54f-0fd1-4c7f-beff-d8abae178a7d                     |
| name                                 | osvm-evac-1                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 03e7bd0e-23b6-441c-88e7-36cbff593280                     |
| name                                 | osvm-4                                                   |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 33d9653e-3e8c-4e38-a5e2-d58d51fdcbeb                     |
| name                                 | osvm-3                                                   |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | ce67f123-1803-4766-ad62-98eada6f1e01                     |
| name                                 | osvm-2                                                   |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 7ce7e96f-5ba1-484c-9b5d-15b11ab9e65b                     |
| name                                 | osvm-1                                                   |
| status                               | ACTIVE             


kill compute-0 using echo b > /proc/sysrq-trigger
...



[stack@undercloud-0 ~]$ date;for i in `openstack server list -cID -fvalue`;do openstack server show $i |grep -w 'name \|flavor\|id\|OS-EXT-SRV-ATTR:host\|status';echo'';done
Thu Jul 12 18:03:02 EDT 2018
| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 907657b9-9e02-4972-a642-8c1148b72469                     |
| name                                 | osvm-evac-4                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 45c627f2-bcf9-4826-aa35-9cbe7d328327                     |
| name                                 | osvm-evac-3                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | e4238c23-9fd9-4867-a72d-a34e328430b0                     |
| name                                 | osvm-evac-2                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 728cd54f-0fd1-4c7f-beff-d8abae178a7d                     |
| name                                 | osvm-evac-1                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 03e7bd0e-23b6-441c-88e7-36cbff593280                     |
| name                                 | osvm-4                                                   |
| status                               | SHUTOFF                                                  |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 33d9653e-3e8c-4e38-a5e2-d58d51fdcbeb                     |
| name                                 | osvm-3                                                   |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | ce67f123-1803-4766-ad62-98eada6f1e01                     |
| name                                 | osvm-2                                                   |
| status                               | SHUTOFF                                                  |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 7ce7e96f-5ba1-484c-9b5d-15b11ab9e65b                     |
| name                                 | osvm-1                                                   |
| status                               | ACTIVE                                                   |



[stack@undercloud-0 ~]$ openstack flavor set  m1.tiny --property evacuable=false
[stack@undercloud-0 ~]$ date;for i in `openstack server list -cID -fvalue`;do openstack server show $i |grep -w 'name \|flavor\|id\|OS-EXT-SRV-ATTR:host\|status';echo'';done
Thu Jul 12 18:25:24 EDT 2018
| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 907657b9-9e02-4972-a642-8c1148b72469                     |
| name                                 | osvm-evac-4                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 45c627f2-bcf9-4826-aa35-9cbe7d328327                     |
| name                                 | osvm-evac-3                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | e4238c23-9fd9-4867-a72d-a34e328430b0                     |
| name                                 | osvm-evac-2                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 728cd54f-0fd1-4c7f-beff-d8abae178a7d                     |
| name                                 | osvm-evac-1                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 03e7bd0e-23b6-441c-88e7-36cbff593280                     |
| name                                 | osvm-4                                                   |
| status                               | SHUTOFF                                                  |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 33d9653e-3e8c-4e38-a5e2-d58d51fdcbeb                     |
| name                                 | osvm-3                                                   |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | ce67f123-1803-4766-ad62-98eada6f1e01                     |
| name                                 | osvm-2                                                   |
| status                               | SHUTOFF                                                  |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 7ce7e96f-5ba1-484c-9b5d-15b11ab9e65b                     |
| name                                 | osvm-1                                                   |
| status                               | ACTIVE                                                   |

kill compute-1 via echo b>/proc/sysrq-trigger
...


[stack@undercloud-0 ~]$ date;for i in `openstack server list -cID -fvalue`;do openstack server show $i |grep -w 'name \|flavor\|id\|OS-EXT-SRV-ATTR:host\|status';echo'';done
Thu Jul 12 18:30:06 EDT 2018
| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 907657b9-9e02-4972-a642-8c1148b72469                     |
| name                                 | osvm-evac-4                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 45c627f2-bcf9-4826-aa35-9cbe7d328327                     |
| name                                 | osvm-evac-3                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | e4238c23-9fd9-4867-a72d-a34e328430b0                     |
| name                                 | osvm-evac-2                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny-evac (1)                                         |
| id                                   | 728cd54f-0fd1-4c7f-beff-d8abae178a7d                     |
| name                                 | osvm-evac-1                                              |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 03e7bd0e-23b6-441c-88e7-36cbff593280                     |
| name                                 | osvm-4                                                   |
| status                               | SHUTOFF                                                  |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 33d9653e-3e8c-4e38-a5e2-d58d51fdcbeb                     |
| name                                 | osvm-3                                                   |
| status                               | ACTIVE                                                   |

| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | ce67f123-1803-4766-ad62-98eada6f1e01                     |
| name                                 | osvm-2                                                   |
| status                               | SHUTOFF                                                  |

| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| flavor                               | m1.tiny (0)                                              |
| id                                   | 7ce7e96f-5ba1-484c-9b5d-15b11ab9e65b                     |
| name                                 | osvm-1                                                   |
| status                               | ACTIVE

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments