Chapter 8. Operational Considerations

8.1. Configuration Updates

The procedure to apply OpenStack configuration changes for the nodes described in this reference architecture does not differ from the procedure for non-hyper-converged nodes deployed by Red Hat OpenStack Platform director. Thus, to apply an OpenStack configuration, follow the procedure described in section 7.7. Modifying the Overcloud Environment of the Director Installation and Usage documentation. As stated in the documentation, the same Heat templates must be passed as arguments to the openstack overcloud command.

8.2. Adding Compute/Red Hat Ceph Storage Nodes

This section describes how to add additional hyper-converged nodes to an exising hyper-converged deployment that was configured as described earlier in this reference architecture.

8.2.1. Use Red Hat OpenStack Platform director to add a new Nova Compute / Ceph OSD Node

  1. Create a new JSON file

Create a new JSON file describing the new nodes to be added. For example, if adding a server in a rack in slot U35, then a file like u35.json may contain the following:

 {
  "nodes": [
     {
         "pm_password": "PASSWORD",
         "name": "r730xd_u35",
         "pm_user": "root",
         "pm_addr": "10.19.136.28",
         "pm_type": "pxe_ipmitool",
         "mac": [
             "ec:f4:bb:ed:6f:e4"
         ],
         "arch": "x86_64",
	 "capabilities": "node:osd-compute-3,boot_option:local"
     }
  ]
}
  1. Import the new JSON file into Ironic
 openstack baremetal import u35.json
  1. Observe that the new node was added

For example, the server in U35 was assigned the ID 7250678a-a575-4159-840a-e7214e697165.

[stack@hci-director scale]$ ironic node-list
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
| UUID                                 | Name | Instance UUID                        | Power State | Provision State | Maintenance |
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
| a94b75e3-369f-4b2d-b8cc-8ab272e23e89 | None | 629f3b1f-319f-4df7-8df1-0a9828f2f2f8 | power on    | active          | False       |
| 7ace7b2b-b549-414f-b83e-5f90299b4af3 | None | 4b354355-336d-44f2-9def-27c54cbcc4f5 | power on    | active          | False       |
| 8be1d83c-19cb-4605-b91d-928df163b513 | None | 29124fbb-ee1d-4322-a504-a1a190022f4e | power on    | active          | False       |
| e8411659-bc2b-4178-b66f-87098a1e6920 | None | 93199972-51ff-4405-979c-3c4aabdee7ce | power on    | active          | False       |
| 04679897-12e9-4637-9998-af8bee30b414 | None | e7578d80-0376-4df5-bbff-d4ac02eb1254 | power on    | active          | False       |
| 48b4987d-e778-48e1-ba74-88a08edf7719 | None | 586a5ef3-d530-47de-8ec0-8c98b30f880c | power on    | active          | False       |
| 7250678a-a575-4159-840a-e7214e697165 | None | None                                 | None        | available       | False       |
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
[stack@hci-director scale]$
  1. Set the new server in maintenance mode

Maintenance mode prevents the server from being used for another purpose, e.g. another cloud operator adding an additional node at the same time.

 ironic node-set-maintenance 7250678a-a575-4159-840a-e7214e697165 true
  1. Introspect the new hardware
 openstack baremetal introspection start 7250678a-a575-4159-840a-e7214e697165 true
  1. Verify that introspection is complete

The previous step takes time. The following command shows the status of the introspection.

[stack@hci-director ~]$  openstack baremetal introspection bulk status
+--------------------------------------+----------+-------+
| Node UUID                            | Finished | Error |
+--------------------------------------+----------+-------+
| a94b75e3-369f-4b2d-b8cc-8ab272e23e89 | True     | None  |
| 7ace7b2b-b549-414f-b83e-5f90299b4af3 | True     | None  |
| 8be1d83c-19cb-4605-b91d-928df163b513 | True     | None  |
| e8411659-bc2b-4178-b66f-87098a1e6920 | True     | None  |
| 04679897-12e9-4637-9998-af8bee30b414 | True     | None  |
| 48b4987d-e778-48e1-ba74-88a08edf7719 | True     | None  |
| 7250678a-a575-4159-840a-e7214e697165 | True     | None  |
+--------------------------------------+----------+-------+
[stack@hci-director ~]$
  1. Remove the new server from maintenance mode

This step is necessary in order for the Red Hat OpenStack Platform director Nova Scheduler to select the new node when scaling the number of computes.

 ironic node-set-maintenance 7250678a-a575-4159-840a-e7214e697165 false
  1. Assign the kernel and ramdisk of the full overcloud image to the new node
 openstack baremetal configure boot

The IDs of the kernel and ramdisk that were assigned to the new node are seen with the following command:

[stack@hci-director ~]$ ironic node-show 7250678a-a575-4159-840a-e7214e697165 | grep deploy_
| driver_info            | {u'deploy_kernel': u'e03c5677-2216-4120-95ad-b4354554a590',              |
|                        | u'ipmi_password': u'******', u'deploy_ramdisk': u'2c5957bd-              |
|                        | u'deploy_key': u'H3O1D1ETXCSSBDUMJY5YCCUFG12DJN0G', u'configdrive': u'H4 |
[stack@hci-director ~]$

The deploy_kernel and deploy_ramdisk are checked against what is in Glance. In the following example, the names bm-deploy-kernel and bm-deploy-ramdisk were assigned from the Glance database.

[stack@hci-director ~]$ openstack image list
+--------------------------------------+------------------------+--------+
| ID                                   | Name                   | Status |
+--------------------------------------+------------------------+--------+
| f7dce3db-3bbf-4670-8296-fa59492276c5 | bm-deploy-ramdisk      | active |
| 9b73446a-2c31-4672-a3e7-b189e105b2f9 | bm-deploy-kernel       | active |
| 653f9c4c-8afc-4320-b185-5eb1f5ecb7aa | overcloud-full         | active |
| 714b5f55-e64b-4968-a307-ff609cbcce6c | overcloud-full-initrd  | active |
| b9b62ec3-bfdb-43f7-887f-79fb79dcacc0 | overcloud-full-vmlinuz | active |
+--------------------------------------+------------------------+--------+
[stack@hci-director ~]$
  1. Update the appropriate Heat template to scale the OsdCompute node

Update ~/custom-templates/layout.yaml change the OsdComputeCount from 3 to 4 and add a new IP in each isolated network for the OsdCompute node. For example, change the following:

  OsdComputeIPs:
    internal_api:
      - 192.168.2.203
      - 192.168.2.204
      - 192.168.2.205
    tenant:
      - 192.168.3.203
      - 192.168.3.204
      - 192.168.3.205
    storage:
      - 172.16.1.203
      - 172.16.1.204
      - 172.16.1.205
    storage_mgmt:
      - 172.16.2.203
      - 172.16.2.204
      - 172.16.2.205

so that a .206 IP addresses is added as in the following:

  OsdComputeIPs:
    internal_api:
      - 192.168.2.203
      - 192.168.2.204
      - 192.168.2.205
      - 192.168.2.206
    tenant:
      - 192.168.3.203
      - 192.168.3.204
      - 192.168.3.205
      - 192.168.3.206
    storage:
      - 172.16.1.203
      - 172.16.1.204
      - 172.16.1.205
      - 172.16.1.206
    storage_mgmt:
      - 172.16.2.203
      - 172.16.2.204
      - 172.16.2.205
      - 172.16.2.206

See Section 5.5.3, “Configure scheduler hints to control node placement and IP assignment” for more information about the ~/custom-templates/layout.yaml file.

  1. Apply the overcloud update

Use the same command that was used to deploy the overcloud to update the overcloud so that the changes made in the previous step are applied.

openstack overcloud deploy --templates \
-r ~/custom-templates/custom-roles.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \
-e ~/custom-templates/network.yaml \
-e ~/custom-templates/ceph.yaml \
-e ~/custom-templates/layout.yaml
  1. Verify that the new OsdCompute node was added correctly

Use openstack server list to verify that the new OsdCompute node was added and is available. In the example below the new node, overcloud-osd-compute-3, is listed as ACTIVE.

[stack@hci-director ~]$ openstack server list
+--------------------------------------+-------------------------+--------+-----------------------+----------------+
| ID                                   | Name                    | Status | Networks              | Image Name     |
+--------------------------------------+-------------------------+--------+-----------------------+----------------+
| fc8686c1-a675-4c89-a508-cc1b34d5d220 | overcloud-controller-2  | ACTIVE | ctlplane=192.168.1.37 | overcloud-full |
| 7c6ae5f3-7e18-4aa2-a1f8-53145647a3de | overcloud-osd-compute-2 | ACTIVE | ctlplane=192.168.1.30 | overcloud-full |
| 851f76db-427c-42b3-8e0b-e8b4b19770f8 | overcloud-controller-0  | ACTIVE | ctlplane=192.168.1.33 | overcloud-full |
| e2906507-6a06-4c4d-bd15-9f7de455e91d | overcloud-controller-1  | ACTIVE | ctlplane=192.168.1.29 | overcloud-full |
| 0f93a712-b9eb-4f42-bc05-f2c8c2edfd81 | overcloud-osd-compute-0 | ACTIVE | ctlplane=192.168.1.32 | overcloud-full |
| 8f266c17-ff39-422e-a935-effb219c7782 | overcloud-osd-compute-1 | ACTIVE | ctlplane=192.168.1.24 | overcloud-full |
| 5fa641cf-b290-4a2a-b15e-494ab9d10d8a | overcloud-osd-compute-3 | ACTIVE | ctlplane=192.168.1.21 | overcloud-full |
+--------------------------------------+-------------------------+--------+-----------------------+----------------+
[stack@hci-director ~]$

The new Compute/Ceph Storage Node has been added the overcloud.

8.3. Removing Compute/Red Hat Ceph Storage Nodes

This section describes how to remove an OsdCompute node from an exising hyper-converged deployment that was configured as described earlier in this reference architecture.

Before reducing the compute and storage resources of a hyper-converged overcloud, verify that there will still be enough CPU and RAM to service the compute workloads, and migrate the compute workloads off the node to be removed. Verify that the Ceph cluster has the reserve storage capacity necessary to maintain a health status of HEALTH_OK without the Red Hat Ceph Storage node to be removed.

8.3.1. Remove the Ceph Storage Node

At this time of writing Red Hat OpenStack Platform director does not support the automated removal of a Red Hat Ceph Storage node, so steps in this section need to be done manually from one of the OpenStack Controller / Ceph Monitor nodes, unless otherwise indicated.

  1. Verify that the ceph health command does not produce any "near full" warnings
[root@overcloud-controller-0 ~]# ceph health
HEALTH_OK
[root@overcloud-controller-0 ~]#
Warning

If the ceph health command reports that the cluster is near full as in the example below, then removing the OSD could result in exceeding or reaching the full ratio which could result in data loss. If this is the case, contact Red Hat before proceeding to discuss options to remove the Red Hat Ceph Storage node without data loss.

HEALTH_WARN 1 nearfull osds
osd.2 is near full at 85%
  1. Determine the OSD numbers of the OsdCompute node to be removed

In the example below, overcloud-osd-compute-3 will be removed, and the ceph osd tree command shows that its OSD numbers are 0 through 44 counting by fours.

[root@overcloud-controller-0 ~]# ceph osd tree
ID WEIGHT   TYPE NAME                        UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 52.37256 root default
-2 13.09314     host overcloud-osd-compute-3
 0  1.09109         osd.0                         up  1.00000          1.00000
 4  1.09109         osd.4                         up  1.00000          1.00000
 8  1.09109         osd.8                         up  1.00000          1.00000
12  1.09109         osd.12                        up  1.00000          1.00000
16  1.09109         osd.16                        up  1.00000          1.00000
20  1.09109         osd.20                        up  1.00000          1.00000
24  1.09109         osd.24                        up  1.00000          1.00000
28  1.09109         osd.28                        up  1.00000          1.00000
32  1.09109         osd.32                        up  1.00000          1.00000
36  1.09109         osd.36                        up  1.00000          1.00000
40  1.09109         osd.40                        up  1.00000          1.00000
44  1.09109         osd.44                        up  1.00000          1.00000
...
  1. Start a process to monitor the Ceph cluster

In a separate terminal, run the ceph -w command. This command is used to monitor the health of the Ceph cluster during OSD removal. The output of this command once started is similar to:

[root@overcloud-controller-0 ~]# ceph -w
    cluster eb2bb192-b1c9-11e6-9205-525400330666
     health HEALTH_OK
     monmap e2: 3 mons at {overcloud-controller-0=172.16.1.200:6789/0,overcloud-controller-1=172.16.1.201:6789/0,overcloud-controller-2=172.16.1.202:6789/0}
            election epoch 8, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2
     osdmap e139: 48 osds: 48 up, 48 in
            flags sortbitwise
      pgmap v106106: 1344 pgs, 6 pools, 11080 MB data, 4140 objects
            35416 MB used, 53594 GB / 53628 GB avail
                1344 active+clean

2016-11-29 02:13:17.058468 mon.0 [INF] pgmap v106106: 1344 pgs: 1344 active+clean; 11080 MB data, 35416 MB used, 53594 GB / 53628 GB avail
2016-11-29 02:15:03.674380 mon.0 [INF] pgmap v106107: 1344 pgs: 1344 active+clean; 11080 MB data, 35416 MB used, 53594 GB / 53628 GB avail
...
  1. Mark OSDs of the node to be removed as out

Use the ceph osd out <NUM> command to remove all twelve OSDs from the overcloud-osd-compute-3 node from the Ceph cluster. Allow for time between each OSD removal to ensure the cluster has time to complete the previous action before proceeding; this may be achieved by using a sleep statement. A script like the following, which uses seq to count from 0 to 44 by fours, may be used:

for i in $(seq 0 4 44); do
    ceph osd out $i;
    sleep 10;
done

Before running the above script, note the output of ceph osd stat with all OSDs up and in.

[root@overcloud-controller-0 ~]# ceph osd stat
     osdmap e173: 48 osds: 48 up, 48 in
            flags sortbitwise
[root@overcloud-controller-0 ~]#

The results of running the script above should look as follows:

[root@overcloud-controller-0 ~]# for i in $(seq 0 4 44); do ceph osd out $i; sleep 10; done
marked out osd.0.
marked out osd.4.
marked out osd.8.
marked out osd.12.
marked out osd.16.
marked out osd.20.
marked out osd.24.
marked out osd.28.
marked out osd.32.
marked out osd.36.
marked out osd.40.
marked out osd.44.
[root@overcloud-controller-0 ~]#

After the OSDs are marked as out, the output of the ceph osd stat command should show that twelve of the OSDs are no longer in but still up.

[root@overcloud-controller-0 ~]# ceph osd stat
     osdmap e217: 48 osds: 48 up, 36 in
            flags sortbitwise
[root@overcloud-controller-0 ~]#
  1. Wait for all of the placement groups to become active and clean

The removal of the OSDs will cause Ceph to rebalance the cluster by migrating placement groups to other OSDs. The ceph -w command started in step 3 should show the placement group states as they change from active+clean to active, some degraded objects, and finally active+clean when migration completes.

An example of the output of ceph -w command started in step 3 as it changes looks like the following:

2016-11-29 02:16:06.372846 mon.2 [INF] from='client.? 172.16.1.200:0/1977099347' entity='client.admin' cmd=[{"prefix": "osd out", "ids": ["0"]}]: dispatch
...
2016-11-29 02:16:07.624668 mon.0 [INF] osdmap e141: 48 osds: 48 up, 47 in
2016-11-29 02:16:07.714072 mon.0 [INF] pgmap v106111: 1344 pgs: 8 remapped+peering, 1336 active+clean; 11080 MB data, 34629 MB used, 52477 GB / 52511 GB avail
2016-11-29 02:16:07.624952 osd.46 [INF] 1.8e starting backfill to osd.2 from (0'0,0'0] MAX to 139'24162
2016-11-29 02:16:07.625000 osd.2 [INF] 1.ef starting backfill to osd.16 from (0'0,0'0] MAX to 139'17958
2016-11-29 02:16:07.625226 osd.46 [INF] 1.76 starting backfill to osd.25 from (0'0,0'0] MAX to 139'37918
2016-11-29 02:16:07.626074 osd.46 [INF] 1.8e starting backfill to osd.15 from (0'0,0'0] MAX to 139'24162
2016-11-29 02:16:07.626550 osd.21 [INF] 1.ff starting backfill to osd.46 from (0'0,0'0] MAX to 139'21304
2016-11-29 02:16:07.627698 osd.46 [INF] 1.32 starting backfill to osd.33 from (0'0,0'0] MAX to 139'24962
2016-11-29 02:16:08.682724 osd.45 [INF] 1.60 starting backfill to osd.16 from (0'0,0'0] MAX to 139'8346
2016-11-29 02:16:08.696306 mon.0 [INF] osdmap e142: 48 osds: 48 up, 47 in
2016-11-29 02:16:08.738872 mon.0 [INF] pgmap v106112: 1344 pgs: 6 peering, 9 remapped+peering, 1329 active+clean; 11080 MB data, 34629 MB used, 52477 GB / 52511 GB avail
2016-11-29 02:16:09.850909 mon.0 [INF] osdmap e143: 48 osds: 48 up, 47 in
...
2016-11-29 02:18:10.838365 mon.0 [INF] pgmap v106256: 1344 pgs: 7 activating, 1 active+recovering+degraded, 7 activating+degraded, 9 active+degraded, 70 peering, 1223 active+clean, 8 active+remapped, 19 remapped+peering; 11080 MB data, 33187 MB used, 40189 GB / 40221 GB avail; 167/12590 objects degraded (1.326%); 80/12590 objects misplaced (0.635%); 11031 kB/s, 249 objects/s recovering
...

Output like the above should continue as the Ceph cluster rebalances data, and eventually it returns to a health status of HEALTH_OK.

  1. Verify that the cluster has returned to health status HEALTH_OK
[root@overcloud-controller-0 ~]# ceph -s
    cluster eb2bb192-b1c9-11e6-9205-525400330666
     health HEALTH_OK
     monmap e2: 3 mons at {overcloud-controller-0=172.16.1.200:6789/0,overcloud-controller-1=172.16.1.201:6789/0,overcloud-controller-2=172.16.1.202:6789/0}
            election epoch 8, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2
     osdmap e217: 48 osds: 48 up, 36 in
            flags sortbitwise
      pgmap v106587: 1344 pgs, 6 pools, 11080 MB data, 4140 objects
            35093 MB used, 40187 GB / 40221 GB avail
                1344 active+clean
[root@overcloud-controller-0 ~]#
  1. Stop the OSD Daemons on the node being removed

From the Red Hat OpenStack Platform director server, ssh into the node that is being removed and run systemctl stop ceph-osd.target to stop all OSDs.

Note how the output of ceph osd stat changes after systemctl command is run; the number of up OSDs changes from 48 to 36.

[root@overcloud-osd-compute-3 ~]# ceph osd stat
     osdmap e217: 48 osds: 48 up, 36 in
            flags sortbitwise
[root@overcloud-osd-compute-3 ~]# systemctl stop ceph-osd.target
[root@overcloud-osd-compute-3 ~]# ceph osd stat
     osdmap e218: 48 osds: 36 up, 36 in
            flags sortbitwise
[root@overcloud-osd-compute-3 ~]#

Be sure to run systemctl stop ceph-osd.target on the same node which hosts the OSDs, e.g. in this case, the OSDs from overcloud-osd-compute-3 will be removed, so the command is run on overcloud-osd-compute-3.

  1. Remove the OSDs

The script below does the following:

  • Remove the OSD from the CRUSH map so that it no longer receives data
  • Remove the OSD authentication key
  • Remove the OSD
for i in $(seq 0 4 44); do
    ceph osd crush remove osd.$i
    sleep 10
    ceph auth del osd.$i
    sleep 10
    ceph osd rm $i
    sleep 10
done

Before removing the OSDs, note that they are in the CRUSH map for the Ceph storage node to be removed.

[root@overcloud-controller-0 ~]# ceph osd crush tree | grep overcloud-osd-compute-3 -A 20
                "name": "overcloud-osd-compute-3",
                "type": "host",
                "type_id": 1,
                "items": [
                    {
                        "id": 0,
                        "name": "osd.0",
                        "type": "osd",
                        "type_id": 0,
                        "crush_weight": 1.091095,
                        "depth": 2
                    },
                    {
                        "id": 4,
                        "name": "osd.4",
                        "type": "osd",
                        "type_id": 0,
                        "crush_weight": 1.091095,
                        "depth": 2
                    },
                    {
[root@overcloud-controller-0 ~]#

When the script above is executed, it looks like the following:

[root@overcloud-osd-compute-3 ~]# for i in $(seq 0 4 44); do
>     ceph osd crush remove osd.$i
>     sleep 10
>     ceph auth del osd.$i
>     sleep 10
>     ceph osd rm $i
>     sleep 10
> done
removed item id 0 name 'osd.0' from crush map
updated
removed osd.0
removed item id 4 name 'osd.4' from crush map
updated
removed osd.4
removed item id 8 name 'osd.8' from crush map
updated
removed osd.8
removed item id 12 name 'osd.12' from crush map
updated
removed osd.12
removed item id 16 name 'osd.16' from crush map
updated
removed osd.16
removed item id 20 name 'osd.20' from crush map
updated
removed osd.20
removed item id 24 name 'osd.24' from crush map
updated
removed osd.24
removed item id 28 name 'osd.28' from crush map
updated
removed osd.28
removed item id 32 name 'osd.32' from crush map
updated
removed osd.32
removed item id 36 name 'osd.36' from crush map
updated
removed osd.36
removed item id 40 name 'osd.40' from crush map
updated
removed osd.40
removed item id 44 name 'osd.44' from crush map
updated
removed osd.44
[root@overcloud-osd-compute-3 ~]#

The ceph osd stat command should now report that there are only 36 OSDs.

[root@overcloud-controller-0 ~]# ceph osd stat
     osdmap e300: 36 osds: 36 up, 36 in
            flags sortbitwise
[root@overcloud-controller-0 ~]#

When an OSD is removed from the CRUSH map, CRUSH recomputes which OSDs get the placement groups, and data re-balances accordingly. The CRUSH map may be checked after the OSDs are removed to verify that the update completed.

Observe that overcloud-osd-compute-3 has no OSDs:

[root@overcloud-controller-0 ~]# ceph osd crush tree | grep overcloud-osd-compute-3 -A 5
                "name": "overcloud-osd-compute-3",
                "type": "host",
                "type_id": 1,
                "items": []
            },
            {
[root@overcloud-controller-0 ~]#

8.3.2. Remove the Node from the Overcloud

Though the OSDs on overcloud-osd-compute-3 are no longer a member of the Ceph cluster, its Nova compute services are still functioning and will be removed in this subsection. The hardware will be shut off, and the Overlcoud Heat stack will no longer keep track of the node. All of the steps to do this should be carried out as the stack user on the Red Hat OpenStack Platform director system unless otherwise noted.

Before following this procedure, migrate any instances running on the compute node that will be removed to another compute node.

  1. Authenticate to the overcloud
 source ~/overcloudrc
  1. Check the status of the compute node that is going to be removed

For example, overcloud-osd-compute-3 will be removed:

[stack@hci-director ~]$ nova service-list | grep compute-3
| 145 | nova-compute     | overcloud-osd-compute-3.localdomain | nova     | enabled | up    | 2016-11-29T03:40:32.000000 | -               |
[stack@hci-director ~]$
  1. Disable the compute node’s service so that no new instances are scheduled on it
[stack@hci-director ~]$ nova service-disable overcloud-osd-compute-3.localdomain  nova-compute
+-------------------------------------+--------------+----------+
| Host                                | Binary       | Status   |
+-------------------------------------+--------------+----------+
| overcloud-osd-compute-3.localdomain | nova-compute | disabled |
+-------------------------------------+--------------+----------+
[stack@hci-director ~]$
  1. Authenticate to the undercloud
 source ~/stackrc
  1. Identify the Nova ID of the OsdCompute node to be removed
[stack@hci-director ~]$ openstack server list | grep osd-compute-3
| 6b2a2e71-f9c8-4d5b-aaf8-dada97c90821 | overcloud-osd-compute-3 | ACTIVE | ctlplane=192.168.1.27 | overcloud-full |
[stack@hci-director ~]$

In the following example, the Nova ID is extracted with awk and egrep and set to the variable $nova_id

[stack@hci-director ~]$ nova_id=$(openstack server list | grep compute-3 | awk {'print $2'} | egrep -vi 'id|^$')
[stack@hci-director ~]$ echo $nova_id
6b2a2e71-f9c8-4d5b-aaf8-dada97c90821
[stack@hci-director ~]$
  1. Start a Mistral workflow to delete the node by UUID from the stack by name
[stack@hci-director ~]$ time openstack overcloud node delete --stack overcloud $nova_id
deleting nodes [u'6b2a2e71-f9c8-4d5b-aaf8-dada97c90821'] from stack overcloud
Started Mistral Workflow. Execution ID: 396f123d-df5b-4f37-b137-83d33969b52b

real    1m50.662s
user    0m0.563s
sys     0m0.099s
[stack@hci-director ~]$

In the above example, the stack to delete the node from needs to be identified by name, "overcloud", instead of by its UUID. However, it will be possible to supply either the UUID or name after Red Hat Bugzilla 1399429 is resolved. It is no longer necessary when deleting a node to pass the Heat environment files with the -e option.

As shown by the time command output, the request to delete the node is accepted quickly. However, the Mistral workflow and Heat stack update will run in the background as it removes the compute node.

[stack@hci-director ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+--------------------+----------------------+----------------------+
| id                                   | stack_name | stack_status       | creation_time        | updated_time         |
+--------------------------------------+------------+--------------------+----------------------+----------------------+
| 23e7c364-7303-4af6-b54d-cfbf1b737680 | overcloud  | UPDATE_IN_PROGRESS | 2016-11-24T03:24:56Z | 2016-11-30T17:16:48Z |
+--------------------------------------+------------+--------------------+----------------------+----------------------+
[stack@hci-director ~]$

Confirm that Heat has finished updating the overcloud.

[stack@hci-director ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+----------------------+
| id                                   | stack_name | stack_status    | creation_time        | updated_time         |
+--------------------------------------+------------+-----------------+----------------------+----------------------+
| 23e7c364-7303-4af6-b54d-cfbf1b737680 | overcloud  | UPDATE_COMPLETE | 2016-11-24T03:24:56Z | 2016-11-30T17:16:48Z |
+--------------------------------------+------------+-----------------+----------------------+----------------------+
[stack@hci-director ~]$
  1. Observe that the node was deleted as desired.

In the example below, overcloud-osd-compute-3 is not included in the openstack server list output.

[stack@hci-director ~]$ openstack server list
+-------------------------+-------------------------+--------+-----------------------+----------------+
| ID                      | Name                    | Status | Networks              | Image Name     |
+-------------------------+-------------------------+--------+-----------------------+----------------+
| fc8686c1-a675-4c89-a508 | overcloud-controller-2  | ACTIVE | ctlplane=192.168.1.37 | overcloud-full |
| -cc1b34d5d220           |                         |        |                       |                |
| 7c6ae5f3-7e18-4aa2-a1f8 | overcloud-osd-compute-2 | ACTIVE | ctlplane=192.168.1.30 | overcloud-full |
| -53145647a3de           |                         |        |                       |                |
| 851f76db-427c-42b3      | overcloud-controller-0  | ACTIVE | ctlplane=192.168.1.33 | overcloud-full |
| -8e0b-e8b4b19770f8      |                         |        |                       |                |
| e2906507-6a06-4c4d-     | overcloud-controller-1  | ACTIVE | ctlplane=192.168.1.29 | overcloud-full |
| bd15-9f7de455e91d       |                         |        |                       |                |
| 0f93a712-b9eb-          | overcloud-osd-compute-0 | ACTIVE | ctlplane=192.168.1.32 | overcloud-full |
| 4f42-bc05-f2c8c2edfd81  |                         |        |                       |                |
| 8f266c17-ff39-422e-a935 | overcloud-osd-compute-1 | ACTIVE | ctlplane=192.168.1.24 | overcloud-full |
| -effb219c7782           |                         |        |                       |                |
+-------------------------+-------------------------+--------+-----------------------+----------------+
[stack@hci-director ~]$
  1. Confirm that Ironic has turned off the hardware that ran the converted Compute/OSD services, and that it is available for other purposes.
[stack@hci-director ~]$ openstack baremetal node list
+-------------------+-------------+-------------------+-------------+--------------------+-------------+
| UUID              | Name        | Instance UUID     | Power State | Provisioning State | Maintenance |
+-------------------+-------------+-------------------+-------------+--------------------+-------------+
| c6498849-d8d8-404 | m630_slot13 | 851f76db-427c-    | power on    | active             | False       |
| 2-aa1c-           |             | 42b3-8e0b-        |             |                    |             |
| aa62ec2df17e      |             | e8b4b19770f8      |             |                    |             |
| a8b2e3b9-c62b-496 | m630_slot14 | e2906507-6a06     | power on    | active             | False       |
| 5-8a3d-           |             | -4c4d-            |             |                    |             |
| c4e7743ae78b      |             | bd15-9f7de455e91d |             |                    |             |
| f2d30a3a-8c74     | m630_slot15 | fc8686c1-a675-4c8 | power on    | active             | False       |
| -4fbf-afaa-       |             | 9-a508-cc1b34d5d2 |             |                    |             |
| fb666af55dfc      |             | 20                |             |                    |             |
| 8357d7b0-bd62-4b7 | r730xd_u29  | 0f93a712-b9eb-4f4 | power on    | active             | False       |
| 9-91f9-52c2a50985 |             | 2-bc05-f2c8c2edfd |             |                    |             |
| d9                |             | 81                |             |                    |             |
| fc6efdcb-ae5f-    | r730xd_u31  | 8f266c17-ff39-422 | power on    | active             | False       |
| 431d-             |             | e-a935-effb219c77 |             |                    |             |
| adf1-4dd034b4a0d3 |             | 82                |             |                    |             |
| 73d19120-6c93     | r730xd_u33  | 7c6ae5f3-7e18-4aa | power on    | active             | False       |
| -4f1b-ad1f-       |             | 2-a1f8-53145647a3 |             |                    |             |
| 4cce5913ba76      |             | de                |             |                    |             |
| a0b8b537-0975-406 | r730xd_u35  | None              | power off   | available          | False       |
| b-a346-e361464fd1 |             |                   |             |                    |             |
| e3                |             |                   |             |                    |             |
+-------------------+-------------+-------------------+-------------+--------------------+-------------+
[stack@hci-director ~]$

In the above, the server r730xd_u35 is powered off and available.

  1. Check the status of the compute service that was removed in the overcloud

Authenticate back to the overcloud and observe the state of the nova-compute service offered by overcloud-osd-compute-3:

[stack@hci-director ~]$ source ~/overcloudrc
[stack@hci-director ~]$ nova service-list | grep osd-compute-3
| 145 | nova-compute     | overcloud-osd-compute-3.localdomain | nova     | disabled | down  | 2016-11-29T04:49:23.000000 | -               |
[stack@hci-director ~]$

In the above example, the overcloud has a nova-compute service on the overcloud-osd-compute-3 host, but it is currently marked as disabled and down.

  1. Remove the node’s compute service from the overcloud Nova scheduler

Use nova service-delete 135 to remove the nova-compute service offered by overcloud-osd-compute-3.

The Compute/Ceph Storage Node has been fully removed.