-
Language:
English
-
Language:
English
Chapter 8. Operational Considerations
8.1. Configuration Updates
The procedure to apply OpenStack configuration changes for the nodes described in this reference architecture does not differ from the procedure for non-hyper-converged nodes deployed by Red Hat OpenStack Platform director. Thus, to apply an OpenStack configuration, follow the procedure described in section 7.7. Modifying the Overcloud Environment of the Director Installation and Usage documentation. As stated in the documentation, the same Heat templates must be passed as arguments to the openstack overcloud
command.
8.2. Adding Compute/Red Hat Ceph Storage Nodes
This section describes how to add additional hyper-converged nodes to an exising hyper-converged deployment that was configured as described earlier in this reference architecture.
8.2.1. Use Red Hat OpenStack Platform director to add a new Nova Compute / Ceph OSD Node
- Create a new JSON file
Create a new JSON file describing the new nodes to be added. For example, if adding a server in a rack in slot U35, then a file like u35.json may contain the following:
{ "nodes": [ { "pm_password": "PASSWORD", "name": "r730xd_u35", "pm_user": "root", "pm_addr": "10.19.136.28", "pm_type": "pxe_ipmitool", "mac": [ "ec:f4:bb:ed:6f:e4" ], "arch": "x86_64", "capabilities": "node:osd-compute-3,boot_option:local" } ] }
- Import the new JSON file into Ironic
openstack baremetal import u35.json
- Observe that the new node was added
For example, the server in U35 was assigned the ID 7250678a-a575-4159-840a-e7214e697165.
[stack@hci-director scale]$ ironic node-list +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | UUID | Name | Instance UUID | Power State | Provision State | Maintenance | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | a94b75e3-369f-4b2d-b8cc-8ab272e23e89 | None | 629f3b1f-319f-4df7-8df1-0a9828f2f2f8 | power on | active | False | | 7ace7b2b-b549-414f-b83e-5f90299b4af3 | None | 4b354355-336d-44f2-9def-27c54cbcc4f5 | power on | active | False | | 8be1d83c-19cb-4605-b91d-928df163b513 | None | 29124fbb-ee1d-4322-a504-a1a190022f4e | power on | active | False | | e8411659-bc2b-4178-b66f-87098a1e6920 | None | 93199972-51ff-4405-979c-3c4aabdee7ce | power on | active | False | | 04679897-12e9-4637-9998-af8bee30b414 | None | e7578d80-0376-4df5-bbff-d4ac02eb1254 | power on | active | False | | 48b4987d-e778-48e1-ba74-88a08edf7719 | None | 586a5ef3-d530-47de-8ec0-8c98b30f880c | power on | active | False | | 7250678a-a575-4159-840a-e7214e697165 | None | None | None | available | False | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ [stack@hci-director scale]$
- Set the new server in maintenance mode
Maintenance mode prevents the server from being used for another purpose, e.g. another cloud operator adding an additional node at the same time.
ironic node-set-maintenance 7250678a-a575-4159-840a-e7214e697165 true
- Introspect the new hardware
openstack baremetal introspection start 7250678a-a575-4159-840a-e7214e697165 true
- Verify that introspection is complete
The previous step takes time. The following command shows the status of the introspection.
[stack@hci-director ~]$ openstack baremetal introspection bulk status +--------------------------------------+----------+-------+ | Node UUID | Finished | Error | +--------------------------------------+----------+-------+ | a94b75e3-369f-4b2d-b8cc-8ab272e23e89 | True | None | | 7ace7b2b-b549-414f-b83e-5f90299b4af3 | True | None | | 8be1d83c-19cb-4605-b91d-928df163b513 | True | None | | e8411659-bc2b-4178-b66f-87098a1e6920 | True | None | | 04679897-12e9-4637-9998-af8bee30b414 | True | None | | 48b4987d-e778-48e1-ba74-88a08edf7719 | True | None | | 7250678a-a575-4159-840a-e7214e697165 | True | None | +--------------------------------------+----------+-------+ [stack@hci-director ~]$
- Remove the new server from maintenance mode
This step is necessary in order for the Red Hat OpenStack Platform director Nova Scheduler to select the new node when scaling the number of computes.
ironic node-set-maintenance 7250678a-a575-4159-840a-e7214e697165 false
- Assign the kernel and ramdisk of the full overcloud image to the new node
openstack baremetal configure boot
The IDs of the kernel and ramdisk that were assigned to the new node are seen with the following command:
[stack@hci-director ~]$ ironic node-show 7250678a-a575-4159-840a-e7214e697165 | grep deploy_ | driver_info | {u'deploy_kernel': u'e03c5677-2216-4120-95ad-b4354554a590', | | | u'ipmi_password': u'******', u'deploy_ramdisk': u'2c5957bd- | | | u'deploy_key': u'H3O1D1ETXCSSBDUMJY5YCCUFG12DJN0G', u'configdrive': u'H4 | [stack@hci-director ~]$
The deploy_kernel
and deploy_ramdisk
are checked against what is in Glance. In the following example, the names bm-deploy-kernel
and bm-deploy-ramdisk
were assigned from the Glance database.
[stack@hci-director ~]$ openstack image list +--------------------------------------+------------------------+--------+ | ID | Name | Status | +--------------------------------------+------------------------+--------+ | f7dce3db-3bbf-4670-8296-fa59492276c5 | bm-deploy-ramdisk | active | | 9b73446a-2c31-4672-a3e7-b189e105b2f9 | bm-deploy-kernel | active | | 653f9c4c-8afc-4320-b185-5eb1f5ecb7aa | overcloud-full | active | | 714b5f55-e64b-4968-a307-ff609cbcce6c | overcloud-full-initrd | active | | b9b62ec3-bfdb-43f7-887f-79fb79dcacc0 | overcloud-full-vmlinuz | active | +--------------------------------------+------------------------+--------+ [stack@hci-director ~]$
- Update the appropriate Heat template to scale the OsdCompute node
Update ~/custom-templates/layout.yaml change the OsdComputeCount
from 3 to 4 and add a new IP in each isolated network for the OsdCompute node. For example, change the following:
OsdComputeIPs: internal_api: - 192.168.2.203 - 192.168.2.204 - 192.168.2.205 tenant: - 192.168.3.203 - 192.168.3.204 - 192.168.3.205 storage: - 172.16.1.203 - 172.16.1.204 - 172.16.1.205 storage_mgmt: - 172.16.2.203 - 172.16.2.204 - 172.16.2.205
so that a .206 IP addresses is added as in the following:
OsdComputeIPs: internal_api: - 192.168.2.203 - 192.168.2.204 - 192.168.2.205 - 192.168.2.206 tenant: - 192.168.3.203 - 192.168.3.204 - 192.168.3.205 - 192.168.3.206 storage: - 172.16.1.203 - 172.16.1.204 - 172.16.1.205 - 172.16.1.206 storage_mgmt: - 172.16.2.203 - 172.16.2.204 - 172.16.2.205 - 172.16.2.206
See Section 5.5.3, “Configure scheduler hints to control node placement and IP assignment” for more information about the ~/custom-templates/layout.yaml file.
- Apply the overcloud update
Use the same command that was used to deploy the overcloud to update the overcloud so that the changes made in the previous step are applied.
openstack overcloud deploy --templates \ -r ~/custom-templates/custom-roles.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \ -e ~/custom-templates/network.yaml \ -e ~/custom-templates/ceph.yaml \ -e ~/custom-templates/layout.yaml
- Verify that the new OsdCompute node was added correctly
Use openstack server list
to verify that the new OsdCompute node was added and is available. In the example below the new node, overcloud-osd-compute-3, is listed as ACTIVE.
[stack@hci-director ~]$ openstack server list +--------------------------------------+-------------------------+--------+-----------------------+----------------+ | ID | Name | Status | Networks | Image Name | +--------------------------------------+-------------------------+--------+-----------------------+----------------+ | fc8686c1-a675-4c89-a508-cc1b34d5d220 | overcloud-controller-2 | ACTIVE | ctlplane=192.168.1.37 | overcloud-full | | 7c6ae5f3-7e18-4aa2-a1f8-53145647a3de | overcloud-osd-compute-2 | ACTIVE | ctlplane=192.168.1.30 | overcloud-full | | 851f76db-427c-42b3-8e0b-e8b4b19770f8 | overcloud-controller-0 | ACTIVE | ctlplane=192.168.1.33 | overcloud-full | | e2906507-6a06-4c4d-bd15-9f7de455e91d | overcloud-controller-1 | ACTIVE | ctlplane=192.168.1.29 | overcloud-full | | 0f93a712-b9eb-4f42-bc05-f2c8c2edfd81 | overcloud-osd-compute-0 | ACTIVE | ctlplane=192.168.1.32 | overcloud-full | | 8f266c17-ff39-422e-a935-effb219c7782 | overcloud-osd-compute-1 | ACTIVE | ctlplane=192.168.1.24 | overcloud-full | | 5fa641cf-b290-4a2a-b15e-494ab9d10d8a | overcloud-osd-compute-3 | ACTIVE | ctlplane=192.168.1.21 | overcloud-full | +--------------------------------------+-------------------------+--------+-----------------------+----------------+ [stack@hci-director ~]$
The new Compute/Ceph Storage Node has been added the overcloud.
8.3. Removing Compute/Red Hat Ceph Storage Nodes
This section describes how to remove an OsdCompute node from an exising hyper-converged deployment that was configured as described earlier in this reference architecture.
Before reducing the compute and storage resources of a hyper-converged overcloud, verify that there will still be enough CPU and RAM to service the compute workloads, and migrate the compute workloads off the node to be removed. Verify that the Ceph cluster has the reserve storage capacity necessary to maintain a health status of HEALTH_OK
without the Red Hat Ceph Storage node to be removed.
8.3.1. Remove the Ceph Storage Node
At this time of writing Red Hat OpenStack Platform director does not support the automated removal of a Red Hat Ceph Storage node, so steps in this section need to be done manually from one of the OpenStack Controller / Ceph Monitor nodes, unless otherwise indicated.
-
Verify that the
ceph health
command does not produce any "near full" warnings
[root@overcloud-controller-0 ~]# ceph health HEALTH_OK [root@overcloud-controller-0 ~]#
If the ceph health
command reports that the cluster is near full as in the example below, then removing the OSD could result in exceeding or reaching the full ratio which could result in data loss. If this is the case, contact Red Hat before proceeding to discuss options to remove the Red Hat Ceph Storage node without data loss.
HEALTH_WARN 1 nearfull osds osd.2 is near full at 85%
- Determine the OSD numbers of the OsdCompute node to be removed
In the example below, overcloud-osd-compute-3 will be removed, and the ceph osd tree
command shows that its OSD numbers are 0 through 44 counting by fours.
[root@overcloud-controller-0 ~]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 52.37256 root default -2 13.09314 host overcloud-osd-compute-3 0 1.09109 osd.0 up 1.00000 1.00000 4 1.09109 osd.4 up 1.00000 1.00000 8 1.09109 osd.8 up 1.00000 1.00000 12 1.09109 osd.12 up 1.00000 1.00000 16 1.09109 osd.16 up 1.00000 1.00000 20 1.09109 osd.20 up 1.00000 1.00000 24 1.09109 osd.24 up 1.00000 1.00000 28 1.09109 osd.28 up 1.00000 1.00000 32 1.09109 osd.32 up 1.00000 1.00000 36 1.09109 osd.36 up 1.00000 1.00000 40 1.09109 osd.40 up 1.00000 1.00000 44 1.09109 osd.44 up 1.00000 1.00000 ...
- Start a process to monitor the Ceph cluster
In a separate terminal, run the ceph -w
command. This command is used to monitor the health of the Ceph cluster during OSD removal. The output of this command once started is similar to:
[root@overcloud-controller-0 ~]# ceph -w cluster eb2bb192-b1c9-11e6-9205-525400330666 health HEALTH_OK monmap e2: 3 mons at {overcloud-controller-0=172.16.1.200:6789/0,overcloud-controller-1=172.16.1.201:6789/0,overcloud-controller-2=172.16.1.202:6789/0} election epoch 8, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 osdmap e139: 48 osds: 48 up, 48 in flags sortbitwise pgmap v106106: 1344 pgs, 6 pools, 11080 MB data, 4140 objects 35416 MB used, 53594 GB / 53628 GB avail 1344 active+clean 2016-11-29 02:13:17.058468 mon.0 [INF] pgmap v106106: 1344 pgs: 1344 active+clean; 11080 MB data, 35416 MB used, 53594 GB / 53628 GB avail 2016-11-29 02:15:03.674380 mon.0 [INF] pgmap v106107: 1344 pgs: 1344 active+clean; 11080 MB data, 35416 MB used, 53594 GB / 53628 GB avail ...
- Mark OSDs of the node to be removed as out
Use the ceph osd out <NUM>
command to remove all twelve OSDs from the overcloud-osd-compute-3 node from the Ceph cluster. Allow for time between each OSD removal to ensure the cluster has time to complete the previous action before proceeding; this may be achieved by using a sleep
statement. A script like the following, which uses seq
to count from 0 to 44 by fours, may be used:
for i in $(seq 0 4 44); do ceph osd out $i; sleep 10; done
Before running the above script, note the output of ceph osd stat
with all OSDs up and in.
[root@overcloud-controller-0 ~]# ceph osd stat osdmap e173: 48 osds: 48 up, 48 in flags sortbitwise [root@overcloud-controller-0 ~]#
The results of running the script above should look as follows:
[root@overcloud-controller-0 ~]# for i in $(seq 0 4 44); do ceph osd out $i; sleep 10; done marked out osd.0. marked out osd.4. marked out osd.8. marked out osd.12. marked out osd.16. marked out osd.20. marked out osd.24. marked out osd.28. marked out osd.32. marked out osd.36. marked out osd.40. marked out osd.44. [root@overcloud-controller-0 ~]#
After the OSDs are marked as out, the output of the ceph osd stat
command should show that twelve of the OSDs are no longer in but still up.
[root@overcloud-controller-0 ~]# ceph osd stat osdmap e217: 48 osds: 48 up, 36 in flags sortbitwise [root@overcloud-controller-0 ~]#
- Wait for all of the placement groups to become active and clean
The removal of the OSDs will cause Ceph to rebalance the cluster by migrating placement groups to other OSDs. The ceph -w
command started in step 3 should show the placement group states as they change from active+clean to active, some degraded objects, and finally active+clean when migration completes.
An example of the output of ceph -w
command started in step 3 as it changes looks like the following:
2016-11-29 02:16:06.372846 mon.2 [INF] from='client.? 172.16.1.200:0/1977099347' entity='client.admin' cmd=[{"prefix": "osd out", "ids": ["0"]}]: dispatch ... 2016-11-29 02:16:07.624668 mon.0 [INF] osdmap e141: 48 osds: 48 up, 47 in 2016-11-29 02:16:07.714072 mon.0 [INF] pgmap v106111: 1344 pgs: 8 remapped+peering, 1336 active+clean; 11080 MB data, 34629 MB used, 52477 GB / 52511 GB avail 2016-11-29 02:16:07.624952 osd.46 [INF] 1.8e starting backfill to osd.2 from (0'0,0'0] MAX to 139'24162 2016-11-29 02:16:07.625000 osd.2 [INF] 1.ef starting backfill to osd.16 from (0'0,0'0] MAX to 139'17958 2016-11-29 02:16:07.625226 osd.46 [INF] 1.76 starting backfill to osd.25 from (0'0,0'0] MAX to 139'37918 2016-11-29 02:16:07.626074 osd.46 [INF] 1.8e starting backfill to osd.15 from (0'0,0'0] MAX to 139'24162 2016-11-29 02:16:07.626550 osd.21 [INF] 1.ff starting backfill to osd.46 from (0'0,0'0] MAX to 139'21304 2016-11-29 02:16:07.627698 osd.46 [INF] 1.32 starting backfill to osd.33 from (0'0,0'0] MAX to 139'24962 2016-11-29 02:16:08.682724 osd.45 [INF] 1.60 starting backfill to osd.16 from (0'0,0'0] MAX to 139'8346 2016-11-29 02:16:08.696306 mon.0 [INF] osdmap e142: 48 osds: 48 up, 47 in 2016-11-29 02:16:08.738872 mon.0 [INF] pgmap v106112: 1344 pgs: 6 peering, 9 remapped+peering, 1329 active+clean; 11080 MB data, 34629 MB used, 52477 GB / 52511 GB avail 2016-11-29 02:16:09.850909 mon.0 [INF] osdmap e143: 48 osds: 48 up, 47 in ... 2016-11-29 02:18:10.838365 mon.0 [INF] pgmap v106256: 1344 pgs: 7 activating, 1 active+recovering+degraded, 7 activating+degraded, 9 active+degraded, 70 peering, 1223 active+clean, 8 active+remapped, 19 remapped+peering; 11080 MB data, 33187 MB used, 40189 GB / 40221 GB avail; 167/12590 objects degraded (1.326%); 80/12590 objects misplaced (0.635%); 11031 kB/s, 249 objects/s recovering ...
Output like the above should continue as the Ceph cluster rebalances data, and eventually it returns to a health status of HEALTH_OK
.
-
Verify that the cluster has returned to health status
HEALTH_OK
[root@overcloud-controller-0 ~]# ceph -s cluster eb2bb192-b1c9-11e6-9205-525400330666 health HEALTH_OK monmap e2: 3 mons at {overcloud-controller-0=172.16.1.200:6789/0,overcloud-controller-1=172.16.1.201:6789/0,overcloud-controller-2=172.16.1.202:6789/0} election epoch 8, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 osdmap e217: 48 osds: 48 up, 36 in flags sortbitwise pgmap v106587: 1344 pgs, 6 pools, 11080 MB data, 4140 objects 35093 MB used, 40187 GB / 40221 GB avail 1344 active+clean [root@overcloud-controller-0 ~]#
- Stop the OSD Daemons on the node being removed
From the Red Hat OpenStack Platform director server, ssh
into the node that is being removed and run systemctl stop ceph-osd.target
to stop all OSDs.
Note how the output of ceph osd stat
changes after systemctl
command is run; the number of up OSDs changes from 48 to 36.
[root@overcloud-osd-compute-3 ~]# ceph osd stat osdmap e217: 48 osds: 48 up, 36 in flags sortbitwise [root@overcloud-osd-compute-3 ~]# systemctl stop ceph-osd.target [root@overcloud-osd-compute-3 ~]# ceph osd stat osdmap e218: 48 osds: 36 up, 36 in flags sortbitwise [root@overcloud-osd-compute-3 ~]#
Be sure to run systemctl stop ceph-osd.target
on the same node which hosts the OSDs, e.g. in this case, the OSDs from overcloud-osd-compute-3 will be removed, so the command is run on overcloud-osd-compute-3.
- Remove the OSDs
The script below does the following:
- Remove the OSD from the CRUSH map so that it no longer receives data
- Remove the OSD authentication key
- Remove the OSD
for i in $(seq 0 4 44); do ceph osd crush remove osd.$i sleep 10 ceph auth del osd.$i sleep 10 ceph osd rm $i sleep 10 done
Before removing the OSDs, note that they are in the CRUSH map for the Ceph storage node to be removed.
[root@overcloud-controller-0 ~]# ceph osd crush tree | grep overcloud-osd-compute-3 -A 20 "name": "overcloud-osd-compute-3", "type": "host", "type_id": 1, "items": [ { "id": 0, "name": "osd.0", "type": "osd", "type_id": 0, "crush_weight": 1.091095, "depth": 2 }, { "id": 4, "name": "osd.4", "type": "osd", "type_id": 0, "crush_weight": 1.091095, "depth": 2 }, { [root@overcloud-controller-0 ~]#
When the script above is executed, it looks like the following:
[root@overcloud-osd-compute-3 ~]# for i in $(seq 0 4 44); do > ceph osd crush remove osd.$i > sleep 10 > ceph auth del osd.$i > sleep 10 > ceph osd rm $i > sleep 10 > done removed item id 0 name 'osd.0' from crush map updated removed osd.0 removed item id 4 name 'osd.4' from crush map updated removed osd.4 removed item id 8 name 'osd.8' from crush map updated removed osd.8 removed item id 12 name 'osd.12' from crush map updated removed osd.12 removed item id 16 name 'osd.16' from crush map updated removed osd.16 removed item id 20 name 'osd.20' from crush map updated removed osd.20 removed item id 24 name 'osd.24' from crush map updated removed osd.24 removed item id 28 name 'osd.28' from crush map updated removed osd.28 removed item id 32 name 'osd.32' from crush map updated removed osd.32 removed item id 36 name 'osd.36' from crush map updated removed osd.36 removed item id 40 name 'osd.40' from crush map updated removed osd.40 removed item id 44 name 'osd.44' from crush map updated removed osd.44 [root@overcloud-osd-compute-3 ~]#
The ceph osd stat
command should now report that there are only 36 OSDs.
[root@overcloud-controller-0 ~]# ceph osd stat osdmap e300: 36 osds: 36 up, 36 in flags sortbitwise [root@overcloud-controller-0 ~]#
When an OSD is removed from the CRUSH map, CRUSH recomputes which OSDs get the placement groups, and data re-balances accordingly. The CRUSH map may be checked after the OSDs are removed to verify that the update completed.
Observe that overcloud-osd-compute-3 has no OSDs:
[root@overcloud-controller-0 ~]# ceph osd crush tree | grep overcloud-osd-compute-3 -A 5 "name": "overcloud-osd-compute-3", "type": "host", "type_id": 1, "items": [] }, { [root@overcloud-controller-0 ~]#
8.3.2. Remove the Node from the Overcloud
Though the OSDs on overcloud-osd-compute-3 are no longer a member of the Ceph cluster, its Nova compute services are still functioning and will be removed in this subsection. The hardware will be shut off, and the Overlcoud Heat stack will no longer keep track of the node. All of the steps to do this should be carried out as the stack
user on the Red Hat OpenStack Platform director system unless otherwise noted.
Before following this procedure, migrate any instances running on the compute node that will be removed to another compute node.
- Authenticate to the overcloud
source ~/overcloudrc
- Check the status of the compute node that is going to be removed
For example, overcloud-osd-compute-3 will be removed:
[stack@hci-director ~]$ nova service-list | grep compute-3 | 145 | nova-compute | overcloud-osd-compute-3.localdomain | nova | enabled | up | 2016-11-29T03:40:32.000000 | - | [stack@hci-director ~]$
- Disable the compute node’s service so that no new instances are scheduled on it
[stack@hci-director ~]$ nova service-disable overcloud-osd-compute-3.localdomain nova-compute +-------------------------------------+--------------+----------+ | Host | Binary | Status | +-------------------------------------+--------------+----------+ | overcloud-osd-compute-3.localdomain | nova-compute | disabled | +-------------------------------------+--------------+----------+ [stack@hci-director ~]$
- Authenticate to the undercloud
source ~/stackrc
- Identify the Nova ID of the OsdCompute node to be removed
[stack@hci-director ~]$ openstack server list | grep osd-compute-3 | 6b2a2e71-f9c8-4d5b-aaf8-dada97c90821 | overcloud-osd-compute-3 | ACTIVE | ctlplane=192.168.1.27 | overcloud-full | [stack@hci-director ~]$
In the following example, the Nova ID is extracted with awk
and egrep
and set to the variable $nova_id
[stack@hci-director ~]$ nova_id=$(openstack server list | grep compute-3 | awk {'print $2'} | egrep -vi 'id|^$') [stack@hci-director ~]$ echo $nova_id 6b2a2e71-f9c8-4d5b-aaf8-dada97c90821 [stack@hci-director ~]$
- Start a Mistral workflow to delete the node by UUID from the stack by name
[stack@hci-director ~]$ time openstack overcloud node delete --stack overcloud $nova_id deleting nodes [u'6b2a2e71-f9c8-4d5b-aaf8-dada97c90821'] from stack overcloud Started Mistral Workflow. Execution ID: 396f123d-df5b-4f37-b137-83d33969b52b real 1m50.662s user 0m0.563s sys 0m0.099s [stack@hci-director ~]$
In the above example, the stack to delete the node from needs to be identified by name, "overcloud", instead of by its UUID. However, it will be possible to supply either the UUID or name after Red Hat Bugzilla 1399429 is resolved. It is no longer necessary when deleting a node to pass the Heat environment files with the -e option.
As shown by the time
command output, the request to delete the node is accepted quickly. However, the Mistral workflow and Heat stack update will run in the background as it removes the compute node.
[stack@hci-director ~]$ heat stack-list WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead +--------------------------------------+------------+--------------------+----------------------+----------------------+ | id | stack_name | stack_status | creation_time | updated_time | +--------------------------------------+------------+--------------------+----------------------+----------------------+ | 23e7c364-7303-4af6-b54d-cfbf1b737680 | overcloud | UPDATE_IN_PROGRESS | 2016-11-24T03:24:56Z | 2016-11-30T17:16:48Z | +--------------------------------------+------------+--------------------+----------------------+----------------------+ [stack@hci-director ~]$
Confirm that Heat has finished updating the overcloud.
[stack@hci-director ~]$ heat stack-list WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead +--------------------------------------+------------+-----------------+----------------------+----------------------+ | id | stack_name | stack_status | creation_time | updated_time | +--------------------------------------+------------+-----------------+----------------------+----------------------+ | 23e7c364-7303-4af6-b54d-cfbf1b737680 | overcloud | UPDATE_COMPLETE | 2016-11-24T03:24:56Z | 2016-11-30T17:16:48Z | +--------------------------------------+------------+-----------------+----------------------+----------------------+ [stack@hci-director ~]$
- Observe that the node was deleted as desired.
In the example below, overcloud-osd-compute-3 is not included in the openstack server list
output.
[stack@hci-director ~]$ openstack server list +-------------------------+-------------------------+--------+-----------------------+----------------+ | ID | Name | Status | Networks | Image Name | +-------------------------+-------------------------+--------+-----------------------+----------------+ | fc8686c1-a675-4c89-a508 | overcloud-controller-2 | ACTIVE | ctlplane=192.168.1.37 | overcloud-full | | -cc1b34d5d220 | | | | | | 7c6ae5f3-7e18-4aa2-a1f8 | overcloud-osd-compute-2 | ACTIVE | ctlplane=192.168.1.30 | overcloud-full | | -53145647a3de | | | | | | 851f76db-427c-42b3 | overcloud-controller-0 | ACTIVE | ctlplane=192.168.1.33 | overcloud-full | | -8e0b-e8b4b19770f8 | | | | | | e2906507-6a06-4c4d- | overcloud-controller-1 | ACTIVE | ctlplane=192.168.1.29 | overcloud-full | | bd15-9f7de455e91d | | | | | | 0f93a712-b9eb- | overcloud-osd-compute-0 | ACTIVE | ctlplane=192.168.1.32 | overcloud-full | | 4f42-bc05-f2c8c2edfd81 | | | | | | 8f266c17-ff39-422e-a935 | overcloud-osd-compute-1 | ACTIVE | ctlplane=192.168.1.24 | overcloud-full | | -effb219c7782 | | | | | +-------------------------+-------------------------+--------+-----------------------+----------------+ [stack@hci-director ~]$
- Confirm that Ironic has turned off the hardware that ran the converted Compute/OSD services, and that it is available for other purposes.
[stack@hci-director ~]$ openstack baremetal node list +-------------------+-------------+-------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +-------------------+-------------+-------------------+-------------+--------------------+-------------+ | c6498849-d8d8-404 | m630_slot13 | 851f76db-427c- | power on | active | False | | 2-aa1c- | | 42b3-8e0b- | | | | | aa62ec2df17e | | e8b4b19770f8 | | | | | a8b2e3b9-c62b-496 | m630_slot14 | e2906507-6a06 | power on | active | False | | 5-8a3d- | | -4c4d- | | | | | c4e7743ae78b | | bd15-9f7de455e91d | | | | | f2d30a3a-8c74 | m630_slot15 | fc8686c1-a675-4c8 | power on | active | False | | -4fbf-afaa- | | 9-a508-cc1b34d5d2 | | | | | fb666af55dfc | | 20 | | | | | 8357d7b0-bd62-4b7 | r730xd_u29 | 0f93a712-b9eb-4f4 | power on | active | False | | 9-91f9-52c2a50985 | | 2-bc05-f2c8c2edfd | | | | | d9 | | 81 | | | | | fc6efdcb-ae5f- | r730xd_u31 | 8f266c17-ff39-422 | power on | active | False | | 431d- | | e-a935-effb219c77 | | | | | adf1-4dd034b4a0d3 | | 82 | | | | | 73d19120-6c93 | r730xd_u33 | 7c6ae5f3-7e18-4aa | power on | active | False | | -4f1b-ad1f- | | 2-a1f8-53145647a3 | | | | | 4cce5913ba76 | | de | | | | | a0b8b537-0975-406 | r730xd_u35 | None | power off | available | False | | b-a346-e361464fd1 | | | | | | | e3 | | | | | | +-------------------+-------------+-------------------+-------------+--------------------+-------------+ [stack@hci-director ~]$
In the above, the server r730xd_u35 is powered off and available.
- Check the status of the compute service that was removed in the overcloud
Authenticate back to the overcloud and observe the state of the nova-compute service offered by overcloud-osd-compute-3:
[stack@hci-director ~]$ source ~/overcloudrc [stack@hci-director ~]$ nova service-list | grep osd-compute-3 | 145 | nova-compute | overcloud-osd-compute-3.localdomain | nova | disabled | down | 2016-11-29T04:49:23.000000 | - | [stack@hci-director ~]$
In the above example, the overcloud has a nova-compute service on the overcloud-osd-compute-3 host, but it is currently marked as disabled and down.
- Remove the node’s compute service from the overcloud Nova scheduler
Use nova service-delete 135
to remove the nova-compute service offered by overcloud-osd-compute-3.
The Compute/Ceph Storage Node has been fully removed.