cinder-volume is down

Solution In Progress - Updated -

Issue

  • openstack-cincder-volume resource won't come up as per pcs status:
[root@overcloud-controller2 ~]# pcs resource restart openstack-cinder-volume;sleep 120; pcs status

Cluster name: tripleo_cluster
Cluster Summary:
  * Stack: corosync
  * Current DC: overcloud-controller1 (version 2.0.5-9.el8_4.3-ba59be7122) - partition with quorum
  * Last updated: Tue Mar 15 14:56:29 2022
  * Last change:  Tue Mar 15 14:52:39 2022 by root via crm_resource on overcloud-controller1
  * 12 nodes configured
  * 37 resource instances configured

Node List:
  * Online: [ overcloud-controller1 overcloud-controller2 overcloud-controller3 ]
  * GuestOnline: [ galera-bundle-0@overcloud-controller2 galera-bundle-1@overcloud-controller3 galera-bundle-2@overcloud-controller1 rabbitmq-bundle-0@overcloud-controller1 rabbitmq-bundle-1@overcloud-controller2 rabbitmq-bundle-2@overcloud-controller3 redis-bundle-0@overcloud-controller1 redis-bundle-1@overcloud-controller2 redis-bundle-2@overcloud-controller3 ]

Full List of Resources:
  * ip-10.252.155.14    (ocf::heartbeat:IPaddr2):        Started overcloud-controller1
  * ip-114.110.20.135   (ocf::heartbeat:IPaddr2):        Started overcloud-controller1
  * ip-10.254.155.9     (ocf::heartbeat:IPaddr2):        Started overcloud-controller1
  * ip-10.254.155.13    (ocf::heartbeat:IPaddr2):        Started overcloud-controller1
  * ip-10.254.154.13    (ocf::heartbeat:IPaddr2):        Started overcloud-controller1
  * ip-10.154.156.213   (ocf::heartbeat:IPaddr2):        Started overcloud-controller1
  * Container bundle set: haproxy-bundle [cluster.common.tag/openstack-haproxy:pcmklatest]:
    * haproxy-bundle-podman-0   (ocf::heartbeat:podman):         Started overcloud-controller1
    * haproxy-bundle-podman-1   (ocf::heartbeat:podman):         Started overcloud-controller2
    * haproxy-bundle-podman-2   (ocf::heartbeat:podman):         Started overcloud-controller3
  * Container bundle set: galera-bundle [cluster.common.tag/openstack-mariadb:pcmklatest]:
    * galera-bundle-0   (ocf::heartbeat:galera):         Master overcloud-controller2
    * galera-bundle-1   (ocf::heartbeat:galera):         Master overcloud-controller3
    * galera-bundle-2   (ocf::heartbeat:galera):         Master overcloud-controller1
  * Container bundle set: redis-bundle [cluster.common.tag/openstack-redis:pcmklatest]:
    * redis-bundle-0    (ocf::heartbeat:redis):  Master overcloud-controller1
    * redis-bundle-1    (ocf::heartbeat:redis):  Slave overcloud-controller2
    * redis-bundle-2    (ocf::heartbeat:redis):  Slave overcloud-controller3
  * Container bundle set: rabbitmq-bundle [cluster.common.tag/openstack-rabbitmq:pcmklatest]:
    * rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster):       Started overcloud-controller1
    * rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster):       Started overcloud-controller2
    * rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster):       Started overcloud-controller3
  * Container bundle: openstack-cinder-volume [cluster.common.tag/openstack-cinder-volume:pcmklatest]:
    * openstack-cinder-volume-podman-0  (ocf::heartbeat:podman):         Started overcloud-controller2

Failed Resource Actions:
  * openstack-cinder-volume-podman-0_monitor_60000 on overcloud-controller2 'not running' (7): call=368, status='complete', exitreason='', last-rc-change='2022-03-15 11:56:31 +07:00', queued=0ms, exec=0ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
  • The following errors are seen in the logs:
[root@overcloud-controller2 cinder]# tail -n 50 /var/log/containers/cinder/cinder-volume.log
2022-03-15 14:56:55.020 117 WARNING cinder.volume.manager [req-2c8e6074-3698-4cd5-9acc-98e61533f21b - - - - -] Update driver status failed: (config name Tier3) is uninitialized.
[root@overcloud-controller2 cinder]# tail -n 100 cinder-volume.log
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager     return f(*args, **kwargs)
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager   File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/hitachi/hbsd_fc.py", line 170, in do_setup
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager     self.common.do_setup(context)
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager   File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/hitachi/hbsd_common.py", line 598, in do_setup
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager     self.check_param()
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager   File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/hitachi/hbsd_rest.py", line 395, in check_param
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager     raise utils.HBSDError(msg)
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager cinder.volume.drivers.hitachi.hbsd_utils.HBSDError: HBSD error occurred. A parameter is invalid. (san_password)
2022-03-15 14:52:49.876 117 ERROR cinder.volume.manager
2022-03-15 14:52:49.922 116 INFO cinder.volume.manager [req-e7de3044-c282-43ad-93ed-799bc34dcdd6 - - - - -] Initializing RPC dependent components of volume driver HBSDFCDriver (2.0.0)
2022-03-15 14:52:49.923 116 ERROR cinder.utils [req-e7de3044-c282-43ad-93ed-799bc34dcdd6 - - - - -] Volume driver HBSDFCDriver not initialized
2022-03-15 14:52:49.923 116 ERROR cinder.volume.manager [req-e7de3044-c282-43ad-93ed-799bc34dcdd6 - - - - -] Cannot complete RPC initialization because driver isn't initialized properly.: cinder.exception.DriverNotInitialized: Volume driver not ready.
2022-03-15 14:52:50.003 117 INFO cinder.volume.manager [req-499c2a5a-c4fd-450a-aab7-90212adfd23d - - - - -] Initializing RPC dependent components of volume driver HBSDFCDriver (2.0.0)
2022-03-15 14:52:50.004 117 ERROR cinder.utils [req-499c2a5a-c4fd-450a-aab7-90212adfd23d - - - - -] Volume driver HBSDFCDriver not initialized
2022-03-15 14:52:50.004 117 ERROR cinder.volume.manager [req-499c2a5a-c4fd-450a-aab7-90212adfd23d - - - - -] Cannot complete RPC initialization because driver isn't initialized properly.: cinder.exception.DriverNotInitialized: Volume driver not ready.
2022-03-15 14:52:59.860 115 ERROR cinder.service [-] Manager for service cinder-volume hostgroup@Tier1 is reporting problems, not sending heartbeat. Service will appear "down".
2022-03-15 14:52:59.924 116 ERROR cinder.service [-] Manager for service cinder-volume hostgroup@Tier2 is reporting problems, not sending heartbeat. Service will appear "down".
2022-03-15 14:53:00.005 117 ERROR cinder.service [-] Manager for service cinder-volume hostgroup@Tier3 is reporting problems, not sending heartbeat. Service will appear "down".
  • cinder-volume backend for the given Tier stays down:
(overcloud) [stack@overcloud templates]$ openstack volume service list
+------------------+----------------------+------+---------+-------+----------------------------+
| Binary           | Host                 | Zone | Status  | State | Updated At                 |
+------------------+----------------------+------+---------+-------+----------------------------+
| cinder-scheduler | overcloud-controller1 | nova | enabled | up    | 2022-03-15T07:58:00.000000 |
| cinder-scheduler | overcloud-controller2 | nova | enabled | up    | 2022-03-15T07:58:06.000000 |
| cinder-scheduler | overcloud-controller3 | nova | enabled | up    | 2022-03-15T07:58:00.000000 |
| cinder-volume    | hostgroup@Tier1      | nova | enabled | down  | 2022-03-15T07:52:42.000000 |
| cinder-volume    | hostgroup@Tier2      | nova | enabled | down  | 2022-03-15T07:52:42.000000 |
| cinder-volume    | hostgroup@Tier3      | nova | enabled | down  | 2022-03-15T07:52:42.000000 |
+------------------+----------------------+------+---------+-------+----------------------------+
  • The following errors are also seen in /var/log/containers/cinder/cinder-volume.log:
2022-03-15 17:23:16.328 117 INFO cinder.volume.manager [req-e965ded1-82d9-4371-9802-0f19e9c3e2f0 - - - - -] Starting volume driver HBSDFCDriver (2.0.0)
2022-03-15 17:23:16.331 117 ERROR cinder.volume.drivers.hitachi.hbsd_utils [req-e965ded1-82d9-4371-9802-0f19e9c3e2f0 - - - - -] MSGID0731-E: Failed to communicate with the REST API server. (exception: <class 'requests.exceptions.ConnectionError'>, message: HTTPSConnectionPool(host='10.10.10.10', port=443): Max retries exceeded with url: /ConfigurationManager/v1/objects/storages/834000412623/sessions (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f475db39908>: Failed to establish a new connection: [Errno 113] EHOSTUNREACH',)), method: POST, url: https://10.10.10.10:443/ConfigurationManager/v1/objects/storages/834000412623/sessions, params: None, body: None): requests.exceptions.ConnectionError: HTTPSConnectionPool(host='10.10.10.10', port=443): Max retries exceeded with url: /ConfigurationManager/v1/objects/storages/834000412623/sessions (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f475db39908>: Failed to establish a new connection: [Errno 113] EHOSTUNREACH',))
2022-03-15 17:23:16.332 117 ERROR cinder.volume.manager [req-e965ded1-82d9-4371-9802-0f19e9c3e2f0 - - - - -] Failed to initialize driver.: cinder.volume.drivers.hitachi.hbsd_utils.HBSDError: HBSD error occurred. Failed to communicate with the REST API server. (exception: <class 'requests.exceptions.ConnectionError'>, message: HTTPSConnectionPool(host='10.10.10.10', port=443): Max retries exceeded with url: /ConfigurationManager/v1/objects/storages/834000412623/sessions (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f475db39908>: Failed to establish a new connection: [Errno 113] EHOSTUNREACH',)), method: POST, url: https://10.10.10.10:443/ConfigurationManager/v1/objects/storages/834000412623/sessions, params: None, body: None)

Environment

  • Red Hat OpenStack Platform 16.2 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content