Redis slave nodes fail to start during OSP minor update

Solution In Progress - Updated -

Issue

  • When running openstack overcloud update run --node Controller, the redis-bundle on the slave nodes timeout trying to connect to the master and the pcs resource fails then shortly after is reported as stopped.

  • Running docker ps | grep redis on the slave nodes shows that the redis containers are actually running and seem to be functional.

  • Running pcs resource cleanup does resolve the issue, however we are expecting that all pacemaker resources are successfully restarted after an update.

  • pcs status shows the slaves as stopped:

 Docker container set: redis-bundle [registry.gpslab.cbr.redhat.com/rhosp13/openstack-redis:pcmklatest]
   Replica[0]
      redis-bundle-docker-0     (ocf::heartbeat:docker):        Started overcloud-controller-1
      redis-bundle-0    (ocf::pacemaker:remote):        Started overcloud-controller-1
      redis     (ocf::heartbeat:redis): Master redis-bundle-0
   Replica[1]
      redis-bundle-docker-1     (ocf::heartbeat:docker):        Started overcloud-controller-2
      redis-bundle-1    (ocf::pacemaker:remote):        Started overcloud-controller-2
      redis     (ocf::heartbeat:redis): Stopped
   Replica[2]
      redis-bundle-docker-2     (ocf::heartbeat:docker):        Started overcloud-controller-3
      redis-bundle-2    (ocf::pacemaker:remote):        Started overcloud-controller-3
      redis     (ocf::heartbeat:redis): Stopped
...
* Node redis-bundle-1@overcloud-controller-2:
   redis: migration-threshold=1000000 fail-count=1000000 last-failure='Mon Feb 17 11:11:04 2020'
* Node redis-bundle-2@overcloud-controller-3:
   redis: migration-threshold=1000000 fail-count=1000000 last-failure='Mon Feb 17 10:51:02 2020'
...

Failed Resource Actions:
* redis_start_0 on redis-bundle-1 'unknown error' (1): call=8, status=Timed Out, exitreason='',
    last-rc-change='Mon Feb 17 11:07:44 2020', queued=0ms, exec=200002ms
* redis_start_0 on redis-bundle-2 'unknown error' (1): call=8, status=Timed Out, exitreason='',
    last-rc-change='Mon Feb 17 10:47:42 2020', queued=0ms, exec=200001ms
...

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In