All ceph containers stays down after unexpected reboot

Solution Verified - Updated -

Issue

  • The ceph node is down unexpectedly due to some reason, all ceph containers do not come back up after the ceph node reboot. Below is error info from messages log:
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 podman[38527]: Error: error creating container storage: the container name "ceph-osd-1" is already in use by "2a581d08ced179692b5fea7fa412197a56482bfef418805f69d13ea0ee48089d". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@1.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@1.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 podman[38588]: Error: error creating container storage: the container name "ceph-osd-212" is already in use by "d510f024cc64bf23884bc74eafe8d8b8f3cfd7296a0a9b7ade2d865834a0422c". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@212.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@212.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 podman[38679]: Error: error creating container storage: the container name "ceph-mgr-ceph01" is already in use by "4d54d65bd2df714d3543668b5b04185e8ed7c2aa8826f0d3544174f5fec59c69". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-mgr@ceph01.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-mgr@ceph01.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph Manager.
Oct 28 21:04:04 ceph01 podman[38603]: Error: error creating container storage: the container name "ceph-osd-157" is already in use by "15f24f3c6281797fe219aede163a57e56731d1d2d3764feaa490386dfce34c6b". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 podman[38740]: Error: error creating container storage: the container name "ceph-rgw-ceph01-rgw0" is already in use by "14aec340bf3ce381ab96686f56b5ce659dcc9be9df6df96f943c8711428e1bb0". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@157.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@157.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 systemd[1]: ceph-radosgw@rgw.ceph01.rgw0.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-radosgw@rgw.ceph01.rgw0.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph RGW.

Environment

  • Red Hat Ceph Storage 4.1z1 - 4.1.1
  • ceph version 14.2.8-81.el8cp
  • ceph-ansible-4.0.25-1

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content