All ceph containers stays down after unexpected reboot
Issue
- The ceph node is down unexpectedly due to some reason, all ceph containers do not come back up after the ceph node reboot. Below is error info from messages log:
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 podman[38527]: Error: error creating container storage: the container name "ceph-osd-1" is already in use by "2a581d08ced179692b5fea7fa412197a56482bfef418805f69d13ea0ee48089d". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@1.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@1.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 podman[38588]: Error: error creating container storage: the container name "ceph-osd-212" is already in use by "d510f024cc64bf23884bc74eafe8d8b8f3cfd7296a0a9b7ade2d865834a0422c". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@212.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@212.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 podman[38679]: Error: error creating container storage: the container name "ceph-mgr-ceph01" is already in use by "4d54d65bd2df714d3543668b5b04185e8ed7c2aa8826f0d3544174f5fec59c69". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-mgr@ceph01.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-mgr@ceph01.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph Manager.
Oct 28 21:04:04 ceph01 podman[38603]: Error: error creating container storage: the container name "ceph-osd-157" is already in use by "15f24f3c6281797fe219aede163a57e56731d1d2d3764feaa490386dfce34c6b". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 podman[38740]: Error: error creating container storage: the container name "ceph-rgw-ceph01-rgw0" is already in use by "14aec340bf3ce381ab96686f56b5ce659dcc9be9df6df96f943c8711428e1bb0". You have to remove that container to be able to reuse that name.: that name is already in use
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@157.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-osd@157.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph OSD.
Oct 28 21:04:04 ceph01 systemd[1]: ceph-radosgw@rgw.ceph01.rgw0.service: Control process exited, code=exited status=125
Oct 28 21:04:04 ceph01 systemd[1]: ceph-radosgw@rgw.ceph01.rgw0.service: Failed with result 'exit-code'.
Oct 28 21:04:04 ceph01 systemd[1]: Failed to start Ceph RGW.
Environment
- Red Hat Ceph Storage 4.1z1 - 4.1.1
- ceph version 14.2.8-81.el8cp
- ceph-ansible-4.0.25-1
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.