ETCD error of dataDir has been destroyed and must be removed in OpenShift Container Platform 4 cluster.
Issue
- While replacing an unhealthy master node,
etcd
is degraded with following error:
ETCD POD Logs:
2022-06-06T04:55:19.695291547Z #### attempt 0
2022-06-06T04:55:19.696317283Z member={name="master02.ocp-dev.example.com", peerURLs=[https://192.168.0.2:2380}, clientURLs=[https://192.168.0.2:2379]
2022-06-06T04:55:19.696317283Z member={name="master03.ocp-dev.example.com", peerURLs=[https://192.168.0.3:2380}, clientURLs=[https://192.168.0.3:2379]
2022-06-06T04:55:19.696317283Z member={name="master01.ocp-dev.example.com", peerURLs=[https://192.168.0.1:2380}, clientURLs=[https://192.168.0.1:2379]
2022-06-06T04:55:19.696335095Z target={name="master02.ocp-dev.example.com", peerURLs=[https://192.168.0.2:2380}, clientURLs=[https://192.168.0.2:2379]
2022-06-06T04:55:19.696368271Z member "https://192.168.0.2:2380" dataDir has been destroyed and must be removed from the cluster
- ETCD Pod failed to start and getting restarted multiple times:
etcd-master01.ocp-dev.example.com 4/4 Running 0 5d
etcd-master02.ocp-dev.example.com 3/4 Running 553 1d <---- node which has been replaced recently.
etcd-master03.ocp-dev.example.com 4/4 Running 0 5d
etcd-quorum-guard-c97444699-kqw72 0/1 Running 0 1d
etcd-quorum-guard-c97444699-lkjmm 1/1 Running 0 1d
etcd-quorum-guard-c97444699-zsq4c 1/1 Running 0 1d
Environment
- Red Hat OpenShift Container Platform (RHOCP) 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.