Openshift 4.9-4.10 Etcd 3.5.0-3.5.2 data inconsistency
Issue
- The issue occurs when the etcd process is shutdown in an uncontrolled manner and is operating under high load
- The kill -9 command and Out Of Memory (OOM) kills are canonical examples, as well as OpenShift Container Platform Control Plane Nodes running out of available memory or disk space.
- After updating to OCP 4.9.28 or 4.10.9 (or later) one etcd pod is failing to start and the etcd operator is in a degraded state.
- etcd pod is failing to start and reporting found data inconsistency with peers.
Environment
Red Hat Openshift Container Platform 4.9-4.10
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.