OpenShift image data stored in etcd results in a very large database
Issue
- Our etcd db has grown beyond a manageable size. We have over 2,000 projects and are now at a 1.3GB snapshot size even with a pruned version of etcd. How can we solve the etcd db at scale?
-
The
atomic-openshift-master-controllers
service was restarting repeatedly on all masters. Further checking revealed thatetcd
is restarting repeatedly as well. The health checks were failing too:# etcdctl -C https://openshift.example.com:2379 --ca-file=/etc/origin/master/master.etcd-ca.crt --cert-file=/etc/origin/master/master.etcd-client.crt --key-file=/etc/origin/master/master.etcd-client.key cluster-health failed to check the health of member 3a78b19a3ba02203 on https://openshift.example.com:2379: Get https://openshift.example.com:2379/health: net/http: TLS handshake timeout member 3a78b19a3ba02203 is unreachable: [https://openshift.example.com:2379] are all unreachable
Environment
- Red Hat OpenShift Enterprise 3.1, 3.2, 3.3
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.