OpenShift image data stored in etcd results in a very large database

Issue

Our etcd db has grown beyond a manageable size. We have over 2,000 projects and are now at a 1.3GB snapshot size even with a pruned version of etcd. How can we solve the etcd db at scale?

The atomic-openshift-master-controllers service was restarting repeatedly on all masters. Further checking revealed that etcd is restarting repeatedly as well. The health checks were failing too:

# etcdctl -C https://openshift.example.com:2379 --ca-file=/etc/origin/master/master.etcd-ca.crt --cert-file=/etc/origin/master/master.etcd-client.crt --key-file=/etc/origin/master/master.etcd-client.key cluster-health
failed to check the health of member 3a78b19a3ba02203 on https://openshift.example.com:2379: Get https://openshift.example.com:2379/health: net/http: TLS handshake timeout
member 3a78b19a3ba02203 is unreachable: [https://openshift.example.com:2379] are all unreachable

Environment

Red Hat OpenShift Enterprise 3.1, 3.2, 3.3

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

OpenShift image data stored in etcd results in a very large database

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links