How to compact and defrag etcd to decrease database size in OpenShift 4

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • etcd

Issue

  • What are the steps to compact and defrag the etcd database in OCP 4.
  • When etcd needs defragmentation?

Resolution

For OCP 3, refer to How to defrag Etcd to decrease DB size in OpenShift 3.

In Red Hat OpenShift 4, etcd is compacted every 5 minutes, and starting with OpenShift 4.9, etcd data is automatically defragmented by the etcd Operator as explained in automatic defragmentation for etcd data. Refer to the "Diagnostic Steps" section for checking if the automatic compaction and defragmentation is being executed.

Manual etcd compaction and defragmentation

In case a manual etcd compaction and defragmentation is required, please check the following steps, paying special attention to the Prerequisites and the Additional Notes.

Prerequisites

Important: etcd defragmentation can impact cluster performance. Before compacting or defragmenting, check that the dbSize and the dbSizeInUse differ in more than 40-50% in the output of the etcdctl endpoint status --cluster -w json command. Check also what member is the etcd leader in above commands, as it needs to be the last member to run the commands against.

$ oc get pods -n openshift-etcd -l etcd
[...]
$ oc rsh  -n openshift-etcd [etcd-pod] etcdctl endpoint status --cluster -w table
[...]
$ oc rsh  -n openshift-etcd [etcd-pod] etcdctl endpoint status --cluster -w json
[...]

NOTE: If the above does not work, compact operation can be performed first and then defrag.

In case the oc doesn't work, the commands can be executed directly in the container on the master. If you have SSH access to the master, you can connect to the container using the command below.

# crictl exec -it $(crictl ps | grep etcdctl | awk '{print $1}') bash

Etcd compact operation

# Get the current revision number
$ etcdctl endpoint status --write-out json | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*'
440161
440161
440161

# Compact the database
$ etcdctl --command-timeout=600s compact <rev_number>

Defrag

For Defrag of the etcd database, please refer to the "Manual defragmentation" section in the OpenShift 4 documentation: Defragmenting etcd data.

To verify the defragmentation, check the "Diagnostic Steps" section.

Additional Notes

  • Make sure of leaving the leader instance to be the last one to run the command against.
  • If a timeout occurs please increase --command-timeout until success.
  • It is important to note that the defrag action is blocking. The member will not respond until the defrag is complete. For this reason, defrag should be a rolling action.
  • Allow 30 seconds to 1 minute in between defrag actions on each of the etcd pods for the cluster to recover.
  • Note that defrag operation needs to be run against each member individually. Make sure to unset ETCDCTL_ENDPOINTS before defrag and run the command with --endpoints=https://localhost:2379 as per documentation.

Root Cause

Compacting old revisions internally fragments etcd by leaving gaps in the backend database. Fragmented space is available for use by etcd but unavailable to the host filesystem. In other words, deleting application data does not reclaim the space on the disk.
The process of defragmentation releases this storage space back to the file system. Defragmentation is issued for each etcd member individually, so that cluster-wide latency spikes may be avoided as the defrag action is blocking.

Diagnostic Steps

  • Ensure that there is not an etcd-disable-defrag configmap in the openshift-etcd-operator namespace:

    $ oc get cm etcd-disable-defrag -n openshift-etcd-operator
    Error from server (NotFound): configmaps "etcd-disable-defrag" not found
    
  • Check for defragmentation and compaction logs in the etcd pods:

    $ oc get pods -n openshift-etcd -l etcd
    [...]
    $ oc logs -n openshift-etcd [pod_name] | grep defrag
    [...]
    {"level":"info","ts":"2024-01-01T00:00:00.000000Z","caller":"v3rpc/maintenance.go:90","msg":"starting defragment"}
    {"level":"info","ts":"2024-01-01T00:00:00.000000Z","caller":"backend/backend.go:497","msg":"defragmenting","path":"/var/lib/etcd/member/snap/db","current-db-size-bytes":108052480,"current-db-size":"108 MB","current-db-size-in-use-bytes":58818560,"current-db-size-in-use":"59 MB"}
    {"level":"info","ts":"2024-01-01T00:00:00.000000Z","caller":"backend/backend.go:549","msg":"finished defragmenting directory","path":"/var/lib/etcd/member/snap/db","current-db-size-bytes-diff":-50335744,"current-db-size-bytes":57716736,"current-db-size":"58 MB","current-db-size-in-use-bytes-diff":-1110016,"current-db-size-in-use-bytes":57708544,"current-db-size-in-use":"58 MB","took":"465.693089ms"}
    {"level":"info","ts":"2024-01-01T00:00:00.000000Z","caller":"v3rpc/maintenance.go:96","msg":"finished defragment"}
    [...]
    
    $ oc logs -n openshift-etcd [pod_name] | grep compaction
    [...]
    {"level":"info","ts":"2024-01-01T00:00:00.000000Z","caller":"mvcc/kvstore_compaction.go:66","msg":"finished scheduled compaction","compact-revision":3196394,"took":"341.818892ms","hash":3764523612}
    {"level":"info","ts":"2024-01-01T00:05:00.000000Z","caller":"mvcc/kvstore_compaction.go:66","msg":"finished scheduled compaction","compact-revision":3201824,"took":"216.814266ms","hash":143910628}
    {"level":"info","ts":"2024-01-01T00:10:00.000000Z","caller":"mvcc/kvstore_compaction.go:66","msg":"finished scheduled compaction","compact-revision":3206943,"took":"224.977711ms","hash":3635866851}
    [...]
    
  • If manual compaction and defragmentation is performed, verify that it successfully reducing the database space on all the etcd instances:

    $ oc get pods -n openshift-etcd -l etcd
    [...]
    $ oc rsh  -n openshift-etcd [etcd-pod]
    sh-4.2# etcdctl endpoint status -w table --cluster
    +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    |         ENDPOINT          |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
    +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    | https://10.0.000.000:2379 | 2be524b7dda9390d |   3.4.9 |   70 MB |     false |      false |         7 |     657422 |             657422 |        |
    | https://10.0.001.010:2379 | b2d7a157e701c9e5 |   3.4.9 |  70 MB |      true |      false |         7 |     657422 |             657422 |        |
    |  https://10.0.002.02:2379 | d5ae36565eff47a1 |   3.4.9 |  70 MB |     false |      false |         7 |     657422 |             657422 |        |
    +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments