OpenShift GitOps creating a large amount of "OperationCompleted" events in RHOCP 4

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4

Issue

  • The following sequence of events is being repeated thousands of times in the GitOps project:

    3h19m       Normal    ResourceUpdated      application/<application_name>                                                   Updated sync status: OutOfSync -> Synced
    3h19m       Normal    ResourceUpdated      application/<application_name>                                                   Updated sync status: Synced -> OutOfSync
    3h19m       Normal    ResourceUpdated      application/<application_name>                                                    Updated sync status: Synced -> OutOfSync
    3h19m       Normal    OperationStarted     application/<application_name>                                                   Initiated automated sync to '486cd42e19fecf07146a538b3526823715272ccc'
    3h19m       Normal    OperationCompleted   application/<application_name>                                                   Partial sync operation to 486cd42e19fecf07146a538b3526823715272ccc succee
    
  • The consequence of it is that Etcd becomes larger and the workload on the master nodes gets also higher. Find additional details in ETCD performance troubleshooting guide for OpenShift Container Platform; especially in the epigraph "ETCD cleanup" where there is the following statement in that regard: "Any number of CRDs (secrets, deployments, etc..) above 8k could cause performance issues on storage with not enough IOPS".

Resolution

  • This was reported here: GITOPS-4193. However, at least in the case analyzed, the problem was not related to any bug. This issue was detected in a cluster due to the following reason:

    • Due to some external tools, some objects were being automatically edited (reconciled) in the cluster side. This implied that the following events started happening in an infinite loop:
      1. Red Hat OpenShift GitOps edited the objects in the cluster to replicate their respective contents in the Git repository.
      2. The external automation mentioned previously overwrote those changes again.
  • Please, contact Red Hat Support if you need any help or if you suspect that the root cause why you are experiencing this problem is different.

Diagnostic Steps

  • Find the number of these events in the project by executing the following command:
oc get event -n openshift-gitops | grep -F 'OperationCompleted' | wc -l
  • Check the files changed in the OpenShift project affected and find whether something is overwriting what Red Hat OpenShift GitOps is changing.
    • The changes can be checked by using the diff feature either through the command line or from the graphical user interface.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments