OpenShift Container Platform 4 is failing to terminate or create pods and many Cluster Operator are failing to work as expected
Issue
- Pods are unable to terminate and creation of pods is failing
- Everything up and running is working but creating new pods, stopping them or restarting pods, operators and controllers is failing, impacting functionality of OpenShift
- Suddenly, no pods can be created or terminated and OpenShift API is reporting many HTTP 429 status
-
Pod creation and terminating is failing and we found the below
grpc: trying to send message larger than max (2169698338 vs. 2147483647)
messages inkube-apiserver
logsI0821 23:03:24.662977 17 trace.go:205] Trace[419549052]: "List(recursive=true) etcd3" key:/secrets,resourceVersion:,resourceVersionMatch:, limit:10000,continue: (21-Aug-2023 23:03:23.767) (total time: 895 ms): Trace[419549052]: [895.681318ms] [895.681318ms] END W0821 23:03:24.662996 17 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (2169698338 vs. 2147483647) E0821 23:03:24.663007 17 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (2169698338 vs. 2147483647); reinitializing...
-
Creating or removing pods on OpenShift is failing and the below messages are found in
etcd
logs{"level":"warn","ts":"2023-08-17T23:03:24.662Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00357ae00/10.1.1.3:2379","attempt":0,"error":"rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (2169698338 vs. 2147483647)"}
Environment
- Red Hat OpenShift Container Platform (RHOCP) 4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.