After upgrade to v5.4.2, collector pods keep restarting with OOMKilled

Environment

Red Hat OpenShift Service on AWS
OpenShift Container Platform
- 4.9+
Logging
- 5.4.2

Issue

After upgrade to v5.4.2, collector pods keep restarting with OOMKilled if you configured CloudWatch forwarding configurations.
After upgrade to v5.4.2, collector pods memory usage is spiked than before if you configured CloudWatch forwarding configurations.

Resolution

Please upgrade your Logging sub-system to v5.4.3+ in order to fix the issue.

Root Cause

The collector(fluentd) had been enabled "disable_chunk_backup" as of v5.4.2, it causes more memory usage due to handling chunk backup files IO and around buffers. It's disabled at v5.4.3 again to fix this issue.
Usually, it would be affected when you configure CloudWatch in ClusterLogForwarder in your env. Because CloudWatch has hard limit message length size as 256kb, all logs which is more than the limit size are backing up if "disable_chunk_backup" is enabled.

Diagnostic Steps

With CloudWatch forwarding configuration, you can see the following logs in your collector pod if your log is more than 256kb, when "disable_chunk_backup" is enabled. It shows you that your collector can use more memory than before.

[warn]: got unrecoverable error in primary and no secondary error_class=Fluent::Plugin::CloudwatchLogsOutput::TooLargeEventError error="Log event in /dxpf/cont/rosa/audit.audit is discarded because it is too large: 12345 bytes exceeds limit of 262144"
:
[warn]: bad chunk is moved to /tmp/fluent/backup/worker0/object_xxxxx.log

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Select Your Language

After upgrade to v5.4.2, collector pods keep restarting with OOMKilled

Environment

Issue

Resolution

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Environment

Issue

Resolution

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links