Limits on Red Hat OpenShift Container Platform (RHOCP) 4.x core services

Solution In Progress - Updated -

Environment

Red Hat OpenShift Container Platform (RHOCP) 4.x

Issue

  • Are there any recommendations to impose limits on OpenShift core services and infrastructure?
  • Why OpenShift core services doesn't have limits configured?

Resolution

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

As per defined by Kubernetes, Pods are allowed to declare their CPU and memory resource requirements in advance using requests and limits. Requests are used to ensure minimum resources are provided and to influence scheduling, while limits prevent Pods from consuming resources excessively.

Unlike with user workloads, setting limits for cluster components is problematic for several reasons:

  • Components cannot anticipate how they scale in usage in all customer environments, so setting one-size-fits-all limits is not possible.
  • Setting static limits prevents administrators from responding to changes in their clusters’ needs, such as by resizing control plane nodes to provide more resources.
  • We do not want cluster components to be restarted based on their resource consumption (for example, being killed due to an out-of-memory condition). We need to detect and handle those cases more gracefully, without degrading cluster performance.

Therefore, cluster components SHOULD NOT be configured with resource limits.

There is one exception at this moment, which is Prometheus, part of Openshift Monitoring stack, that can have its limits configured. See the KB How to set up the resource limit and request for Prometheus in OpenShift ? for additional details.

Additional explanation is provided by engineering team as part of OpenShift Conventions document.

Customers and partners must be aware that the RHOCP clusters must be properly sized according to its own business and application workload requirements, with in mind that Control Plane nodes are responsible to handle the Red Hat Openshift core services (cluster operators) which is a crucial and sensitive stack for the cluster.

Take in consideration the following documentation during your capacity planning study:
- Recommended host practices.
- Other specific application storage recommendations
- Backend Performance Requirements for OpenShift etcd.
- ETCD performance troubleshooting guide for Openshift Container Platform

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments