Monitoring your OpenShift cluster health with Insights Advisor

OpenShift Cluster Manager 2023

Using Insights Advisor to monitor your OpenShift cluster health

Red Hat Customer Content Services

Abstract


Chapter 1. About Red Hat Insights Advisor for OpenShift Container Platform

Use Red Hat Insights Advisor for OpenShift Container Platform to identify and solve issues with your clusters.

1.1. About Red Hat Insights Advisor for OpenShift Container Platform

You can use Insights Advisor to assess and monitor the health of your OpenShift Container Platform clusters. Whether you are concerned about individual clusters, or with your whole infrastructure, it is important to be aware of your exposure to issues that can affect service availability, fault tolerance, performance, or security.

Insights repeatedly analyzes the data that Insights Operator sends using a database of recommendations, which are sets of conditions that can leave your OpenShift Container Platform clusters at risk. Your data is then uploaded to the Insights Advisor service on Red Hat Hybrid Cloud Console where you can perform the following actions:

  • See clusters impacted by a specific recommendation.
  • Use robust filtering capabilities to refine your results to those recommendations.
  • Learn more about individual recommendations, details about the risks they present, and get resolutions tailored to your individual clusters.
  • Share results with other stakeholders.

To use Insights Advisor, your cluster must be registered to OpenShift Cluster Manager. To register a disconnected cluster, see Registering OpenShift Container Platform clusters to OpenShift Cluster Manager.

Additional resources

1.2. Understanding Insights Advisor recommendations

Insights Advisor bundles information about various cluster states and component configurations that can negatively affect the service availability, fault tolerance, performance, or security of your clusters. This information set is called a recommendation in Insights Advisor and includes the following information:

  • Name: A concise description of the recommendation
  • Added: When the recommendation was published to the Insights Advisor archive
  • Category: Whether the issue has the potential to negatively affect service availability, fault tolerance, performance, or security
  • Total risk: A value derived from the likelihood that the condition will negatively affect your infrastructure, and the impact on operations if that were to happen
  • Clusters: A list of clusters on which a recommendation is detected
  • Link to associated topics: More information from Red Hat about the issue

Chapter 2. Using Red Hat Insights Advisor for OpenShift Container Platform

Insights Advisor repeatedly analyzes the data Insights Operator sends. You can view and manage reports showing Insights Advisor data for your OpenShift Container Platform cluster from the Insights Advisor service on Red Hat Hybrid Cloud Console.

2.1. Displaying potential issues with your cluster

This section describes how to display the Insights report in Insights Advisor on Red Hat Hybrid Cloud Console.

Note that Insights repeatedly analyzes your cluster and shows the latest results. These results can change, for example, if you fix an issue or a new issue has been detected.

Prerequisites

Procedure

  1. Navigate to AdvisorRecommendations on Red Hat Hybrid Cloud Console.

    Depending on the result, Insights Advisor displays one of the following:

    • No matching recommendations found, if Insights did not identify any issues.
    • A list of issues Insights has detected, grouped by risk (low, moderate, important, and critical).
    • No clusters yet, if Insights has not yet analyzed the cluster. The analysis starts shortly after the cluster has been installed, registered, and connected to the internet.
  2. If any issues are displayed, click the > icon in front of the entry for more details.

    Depending on the issue, the details can also contain a link to more information from Red Hat about the issue.

2.2. Displaying all Insights Advisor recommendations

The Recommendations view, by default, only displays the recommendations that are detected on your clusters. However, you can view all of the recommendations in the Insights Advisor archive.

Prerequisites

Procedure

  1. Navigate to AdvisorRecommendations on Red Hat Hybrid Cloud Console.
  2. Click the X icons next to the Clusters Impacted and Status filters.

    You can now browse through all of the potential recommendations for your cluster.

2.3. Disabling Insights Advisor recommendations

You can disable specific recommendations that affect your clusters, so that they no longer appear in your reports. It is possible to disable a recommendation for a single cluster or all of your clusters.

Note

Disabling a recommendation for all of your clusters also applies to any future clusters.

Prerequisites

Procedure

  1. Navigate to AdvisorRecommendations on Red Hat Hybrid Cloud Console.
  2. To disable the recommendation for a single cluster:

    1. Click the name of the recommendation to disable. You are directed to the single recommendation page.
    2. Click the Options menu more options for that cluster, and then click Disable recommendation for cluster.
    3. Enter a justification note and click Save.
  3. To disable the recommendation for all of your clusters:

    1. Click the name of the recommendation to disable. You are directed to the single recommendation page.
    2. Click ActionsDisable recommendation.
    3. Enter a justification note and click Save.

2.4. Enabling a previously disabled Insights Advisor recommendation

When a recommendation is disabled for all clusters, you will no longer see the recommendation in Insights Advisor. You can change this behavior.

Prerequisites

Procedure

  1. Navigate to AdvisorRecommendations on Red Hat Hybrid Cloud Console.
  2. Filter the recommendations by StatusDisabled.
  3. Locate the recommendation to enable.
  4. Click the Options menu more options , and then click Enable recommendation.

2.5. Displaying the Insights Advisor status in the web console

Insights Advisor repeatedly analyzes your cluster and you can display the status of identified potential issues of your cluster in the OpenShift Container Platform web console. This status shows the number of issues in the different categories and, for further details, links to the reports in Red Hat Hybrid Cloud Console.

Prerequisites

  • Your cluster is registered with OpenShift Cluster Manager.
  • Remote health reporting is enabled, which is the default.
  • You are logged in to the OpenShift Container Platform web console.

Procedure

  1. Navigate to HomeOverview in the OpenShift Container Platform web console.
  2. Click Insights on the Status card.

    The pop-up window lists potential issues grouped by risk. Click the individual categories or View all recommendations in Insights Advisor to display more details.

2.6. Using update-risk assessment to identify and mitigate cluster-update risks

The Red Hat Insights advisor service assesses the risk of cluster-update failure. Update-risk assessment uses machine learning developed in collaboration with IBM Research to compare the recent state of the cluster with conditions known to cause updates to fail.

Managing updates in complex, production Kubernetes environments is a challenging task. Over 60 independently-working components usually form the infrastructure of such environments and each component has a different operational state and configuration, which can cause minor and major version updates to fail.

Update-risk assessment shows you a list of risks present in your cluster, including failing operator conditions, alerts, and other metrics. The assessment also provides links to specific information about each issue. You can use the update-risk feature to generate a checklist of issues to fix before beginning a cluster update.

Prerequisites

Procedure

  1. Navigate to https://console.redhat.com/openshift/insights/advisor/clusters.
  2. Click on a cluster to view cluster details.

    OpenShift advisor cluster details page

    • If update risks exist for the selected cluster, a “Resolve update risks” banner is visible.
    • If no risks exist for the cluster, a banner displays the message, "No known update risks identified for this cluster."
    • If the cluster has not checked in for more than two hours, the banner message says "Warning alert:Update risks are not currently available. This cluster has gone more than two hours without sending metrics. Check the cluster’s web console if you think that this is incorrect."
  3. Click the Update risks tab.
  4. If update risks are detected, view alerts or cluster operator risks for the cluster.
  5. Click on an alert to open the in-cluster Alert details page for that alert in the Red Hat OpenShift web console.
  6. Click a cluster operator to open the in-cluster, ClusterOperator details page in the Red Hat OpenShift web console.

For more information about alerts, see OpenShift documentation, Getting information about alerts, silences, and alerting rules.

2.7. Using the Deployment Validation Operator in your Red Hat Insights for OpenShift workflow

The Deployment Validation Operator (DVO) validates on-premises and managed clusters against a curated collection of KubeLinter checks. These checks implement best practices for Kubernetes-native workloads, helping to ensure that applications are optimized for the operational stability of the cluster.

The Insights Operator gathers the DVO checks every two hours by default and presents data in the Red Hat Hybrid Cloud Console, Insights Advisor service. If the DVO detects issues, cluster administrators can view resolutions in the Insights Advisor service. If the DVO detects no issues, no results are visible in the Insights Advisor service.

Curated KubeLinter checks

The Deployment Validation Operator (DVO) checks a curated, limited collection of all of the available KubeLinter checks. The DVO does not execute the whole list of available KubeLinter checks. Insights Advisor service recommendations do not exist for all available KubeLinter checks.

Supported OpenShift Container Platform versions

All Red Hat-supported versions of OpenShift Container Platform, OpenShift Dedicated, and Red Hat OpenShift Service on AWS support the Deployment Validation Operator (DVO).

Important

Managed OpenShift clusters have the Deployment Validation Operator (DVO) installed and operational, by default. On-premesis clusters must download the DVO from OperatorHub and can modify the default list of checks.

2.7.1. The Deployment Validation Operator (DVO) on managed OpenShift clusters

The Deployment Validation Operator (DVO) is already installed and operational on managed OpenShift clusters. This includes clusters on OpenShift Dedicated and Red Hat OpenShift Service on AWS.

DVO Configuration

The DVO for managed clusters comes preconfigured, by default. The DVO configuration file contains the default, curated set of KubeLinter checks and is not editable.

DVO Updates

On managed clusters, the DVO updates automatically.

2.7.2. The Deployment Validation Operator (DVO) on on-premises OpenShift clusters

Administrators of on-premises clusters must install the Deployment Validation Operator (DVO) from OperatorHub in the OpenShift web console. On-premises cluster administrators can also configure the default set of KubeLinter checks.

Note

The Insights Advisor service does not have recommendations for all of the checks that KubeLinter has available.

2.7.2.1. Installing the Deployment Validation Operator (DVO) on on-premises Openshift clusters

You can find the DVO in OperatorHub and install it from there.

Prerequisites

  • You are logged into the Red Hat OpenShift web console as a cluster administrator.

Procedure

  1. Navigate to Red Hat OpenShift web console > Operators > OperatorHub.
  2. In the Search box, start typing “deployment-validation-operator."
  3. Click on the DVO card when you see it.
  4. When you see the pop-up window appear, click Continue to proceed.
  5. The Deployment Validation Operator card displays information about capabilities, configuration, version, and GitHub source files. When you are ready to install the DVO, click Install.
  6. Choose the namespace or use the default.
  7. Click Install and the Operator installs. You can confirm the installation in the InstalledOperators view. The DVO is also visible in the Pods and Deployments views for the cluster in the corresponding namespace.

2.7.2.2. Configuring the list of Deployment Validation Operator (DVO) checks on on-premises clusters

Administrators of on-premises OpenShift clusters can change the default list of DVO checks to focus on specific best practices of interest. Refer to the section, Configuring Checks, in the DVO technical documentation in GitHub.

2.7.2.3. Updating the Deployment Validation Operator (DVO) on on-premises clusters

You can set the DVO to automatically update during the installation from OperatorHub.

2.7.3. Viewing Deployment Validation Operator (DVO) results in the Insights Advisor service

If the DVO detects issues, the Overview page for the cluster in the OpenShift web console shows an Insights link with the number of detected issues in the Status information block. Click the link to learn more about the issue and how to resolve it in the Insights Advisor service in the Red Hat Hybrid Cloud Console.

Note

If the latest DVO returns no issues, you will not see any issues in Insights.

Prerequisites

  • You are logged into the Red Hat Hybrid Cloud Console.

Procedure

  1. Navigate to the Overview page for the cluster in the OpenShift web console.
  2. Look for the Status block, an Insights link, and the number of detected issues.
  3. If one or more issues exist, click the Insights link to open the Insights Advisor service in the Hybrid Cloud Console.

    The link takes you to Insights > Advisor > Clusters, to the cluster information page.

  4. In the Recommendations tab, you can see the issues detected on the cluster. Click the arrow to view complete information about the issue, including the necessary actions to resolve it.

2.7.4. Viewing the default list of Deployment Validation Operator (DVO) checks

The Deployment Validation Operator (DVO) checks your cluster against a default list of checks. You can view the list of DVO checks in GitHub.

Important

Administrators of managed clusters cannot modify the list of default checks, but can view them using the link above.

Legal Notice

Copyright © 2023 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.