Chapter 23. Monitoring and observability

This chapter provides a number of ways to monitor and obtain metrics and logs from your Red Hat Virtualization system. These methods include:

  • Using Data Warehouse and Grafana to monitor RHV
  • Sending metrics to a remote instance of Elasticsearch
  • Deploying Insights in Red Hat Virtualization Manager

23.1. Using Data Warehouse and Grafana to monitor RHV

23.1.1. Grafana

Grafana is a web-based UI tool used to display reports based on data collected from the oVirt Data Warehouse PostgreSQL database under the database name ovirt_engine_history. See Grafana dashboards.

Data from the Manager is aggregated in hourly and daily aggregations. The data is retained according to the scale setting defined in the Data Warehouse configuration during engine-setup (Basic or Full scale):

  • Basic (default) - sampled every 24 hours, hourly data saved for 1 month, daily data - no daily aggregations saved.
  • Full (recommended)- sampled every 24 hours, hourly data saved for 2 months, daily aggregations saved for 5 years.

Full sample scaling may require migrating the Data Warehouse to a separate virtual machine.

Note

Red Hat only supports installing the Data Warehouse database, the Data Warehouse service and Grafana all on the same machine as each other, even though you can install each of these components on separate machines from each other.

23.1.2. Configuring Grafana

Installation

Grafana integration is enabled and installed by default when running Red Hat Virtualization Manager engine-setup for Stand Alone Manager installations. For Self-Hosted engine installations, you must enable Grafana integration manually when running engine-setup.

To enable Grafana integration in the Self-Hosted engine:

  1. Put the environment in global maintenance mode:

    # hosted-engine --set-maintenance --mode=global
  2. Log in to the machine where you want to install Grafana.
  3. Run the engine-setup command as follows:

    # engine-setup --reconfigure-optional-components
  4. Answer Yes to install Grafana on this machine:

    Configure Grafana on this host (Yes, No) [Yes]:
  5. Disable global maintenance mode:

    # hosted-engine --set-maintenance --mode=none

To access the Grafana dashboards:

  • Go to https://<engine FQDN or IP address>/ovirt-engine-grafana

or

  • Click Monitoring Portal in the web administration welcome page for the Administration Portal.

Configuring Grafana for Single Sign-on

The Manager engine-setup automatically configures Grafana to allow existing users on the Manager to log in with SSO from the Administration Portal, but does not automatically create users. You need to create new users (Invite in the Grafana UI), confirm the new user, and then they can log in.

  1. Set an email address for the user in the Manager, if it is not already defined.
  2. Log in to Grafana with an existing admin user (the initially configured admin).
  3. Go to ConfigurationUsers and select Invite.
  4. Input the email address and name, and select a Role.
  5. Send the invitation using one of these options:

    • Select Send invite mail and click Submit. For this option, you need an operational local mail server configured on the Grafana machine.

      or

    • Select Pending Invites

      • Locate the entry you want
      • Select Copy invite
      • Copy and use this link to create the account by pasting it directly into a browser address bar, or by sending it to another user.

If you use the Pending Invites option, no email is sent, and the email address does not really need to exist - any valid looking address will work, as long as it’s configured as the email address of a Manager user.

To log in with this account:

  1. Log in to the Red Hat Virtualization web administration welcome page using the account that has this email address.
  2. Select Monitoring Portal to open the Grafana dashboard.
  3. Select Sign in with oVirt Engine Auth.

23.1.3. Grafana dashboards

Built-in Grafana dashboards

The following dashboards are available in the initial Grafana setup to report Data Center, Cluster, Host, and Virtual Machine data:

Table 23.1. Built-in Grafana dashboards

Dashboard typeContent

Executive dashboards

  • Cluster dashboard - resource usage, peaks, over-commit, and up-time for hosts and virtual machines in a selected cluster, according to the latest configurations.
  • Data Center dashboard - resource usage, peaks, and up-time for clusters, hosts, and storage domains in a selected data center, according to the latest configurations.
  • Executive dashboard - user resource usage and number of operating systems for hosts and virtual machines in selected clusters over a selected period.
  • Host dashboard - latest and historical configuration details and resource usage metrics of a selected host over a selected period.
  • System dashboard - resource usage and up-time for hosts and storage domains in the system, according to in the latest configurations.
  • Virtual Machine dashboard - latest and historical configuration details and resource usage metrics of a selected virtual machine over a selected period.

Trend dashboards

  • Hosts Resource Usage dashboard - daily and hourly resource usage (number of virtual machines, CPU, memory, network Tx/Rx) for selected hosts in a selected period.
  • Hosts Trend dashboard - resource usage (number of virtual machines, CPU, memory, and network Tx/Rx) for selected hosts over a selected period.
  • Trend dashboard - usage rates for the 5 most and least utilized virtual machines and hosts by memory and by CPU in selected clusters over a selected period.
  • Virtual Machines Resource Usage dashboard - daily and hourly resource usage (CPU, memory, network Tx/Rx, disk I/O) for selected virtual machines in a selected period.
  • Virtual Machines Trend dashboard -resource usage (CPU, memory, network Tx/Rx, disk I/O) for selected virtual machines over a selected period.

Service Level dashboards

  • Cluster Quality of Service

    • Hosts dashboard - the time selected hosts have performed above and below the CPU and memory threshold in a selected period.
    • Virtual Machines dashboard - the time selected virtual machines have performed above and below the CPU and memory threshold in a selected period.
  • Hosts Uptime dashboard - the uptime, planned downtime, and unplanned downtime for selected hosts in a selected period.
  • Uptime dashboard - planned downtime, unplanned downtime, and total time for the hosts, high availability virtual machines, and all virtual machines in selected clusters in a selected period.
  • Virtual Machines Uptime dashboard - the uptime, planned downtime, and unplanned downtime for selected virtual machines in a selected period.

Inventory dashboards

  • Hosts Inventory dashboard - FQDN, VDSM version, operating system, CPU model, CPU cores, memory size, create date, delete date, and hardware details for selected hosts, according to the latest configurations.
  • Inventory dashboard - number of hosts, virtual machines, and running virtual machines, resources usage and over-commit rates for selected data centers, according to the latest configurations.
  • Storage Domains Inventory dashboard - domain type, storage type, available disk size, used disk size, total disk size, creation date, and delete date for selected storage domains over a selected period.
  • Virtual Machines Inventory dashboard - template name, operating system, CPU cores, memory size, create date, and delete date for selected virtual machines, according to the latest configurations.

Customized Grafana dashboards

You can create customized dashboards or copy and modify existing dashboards according to your reporting needs.

Note

Built-in dashboards cannot be customized.