Chapter 8. Management of monitoring stack using the Ceph Orchestrator

As a storage administrator, you can use the Ceph Orchestrator with Cephadm in the backend to deploy monitoring and alerting stack. The monitoring stack consists of Prometheus, Prometheus exporters, Prometheus Alertmanager, and Grafana. Users need to either define these services with Cephadm in a YAML configuration file, or they can use the command line interface to deploy them. When multiple services of the same type are deployed, a highly-available setup is deployed. The node exporter is an exception to this rule.

Note

Red Hat Ceph Storage 5.0 does not support custom images for deploying monitoring services such as Prometheus, Grafana, Alertmanager, and node-exporter.

The following monitoring services can be deployed with Cephadm:

  • Prometheus is the monitoring and alerting toolkit. It collects the data provided by Prometheus exporters and fires preconfigured alerts if predefined thresholds have been reached. The Prometheus manager module provides a Prometheus exporter to pass on Ceph performance counters from the collection point in ceph-mgr.

    The Prometheus configuration, including scrape targets,such as metrics providing daemons, is set up automatically by Cephadm. Cephadm also deploys a list of default alerts, for example, health error, 10% OSDs down, or pgs inactive.

  • Alertmanager handles alerts sent by the Prometheus server. It deduplicates, groups, and routes the alerts to the correct receiver. By default, the Ceph dashboard is automatically configured as the receiver. The Alertmanager handles alerts sent by the Prometheus server. Alerts can be silenced using the Alertmanager, but silences can also be managed using the Ceph Dashboard.
  • Grafana is the visualization and alerting software. The alerting functionality of Grafana is not used by this monitoring stack. For alerting, the Alertmanager is used.

    By default, traffic to Grafana is encrypted with TLS. You can either supply your own TLS certificate or use a self-signed one. If no custom certificate has been configured before Grafana has been deployed, then a self-signed certificate is automatically created and configured for Grafana. Custom certificates for Grafana can be configured using the following commands:

    Syntax

     ceph config-key set mgr/cephadm/grafana_key -i PRESENT_WORKING_DIRECTORY/key.pem
     ceph config-key set mgr/cephadm/grafana_crt -i PRESENT_WORKING_DIRECTORY/certificate.pem

Node exporter is an exporter for Prometheus which provides data about the node on which it is installed. It is recommended to install the node exporter on all nodes. This can be done using the `monitoring.yml file with the node-exporter service type.

8.1. Deploying the monitoring stack using the Ceph Orchestrator

The monitoring stack consists of Prometheus, Prometheus exporters, Prometheus Alertmanager, and Grafana. Ceph Dashboard makes use of these components to store and visualize detailed metrics on cluster usage and performance.

You can deploy the monitoring stack using the service specification in YAML file format. All the monitoring services can have the network and port they bind to configured in the yml file.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the nodes.

Procedure

  1. Enable the prometheus module in the Ceph Manager daemon. This exposes the internal Ceph metrics so that Prometheus can read them:

    Example

    [ceph: root@host01 /]# ceph mgr module enable prometheus

    Important

    Ensure this command is run before Prometheus is deployed. If the command was not run before the deployment, you must redeploy Prometheus to update the configuration:

    ceph orch redeploy prometheus
  2. Navigate to the following directory:

    Syntax

    cd /var/lib/ceph/DAEMON_PATH/

    Example

    [ceph: root@host01 mds/]# cd /var/lib/ceph/monitoring/

    Note

    If the directory monitoring does not exist, create it.

  3. Create the monitoring.yml file:

    Example

    [ceph: root@host01 monitoring]# touch monitoring.yml

  4. Edit the specification file with a content similar to the following example:

    Example

    service_type: prometheus
    service_name: prometheus
    placement:
      hosts:
      - host01
    networks:
    - 192.169.142.0/24
    ---
    service_type: node-exporter
    ---
    service_type: alertmanager
    service_name: alertmanager
    placement:
      hosts:
      - host03
    networks:
    - 192.169.142.03/24
    ---
    service_type: grafana
    service_name: grafana
    placement:
      hosts:
      - host02
    networks:
    - 192.169.142.02/24

  5. Apply monitoring services:

    Example

    [ceph: root@host01 monitoring]# ceph orch apply -i monitoring.yml

Verification

  • List the service:

    Example

    [ceph: root@host01 /]# ceph orch ls

  • List the hosts, daemons, and processes:

    Syntax

    ceph orch ps --service_name=SERVICE_NAME

    Example

    [ceph: root@host01 /]# ceph orch ps --service_name=prometheus

Important

Prometheus, Grafana, and the Ceph dashboard are all automatically configured to talk to each other, resulting in a fully functional Grafana integration in the Ceph dashboard.

8.2. Removing the monitoring stack using the Ceph Orchestrator

You can remove the monitoring stack using the ceph orch rm command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell

  2. Use the ceph orch rm command to remove the monitoring stack:

    Syntax

    ceph orch rm SERVICE_NAME --force

    Example

    [ceph: root@host01 /]# ceph orch rm grafana
    [ceph: root@host01 /]# ceph orch rm prometheus
    [ceph: root@host01 /]# ceph orch rm node-exporter
    [ceph: root@host01 /]# ceph orch rm alertmanager
    [ceph: root@host01 /]# ceph mgr module disable prometheus

  3. Check the status of the process:

    Example

    [ceph: root@host01 /]# ceph orch status

Verification

  • List the service:

    Example

    [ceph: root@host01 /]# ceph orch ls

  • List the hosts, daemons, and processes:

    Syntax

    ceph orch ps

    Example

    [ceph: root@host01 /]# ceph orch ps

Additional Resources