Chapter 5. Configure Ceph Plug-ins

There are some open source Ceph plug-ins provided at https://github.com/valerytschopp/ceph-nagios-plugins. They include:

  • check_ceph_df: This plug-in outputs messages related to ceph df for the entire cluster or for individual pools. This plug-in only needs to run on Ceph monitor hosts. Multiple instances may be configured to monitor individual pools.
  • check_ceph_health: This plug-in outputs the result of ceph health. This plug-in only needs to run on Ceph monitor hosts.
  • check_ceph_mon: This plug-in checks a single monitor and returns OK if the monitor is up and running or WARN if it is down or missing. This plug-in only needs to run on Ceph monitor hosts.
  • check_ceph_osd: This plug-in checks an OSD host or a single OSD and returns OK if the OSD is up and running or WARN if it is down. This plug-in only needs to run on Ceph OSD hosts.
  • check_ceph_rgw: This plug-in checks a single Ceph Object Gateway and returns OK and the buckets and data usage if it is up and running or WARN if it is down or missing. This plug-in only needs to run on Ceph Object Gateway hosts.
  • check_ceph_mds: This plug-in checks a single metadata server and returns OK if it is up and running, WARN if it is laggy and Error if it is down or missing. This plug-in only needs to run on Ceph metadata server hosts. These plug-ins get installed on the appropriate Ceph hosts. The following sections describe how to configure the ceph health plug-in on a monitor host.

5.1. Create Keyring and Key

Log in to the monitor server and create a Ceph key and keyring for Nagios.

[user@mon]# ssh mon
[user@mon]# cd /etc/ceph
[user@mon]# ceph auth get-or-create client.nagios mon 'allow r' > client.nagios.keyring

Each plug-in will require authentication. Repeat this procedure for each host that contains a plug-in.

5.2. Test the Ceph Plug-in Installation

Before proceeding with additional configuration, ensure that the plug-ins are working. For example:

[user@mon]# /usr/lib/nagios/plugins/check_ceph_health --id nagios --keyring /etc/ceph/client.nagios.keyring

The check_ceph_health plug-in performs the the equivalent of:

[user@mon]# ceph health

5.3. Add a Command for the Ceph Plug-in

Add a command for the check_ceph_health plug-in.

[user@mon]# vim /usr/local/nagios/etc/nrpe.cfg

For example:

command[check_ceph_health]=/usr/lib/nagios/plugins/check_ceph_health --id nagios --keyring /etc/ceph/client.nagios.keyring

Save and restart NRPE.

[user@mon]# systemctl restart nrpe

Repeat this procedure for each Ceph plug-in applicable to the host. See https://github.com/valerytschopp/ceph-nagios-plugins for usage.

5.4. Define the check_nrpe Command

Return to the Nagios server and define a check_nrpe command for the NRPE plug-in.

[user@nagios]# cd /usr/local/nagios/etc/objects
[user@nagios]# vi commands.cfg
define command{
 command_name check_nrpe
 command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

5.5. Define a Service for the Plug-in

On the Nagios server, edit the configuration file for the host and add a service for the Ceph plug-in. For example:

[user@nagios]# vim /usr/local/nagios/etc/objects/mon.cfg
define service {
  use                   generic-service
  host_name             mon
  service_description   Ceph Health Check
  check_command         check_nrpe!check_ceph_health
}

Note that the check_command setting uses check_nrpe!` before the Ceph plug-in name. This tells NRPE to execute the check_ceph_health command on the remote host.

Repeat this procedure for each plug-in applicable to the host.

Then, restart the Nagios server.

[user@nagios]# systemctl restart nagios