Monitoring Ceph with Nagios Guide

Red Hat Ceph Storage 6

Monitoring Ceph with Nagios Core.

Red Hat Ceph Storage Documentation Team

Abstract

This document provides instructions for installing and configuring Nagios to monitor a Red Hat Ceph Storage cluster.
Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright's message.

Chapter 1. Nagios and Ceph

Nagios Core is an open-source solution for monitoring nodes. Large Red Hat Ceph Storage clusters benefit from distributed monitoring systems such as Nagios Core. The Nagios Core checks each node in a cluster, including the health of the underlying operating system, as well as the health of the Red Hat Ceph Storage cluster daemons.

To deploy Nagios Core with Ceph requires:

  • A running Red Hat Ceph Storage cluster.

Instead of Nagios Core, you can also substitute the more feature-rich commercial version, Nagios XI.

Important

Red Hat does not provide the Nagios packages.

Important

Red Hat works with our technology partners to provide this documentation as a service to our customers. However, Red Hat does not provide support for this product. If you need technical assistance with this product, then contact Nagios for support.

Chapter 2. Nagios Core installation and configuration

As a storage administrator, you can install Nagios Core by downloading the Nagios Core source code; then, configuring, making, and installing it on the node that will run the Nagios Core instance.

2.1. Installing and configuring the Nagios Core server from source

There is not a Red Hat Enterprise Linux package for the Nagios Core software, so the Nagios Core software must be compiled from source.

Prerequisites

  • Internet access.
  • Root-level access to the Nagios Core host.

Procedure

  1. Install the prerequisites:

    Example

    [root@nagios ~]# dnf install -y httpd php php-cli gcc glibc glibc-common gd gd-devel net-snmp openssl openssl-devel wget unzip make

  2. If you are using a firewall, open port 80 for httpd:

    Example

    [root@nagios ~]# firewall-cmd --zone=public --add-port=80/tcp
    [root@nagios ~]# firewall-cmd --zone=public --add-port=80/tcp --permanent

  3. Create a user and group for Nagios Core:

    Example

    [root@nagios ~]# useradd nagios
    [root@nagios ~]# passwd nagios
    [root@nagios ~]# groupadd nagcmd
    [root@nagios ~]# usermod -a -G nagcmd nagios
    [root@nagios ~]# usermod -a -G nagcmd apache

  4. Download the latest version of Nagios Core and Plug-ins:

    Example

    [root@nagios ~]# wget --inet4-only https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.4.5.tar.gz
    [root@nagios ~]# wget --inet4-only http://www.nagios-plugins.org/download/nagios-plugins-2.3.3.tar.gz
    [root@nagios ~]# tar zxf nagios-4.4.5.tar.gz
    [root@nagios ~]# tar zxf nagios-plugins-2.3.3.tar.gz
    [root@nagios ~]# cd nagios-4.4.5

  5. Run ./configure:

    Example

    [root@nagios nagios-4.4.5]# ./configure --with-command-group=nagcmd

  6. Compile the Nagios Core source code:

    Example

    [root@nagios nagios-4.4.5]# make all

  7. Install Nagios source code:

    Example

    [root@nagios nagios-4.4.5]# make install
    [root@nagios nagios-4.4.5]# make install-init
    [root@nagios nagios-4.4.5]# make install-config
    [root@nagios nagios-4.4.5]# make install-commandmode
    [root@nagios nagios-4.4.5]# make install-webconf

  8. Copy the event handlers and change their ownership:

    Example

    [root@nagios nagios-4.4.5]# cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/
    [root@nagios nagios-4.4.5]# chown -R nagios:nagios /usr/local/nagios/libexec/eventhandlers

  9. Run the pre-flight check:

    Example

    [root@nagios nagios-4.4.5]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

  10. Make and install the Nagios Core plug-ins:

    Example

    [root@nagios ~]# cd ../nagios-plugins-2.3.3
    [root@nagios nagios-plugins-2.3.3]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
    [root@nagios nagios-plugins-2.3.3]# make
    [root@nagios nagios-plugins-2.3.3]# make install

  11. Create a user for the Nagios Core user interface:

    Example

    [root@nagios nagios-plugins-2.3.3]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

    Important

    If adding a user other than nagiosadmin, ensure the /usr/local/nagios/etc/cgi.cfg file gets updated with the user name too.

  12. Modify the /usr/local/nagios/etc/objects/contacts.cfg file with the user name, full name, and email address as needed.

2.2. Starting the Nagios Core service

Start the Nagios Core service to monitor the Red Hat Ceph Storage cluster health.

Prerequisites

  • Root-level access to the Nagios Core host.

Procedure

  1. Add Nagios Core and Apache as a service:

    Example

    [root@nagios ~]# systemctl enable nagios
    [root@nagios ~]# systemctl enable httpd

  2. Start the Nagios Core daemon and Apache:

    Example

    [root@nagios ~]# systemctl start nagios
    [root@nagios ~]# systemctl start httpd

2.3. Logging into the Nagios Core server

Log in to the Nagios Core server to view the health status of the Red Hat Ceph Storage cluster.

Prerequisites

  • User name and password for the Nagios dashboard.

Procedure

  • With Nagios up and running, log in to the dashboard using the credentials of the default Nagios Core user:

    Syntax

    http://IP_ADDRESS/nagios

    Replace IP_ADDRESS with the IP address of your Nagios Core server.

Chapter 3. Nagios remote plug-in executor installation

As a storage administrator, you can monitor the Ceph storage cluster hosts, install Nagios plug-ins, the Ceph plug-ins, and the Nagios remote plug-in executor (NRPE) add-on to each of the Ceph hosts.

For demonstration purposes, this section adds NRPE to a Ceph Monitor host with the hostname host01. Repeat the remaining procedures on all Ceph hosts that Nagios should monitor.

3.1. Installing and configuring Nagios Remote Plug-In Executor

Install the Nagios Remote Plug-in Executor (NPRE) and configure it to communicate with the Nagios Core server.

Prerequisites

  • Root-level access to Ceph Monitor host.

Procedure

  1. Install these packages on the host:

    Example

    [root@host01 ~]# dnf install openssl openssl-devel gcc make git

  2. NRPE installation requires a Nagios user. Create the user first:

    Example

    [root@host01 ~]# useradd nagios
    [root@host01 ~]# passwd nagios

  3. Download the latest version of the Nagios plug-ins. Then, make and install them:

    Example

    [root@host01 ~]# wget http://nagios-plugins.org/download/nagios-plugins-2.3.3.tar.gz
    [root@host01 ~]# tar zxf nagios-plugins-2.3.3.tar.gz
    [root@host01 ~]# cd nagios-plugins-2.3.3
    [root@host01 nagios-plugins-2.3.3]# ./configure
    [root@host01 nagios-plugins-2.3.3]# make
    [root@host01 nagios-plugins-2.3.3]# make install

  4. Download the latest version of the Ceph plug-ins:

    Example

    [root@host01 nagios-plugins-2.3.3]# cd ~
    [root@host01 ~]# git clone --recursive https://github.com/ceph/ceph-nagios-plugins.git
    [root@host01 ~]# cd ceph-nagios-plugins
    [root@host01 ceph-nagios-plugins]# make dist
    [root@host01 ceph-nagios-plugins]# make install

  5. Download, make, and install Nagios NRPE:

    Example

    [root@host01 ceph-nagios-plugins]# cd ~
    [root@host01 ~]# wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-4.0.3/nrpe-4.0.3.tar.gz
    [root@host01 ~]# tar xvfz nrpe-4.0.3.tar.gz
    [root@host01 ~]# cd nrpe-4.0.3
    [root@host01 nrpe-4.0.3]# ./configure
    [root@host01 nrpe-4.0.3]# make all
    [root@host01 nrpe-4.0.3]# make install-groups-users
    [root@host01 nrpe-4.0.3]# make install
    [root@host01 nrpe-4.0.3]# make install-config
    [root@host01 nrpe-4.0.3]# make install-init

  6. If you are using a firewall, open port 5666 to allow communication with NRPE:

    Example

    [root@host01 ~]# firewall-cmd --zone=public --add-port=5666/tcp
    [root@host01 ~]# firewall-cmd --zone=public --add-port=5666/tcp --permanent

Additional Resources

3.2. Starting the Nagios Remote Plug-in Executor service

Start the Nagios Remote Plug-in Executor (NRPE) service to collect data and report it back to the Nagios Core server.

Prerequisites

  • Root-level access to the Ceph Monitor host

Procedure

  • Enable and start the NRPE service:

    Example

    [root@host01 ~]# systemctl enable nrpe
    [root@host01 ~]# systemctl start nrpe

3.3. Configuring Nagios Core server access to remote nodes

For the Nagios Core server to access Nagios Remote Plugin Executor (NPRE) on a remote machine, the remote machine’s NRPE configurations must be updated with the IP address of the Nagios Core server.

Prerequisites

  • Root-level access to the Nagios Core server.
  • Internet access.
  • Access to the Nagios Remote Plugin Executor.

Procedure

  1. Edit the NRPE configuration with the Nagios server’s IP address:

    Example

    [root@host01 ~]# vi /usr/local/nagios/etc/nrpe.cfg

  2. Add the IP address of the Nagios Core server to the allowed_hosts setting.

    Syntax

    allowed_hosts=127.0.0.1,IP_ADDRESS_OF_NAGIOS_CORE_SERVER

    Replace IP_ADDRESS_OF_NAGIOS_CORE_SERVER with the IP address of your Nagios Core server.

  3. Restart nrpe:

    Example

    [root@host01 ~]# systemctl restart nrpe

Verification

  • Test the installation:

    Example

    [root@host01 ~]# /usr/local/nagios/libexec/check_nrpe -H localhost

    The check should echo NRPE v4.0.3 if it is working correctly.

Chapter 4. Configuring the remote node on the Nagios Core server

Configure the Nagios Core server to be aware of the remote hosts.

Prerequisites

  • Root-level access to the remote node on the Nagios Core server.
  • Internet access.

Procedure

  1. Install the check_nrpe plug-in:

    Example

    [root@nagios ~]# cd ~
    [root@nagios ~]# wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-4.0.3/nrpe-4.0.3.tar.gz
    [root@nagios ~]# tar xvfz nrpe-4.0.3.tar.gz
    [root@nagios ~]# cd nrpe-4.0.3
    [root@nagios nrpe-4.0.3]# ./configure
    [root@nagios nrpe-4.0.3]# make check_nrpe
    [root@nagios nrpe-4.0.3]# make install-plugin

  2. Create a configuration for the remote host:

    Syntax

     cd /usr/local/nagios/etc/objects
     cp localhost.cfg HOST_TO_ADD.cfg

    Example

    [root@nagios nrpe-4.0.3]# cd /usr/local/nagios/etc/objects
    [root@nagios objects]# cp localhost.cfg host01.cfg

  3. Edit the configuration file and update the settings for the remote host:

    Syntax

    vi HOST_TO_ADD.cfg

    Example

    [root@nagios objects]# vi host01.cfg

    Syntax

    # Define a host for the local machine
    
    define host {
    
        use     linux-server    ; Name of host template to use
                                ; This host definition will inherit all variables that are defined
                                ; in (or inherited by) the linux-server host template definition.
        host_name               LOCALHOST
        alias                   LOCALHOST
        address                 127.0.0.1
    }

    Replace LOCALHOST with the hostname of the remote host, and 127.0.0.1 with the IP address of the Ceph monitor host.

    Example

    # Define a host for the local machine
    
    define host {
        use     linux-server   ; Name of host template to use
                               ; This host definition will inherit all variables that are defined
                               ; in (or inherited by) the linux-server host template definition.
        host_name               host01
        alias                   host01
        address                 10.10.128.69
    }

  4. Delete or comment out the Host Group definition:

    Example

    [root@nagios objects]# vi host01.cfg

    #define hostgroup {
    #
    #    hostgroup_name          linux-servers           ; The name of the hostgroup
    #    alias                   Linux Servers           ; Long name of the group
    #    members                 localhost               ; Comma separated list of hosts that belong to this group
    #}
  5. Change the file ownership to Nagios:

    Example

    [root@nagios objects]# chown nagios:nagios host01.cfg

  6. Add a cfg_file= reference to the host01.cfg file in /usr/local/nagios/etc/nagios.cfg:

    Example

    [root@nagios objects]# vi /usr/local/nagios/etc/nagios.cfg

    cfg_file=/usr/local/nagios/etc/objects/host01.cfg
  7. Restart the Nagios server:

    Example

    [root@nagios objects]# systemctl restart nagios

  8. Ensure that the make and install procedures worked and that there is connectivity between the Nagios Core server and the remote host containing NRPE:

    Syntax

    /usr/local/nagios/libexec/check_nrpe -H HOSTNAME_OF_REMOTE_HOST

    Replace HOSTNAME_OF_REMOTE_HOST with the IP address of the Ceph host to monitor.

    Example

    [root@nagios objects]# /usr/local/nagios/libexec/check_nrpe -H host01

Verification

  • The check should echo NRPE v4.0.3 if it is working correctly.

Chapter 5. Configuring the Nagios Plugins for Ceph

Configure the Nagios plug-ins for Red Hat Ceph Storage cluster.

Prerequisites

  • Root-level access to the Ceph Monitor host and Nagios Core Server.
  • A running Red Hat Ceph Storage cluster.

Procedure

  1. Log in to the Ceph monitor host and create a Ceph key and keyring for Nagios.

    Example

    [root@nagios ~]# ssh user@host01
    [user@host01 ~]$ sudo su -
    [root@host01 ~]# cd /etc/ceph
    [root@host01 ceph]# ceph auth get-or-create client.nagios mon 'allow r' > client.nagios.keyring

    Each plug-in will require authentication. Repeat this procedure for each host that contains a plug-in.

  2. Add a command for the check_ceph_health plug-in:

    Example

    [root@host01 ~]# vi /usr/local/nagios/etc/nrpe.cfg

    command[check_ceph_health]=/usr/lib/nagios/plugins/check_ceph_health --id nagios --keyring /etc/ceph/client.nagios.keyring
  3. Enable and restart the nrpe service:

    Example

    [root@host01 ~]# systemctl enable nrpe
    [root@host01 ~]# systemctl restart nrpe

    Repeat this procedure for each Ceph plug-in applicable to the host.

  4. Return to the Nagios Core server and define a check_nrpe command for the NRPE plug-in:

    Example

    [root@nagios ~]# cd /usr/local/nagios/etc/objects
    [root@nagios objects]# vi commands.cfg

    Syntax

    define command{
     command_name check_nrpe
     command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
    }

  5. On the Nagios Core server, edit the configuration file for the node and add a service for the Ceph plug-in.

    Example

    [root@nagios objects]# vi /usr/local/nagios/etc/objects/host01.cfg

    Syntax

    define service {
      use                   generic-service
      host_name             HOSTNAME
      service_description   Ceph Health Check
      check_command         check_nrpe!check_ceph_health
    }

    Replace HOSTNAME with the hostname of the Ceph host you want to monitor.

    Example

    define service {
      use                   generic-service
      host_name             host01
      service_description   Ceph Health Check
      check_command         check_nrpe!check_ceph_health
    }

    Note

    The check_command setting uses check_nrpe! before the Ceph plug-in name. This tells NRPE to execute the check_ceph_health command on the remote node.

  6. Repeat this procedure for each plug-in applicable to the host.
  7. Restart the Nagios Core server:

    Example

    [root@nagios ~]# systemctl restart nagios

  8. Before proceeding with additional configuration, ensure that the plug-ins are working on the Ceph host:

    Syntax

    /usr/lib/nagios/plugins/check_ceph_health --id NAGIOS_USER --keyring /etc/ceph/client.nagios.keyring

    Example

    [root@host01 ~]# /usr/lib/nagios/plugins/check_ceph_health --id nagios --keyring /etc/ceph/client.nagios.keyring
    HEALTH OK

    Note

    The check_ceph_health plug-in performs the equivalent of the ceph health command.

Additional Resources

Legal Notice

Copyright © 2024 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.