Chapter 15. Monitoring performance by using the metrics RHEL system role

As a system administrator, you can use the metrics RHEL system role to monitor the performance of a system.

15.1. Introduction to the metrics system role

RHEL system roles is a collection of Ansible roles and modules that provide a consistent configuration interface to remotely manage multiple RHEL systems. The metrics system role configures performance analysis services for the local system and, optionally, includes a list of remote systems to be monitored by the local system. The metrics system role enables you to use pcp to monitor your systems performance without having to configure pcp separately, as the set-up and deployment of pcp is handled by the playbook.

Table 15.1. metrics system role variables

Role variableDescriptionExample usage

metrics_monitored_hosts

List of remote hosts to be analyzed by the target host. These hosts will have metrics recorded on the target host, so ensure enough disk space exists below /var/log for each host.

metrics_monitored_hosts: ["webserver.example.com", "database.example.com"]

metrics_retention_days

Configures the number of days for performance data retention before deletion.

metrics_retention_days: 14

metrics_graph_service

A boolean flag that enables the host to be set up with services for performance data visualization via pcp and grafana. Set to false by default.

metrics_graph_service: no

metrics_query_service

A boolean flag that enables the host to be set up with time series query services for querying recorded pcp metrics via redis. Set to false by default.

metrics_query_service: no

metrics_provider

Specifies which metrics collector to use to provide metrics. Currently, pcp is the only supported metrics provider.

metrics_provider: "pcp"

metrics_manage_firewall

Uses the firewall role to manage port access directly from the metrics role. Set to false by default.

metrics_manage_firewall: true

metrics_manage_selinux

Uses the selinux role to manage port access directly from the metrics role. Set to false by default.

metrics_manage_selinux: true

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.metrics/README.md file
  • /usr/share/doc/rhel-system-roles/metrics/ directory

15.2. Using the metrics system role to monitor your local system with visualization

This procedure describes how to use the metrics RHEL system role to monitor your local system while simultaneously provisioning data visualization via Grafana.

Prerequisites

  • You have prepared the control node and the managed nodes.
  • You are logged in to the control node as a user who can run playbooks on the managed nodes.
  • The account you use to connect to the managed nodes has sudo permissions on them.
  • localhost is configured in the inventory file on the control node:

    localhost ansible_connection=local

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Manage metrics
      hosts: localhost
      roles:
        - rhel-system-roles.metrics
      vars:
        metrics_graph_service: yes
        metrics_manage_firewall: true
        metrics_manage_selinux: true

    Because the metrics_graph_service boolean is set to value="yes", Grafana is automatically installed and provisioned with pcp added as a data source. Because metrics_manage_firewall and metrics_manage_selinux are both set to true, the metrics role uses the firewall and selinux system roles to manage the ports used by the metrics role.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Verification

  • To view visualization of the metrics being collected on your machine, access the grafana web interface as described in Accessing the Grafana web UI.

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.metrics/README.md file
  • /usr/share/doc/rhel-system-roles/metrics/ directory

15.3. Using the metrics system role to set up a fleet of individual systems to monitor themselves

This procedure describes how to use the metrics system role to set up a fleet of machines to monitor themselves.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Configure a fleet of machines to monitor themselves
      hosts: managed-node-01.example.com
      roles:
        - rhel-system-roles.metrics
      vars:
        metrics_retention_days: 0
        metrics_manage_firewall: true
        metrics_manage_selinux: true

    Because metrics_manage_firewall and metrics_manage_selinux are both set to true, the metrics role uses the firewall and selinux roles to manage the ports used by the metrics role.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.metrics/README.md file
  • /usr/share/doc/rhel-system-roles/metrics/ directory

15.4. Using the metrics system role to monitor a fleet of machines centrally via your local machine

This procedure describes how to use the metrics system role to set up your local machine to centrally monitor a fleet of machines while also provisioning visualization of the data via grafana and querying of the data via redis.

Prerequisites

  • You have prepared the control node and the managed nodes.
  • You are logged in to the control node as a user who can run playbooks on the managed nodes.
  • The account you use to connect to the managed nodes has sudo permissions on them.
  • localhost is configured in the inventory file on the control node:

    localhost ansible_connection=local

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    - name: Set up your local machine to centrally monitor a fleet of machines
      hosts: localhost
      roles:
        - rhel-system-roles.metrics
      vars:
        metrics_graph_service: yes
        metrics_query_service: yes
        metrics_retention_days: 10
        metrics_monitored_hosts: ["database.example.com", "webserver.example.com"]
        metrics_manage_firewall: yes
        metrics_manage_selinux: yes

    Because the metrics_graph_service and metrics_query_service booleans are set to value="yes", grafana is automatically installed and provisioned with pcp added as a data source with the pcp data recording indexed into redis, allowing the pcp querying language to be used for complex querying of the data. Because metrics_manage_firewall and metrics_manage_selinux are both set to true, the metrics role uses the firewall and selinux roles to manage the ports used by the metrics role.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Verification

  • To view a graphical representation of the metrics being collected centrally by your machine and to query the data, access the grafana web interface as described in Accessing the Grafana web UI.

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.metrics/README.md file
  • /usr/share/doc/rhel-system-roles/metrics/ directory

15.5. Setting up authentication while monitoring a system by using the metrics system role

PCP supports the scram-sha-256 authentication mechanism through the Simple Authentication Security Layer (SASL) framework. The metrics RHEL system role automates the steps to setup authentication by using the scram-sha-256 authentication mechanism. This procedure describes how to setup authentication by using the metrics RHEL system role.

Prerequisites

Procedure

  1. Edit an existing playbook file, for example ~/playbook.yml, and add the authentication-related variables:

    ---
    - name: Set up authentication by using the scram-sha-256 authentication mechanism
      hosts: managed-node-01.example.com
      roles:
        - rhel-system-roles.metrics
      vars:
        metrics_retention_days: 0
        metrics_manage_firewall: true
        metrics_manage_selinux: true
        metrics_username: <username>
        metrics_password: <password>
  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Verification

  • Verify the sasl configuration:

    # pminfo -f -h "pcp://managed-node-01.example.com?username=<username>" disk.dev.read
    Password: <password>
    disk.dev.read
    inst [0 or "sda"] value 19540

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.metrics/README.md file
  • /usr/share/doc/rhel-system-roles/metrics/ directory

15.6. Using the metrics system role to configure and enable metrics collection for SQL Server

This procedure describes how to use the metrics RHEL system role to automate the configuration and enabling of metrics collection for Microsoft SQL Server via pcp on your local system.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Configure and enable metrics collection for Microsoft SQL Server
      hosts: localhost
      roles:
        - rhel-system-roles.metrics
      vars:
        metrics_from_mssql: true
        metrics_manage_firewall: true
        metrics_manage_selinux: true

    Because metrics_manage_firewall and metrics_manage_selinux are both set to true, the metrics role uses the firewall and selinux roles to manage the ports used by the metrics role.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Verification

  • Use the pcp command to verify that SQL Server PMDA agent (mssql) is loaded and running:

    # pcp
    platform: Linux sqlserver.example.com 4.18.0-167.el8.x86_64 #1 SMP Sun Dec 15 01:24:23 UTC 2019 x86_64
     hardware: 2 cpus, 1 disk, 1 node, 2770MB RAM
     timezone: PDT+7
     services: pmcd pmproxy
         pmcd: Version 5.0.2-1, 12 agents, 4 clients
         pmda: root pmcd proc pmproxy xfs linux nfsclient mmv kvm mssql
               jbd2 dm
     pmlogger: primary logger: /var/log/pcp/pmlogger/sqlserver.example.com/20200326.16.31
         pmie: primary engine: /var/log/pcp/pmie/sqlserver.example.com/pmie.log

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.metrics/README.md file
  • /usr/share/doc/rhel-system-roles/metrics/ directory