Monitoring Red Hat Satellite
Collecting metrics from Red Hat Satellite 6
Chapter 1. Overview
Obtaining metrics from Satellite is useful for troubleshooting a current issue, and capacity planning. This guide describes how to collect live metrics and archive them for a fixed period of time. If you need to raise a support case with Red Hat to resolve a performance issue, the archived data provides valuable insight. Note that Red Hat Support can only access the archived data if you upload it to a Support Case.
You can collect the following metrics from Satellite:
- Basic statistics from Red Hat Enterprise Linux, including system load, memory utilization, and input/output operations;
- Process statistics, including memory and CPU utilization;
- Apache HTTP Server activity statistics;
- PostgreSQL activity statistics;
- Satellite application statistics.
Use Performance Co-Pilot (PCP) to collect and archive Satellite metrics.
Chapter 2. Performance Co-Pilot
Performance Co-Pilot (PCP) is a suite of tools and libraries for acquiring, storing, and analyzing system-level performance measurements. PCP can be used to analyze live and historical metrics. Metrics can be retrieved and presented via the CLI, or a web UI.
2.1. Performance Metric Domain Agents
A Performance Metric Domain Agent (PMDA) is a PCP add-on which enables access to metrics of an application or service. To gather all metrics relevant to Satellite, you must install PMDAs for Apache HTTP Server and PostgreSQL.
Chapter 3. Installing PCP Packages
This procedure describes how to install the PCP packages.
Ensure you have a minimum of 20 GB space available in the
The default PCP data retention policy is to retain only that data collected during the past 14 days. Data storage per day is estimated to use usually between 100 MB and 500 MB of disk space, but may use up to several gigabytes.
- Ensure that the base system on which Satellite Server is running is Red Hat Enterprise Linux 7.6. or later. The minimum supported version for the PCP packages is PCP version 4.1.
Install the PCP packages:
# yum install pcp \ pcp-pmda-apache \ pcp-pmda-postgresql \ pcp-system-tools
Enable and start the Performance Metrics Collector daemon, and the Performance Metrics Logger daemon:
# systemctl enable pmcd pmlogger # systemctl start pmcd pmlogger
3.1. Configuring PCP Data Collection
This procedure describes how to configure PCP to collect metrics about processes, Satellite, Apache HTTP Server, and PostgreSQL.
Configure PCP to collect data about important Satellite processes.
By default, PCP collects basic system metrics. This step enables detailed metrics about the following Satellite processes:
# cat >/var/lib/pcp/pmdas/proc/hotproc.conf <<EOF #pmdahotproc Version 1.0 fname == "java" || fname ~ /(qdrouterd|qpidd)/ || (fname == "postgres" && psargs ~ /-D/) || fname == "mongod" || fname ~ /^dynflow/ || psargs ~ /Passenger RackApp/ || fname ~ /^wsgi:pulp/ || psargs ~ /celery (beat|worker)/ || psargs ~ /pulp_streamer/ || psargs ~ /smart-proxy/ || psargs ~ /squid.conf/ EOF
Configure PCP to log the process metrics being collected.
# mkdir -p /var/lib/pcp/config/pmlogconf/foreman-hotproc # cat >/var/lib/pcp/config/pmlogconf/foreman-hotproc/summary << EOF #pmlogconf-setup 2.0 ident foreman hotproc metrics probe hotproc.control.config != "" ? include : exclude hotproc.psinfo.psargs hotproc.psinfo.cnswap hotproc.psinfo.nswap hotproc.psinfo.rss hotproc.psinfo.vsize hotproc.psinfo.cstime hotproc.psinfo.cutime hotproc.psinfo.stime hotproc.psinfo.utime hotproc.io.write_bytes hotproc.io.read_bytes hotproc.schedstat.cpu_time hotproc.fd.count EOF
Install the process monitoring PMDA.
# cd /var/lib/pcp/pmdas/proc # ./Install
Configure PCP to collect metrics from Apache HTTP Server.
Enable the Apache HTTP Server extended status module.
#cat >/etc/httpd/conf.d/01-status.conf <<EOF ExtendedStatus On LoadModule status_module modules/mod_status.so <Location "/server-status"> PassengerEnabled off SetHandler server-status Order deny,allow Deny from all Allow from localhost </Location> EOF
Enable the Apache HTTP Server PMDA.
# cd /var/lib/pcp/pmdas/apache # ./Install
Prevent the Satellite installer overwriting the extended status module’s configuration file.
Add the following line to the
Configure PCP to collect metrics from PostgreSQL.
Change to the
# cd /var/lib/pcp/pmdas/postgresql
Run the installer.
Configure the PCP database interface to permit access to the PostgreSQL database.
/etc/pcpdbi.confconfiguration file, inserting the following lines:
$database = "dbi:Pg:dbname=foreman;host=localhost"; $username = "foreman"; $password = "6qXfN9m5nii5iEcbz8nuiJBNsyjjdRHA"; 1 $os_user = "foreman";
- The value for $password is stored in
Change the SELinux
pcp_pmcd_tdomain permission to permit PCP access to the PostgreSQL database.
# semanage permissive -a pcp_pmcd_t
Verify the PostgreSQL PMDA is able to connect to PostgreSQL.
/var/log/pcp/pmcd/postgresql.logfile to confirm the connection is established. Without a successful database connection, the PostgreSQL PMDA will remain active, but not be able to provide any metrics.
[Tue Aug 14 09:21:06] pmdapostgresql(25056) Info: PostgreSQL connection established
If you find errors in
/var/log/pcp/pmcd/postgresql.log, restart the pmcd service.
# systemctl restart pmcd
Enable telemetry functionality in Satellite.
To enable collection of metrics from Satellite, you must send metrics via the
statsdprotocol into the
pcp-mmvstatsddaemon. The metrics are aggregated and available via the PCP MMV API.
Install the Foreman Telemetry and
# yum install foreman-telemetry pcp-mmvstatsd
Enable and start the
# systemctl enable pcp-mmvstatsd # systemctl start pcp-mmvstatsd
Enable the Satellite telemetry functionality.
Add the following lines to
:telemetry: :prefix: 'fm_rails' :statsd: :enabled: true :host: '127.0.0.1:8125' :protocol: 'statsd' :prometheus: :enabled: false :logger: :enabled: false :level: 'INFO'
Schedule daily storage of metrics in archive files:
# cat >/etc/cron.daily/refresh_mmv <<EOF #!/bin/bash echo "log mandatory on 1 minute mmv" | /usr/bin/pmlc -P EOF # chmod +x /etc/cron.daily/refresh_mmv
Restart the Apache HTTP Server and PCP to begin data collection:
# systemctl restart httpd pmcd pmlogger
3.2. Enabling Access to Metrics via the Web UI
This procedure describes how to access metrics collected by PCP, via the web UI.
Enable the Red Hat Enterprise Linux
# subscription-manager repos --enable rhel-7-server-optional-rpms
Install the PCP web API and applications:
# yum install pcp-webapi pcp-webapp-grafana pcp-webapp-vector
Start and enable the PCP web service:
# systemctl start pmwebd # systemctl enable pmwebd
Open firewall port to allow access to the PCP web service:
# firewall-cmd --add-port=44323/tcp # firewall-cmd --permanent --add-port=44323/tcp
3.3. Verifying PCP Configuration
To verify PCP is configured correctly, and services are active, run the following command:
This outputs a summary of the active PCP configuration.
Example output from the
Performance Co-Pilot configuration on satellite.example.com: platform: Linux satellite.example.com 3.10.0-862.3.3.el7.x86_64 #1 SMP Wed Jun 13 05:44:23 EDT 2018 x86_64 hardware: 8 cpus, 4 disks, 1 node, 23380MB RAM timezone: AEST-10 services: pmcd pmwebd pmcd: Version 3.12.2-1, 9 agents, 1 client pmda: root pmcd proc xfs linux apache mmv postgresql jbd2 pmlogger: primary logger: /var/log/pcp/pmlogger/satellite.example.com/20180802.00.10
In this example, both the Performance Metrics Collector Daemon (pmcd), and the Performance Metrics Web Daemon (pmwebd) services are running. It also confirms the PMDAs which are collecting metrics. Finally, it lists the currently actively archive file, in which
pmlogger is storing metrics.
Chapter 4. PCP Metrics
Metrics are stored in a tree-like structure. For example, all network metrics are stored in a node named
network. Each metric may be a single value, or a list of values, known as instances. For example, kernel load has three instances, a 1-minute, 5-minute, and 15-minute average.
For every metric entry, PCP stores both its data and metadata. This includes the metrics description, data type, units, and dimensions. For example, the metadata enables PCP to output multiple metrics with different dimensions.
The value of a counter metric only increases. For example, a count of disk write operations on a specific device only increases. When you query the value of a counter metric, PCP converts this into a rate value by default.
In addition to system metrics such as CPU, memory, kernel, XFS, disk, and network, the following metrics are configured:
Basic metrics of key Satellite processes
Apache HTTP Server metrics
Basic PostgreSQL statistics
4.1. Identifying Available Metrics
To list all metrics available via PCP, enter the following command:
To list all Satellite metrics and their descriptions, enter the following command:
# foreman-rake telemetry:metrics
To list the archived metrics, enter the following command:
# less /var/log/pcp/pmlogger/$(hostname)/pmlogger.log
The pmlogger daemon archives data as it is received, according to its configuration. To confirm the active archive file, enter the following command:
# pcp | grep logger
The output includes the file name of the active archive file, for example:
Chapter 5. Retrieving Metrics
You can retrieve metrics from PCP using the CLI or the web UI interfaces. A number of CLI tools are provided with PCP, which can either output live data, or data from archived sources. The web UI interfaces are provided by the Grafana and Vector web applications. Vector connects directly to the PCP daemon, and can only display live data. Grafana reads from PCP archive files and can display data to up to 1 year old.
5.1. Retrieving Metrics via the CLI
Using the CLI tools provided with PCP, you can retrieve metrics either live, or from an archive file.
5.1.1. Retrieving Live Metrics using CLI
To output metrics on disk partition write instances, enter the following command:
# pmval -f 1 disk.partitions.write
In this example, PCP converts the number of writes to disk partitions from a counter value, to a rate value. The
-f 1 specifies that the value be abbreviated to one decimal place.
metric: disk.partitions.write host: satellite.example.com semantics: cumulative counter (converting to rate) units: count (converting to count / sec) samples: all vda1 vda2 sr0 0.0 12.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 2.0 0.0
To monitor system metrics with a two second interval:
# pmstat -t 2sec
5.1.2. Retrieving Archived Metrics using CLI
You can use the PCP CLI tools to retrieve metrics from an archive file. To do that, add the
--archive parameter and specify the archive file.
To list all metrics which were enabled when the archive file was created, enter the following command:
pminfo --archive archive_file
To confirm the host and time period covered by an archive file, enter the following command:
# pmdumplog -l archive_file
To list disk writes for each partition, over the time period covered by the archive file:
# pmval --archive /var/log/pcp/pmlogger/satellite.example.com/20180816.00.10 \ -f 1 disk.partitions.write
To list disk write operations per partition, with a two second interval, between the time period 14:00 and 14:15:
# pmval --archive /var/log/pcp/pmlogger/satellite.example.com/20180816.00.10 \ -d -t 2sec \ -f 3 disk.partitions.write \ -S @14:00 -T @14:15
To list average values of all performance metrics, including the time of minimum/maximum value and the actual minimum/maximum value, between the time period 14:00 and 14:30. To output the values in tabular formatting:
# pmlogsummary /var/log/pcp/pmlogger/satellite.example.com/20180816.00.10 \ -HlfiImM \ -S @14:00 \ -T @14:30 \ disk.partitions.write \ mem.freemem
To list system metrics stored in an archive, starting from 14:00. The metrics are displayed in a format similar to the
# pcp --archive /var/log/pcp/pmlogger/satellite.example.com/20180816.00.10 \ -S @14:00 \ atop
5.2. Retrieving Metrics via the Web UI
To access the web UI interfaces to PCP metrics, open the URL of either the following web applications:
Both applications provide a dashboard-style view, with default widgets displaying the values of metrics. You can add and remove metrics to suit your requirements. Also, you can select the time span shown for each widget. Only Grafana provides the option of selecting a custom time range from the archived metrics.
Figure 5.1. Example Grafana dashboard
Figure 5.2. Example Vector dashboard
Chapter 6. Metrics Data Retention
The storage capacity required by PCP data logging is determined by the following factors:
- the metrics being logged,
- the logging interval,
- and the retention policy.
The default logging (sampling) interval is 60 seconds.
The default retention policy is to keep archives for the last 14 days, compressing archives older than one day. PCP archive logs are stored in the
6.1. Changing Default Logging Interval
This procedure describes how to change the default logging interval.
Edit the LOCALHOSTNAME line and append
-t XXs, where XX is the desired time interval, measured in seconds.
6.2. Changing Data Retention Policy
This procedure describes how to change the data retention policy.
Find the line containing
Change the value for parameter
-xto the desired number of days after which data is archived.
-k, and add a value for the number of days after which data is deleted.
For example, the parameters
-x 4 -k 7specify that data will be compressed after 4 days, and deleted after 7 days.
6.3. Confirming Data Storage Usage
To confirm data storage usage, enter the following command:
# less /var/log/pcp/pmlogger/$(hostname)/pmlogger.log
This lists all available metrics, grouped by the frequency at which they are logged. For each group it also lists the storage required to store the listed metrics, per day.
Example storage statistics
logged every 60 sec: 61752 bytes or 84.80 Mbytes/day