Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

Chapter 21. System Monitoring Tools

In order to configure the system, system administrators often need to determine the amount of free memory, how much free disk space is available, how the hard drive is partitioned, or what processes are running.

21.1. Viewing System Processes

21.1.1. Using the ps Command

The ps command allows you to display information about running processes. It produces a static list, that is, a snapshot of what is running when you execute the command. If you want a constantly updated list of running processes, use the top command or the System Monitor application instead.

To list all processes that are currently running on the system including processes owned by other users, type the following at a shell prompt:

ps ax

For each listed process, the ps ax command displays the process ID (PID), the terminal that is associated with it (TTY), the current status (STAT), the cumulated CPU time (TIME), and the name of the executable file (COMMAND). For example:

~]$ ps ax
PID TTY   STAT  TIME COMMAND
 1 ?    Ss   0:01 /usr/lib/systemd/systemd --switched-root --system --deserialize 23
 2 ?    S   0:00 [kthreadd]
 3 ?    S   0:00 [ksoftirqd/0]
 5 ?    S>   0:00 [kworker/0:0H]
[output truncated]

To display the owner alongside each process, use the following command:

ps aux

Apart from the information provided by the ps ax command, ps aux displays the effective user name of the process owner (USER), the percentage of the CPU (%CPU) and memory (%MEM) usage, the virtual memory size in kilobytes (VSZ), the non-swapped physical memory size in kilobytes (RSS), and the time or date the process was started. For example:

~]$ ps aux
USER PID %CPU %MEM  VSZ  RSS TTY  STAT START  TIME COMMAND
root  1 0.3 0.3 134776 6840 ?   Ss  09:28  0:01 /usr/lib/systemd/systemd --switched-root --system --d
root  2 0.0 0.0   0   0 ?   S  09:28  0:00 [kthreadd]
root  3 0.0 0.0   0   0 ?   S  09:28  0:00 [ksoftirqd/0]
root  5 0.0 0.0   0   0 ?   S>  09:28  0:00 [kworker/0:0H]
[output truncated]

You can also use the ps command in a combination with grep to see if a particular process is running. For example, to determine if Emacs is running, type:

~]$ ps ax | grep emacs
12056 pts/3  S+   0:00 emacs
12060 pts/2  S+   0:00 grep --color=auto emacs

For a complete list of available command line options, see the ps(1) manual page.

21.1.2. Using the top Command

The top command displays a real-time list of processes that are running on the system. It also displays additional information about the system uptime, current CPU and memory usage, or total number of running processes, and allows you to perform actions such as sorting the list or killing a process.

To run the top command, type the following at a shell prompt:

top

For each listed process, the top command displays the process ID (PID), the effective user name of the process owner (USER), the priority (PR), the nice value (NI), the amount of virtual memory the process uses (VIRT), the amount of non-swapped physical memory the process uses (RES), the amount of shared memory the process uses (SHR), the process status field S), the percentage of the CPU (%CPU) and memory (%MEM) usage, the cumulated CPU time (TIME+), and the name of the executable file (COMMAND). For example:

~]$ top
top - 16:42:12 up 13 min, 2 users, load average: 0.67, 0.31, 0.19
Tasks: 165 total,  2 running, 163 sleeping,  0 stopped,  0 zombie
%Cpu(s): 37.5 us, 3.0 sy, 0.0 ni, 59.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1016800 total,  77368 free,  728936 used,  210496 buff/cache
KiB Swap:  839676 total,  776796 free,  62880 used.  122628 avail Mem

 PID USER   PR NI  VIRT  RES  SHR S %CPU %MEM   TIME+ COMMAND
 3168 sjw    20  0 1454628 143240 15016 S 20.3 14.1  0:22.53 gnome-shell
 4006 sjw    20  0 1367832 298876 27856 S 13.0 29.4  0:15.58 firefox
 1683 root   20  0 242204 50464  4268 S 6.0 5.0  0:07.76 Xorg
 4125 sjw    20  0 555148 19820 12644 S 1.3 1.9  0:00.48 gnome-terminal-
  10 root   20  0    0   0   0 S 0.3 0.0  0:00.39 rcu_sched
 3091 sjw    20  0  37000  1468  904 S 0.3 0.1  0:00.31 dbus-daemon
 3096 sjw    20  0 129688  2164  1492 S 0.3 0.2  0:00.14 at-spi2-registr
 3925 root   20  0    0   0   0 S 0.3 0.0  0:00.05 kworker/0:0
  1 root   20  0 126568  3884  1052 S 0.0 0.4  0:01.61 systemd
  2 root   20  0    0   0   0 S 0.0 0.0  0:00.00 kthreadd
  3 root   20  0    0   0   0 S 0.0 0.0  0:00.00 ksoftirqd/0
  6 root   20  0    0   0   0 S 0.0 0.0  0:00.07 kworker/u2:0
[output truncated]

Table 21.1, “Interactive top commands” contains useful interactive commands that you can use with top. For more information, see the top(1) manual page.

Table 21.1. Interactive top commands

CommandDescription

Enter, Space

Immediately refreshes the display.

h

Displays a help screen for interactive commands.

h, ?

Displays a help screen for windows and field groups.

k

Kills a process. You are prompted for the process ID and the signal to send to it.

n

Changes the number of displayed processes. You are prompted to enter the number.

u

Sorts the list by user.

M

Sorts the list by memory usage.

P

Sorts the list by CPU usage.

q

Terminates the utility and returns to the shell prompt.

21.1.3. Using the System Monitor Tool

The Processes tab of the System Monitor tool allows you to view, search for, change the priority of, and kill processes from the graphical user interface.

To start the System Monitor tool from the command line, type gnome-system-monitor at a shell prompt. The System Monitor tool appears. Alternatively, if using the GNOME desktop, press the Super key to enter the Activities Overview, type System Monitor and then press Enter. The System Monitor tool appears. The Super key appears in a variety of guises, depending on the keyboard and other hardware, but often as either the Windows or Command key, and typically to the left of the Spacebar.

Click the Processes tab to view the list of running processes.

Figure 21.1. System Monitor — Processes

The Processes tab of the System Monitor application.

For each listed process, the System Monitor tool displays its name (Process Name), current status (Status), percentage of the CPU usage (% CPU), nice value (Nice), process ID (ID), memory usage (Memory), the channel the process is waiting in (Waiting Channel), and additional details about the session (Session). To sort the information by a specific column in ascending order, click the name of that column. Click the name of the column again to toggle the sort between ascending and descending order.

By default, the System Monitor tool displays a list of processes that are owned by the current user. Selecting various options from the View menu allows you to:

  • view only active processes,
  • view all processes,
  • view your processes,
  • view process dependencies,

Additionally, two buttons enable you to:

  • refresh the list of processes,
  • end a process by selecting it from the list and then clicking the End Process button.

21.2. Viewing Memory Usage

21.2.1. Using the free Command

The free command allows you to display the amount of free and used memory on the system. To do so, type the following at a shell prompt:

free

The free command provides information about both the physical memory (Mem) and swap space (Swap). It displays the total amount of memory (total), as well as the amount of memory that is in use (used), free (free), shared (shared), sum of buffers and cached (buff/cache), and available (available). For example:

~]$ free
       total    used    free   shared buff/cache  available
Mem:    1016800   727300    84684    3500   204816   124068
Swap:    839676    66920   772756

By default, free displays the values in kilobytes. To display the values in megabytes, supply the -m command line option:

free -m

For instance:

~]$ free -m
       total    used    free   shared buff/cache  available
Mem:      992     711     81      3     200     120
Swap:      819     65     754

For a complete list of available command line options, see the free(1) manual page.

21.2.2. Using the System Monitor Tool

The Resources tab of the System Monitor tool allows you to view the amount of free and used memory on the system.

To start the System Monitor tool from the command line, type gnome-system-monitor at a shell prompt. The System Monitor tool appears. Alternatively, if using the GNOME desktop, press the Super key to enter the Activities Overview, type System Monitor and then press Enter. The System Monitor tool appears. The Super key appears in a variety of guises, depending on the keyboard and other hardware, but often as either the Windows or Command key, and typically to the left of the Spacebar.

Click the Resources tab to view the system’s memory usage.

Figure 21.2. System Monitor — Resources

The Resources tab of the System Monitor application.

In the Memory and Swap History section, the System Monitor tool displays a graphical representation of the memory and swap usage history, as well as the total amount of the physical memory (Memory) and swap space (Swap) and how much of it is in use.

21.3. Viewing CPU Usage

21.3.1. Using the System Monitor Tool

The Resources tab of the System Monitor tool allows you to view the current CPU usage on the system.

To start the System Monitor tool from the command line, type gnome-system-monitor at a shell prompt. The System Monitor tool appears. Alternatively, if using the GNOME desktop, press the Super key to enter the Activities Overview, type System Monitor and then press Enter. The System Monitor tool appears. The Super key appears in a variety of guises, depending on the keyboard and other hardware, but often as either the Windows or Command key, and typically to the left of the Spacebar.

Click the Resources tab to view the system’s CPU usage.

In the CPU History section, the System Monitor tool displays a graphical representation of the CPU usage history and shows the percentage of how much CPU is currently in use.

21.4. Viewing Block Devices and File Systems

21.4.1. Using the lsblk Command

The lsblk command allows you to display a list of available block devices. It provides more information and better control on output formatting than the blkid command. It reads information from udev, therefore it is usable by non-root users. To display a list of block devices, type the following at a shell prompt:

lsblk

For each listed block device, the lsblk command displays the device name (NAME), major and minor device number (MAJ:MIN), if the device is removable (RM), its size (SIZE), if the device is read-only (RO), what type it is (TYPE), and where the device is mounted (MOUNTPOINT). For example:

~]$ lsblk
NAME           MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0            11:0  1 1024M 0 rom
vda            252:0  0  20G 0 rom
|-vda1          252:1  0  500M 0 part /boot
`-vda2          252:2  0 19.5G 0 part
 |-vg_kvm-lv_root (dm-0) 253:0  0  18G 0 lvm /
 `-vg_kvm-lv_swap (dm-1) 253:1  0  1.5G 0 lvm [SWAP]

By default, lsblk lists block devices in a tree-like format. To display the information as an ordinary list, add the -l command line option:

lsblk -l

For instance:

~]$ lsblk -l
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0          11:0  1 1024M 0 rom
vda          252:0  0  20G 0 rom
vda1         252:1  0  500M 0 part /boot
vda2         252:2  0 19.5G 0 part
vg_kvm-lv_root (dm-0) 253:0  0  18G 0 lvm /
vg_kvm-lv_swap (dm-1) 253:1  0  1.5G 0 lvm [SWAP]

For a complete list of available command line options, see the lsblk(8) manual page.

21.4.2. Using the blkid Command

The blkid command allows you to display low-level information about available block devices. It requires root privileges, therefore non-root users should use the lsblk command. To do so, type the following at a shell prompt as root:

blkid

For each listed block device, the blkid command displays available attributes such as its universally unique identifier (UUID), file system type (TYPE), or volume label (LABEL). For example:

~]# blkid
/dev/vda1: UUID="7fa9c421-0054-4555-b0ca-b470a97a3d84" TYPE="ext4"
/dev/vda2: UUID="7IvYzk-TnnK-oPjf-ipdD-cofz-DXaJ-gPdgBW" TYPE="LVM2_member"
/dev/mapper/vg_kvm-lv_root: UUID="a07b967c-71a0-4925-ab02-aebcad2ae824" TYPE="ext4"
/dev/mapper/vg_kvm-lv_swap: UUID="d7ef54ca-9c41-4de4-ac1b-4193b0c1ddb6" TYPE="swap"

By default, the blkid command lists all available block devices. To display information about a particular device only, specify the device name on the command line:

blkid device_name

For instance, to display information about /dev/vda1, type as root:

~]# blkid /dev/vda1
/dev/vda1: UUID="7fa9c421-0054-4555-b0ca-b470a97a3d84" TYPE="ext4"

You can also use the above command with the -p and -o udev command line options to obtain more detailed information. Note that root privileges are required to run this command:

blkid -po udev device_name

For example:

~]# blkid -po udev /dev/vda1
ID_FS_UUID=7fa9c421-0054-4555-b0ca-b470a97a3d84
ID_FS_UUID_ENC=7fa9c421-0054-4555-b0ca-b470a97a3d84
ID_FS_VERSION=1.0
ID_FS_TYPE=ext4
ID_FS_USAGE=filesystem

For a complete list of available command line options, see the blkid(8) manual page.

21.4.3. Using the findmnt Command

The findmnt command allows you to display a list of currently mounted file systems. To do so, type the following at a shell prompt:

findmnt

For each listed file system, the findmnt command displays the target mount point (TARGET), source device (SOURCE), file system type (FSTYPE), and relevant mount options (OPTIONS). For example:

~]$ findmnt
TARGET              SOURCE       FSTYPE     OPTIONS
/                /dev/mapper/rhel-root
                          xfs       rw,relatime,seclabel,attr2,inode64,noquota
├─/proc             proc        proc      rw,nosuid,nodev,noexec,relatime
│ ├─/proc/sys/fs/binfmt_misc   systemd-1     autofs     rw,relatime,fd=32,pgrp=1,timeout=300,minproto=5,maxproto=5,direct
│ └─/proc/fs/nfsd        sunrpc       nfsd      rw,relatime
├─/sys              sysfs       sysfs      rw,nosuid,nodev,noexec,relatime,seclabel
│ ├─/sys/kernel/security     securityfs     securityfs   rw,nosuid,nodev,noexec,relatime
│ ├─/sys/fs/cgroup        tmpfs       tmpfs      rw,nosuid,nodev,noexec,seclabel,mode=755
[output truncated]

By default, findmnt lists file systems in a tree-like format. To display the information as an ordinary list, add the -l command line option:

findmnt -l

For instance:

~]$ findmnt -l
TARGET           SOURCE        FSTYPE     OPTIONS
/proc           proc         proc      rw,nosuid,nodev,noexec,relatime
/sys            sysfs         sysfs      rw,nosuid,nodev,noexec,relatime,seclabel
/dev            devtmpfs       devtmpfs    rw,nosuid,seclabel,size=933372k,nr_inodes=233343,mode=755
/sys/kernel/security    securityfs      securityfs   rw,nosuid,nodev,noexec,relatime
/dev/shm          tmpfs         tmpfs      rw,nosuid,nodev,seclabel
/dev/pts          devpts        devpts     rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000
/run            tmpfs         tmpfs      rw,nosuid,nodev,seclabel,mode=755
/sys/fs/cgroup       tmpfs         tmpfs      rw,nosuid,nodev,noexec,seclabel,mode=755
[output truncated]

You can also choose to list only file systems of a particular type. To do so, add the -t command line option followed by a file system type:

findmnt -t type

For example, to all list xfs file systems, type:

~]$ findmnt -t xfs
TARGET SOURCE        FSTYPE OPTIONS
/    /dev/mapper/rhel-root xfs  rw,relatime,seclabel,attr2,inode64,noquota
└─/boot /dev/vda1       xfs  rw,relatime,seclabel,attr2,inode64,noquota

For a complete list of available command line options, see the findmnt(8) manual page.

21.4.4. Using the df Command

The df command allows you to display a detailed report on the system’s disk space usage. To do so, type the following at a shell prompt:

df

For each listed file system, the df command displays its name (Filesystem), size (1K-blocks or Size), how much space is used (Used), how much space is still available (Available), the percentage of space usage (Use%), and where is the file system mounted (Mounted on). For example:

~]$ df
Filesystem         1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg_kvm-lv_root 18618236  4357360 13315112 25% /
tmpfs             380376    288  380088  1% /dev/shm
/dev/vda1           495844   77029  393215 17% /boot

By default, the df command shows the partition size in 1 kilobyte blocks and the amount of used and available disk space in kilobytes. To view the information in megabytes and gigabytes, supply the -h command line option, which causes df to display the values in a human-readable format:

df -h

For instance:

~]$ df -h
Filesystem         Size Used Avail Use% Mounted on
/dev/mapper/vg_kvm-lv_root  18G 4.2G  13G 25% /
tmpfs            372M 288K 372M  1% /dev/shm
/dev/vda1          485M  76M 384M 17% /boot

For a complete list of available command line options, see the df(1) manual page.

21.4.5. Using the du Command

The du command allows you to displays the amount of space that is being used by files in a directory. To display the disk usage for each of the subdirectories in the current working directory, run the command with no additional command line options:

du

For example:

~]$ du
14972  ./Downloads
4    ./.mozilla/extensions
4    ./.mozilla/plugins
12   ./.mozilla
15004  .

By default, the du command displays the disk usage in kilobytes. To view the information in megabytes and gigabytes, supply the -h command line option, which causes the utility to display the values in a human-readable format:

du -h

For instance:

~]$ du -h
15M   ./Downloads
4.0K  ./.mozilla/extensions
4.0K  ./.mozilla/plugins
12K   ./.mozilla
15M   .

At the end of the list, the du command always shows the grand total for the current directory. To display only this information, supply the -s command line option:

du -sh

For example:

~]$ du -sh
15M   .

For a complete list of available command line options, see the du(1) manual page.

21.4.6. Using the System Monitor Tool

The File Systems tab of the System Monitor tool allows you to view file systems and disk space usage in the graphical user interface.

To start the System Monitor tool from the command line, type gnome-system-monitor at a shell prompt. The System Monitor tool appears. Alternatively, if using the GNOME desktop, press the Super key to enter the Activities Overview, type System Monitor and then press Enter. The System Monitor tool appears. The Super key appears in a variety of guises, depending on the keyboard and other hardware, but often as either the Windows or Command key, and typically to the left of the Spacebar.

Click the File Systems tab to view a list of file systems.

Figure 21.3. System Monitor — File Systems

The File Systems tab of the System Monitor application.

For each listed file system, the System Monitor tool displays the source device (Device), target mount point (Directory), and file system type (Type), as well as its size (Total), and how much space is available (Available), and used (Used).

21.5. Viewing Hardware Information

21.5.1. Using the lspci Command

The lspci command allows you to display information about PCI buses and devices that are attached to them. To list all PCI devices that are in the system, type the following at a shell prompt:

lspci

This displays a simple list of devices, for example:

~]$ lspci
00:00.0 Host bridge: Intel Corporation 82X38/X48 Express DRAM Controller
00:01.0 PCI bridge: Intel Corporation 82X38/X48 Express Host-Primary PCI Express Bridge
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
[output truncated]

You can also use the -v command line option to display more verbose output, or -vv for very verbose output:

lspci -v|-vv

For instance, to determine the manufacturer, model, and memory size of a system’s video card, type:

~]$ lspci -v
[output truncated]

01:00.0 VGA compatible controller: nVidia Corporation G84 [Quadro FX 370] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: nVidia Corporation Device 0491
    Physical Slot: 2
    Flags: bus master, fast devsel, latency 0, IRQ 16
    Memory at f2000000 (32-bit, non-prefetchable) [size=16M]
    Memory at e0000000 (64-bit, prefetchable) [size=256M]
    Memory at f0000000 (64-bit, non-prefetchable) [size=32M]
    I/O ports at 1100 [size=128]
    Expansion ROM at <unassigned> [disabled]
    Capabilities: <access denied>
    Kernel driver in use: nouveau
    Kernel modules: nouveau, nvidiafb

[output truncated]

For a complete list of available command line options, see the lspci(8) manual page.

21.5.2. Using the lsusb Command

The lsusb command allows you to display information about USB buses and devices that are attached to them. To list all USB devices that are in the system, type the following at a shell prompt:

lsusb

This displays a simple list of devices, for example:

~]$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
[output truncated]
Bus 001 Device 002: ID 0bda:0151 Realtek Semiconductor Corp. Mass Storage Device (Multicard Reader)
Bus 008 Device 002: ID 03f0:2c24 Hewlett-Packard Logitech M-UAL-96 Mouse
Bus 008 Device 003: ID 04b3:3025 IBM Corp.

You can also use the -v command line option to display more verbose output:

lsusb -v

For instance:

~]$ lsusb -v
[output truncated]

Bus 008 Device 002: ID 03f0:2c24 Hewlett-Packard Logitech M-UAL-96 Mouse
Device Descriptor:
 bLength        18
 bDescriptorType     1
 bcdUSB        2.00
 bDeviceClass      0 (Defined at Interface level)
 bDeviceSubClass     0
 bDeviceProtocol     0
 bMaxPacketSize0     8
 idVendor      0x03f0 Hewlett-Packard
 idProduct     0x2c24 Logitech M-UAL-96 Mouse
 bcdDevice      31.00
 iManufacturer      1
 iProduct        2
 iSerial         0
 bNumConfigurations   1
 Configuration Descriptor:
  bLength         9
  bDescriptorType     2
[output truncated]

For a complete list of available command line options, see the lsusb(8) manual page.

21.5.3. Using the lscpu Command

The lscpu command allows you to list information about CPUs that are present in the system, including the number of CPUs, their architecture, vendor, family, model, CPU caches, etc. To do so, type the following at a shell prompt:

lscpu

For example:

~]$ lscpu
Architecture:     x86_64
CPU op-mode(s):    32-bit, 64-bit
Byte Order:      Little Endian
CPU(s):        4
On-line CPU(s) list:  0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):       1
NUMA node(s):     1
Vendor ID:       GenuineIntel
CPU family:      6
Model:         23
Stepping:       7
CPU MHz:        1998.000
BogoMIPS:       4999.98
Virtualization:    VT-x
L1d cache:       32K
L1i cache:       32K
L2 cache:       3072K
NUMA node0 CPU(s):   0-3

For a complete list of available command line options, see the lscpu(1) manual page.

21.6. Checking for Hardware Errors

Red Hat Enterprise Linux 7 introduced the new hardware event report mechanism (HERM.) This mechanism gathers system-reported memory errors as well as errors reported by the error detection and correction (EDAC) mechanism for dual in-line memory modules (DIMMs) and reports them to user space. The user-space daemon rasdaemon, catches and handles all reliability, availability, and serviceability (RAS) error events that come from the kernel tracing mechanism, and logs them. The functions previously provided by edac-utils are now replaced by rasdaemon.

To install rasdaemon, enter the following command as root:

~]# yum install rasdaemon

Start the service as follows:

~]# systemctl start rasdaemon

To make the service run at system start, enter the following command:

~]# systemctl enable rasdaemon

The ras-mc-ctl utility provides a means to work with EDAC drivers. Enter the following command to see a list of command options:

~]$ ras-mc-ctl --help
Usage: ras-mc-ctl [OPTIONS...]
 --quiet      Quiet operation.
 --mainboard    Print mainboard vendor and model for this hardware.
 --status      Print status of EDAC drivers.
output truncated

To view a summary of memory controller events, run as root:

~]# ras-mc-ctl --summary
Memory controller events summary:
    Corrected on DIMM Label(s): 'CPU_SrcID#0_Ha#0_Chan#0_DIMM#0' location: 0:0:0:-1 errors: 1

No PCIe AER errors.

No Extlog errors.
MCE records summary:
    1 MEMORY CONTROLLER RD_CHANNEL0_ERR Transaction: Memory read error errors
    2 No Error errors

To view a list of errors reported by the memory controller, run as root:

~]# ras-mc-ctl --errors
Memory controller events:
1 3172-02-17 00:47:01 -0500 1 Corrected error(s): memory read error at CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 location: 0:0:0:-1, addr 65928, grain 7, syndrome 0 area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0

No PCIe AER errors.

No Extlog errors.

MCE events:
1 3171-11-09 06:20:21 -0500 error: MEMORY CONTROLLER RD_CHANNEL0_ERR Transaction: Memory read error, mcg mcgstatus=0, mci Corrected_error, n_errors=1, mcgcap=0x01000c16, status=0x8c00004000010090, addr=0x1018893000, misc=0x15020a086, walltime=0x57e96780, cpuid=0x00050663, bank=0x00000007
2 3205-06-22 00:13:41 -0400 error: No Error, mcg mcgstatus=0, mci Corrected_error Error_enabled, mcgcap=0x01000c16, status=0x9400000000000000, addr=0x0000abcd, walltime=0x57e967ea, cpuid=0x00050663, bank=0x00000001
3 3205-06-22 00:13:41 -0400 error: No Error, mcg mcgstatus=0, mci Corrected_error Error_enabled, mcgcap=0x01000c16, status=0x9400000000000000, addr=0x00001234, walltime=0x57e967ea, cpu=0x00000001, cpuid=0x00050663, apicid=0x00000002, bank=0x00000002

These commands are also described in the ras-mc-ctl(8) manual page.

21.7. Monitoring Performance with Net-SNMP

Red Hat Enterprise Linux 7 includes the Net-SNMP software suite, which includes a flexible and extensible simple network management protocol (SNMP) agent. This agent and its associated utilities can be used to provide performance data from a large number of systems to a variety of tools which support polling over the SNMP protocol.

This section provides information on configuring the Net-SNMP agent to securely provide performance data over the network, retrieving the data using the SNMP protocol, and extending the SNMP agent to provide custom performance metrics.

21.7.1. Installing Net-SNMP

The Net-SNMP software suite is available as a set of RPM packages in the Red Hat Enterprise Linux software distribution. Table 21.2, “Available Net-SNMP packages” summarizes each of the packages and their contents.

Table 21.2. Available Net-SNMP packages

PackageProvides

net-snmp

The SNMP Agent Daemon and documentation. This package is required for exporting performance data.

net-snmp-libs

The netsnmp library and the bundled management information bases (MIBs). This package is required for exporting performance data.

net-snmp-utils

SNMP clients such as snmpget and snmpwalk. This package is required in order to query a system’s performance data over SNMP.

net-snmp-perl

The mib2c utility and the NetSNMP Perl module. Note that this package is provided by the Optional channel. See Section 9.5.7, “Adding the Optional and Supplementary Repositories” for more information on Red Hat additional channels.

net-snmp-python

An SNMP client library for Python. Note that this package is provided by the Optional channel. See Section 9.5.7, “Adding the Optional and Supplementary Repositories” for more information on Red Hat additional channels.

To install any of these packages, use the yum command in the following form:

yum install package&hellip;

For example, to install the SNMP Agent Daemon and SNMP clients used in the rest of this section, type the following at a shell prompt as root:

~]# yum install net-snmp net-snmp-libs net-snmp-utils

For more information on how to install new packages in Red Hat Enterprise Linux, see Section 9.2.4, “Installing Packages”.

21.7.2. Running the Net-SNMP Daemon

The net-snmp package contains snmpd, the SNMP Agent Daemon. This section provides information on how to start, stop, and restart the snmpd service. For more information on managing system services in Red Hat Enterprise Linux 7, see Chapter 10, Managing Services with systemd.

21.7.2.1. Starting the Service

To run the snmpd service in the current session, type the following at a shell prompt as root:

systemctl start snmpd.service

To configure the service to be automatically started at boot time, use the following command:

systemctl enable snmpd.service

21.7.2.2. Stopping the Service

To stop the running snmpd service, type the following at a shell prompt as root:

systemctl stop snmpd.service

To disable starting the service at boot time, use the following command:

systemctl disable snmpd.service

21.7.2.3. Restarting the Service

To restart the running snmpd service, type the following at a shell prompt:

systemctl restart snmpd.service

This command stops the service and starts it again in quick succession. To only reload the configuration without stopping the service, run the following command instead:

systemctl reload snmpd.service

This causes the running snmpd service to reload its configuration.

21.7.3. Configuring Net-SNMP

To change the Net-SNMP Agent Daemon configuration, edit the /etc/snmp/snmpd.conf configuration file. The default snmpd.conf file included with Red Hat Enterprise Linux 7 is heavily commented and serves as a good starting point for agent configuration.

This section focuses on two common tasks: setting system information and configuring authentication. For more information about available configuration directives, see the snmpd.conf(5) manual page. Additionally, there is a utility in the net-snmp package named snmpconf which can be used to interactively generate a valid agent configuration.

Note that the net-snmp-utils package must be installed in order to use the snmpwalk utility described in this section.

Note

For any changes to the configuration file to take effect, force the snmpd service to re-read the configuration by running the following command as root:

systemctl reload snmpd.service

21.7.3.1. Setting System Information

Net-SNMP provides some rudimentary system information via the system tree. For example, the following snmpwalk command shows the system tree with a default agent configuration.

~]# snmpwalk -v2c -c public localhost system
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Mon May 5 11:16:57 EDT 2014 x86_64
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (464) 0:00:04.64
SNMPv2-MIB::sysContact.0 = STRING: Root <root@localhost> (configure /etc/snmp/snmp.local.conf)
[output truncated]

By default, the sysName object is set to the host name. The sysLocation and sysContact objects can be configured in the /etc/snmp/snmpd.conf file by changing the value of the syslocation and syscontact directives, for example:

syslocation Datacenter, Row 4, Rack 3
syscontact UNIX Admin <admin@example.com>

After making changes to the configuration file, reload the configuration and test it by running the snmpwalk command again:

~]# systemctl reload snmp.service
~]# snmpwalk -v2c -c public localhost system
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Mon May 5 11:16:57 EDT 2014 x86_64
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (35424) 0:05:54.24
SNMPv2-MIB::sysContact.0 = STRING: UNIX Admin <admin@example.com>
SNMPv2-MIB::sysName.0 = STRING: localhost.localdomain
SNMPv2-MIB::sysLocation.0 = STRING: Datacenter, Row 4, Rack 3
[output truncated]

21.7.3.2. Configuring Authentication

The Net-SNMP Agent Daemon supports all three versions of the SNMP protocol. The first two versions (1 and 2c) provide for simple authentication using a community string. This string is a shared secret between the agent and any client utilities. The string is passed in clear text over the network however and is not considered secure. Version 3 of the SNMP protocol supports user authentication and message encryption using a variety of protocols. The Net-SNMP agent also supports tunneling over SSH, and TLS authentication with X.509 certificates.

Configuring SNMP Version 2c Community

To configure an SNMP version 2c community, use either the rocommunity or rwcommunity directive in the /etc/snmp/snmpd.conf configuration file. The format of the directives is as follows:

directive community source OID

… where community is the community string to use, source is an IP address or subnet, and OID is the SNMP tree to provide access to. For example, the following directive provides read-only access to the system tree to a client using the community string "redhat" on the local machine:

rocommunity redhat 127.0.0.1 .1.3.6.1.2.1.1

To test the configuration, use the snmpwalk command with the -v and -c options.

~]# snmpwalk -v2c -c redhat localhost system
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Mon May 5 11:16:57 EDT 2014 x86_64
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (101376) 0:16:53.76
SNMPv2-MIB::sysContact.0 = STRING: UNIX Admin <admin@example.com>
SNMPv2-MIB::sysName.0 = STRING: localhost.localdomain
SNMPv2-MIB::sysLocation.0 = STRING: Datacenter, Row 4, Rack 3
[output truncated]
Configuring SNMP Version 3 User

To configure an SNMP version 3 user, use the net-snmp-create-v3-user command. This command adds entries to the /var/lib/net-snmp/snmpd.conf and /etc/snmp/snmpd.conf files which create the user and grant access to the user. Note that the net-snmp-create-v3-user command may only be run when the agent is not running. The following example creates the "admin" user with the password "redhatsnmp":

~]# systemctl stop snmpd.service
~]# net-snmp-create-v3-user
Enter a SNMPv3 user name to create:
admin
Enter authentication pass-phrase:
redhatsnmp
Enter encryption pass-phrase:
 [press return to reuse the authentication pass-phrase]

adding the following line to /var/lib/net-snmp/snmpd.conf:
  createUser admin MD5 "redhatsnmp" DES
adding the following line to /etc/snmp/snmpd.conf:
  rwuser admin
~]# systemctl start snmpd.service

The rwuser directive (or rouser when the -ro command line option is supplied) that net-snmp-create-v3-user adds to /etc/snmp/snmpd.conf has a similar format to the rwcommunity and rocommunity directives:

directive user noauth|auth|priv OID

… where user is a user name and OID is the SNMP tree to provide access to. By default, the Net-SNMP Agent Daemon allows only authenticated requests (the auth option). The noauth option allows you to permit unauthenticated requests, and the priv option enforces the use of encryption. The authpriv option specifies that requests must be authenticated and replies should be encrypted.

For example, the following line grants the user "admin" read-write access to the entire tree:

rwuser admin authpriv .1

To test the configuration, create a .snmp/ directory in your user’s home directory and a configuration file named snmp.conf in that directory (~/.snmp/snmp.conf) with the following lines:

defVersion 3
defSecurityLevel authPriv
defSecurityName admin
defPassphrase redhatsnmp

The snmpwalk command will now use these authentication settings when querying the agent:

~]$ snmpwalk -v3 localhost system
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Mon May 5 11:16:57 EDT 2014 x86_64
[output truncated]

21.7.4. Retrieving Performance Data over SNMP

The Net-SNMP Agent in Red Hat Enterprise Linux provides a wide variety of performance information over the SNMP protocol. In addition, the agent can be queried for a listing of the installed RPM packages on the system, a listing of currently running processes on the system, or the network configuration of the system.

This section provides an overview of OIDs related to performance tuning available over SNMP. It assumes that the net-snmp-utils package is installed and that the user is granted access to the SNMP tree as described in Section 21.7.3.2, “Configuring Authentication”.

21.7.4.1. Hardware Configuration

The Host Resources MIB included with Net-SNMP presents information about the current hardware and software configuration of a host to a client utility. Table 21.3, “Available OIDs” summarizes the different OIDs available under that MIB.

Table 21.3. Available OIDs

OIDDescription

HOST-RESOURCES-MIB::hrSystem

Contains general system information such as uptime, number of users, and number of running processes.

HOST-RESOURCES-MIB::hrStorage

Contains data on memory and file system usage.

HOST-RESOURCES-MIB::hrDevices

Contains a listing of all processors, network devices, and file systems.

HOST-RESOURCES-MIB::hrSWRun

Contains a listing of all running processes.

HOST-RESOURCES-MIB::hrSWRunPerf

Contains memory and CPU statistics on the process table from HOST-RESOURCES-MIB::hrSWRun.

HOST-RESOURCES-MIB::hrSWInstalled

Contains a listing of the RPM database.

There are also a number of SNMP tables available in the Host Resources MIB which can be used to retrieve a summary of the available information. The following example displays HOST-RESOURCES-MIB::hrFSTable:

~]$ snmptable -Cb localhost HOST-RESOURCES-MIB::hrFSTable
SNMP table: HOST-RESOURCES-MIB::hrFSTable

 Index MountPoint RemoteMountPoint                Type
  Access Bootable StorageIndex LastFullBackupDate LastPartialBackupDate
   1    "/"        "" HOST-RESOURCES-TYPES::hrFSLinuxExt2
 readWrite   true      31   0-1-1,0:0:0.0     0-1-1,0:0:0.0
   5 "/dev/shm"        ""   HOST-RESOURCES-TYPES::hrFSOther
 readWrite  false      35   0-1-1,0:0:0.0     0-1-1,0:0:0.0
   6  "/boot"        "" HOST-RESOURCES-TYPES::hrFSLinuxExt2
 readWrite  false      36   0-1-1,0:0:0.0     0-1-1,0:0:0.0

For more information about HOST-RESOURCES-MIB, see the /usr/share/snmp/mibs/HOST-RESOURCES-MIB.txt file.

21.7.4.2. CPU and Memory Information

Most system performance data is available in the UCD SNMP MIB. The systemStats OID provides a number of counters around processor usage:

~]$ snmpwalk localhost UCD-SNMP-MIB::systemStats
UCD-SNMP-MIB::ssIndex.0 = INTEGER: 1
UCD-SNMP-MIB::ssErrorName.0 = STRING: systemStats
UCD-SNMP-MIB::ssSwapIn.0 = INTEGER: 0 kB
UCD-SNMP-MIB::ssSwapOut.0 = INTEGER: 0 kB
UCD-SNMP-MIB::ssIOSent.0 = INTEGER: 0 blocks/s
UCD-SNMP-MIB::ssIOReceive.0 = INTEGER: 0 blocks/s
UCD-SNMP-MIB::ssSysInterrupts.0 = INTEGER: 29 interrupts/s
UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 18 switches/s
UCD-SNMP-MIB::ssCpuUser.0 = INTEGER: 0
UCD-SNMP-MIB::ssCpuSystem.0 = INTEGER: 0
UCD-SNMP-MIB::ssCpuIdle.0 = INTEGER: 99
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 2278
UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 1395
UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 6826
UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 3383736
UCD-SNMP-MIB::ssCpuRawWait.0 = Counter32: 7629
UCD-SNMP-MIB::ssCpuRawKernel.0 = Counter32: 0
UCD-SNMP-MIB::ssCpuRawInterrupt.0 = Counter32: 434
UCD-SNMP-MIB::ssIORawSent.0 = Counter32: 266770
UCD-SNMP-MIB::ssIORawReceived.0 = Counter32: 427302
UCD-SNMP-MIB::ssRawInterrupts.0 = Counter32: 743442
UCD-SNMP-MIB::ssRawContexts.0 = Counter32: 718557
UCD-SNMP-MIB::ssCpuRawSoftIRQ.0 = Counter32: 128
UCD-SNMP-MIB::ssRawSwapIn.0 = Counter32: 0
UCD-SNMP-MIB::ssRawSwapOut.0 = Counter32: 0

In particular, the ssCpuRawUser, ssCpuRawSystem, ssCpuRawWait, and ssCpuRawIdle OIDs provide counters which are helpful when determining whether a system is spending most of its processor time in kernel space, user space, or I/O. ssRawSwapIn and ssRawSwapOut can be helpful when determining whether a system is suffering from memory exhaustion.

More memory information is available under the UCD-SNMP-MIB::memory OID, which provides similar data to the free command:

~]$ snmpwalk localhost UCD-SNMP-MIB::memory
UCD-SNMP-MIB::memIndex.0 = INTEGER: 0
UCD-SNMP-MIB::memErrorName.0 = STRING: swap
UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 1023992 kB
UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 1023992 kB
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 1021588 kB
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 634260 kB
UCD-SNMP-MIB::memTotalFree.0 = INTEGER: 1658252 kB
UCD-SNMP-MIB::memMinimumSwap.0 = INTEGER: 16000 kB
UCD-SNMP-MIB::memBuffer.0 = INTEGER: 30760 kB
UCD-SNMP-MIB::memCached.0 = INTEGER: 216200 kB
UCD-SNMP-MIB::memSwapError.0 = INTEGER: noError(0)
UCD-SNMP-MIB::memSwapErrorMsg.0 = STRING:

Load averages are also available in the UCD SNMP MIB. The SNMP table UCD-SNMP-MIB::laTable has a listing of the 1, 5, and 15 minute load averages:

~]$ snmptable localhost UCD-SNMP-MIB::laTable
SNMP table: UCD-SNMP-MIB::laTable

 laIndex laNames laLoad laConfig laLoadInt laLoadFloat laErrorFlag laErrMessage
    1 Load-1  0.00  12.00     0  0.000000   noError
    2 Load-5  0.00  12.00     0  0.000000   noError
    3 Load-15  0.00  12.00     0  0.000000   noError

21.7.4.3. File System and Disk Information

The Host Resources MIB provides information on file system size and usage. Each file system (and also each memory pool) has an entry in the HOST-RESOURCES-MIB::hrStorageTable table:

~]$ snmptable -Cb localhost HOST-RESOURCES-MIB::hrStorageTable
SNMP table: HOST-RESOURCES-MIB::hrStorageTable

 Index                     Type      Descr
AllocationUnits  Size  Used AllocationFailures
   1      HOST-RESOURCES-TYPES::hrStorageRam Physical memory
1024 Bytes 1021588 388064         ?
   3 HOST-RESOURCES-TYPES::hrStorageVirtualMemory Virtual memory
1024 Bytes 2045580 388064         ?
   6     HOST-RESOURCES-TYPES::hrStorageOther Memory buffers
1024 Bytes 1021588 31048         ?
   7     HOST-RESOURCES-TYPES::hrStorageOther  Cached memory
1024 Bytes 216604 216604         ?
  10 HOST-RESOURCES-TYPES::hrStorageVirtualMemory   Swap space
1024 Bytes 1023992   0         ?
  31   HOST-RESOURCES-TYPES::hrStorageFixedDisk        /
4096 Bytes 2277614 250391         ?
  35   HOST-RESOURCES-TYPES::hrStorageFixedDisk    /dev/shm
4096 Bytes 127698   0         ?
  36   HOST-RESOURCES-TYPES::hrStorageFixedDisk      /boot
1024 Bytes 198337 26694         ?

The OIDs under HOST-RESOURCES-MIB::hrStorageSize and HOST-RESOURCES-MIB::hrStorageUsed can be used to calculate the remaining capacity of each mounted file system.

I/O data is available both in UCD-SNMP-MIB::systemStats (ssIORawSent.0 and ssIORawRecieved.0) and in UCD-DISKIO-MIB::diskIOTable. The latter provides much more granular data. Under this table are OIDs for diskIONReadX and diskIONWrittenX, which provide counters for the number of bytes read from and written to the block device in question since the system boot:

~]$ snmptable -Cb localhost UCD-DISKIO-MIB::diskIOTable
SNMP table: UCD-DISKIO-MIB::diskIOTable

 Index Device   NRead NWritten Reads Writes LA1 LA5 LA15  NReadX NWrittenX
...
  25  sda 216886272 139109376 16409  4894  ?  ?  ? 216886272 139109376
  26  sda1  2455552   5120  613   2  ?  ?  ?  2455552   5120
  27  sda2  1486848     0  332   0  ?  ?  ?  1486848     0
  28  sda3 212321280 139104256 15312  4871  ?  ?  ? 212321280 139104256

21.7.4.4. Network Information

The Interfaces MIB provides information on network devices. IF-MIB::ifTable provides an SNMP table with an entry for each interface on the system, the configuration of the interface, and various packet counters for the interface. The following example shows the first few columns of ifTable on a system with two physical network interfaces:

~]$ snmptable -Cb localhost IF-MIB::ifTable
SNMP table: IF-MIB::ifTable

 Index Descr       Type  Mtu  Speed   PhysAddress AdminStatus
   1  lo softwareLoopback 16436 10000000              up
   2 eth0  ethernetCsmacd 1500    0 52:54:0:c7:69:58     up
   3 eth1  ethernetCsmacd 1500    0 52:54:0:a7:a3:24    down

Network traffic is available under the OIDs IF-MIB::ifOutOctets and IF-MIB::ifInOctets. The following SNMP queries will retrieve network traffic for each of the interfaces on this system:

~]$ snmpwalk localhost IF-MIB::ifDescr
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifDescr.3 = STRING: eth1
~]$ snmpwalk localhost IF-MIB::ifOutOctets
IF-MIB::ifOutOctets.1 = Counter32: 10060699
IF-MIB::ifOutOctets.2 = Counter32: 650
IF-MIB::ifOutOctets.3 = Counter32: 0
~]$ snmpwalk localhost IF-MIB::ifInOctets
IF-MIB::ifInOctets.1 = Counter32: 10060699
IF-MIB::ifInOctets.2 = Counter32: 78650
IF-MIB::ifInOctets.3 = Counter32: 0

21.7.5. Extending Net-SNMP

The Net-SNMP Agent can be extended to provide application metrics in addition to raw system metrics. This allows for capacity planning as well as performance issue troubleshooting. For example, it may be helpful to know that an email system had a 5-minute load average of 15 while being tested, but it is more helpful to know that the email system has a load average of 15 while processing 80,000 messages a second. When application metrics are available via the same interface as the system metrics, this also allows for the visualization of the impact of different load scenarios on system performance (for example, each additional 10,000 messages increases the load average linearly until 100,000).

A number of the applications included in Red Hat Enterprise Linux extend the Net-SNMP Agent to provide application metrics over SNMP. There are several ways to extend the agent for custom applications as well. This section describes extending the agent with shell scripts and the Perl plug-ins from the Optional channel. It assumes that the net-snmp-utils and net-snmp-perl packages are installed, and that the user is granted access to the SNMP tree as described in Section 21.7.3.2, “Configuring Authentication”.

21.7.5.1. Extending Net-SNMP with Shell Scripts

The Net-SNMP Agent provides an extension MIB (NET-SNMP-EXTEND-MIB) that can be used to query arbitrary shell scripts. To specify the shell script to run, use the extend directive in the /etc/snmp/snmpd.conf file. Once defined, the Agent will provide the exit code and any output of the command over SNMP. The example below demonstrates this mechanism with a script which determines the number of httpd processes in the process table.

Note

The Net-SNMP Agent also provides a built-in mechanism for checking the process table via the proc directive. See the snmpd.conf(5) manual page for more information.

The exit code of the following shell script is the number of httpd processes running on the system at a given point in time:

#!/bin/sh

NUMPIDS=pgrep httpd | wc -l

exit $NUMPIDS

To make this script available over SNMP, copy the script to a location on the system path, set the executable bit, and add an extend directive to the /etc/snmp/snmpd.conf file. The format of the extend directive is the following:

extend name prog args

… where name is an identifying string for the extension, prog is the program to run, and args are the arguments to give the program. For instance, if the above shell script is copied to /usr/local/bin/check_apache.sh, the following directive will add the script to the SNMP tree:

extend httpd_pids /bin/sh /usr/local/bin/check_apache.sh

The script can then be queried at NET-SNMP-EXTEND-MIB::nsExtendObjects:

~]$ snmpwalk localhost NET-SNMP-EXTEND-MIB::nsExtendObjects
NET-SNMP-EXTEND-MIB::nsExtendNumEntries.0 = INTEGER: 1
NET-SNMP-EXTEND-MIB::nsExtendCommand."httpd_pids" = STRING: /bin/sh
NET-SNMP-EXTEND-MIB::nsExtendArgs."httpd_pids" = STRING: /usr/local/bin/check_apache.sh
NET-SNMP-EXTEND-MIB::nsExtendInput."httpd_pids" = STRING:
NET-SNMP-EXTEND-MIB::nsExtendCacheTime."httpd_pids" = INTEGER: 5
NET-SNMP-EXTEND-MIB::nsExtendExecType."httpd_pids" = INTEGER: exec(1)
NET-SNMP-EXTEND-MIB::nsExtendRunType."httpd_pids" = INTEGER: run-on-read(1)
NET-SNMP-EXTEND-MIB::nsExtendStorage."httpd_pids" = INTEGER: permanent(4)
NET-SNMP-EXTEND-MIB::nsExtendStatus."httpd_pids" = INTEGER: active(1)
NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."httpd_pids" = STRING:
NET-SNMP-EXTEND-MIB::nsExtendOutputFull."httpd_pids" = STRING:
NET-SNMP-EXTEND-MIB::nsExtendOutNumLines."httpd_pids" = INTEGER: 1
NET-SNMP-EXTEND-MIB::nsExtendResult."httpd_pids" = INTEGER: 8
NET-SNMP-EXTEND-MIB::nsExtendOutLine."httpd_pids".1 = STRING:

Note that the exit code ("8" in this example) is provided as an INTEGER type and any output is provided as a STRING type. To expose multiple metrics as integers, supply different arguments to the script using the extend directive. For example, the following shell script can be used to determine the number of processes matching an arbitrary string, and will also output a text string giving the number of processes:

#!/bin/sh

PATTERN=$1
NUMPIDS=pgrep $PATTERN | wc -l

echo "There are $NUMPIDS $PATTERN processes."
exit $NUMPIDS

The following /etc/snmp/snmpd.conf directives will give both the number of httpd PIDs as well as the number of snmpd PIDs when the above script is copied to /usr/local/bin/check_proc.sh:

extend httpd_pids /bin/sh /usr/local/bin/check_proc.sh httpd
extend snmpd_pids /bin/sh /usr/local/bin/check_proc.sh snmpd

The following example shows the output of an snmpwalk of the nsExtendObjects OID:

~]$ snmpwalk localhost NET-SNMP-EXTEND-MIB::nsExtendObjects
NET-SNMP-EXTEND-MIB::nsExtendNumEntries.0 = INTEGER: 2
NET-SNMP-EXTEND-MIB::nsExtendCommand."httpd_pids" = STRING: /bin/sh
NET-SNMP-EXTEND-MIB::nsExtendCommand."snmpd_pids" = STRING: /bin/sh
NET-SNMP-EXTEND-MIB::nsExtendArgs."httpd_pids" = STRING: /usr/local/bin/check_proc.sh httpd
NET-SNMP-EXTEND-MIB::nsExtendArgs."snmpd_pids" = STRING: /usr/local/bin/check_proc.sh snmpd
NET-SNMP-EXTEND-MIB::nsExtendInput."httpd_pids" = STRING:
NET-SNMP-EXTEND-MIB::nsExtendInput."snmpd_pids" = STRING:
...
NET-SNMP-EXTEND-MIB::nsExtendResult."httpd_pids" = INTEGER: 8
NET-SNMP-EXTEND-MIB::nsExtendResult."snmpd_pids" = INTEGER: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."httpd_pids".1 = STRING: There are 8 httpd processes.
NET-SNMP-EXTEND-MIB::nsExtendOutLine."snmpd_pids".1 = STRING: There are 1 snmpd processes.
Warning

Integer exit codes are limited to a range of 0–255. For values that are likely to exceed 256, either use the standard output of the script (which will be typed as a string) or a different method of extending the agent.

This last example shows a query for the free memory of the system and the number of httpd processes. This query could be used during a performance test to determine the impact of the number of processes on memory pressure:

~]$ snmpget localhost \
  'NET-SNMP-EXTEND-MIB::nsExtendResult."httpd_pids"' \
  UCD-SNMP-MIB::memAvailReal.0
NET-SNMP-EXTEND-MIB::nsExtendResult."httpd_pids" = INTEGER: 8
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 799664 kB

21.7.5.2. Extending Net-SNMP with Perl

Executing shell scripts using the extend directive is a fairly limited method for exposing custom application metrics over SNMP. The Net-SNMP Agent also provides an embedded Perl interface for exposing custom objects. The net-snmp-perl package in the Optional channel provides the NetSNMP::agent Perl module that is used to write embedded Perl plug-ins on Red Hat Enterprise Linux.

Note

Before subscribing to the Optional and Supplementary channels see the Scope of Coverage Details. If you decide to install packages from these channels, follow the steps documented in the article called How to access Optional and Supplementary channels, and -devel packages using Red Hat Subscription Manager (RHSM)? on the Red Hat Customer Portal.

The NetSNMP::agent Perl module provides an agent object which is used to handle requests for a part of the agent’s OID tree. The agent object’s constructor has options for running the agent as a sub-agent of snmpd or a standalone agent. No arguments are necessary to create an embedded agent:

use NetSNMP::agent (':all');

my $agent = new NetSNMP::agent();

The agent object has a register method which is used to register a callback function with a particular OID. The register function takes a name, OID, and pointer to the callback function. The following example will register a callback function named hello_handler with the SNMP Agent which will handle requests under the OID .1.3.6.1.4.1.8072.9999.9999:

$agent->register("hello_world", ".1.3.6.1.4.1.8072.9999.9999",
         \&hello_handler);
Note

The OID .1.3.6.1.4.1.8072.9999.9999 (NET-SNMP-MIB::netSnmpPlaypen) is typically used for demonstration purposes only. If your organization does not already have a root OID, you can obtain one by contacting an ISO Name Registration Authority (ANSI in the United States).

The handler function will be called with four parameters, HANDLER, REGISTRATION_INFO, REQUEST_INFO, and REQUESTS. The REQUESTS parameter contains a list of requests in the current call and should be iterated over and populated with data. The request objects in the list have get and set methods which allow for manipulating the OID and value of the request. For example, the following call will set the value of a request object to the string "hello world":

$request->setValue(ASN_OCTET_STR, "hello world");

The handler function should respond to two types of SNMP requests: the GET request and the GETNEXT request. The type of request is determined by calling the getMode method on the request_info object passed as the third parameter to the handler function. If the request is a GET request, the caller will expect the handler to set the value of the request object, depending on the OID of the request. If the request is a GETNEXT request, the caller will also expect the handler to set the OID of the request to the next available OID in the tree. This is illustrated in the following code example:

my $request;
my $string_value = "hello world";
my $integer_value = "8675309";

for($request = $requests; $request; $request = $request->next()) {
 my $oid = $request->getOID();
 if ($request_info->getMode() == MODE_GET) {
  if ($oid == new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.0")) {
   $request->setValue(ASN_OCTET_STR, $string_value);
  }
  elsif ($oid == new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.1")) {
   $request->setValue(ASN_INTEGER, $integer_value);
  }
 } elsif ($request_info->getMode() == MODE_GETNEXT) {
  if ($oid == new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.0")) {
   $request->setOID(".1.3.6.1.4.1.8072.9999.9999.1.1");
   $request->setValue(ASN_INTEGER, $integer_value);
  }
  elsif ($oid < new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.0")) {
   $request->setOID(".1.3.6.1.4.1.8072.9999.9999.1.0");
   $request->setValue(ASN_OCTET_STR, $string_value);
  }
 }
}

When getMode returns MODE_GET, the handler analyzes the value of the getOID call on the request object. The value of the request is set to either string_value if the OID ends in ".1.0", or set to integer_value if the OID ends in ".1.1". If the getMode returns MODE_GETNEXT, the handler determines whether the OID of the request is ".1.0", and then sets the OID and value for ".1.1". If the request is higher on the tree than ".1.0", the OID and value for ".1.0" is set. This in effect returns the "next" value in the tree so that a program like snmpwalk can traverse the tree without prior knowledge of the structure.

The type of the variable is set using constants from NetSNMP::ASN. See the perldoc for NetSNMP::ASN for a full list of available constants.

The entire code listing for this example Perl plug-in is as follows:

#!/usr/bin/perl

use NetSNMP::agent (':all');
use NetSNMP::ASN qw(ASN_OCTET_STR ASN_INTEGER);

sub hello_handler {
 my ($handler, $registration_info, $request_info, $requests) = @_;
 my $request;
 my $string_value = "hello world";
 my $integer_value = "8675309";

 for($request = $requests; $request; $request = $request->next()) {
  my $oid = $request->getOID();
  if ($request_info->getMode() == MODE_GET) {
   if ($oid == new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.0")) {
    $request->setValue(ASN_OCTET_STR, $string_value);
   }
   elsif ($oid == new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.1")) {
    $request->setValue(ASN_INTEGER, $integer_value);
   }
  } elsif ($request_info->getMode() == MODE_GETNEXT) {
   if ($oid == new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.0")) {
    $request->setOID(".1.3.6.1.4.1.8072.9999.9999.1.1");
    $request->setValue(ASN_INTEGER, $integer_value);
   }
   elsif ($oid < new NetSNMP::OID(".1.3.6.1.4.1.8072.9999.9999.1.0")) {
    $request->setOID(".1.3.6.1.4.1.8072.9999.9999.1.0");
    $request->setValue(ASN_OCTET_STR, $string_value);
   }
  }
 }
}

my $agent = new NetSNMP::agent();
$agent->register("hello_world", ".1.3.6.1.4.1.8072.9999.9999",
         \&hello_handler);

To test the plug-in, copy the above program to /usr/share/snmp/hello_world.pl and add the following line to the /etc/snmp/snmpd.conf configuration file:

perl do "/usr/share/snmp/hello_world.pl"

The SNMP Agent Daemon will need to be restarted to load the new Perl plug-in. Once it has been restarted, an snmpwalk should return the new data:

~]$ snmpwalk localhost NET-SNMP-MIB::netSnmpPlaypen
NET-SNMP-MIB::netSnmpPlaypen.1.0 = STRING: "hello world"
NET-SNMP-MIB::netSnmpPlaypen.1.1 = INTEGER: 8675309

The snmpget should also be used to exercise the other mode of the handler:

~]$ snmpget localhost \
  NET-SNMP-MIB::netSnmpPlaypen.1.0 \
  NET-SNMP-MIB::netSnmpPlaypen.1.1
NET-SNMP-MIB::netSnmpPlaypen.1.0 = STRING: "hello world"
NET-SNMP-MIB::netSnmpPlaypen.1.1 = INTEGER: 8675309

21.8. Additional Resources

To learn more about gathering system information, see the following resources.

21.8.1. Installed Documentation

  • lscpu(1) — The manual page for the lscpu command.
  • lsusb(8) — The manual page for the lsusb command.
  • findmnt(8) — The manual page for the findmnt command.
  • blkid(8) — The manual page for the blkid command.
  • lsblk(8) — The manual page for the lsblk command.
  • ps(1) — The manual page for the ps command.
  • top(1) — The manual page for the top command.
  • free(1) — The manual page for the free command.
  • df(1) — The manual page for the df command.
  • du(1) — The manual page for the du command.
  • lspci(8) — The manual page for the lspci command.
  • snmpd(8) — The manual page for the snmpd service.
  • snmpd.conf(5) — The manual page for the /etc/snmp/snmpd.conf file containing full documentation of available configuration directives.