Chapter 6. Alerts

6.1. Assigning the Notifier Role

  1. From the settings menu, select Configuration.
  2. Click the Settings accordion, and select the Red Hat CloudForms server.
  3. From the Server Control tab, select the Notifier role.

    6255

  4. Click Save.

6.2. Creating an Alert

This section outlines the basic procedure to create an alert.

Note

To send emails or SNMP traps from the Red Hat CloudForms server, you must enable the Notifier server role and set up SMTP email or SNMP traps. For more information, see General Configuration.

  1. Navigate to ControlExplorer.
  2. Click the Alerts accordion, then click 1847 (Configuration), 1862 (Add a New Alert).
  3. Enter the basic details of the alert:

    1. Enter a description in the Description field.
    2. Select the Active check box to enable the alert after creation.
    3. Select the severity level from the Severity list.
    4. Select the inventory item on which to base the alert from the Based On list.
    5. Select the type of event that triggers the alert from the What to Evaluate list.
    6. Select the frequency with which to be notified if the event log threshold is reached from the Notification Frequency list.
  4. Configure the parameters of the alert.

    Note

    The available parameters depend on the options you selected in the Based On and What to Evaluate lists. See later sections for the details of these parameters.

  5. Optionally, select Send an E-mail to configure options so that an email is sent when the alert is triggered:

    1. Enter the email address from which to send the email in the From field.
    2. Select a user from the Add a User list to add a user registered in Red Hat CloudForms. The user must have a valid email address entered under accounts.
    3. Enter the email address of a user in the Add (enter manually) field and click 2261 to add a user not registered in Red Hat CloudForms.
  6. Optionally, select Send an SNMP Trap to configure options so that an SNMP trap is sent when the alert is triggered:

    1. Enter the IP addresses of the hosts to send the trap in the Host fields.
    2. Select the version of SNMP to use from the Version list:

      1. If you select v1, enter a trap number in the Trap Number field. Enter 1, 2, or 3, based on the appropriate suffix number in Table 6.1, “SNMP Trap Identifiers”.
      2. If you select v2, enter a trap object ID in the Trap Object ID field. Enter info, warning, or critical based on the values in Table 6.1, “SNMP Trap Identifiers”.

        Table 6.1. SNMP Trap Identifiers

        Object IDSuffix Number Added to PENPEN with the Suffix Added

        info

        1

        1.3.6.1.4.1.33482.1

        warn, warning

        2

        1.3.6.1.4.1.33482.2

        crit, critical, error

        3

        1.3.6.1.4.1.33482.3

  7. Optionally, select Show on Timeline to show the alert as an event on the Red Hat CloudForms timeline. The alert shows as part of the Alarm/Status Change/Errors category.
  8. Optionally, select Send a Management Event to trigger an automation event:

    1. Enter the name of the event that exists in the Process/Event Class in the Event Name field.
  9. Click Add.

6.3. Creating a Hardware Reconfigured Alert

Use a hardware reconfigure alert to detect changes to the amount of memory or the number of CPUs on a virtual machine.

  1. Navigate to ControlExplorer.
  2. Click the Alerts accordion, then click 1847 (Configuration), 1862 (Add a new Alert).
  3. In the Info area:

    • Type in a description for the alert.
    • From Based On, select VM and Instance.
    • From What to Evaluate, select Hardware Reconfigured.
    • In Notification Frequency, select how often you want to be notified if hardware reconfiguration is detected.
  4. From Hardware Attribute, select Number of CPUs. From the next dropdown, select Decreased.

    1973

  5. After setting the parameters, select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  6. Click Add.

6.4. Creating a Normal Operating Range Alert

Normal operating range alerts enables you to be notified when the normal operating range is exceeded, or falls below for a period of time from 1 minute to 2 hours. Capacity and utilization must be enabled for normal operating ranges to be calculated. See General Configuration for more information.

  1. Navigate to ControlExplorer.
  2. Click the Alerts accordion, then click 1847 (Configuration), 1862 (Add a new Alert).
  3. In the Info area:

    • Type in a Description for the alert.
    • From Based On, select VM and Instance.
    • For What to Evaluate, select Normal Operating Range.
    • In Notification Frequency, select how often you want to be notified if the performance threshold is reached.
  4. Set the threshold in the Normal Operating Range Parameters area.

    1976

    • From Performance Field, select the field to check and whether you want to be notified if the field is exceeded or fell below.
    • In Field Meets Criteria for, select the amount of time that the threshold requires to be met to trigger the alert.
  5. After setting the parameters, you then select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process. See Section 6.2, “Creating an Alert”.
  6. Click Add.

6.5. Creating a Real Time Performance Alert

Real Time Performance alerts enables you to be notified immediately when a performance threshold has been met for a virtual machine, host, or cluster. Capacity and Utilization must be enabled for performance thresholds to be detected. See General Configuration for more information.

  1. Navigate to ControlExplorer.
  2. Click the Alert accordion, then click 1847 (Configuration), 1862 (Add a new Alert).
  3. In the Info area:

    • Type in a Description for the alert.
    • From Based On, select VM and Instance.
    • For What to Evaluate, select Real Time Performance.
    • In Notification Frequency, select how often you want to be notified if the performance threshold is reached.
  4. Set the threshold in the Real Time Performance Parameters area.

    1978

    • From Performance Field, select the field to check and any other parameters required for that field.
    • In And is Trending, select Don’t Care if it does not matter how the performance metric is trending. Otherwise, choose from the possible trending options.
    • In Field Meets Criteria for, select the amount of time that the threshold requires to be met to trigger the alert.
    • Set Debug Tracing to true only when directed to do so by Red Hat Support. This provides an extremely detailed level of logging and can result in many more log lines being written.
  5. After setting the parameters, you then select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  6. Click Add.

6.6. Creating an Hourly Performance Alert

Hourly performance alerts enable you to be notified immediately when an hourly performance threshold has been met for a cluster. Capacity and Utilization must be enabled for performance thresholds to be detected. See General Configuration for instructions.

  1. Navigate to ControlExplorer.
  2. Click the Alerts accordion.
  3. Click 1847 (Configuration), 1862 (Add a new Alert).
  4. In the Info area:

    1979

    • Type in a Description for the alert.
    • From Based On, select Cluster.
    • For What to Evaluate, select Hourly Performance.
    • In Notification Frequency, select how often you want to be notified if threshold is met.
  5. In the Hourly Performance Parameters area select performance field and the criteria. You can also select options from the And is Trending dropdown box and whether the Debug Tracing is true or false.
  6. After setting the parameters, you then select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  7. Click Add.

6.7. Creating a hostd Log Threshold Alert

Use the hostd Log Threshold to send a notification when certain items are found in the event logs for a host. A default analysis profile with event log items is required for this feature. The following example shows steps to check the host’s log for a failure to validate a virtual machine’s IP address.

  1. Navigate to ControlExplorer.
  2. Click the Alert accordion.
  3. Click 1847 (Configuration), 1862 (Add a new Alert).
  4. In the Info area:

    hostdLogAlert

    • Type in a Description for the alert.
    • From Based On, select Host/Node.
    • For What to Evaluate, select Hostd Log Threshold.
    • In Notification Frequency, select how often you want to be notified if the log item is detected.
  5. In the Hostd Log Threshold Parameters area, select the parameters for the event log message. You can set a threshold for a filter, level, or message source.

    • Use Message Filter to look for specific text in a message. Use Message Level to filter based on message level. Red Hat CloudForms reports on the specified level and above. Use Message Source to filter log messages based on its source.
    • Set How Far Back to Check in days you want to look for this message.
    • If you only want an alert triggered when the log message has occurred a certain number of times, type the number in Event Count Threshold.
  6. After setting the parameters, select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  7. Click Add.

6.8. Creating a VMware Alarm Alert

Red Hat CloudForms can use VMware alarms as a trigger for an alert. This type of alert can be created for a cluster, host, or virtual machine.

  1. Navigate to ControlExplorer.
  2. Click the Alerts accordion, then click 1847 (Configuration), 1862 (Add a new Alert).
  3. In the Info area:

    • Type in a description for the alert.
    • From Based On, select Cluster, Host, or VM.
    • For What to Evaluate, select VMware Alarm.
    • In Notification Frequency, select how often you want to be notified if the log item is detected.
  4. In the VMware Alarm Parameters area select the provider and alarm.

    1984

  5. After setting the parameters, you then select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  6. Click Add.

6.9. Creating an Expression Alert

Expression alerts enables you to create a notification based on any possible criteria for clusters, datastores, hosts, and virtual machines. The following procedure creates an alert for when a host’s datastore has less than 5% free space.

  1. Navigate to ControlExplorer.
  2. Click on the Alerts accordion, then click 1847 (Configuration), 1862 (Add a new Alert).
  3. In the Info area:

    expressionAlerts

    • Type in a description for the alert.
    • From Based On, select Host/Node.
    • For What to Evaluate, select Expression (Custom).
    • In Notification Frequency, select how often you want to be notified if the expression is evaluated to true.
  4. Use the expression editor to create your expression. This is the same expression editor used to create Conditions. For details on how to use the expression editor, see the Policies and Profiles Guide.

    expressionEditor

  5. Click 1863 (Commit expression element changes) to accept the expression.
  6. After setting the parameters, you then select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  7. Click Add.

6.10. Create an Operational Alert

  1. Navigate to ControlExplorer.
  2. Click on the Alerts accordion, then click 1847 (Configuration), 1862 (Add a new Alert).
  3. In the Info area:

    • Type in a description for the alert.
    • Check Active when you feel that the alert is ready to be enabled.
    • From Based On, select Server.
    • Select the appropriate driving event.
    • In Notification Frequency, select how often you want to be notified if the event log threshold is reached.
  4. After setting the parameters, select what you want the alert to do. You can send an email, create an SNMP Trap, let the alert show on the timeline, or send a management event to start an automation process.
  5. Click Add.

6.11. Operational Alert Types

Table 6.2. Operational Alerts

Driving EventExplanation (Thresholds, Description)Proposed Action and Next Steps

EVM Server Start

Alert is raised when an server starts.

Send e-email. This is a notification.

EVM Server Stop

Alert is raised when a server stops.

Send e-mail. Review logs to see why the server stopped.

EVM Server Not Responding

Alert is raised when one server detects that another server has not responded in 2 minutes.

This is a sign of a problem that should be investigated. Check logs.

EVM Server Exceeded Memory Limit

Alert is raised when an server has exceeded its system memory limit and begins killing workers. Default is 80%.

Threshold configured in Advanced Settings.

server:

:worker_monitor:

:kill_algorithm:

:name: :used_swap_percent_gt_value

:value: 80

This may be caused by the following issues:

The server is running with too few resources.

The server is enabled with too many roles or number of workers.

The server picked up all the roles because another server has failed.

A runaway process has taken up most of the memory.

EVM Server is Master

When one server takes over as a master server.

Typically, this should only occur when first starting a set of servers, perhaps following expected outages. If a server picks up as master in other situations, the old master had an issue that needs to be researched (such as server not responding in time).

EVM Server High System Disk Usage

The server’s system disk is 80% full. This check is run as part of a system schedule.

Threshold configured in Advanced Settings.

server:

events:

:disk_usage_gt_percent: 80

Temp files used by the operating system may be filling the disk. Yum updates, normal /tmp files, or temp files in /var/lib/data/miqtemp/ may cause the problem.

EVM Server High App Disk Usage

The server’s app disk is 80% full. This check is run as part of a system schedule.

Threshold configured in Advanced Settings.

server:

events:

:disk_usage_gt_percent: 80

Server temp files may remain.

EVM Server High Log Disk Usage

The server’s log disk is 80% full. This check is run as part of a system schedule.

Threshold configured in Advanced Settings.

server:

events:

:disk_usage_gt_percent: 80

Logs are getting too big or are not being log rotated properly every day. Review the most recent logs.

EVM Server High DB Disk Usage

The server’s db disk is 80% full. This check is run as part of a system schedule. Applies if using PostgreSQL as the VDMB.

Threshold configured in Advanced Settings.

server:

events:

:disk_usage_gt_percent: 80

Database or database logging is getting too large. May require full vacuuming of PostgreSQL database.

EVM Worker Started

Alert is raised when a worker is about to start.

This is a notification. Failover may trigger this alert.

EVM Worker Stopped

Alert is raised when a worker is requested to stop.

Review logs for reason in the event the worker was not purposefully stopped.

EVM Worker Killed

Alert is raised when a non- responsive worker does not restart on its own and is killed.

Review logs for reason the worker was killed. May be the result of EVM Worker Not Responding.

EVM Worker Not Responding

Alert is raised when a worker has not responded for 2 minutes (:heartbeat_timeout) or has not started within 10 minutes (:starting_timeout).

An influx of events from the Virtual Center or Red Hat Virtualization causes an inability of EVM/CFME to handle the capacity at which they are being queued. Utilize the Event Handler Configuration to filter events that are causing problematic queue table growth.

EVM Worker Exceeded Memory Limit

Alert is raised when a worker exceeds the memory threshold. The default is 150 MB, but some workers have their own value in the :memory_threshold section for that specific worker.

Review logs for reason the worker is exceeding the memory limit. This may be the result of an overload to the worker process that requires further investigation.

EVM Worker Exceeded Uptime Limit

Alert is raised when a worker has been running longer than the :restart_interval. (Most workers are set to never restart using the 0.hours setting.) The EMS Refresh SmartProxy workers are set to restart every 2 hours.

Review logs for the reason the worker is exceeding the limit. This may be the result of an overload to the worker process that needs further investigation.

EVM Worker Exit File

Alert is raised when the scheduler worker exits due to a pending large ntp time change.

This is a notification.

6.12. Editing an Alert

After creating an alert, you can edit the threshold, expression, or the notification type.

  1. Navigate to ControlExplorer.
  2. Click on the Alerts accordion, then click on the alert that you need to edit.
  3. Click 1847 (Configuration), 1851 (Edit this Alert).
  4. Make the required changes.
  5. Click Save.

6.13. Copying an Alert

You can copy an existing alert to create a new alert that is similar to the existing one, then change the values associated with it.

  1. Navigate to ControlExplorer.
  2. Click on the Alert accordion, then click on the alert that you want to copy.
  3. Click 1847 (Configuration), 1859 (Copy this Alert). Click OK to confirm.
  4. Make the required changes.
  5. Click Add.

6.14. Deleting an Alert

When an alert is no longer needed, you can remove it from your VMDB.

  1. Navigate to ControlExplorer.
  2. Click on the Alerts accordion, then click on the alert that you want to delete.
  3. Click 1847 (Configuration), 1861 (Delete this Alert).
  4. Click OK to confirm.

6.15. Evaluating an Alert

  1. Navigate to ControlExplorer.
  2. Click the Actions accordion, then click 1847 (Configuration), 1862 (Add a new Action).
  3. Type in a Description for the action.

    1911

  4. Select Evaluate Alerts from Action Type.
  5. Select the alerts to be evaluated and click 1876 (Move selected Alerts into this Action). Use the Ctrl key to select multiple alerts.

    1912

  6. Click Add.