Chapter 7. Kubernetes Event Handling

If a CloudForms installation is to effectively manage an external management system such as an OpenShift Container Platform cluster, it must be able to efficiently detect and process any events from the EMS. This section discusses the process of handling events from Kubernetes.

7.1. Event Processing Workflow

The event processing workflow involves 3 different worker types, as follows:

  1. The cluster-specific ManageIQ::Providers::Openshift::ContainerManager::EventCatcher worker polls Kubernetes for new events using an API call (see Section 7.3, “Event Catcher Configuration” for the frequency of this polling). For each new event caught a message is queued for the event handler
  2. The generic MiqEventHandler worker dequeues the message, and creates an EmsEvent EventStream object. Any Kubernetes-specific references are translated into the equivalent CloudForms object ID, and a new high priority message is queued for automate
  3. A Priority worker dequeues the message and processes it through the automate event switchboard using the EventStream object created by the MiqEventHandler. Processing the event may involve several event handler automate instances that perform actions such as:

The event workflow is illustrated in Figure 7.1, “Event Processing Workflow”

Figure 7.1. Event Processing Workflow

Screenshot


7.2. Event Types

CloudForms 4.6 detects and processes the following event types from Kubernetes.

7.3. Event Catcher Configuration

The :event_catcher section is one of the largest of the Configuration → Advanced settings, and it defines the configuration of each type of event catcher. The following extract shows the settings for the ManageIQ::Providers::Openshift::ContainerManager::EventCatcher worker, and the default settings for all EventCatcher workers:

    :event_catcher:
...
      :event_catcher_openshift:
        :poll: 1.seconds
...
      :defaults:
        :flooding_events_per_minute: 30
        :flooding_monitor_enabled: false
        :ems_event_page_size: 100
        :ems_event_thread_shutdown_timeout: 10.seconds
        :memory_threshold: 2.gigabytes
        :nice_delta: 1
        :poll: 1.seconds

The configuration settings rarely need to be changed from their defaults.

7.4. Extending Event Handling Using Automate

The automate instances that handle the processing of Kubentes events through the event switchboard are located under the /System/Event/EmsEvent/Kubernetes class in the ManageIQ domain of the automate datastore. A typical instance calls two event handler methods via relationships, the first to trigger a provider refresh, and the second to raise a CloudForms Policy Event (see Figure 7.2, “Kubernetes Event Instance”)

Figure 7.2. Kubernetes Event Instance

Screenshot


This default processing can easily be extended by calling one or more further relationships to perform any custom event processing that might be required, for example sending a warning email, or forwarding the event details via REST API to a separate monitoring tool. Figure 7.3, “Custom Event Handling Automate Domain” shows the creation of a new Integration automate domain, into which three of the event-handling instances have been copied.

Figure 7.3. Custom Event Handling Automate Domain

Screenshot


These three event instances have each been modified to call the /OpenShift/Methods/kube_alert instance from their rel6 relationship. The kube_alert instance in this example defines two attributes: to_email_address and from_email_address, and then calls the kube_alert method.

7.4.1. Automate Method

The following kube_alert automate method is a simple example illustrating how the event details could be emailed to a named recipient (defined in the instance schema, and retrieved using $evm.object['to_email_address']). The event-specific parameters such as pod, container or node are contained in the $evm.root['event_stream'] object.

cluster    = $evm.vmdb(:ems, $evm.root['event_stream'].ems_id).name

project    = $evm.root['event_stream'].container_namespace || "N/A"
pod        = $evm.root['event_stream'].container_group_name || "N/A"
container  = $evm.root['event_stream'].container_name || "N/A"
event_type = $evm.root['event_stream'].event_type
message    = $evm.root['event_stream'].message

to      = $evm.object['to_email_address']
from    = $evm.object['from_email_address']
subject = "#{event_type} event received from cluster #{cluster}"

body = "A #{event_type} event was received from cluster #{cluster}<br><br>"
body += "Project: #{project}<br>"
body += "Pod: #{pod}<br>"
body += "Container: #{container}<br>"
body += "Message: #{message}"

$evm.execute('send_email', to, from, subject, body)

7.4.1.1. Example Emails

For a node-related event the event_stream object does not contain any project, pod or container entries. An example email from the method would have the following message body:

A NODE_NODENOTSCHEDULABLE event was received from cluster OpenShift Prod

Project: N/A
Pod: N/A
Container: N/A
Message: Node node2.cloud.example.com status is now: NodeNotSchedulable

For a pod-related event the event_stream object does contain the related project, pod and container values. An example email from the method would have the following message body:

A POD_UNHEALTHY event was received from cluster OpenShift Prod

Project: default
Pod: registry-console-1-wcwx4
Container: registry-console
Message: Readiness probe failed: Get http://10.1.2.3:9090/ping: ⏎
dial tcp 10.1.2.3:9090: getsockopt: connection refused

These examples show how CloudForms can behave as a Kubernetes event broker, and forward event-specific details to a further monitoring, alerting or ticketing system via email, SNMP trap or API call.

7.5. Scaling Out

The event processing workflow can be quite resource-intensive when handling events from several possibly large OpenShift Container Platform clusters. CloudForms installations managing several thousand objects may benefit from dedicated CFME appliances exclusively running the ManageIQ::Providers::Openshift::ContainerManager::EventCatcher workers and MiqEventHandler worker in any zone containing an OpenShift Container Platform provider.