Chapter 1. Introduction to Service Assurance Framework

This section describes Service Assurance Framework and the framework architecture.

1.1. Overview of Service Assurance Framework

This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.

Service Assurance Framework (SAF) is an application that runs on the Red Hat OpenShift Container Platform (OCP). Use SAF to collect metrics and record events from the nodes in your systems that you want to monitor. The metrics and event information travels on a message bus to the server side for storage. Use this centralized information as the source for alerts, visualization, or the source of truth for orchestration frameworks.

1.2. SAF architecture

SAF uses the following components:

  • collectd to collect metrics
  • Prometheus as time-series data storage
  • ElasticSearch as events data storage
  • An AMQP 1.x compatible messaging bus to shuttle the metrics to SAF for storage in Prometheus
  • Smart Gateway

The following diagram is an overview of SAF architecture:

OpenStack SAF Overview 37 1019 arch

On the client side, collectd collects high-resolution metrics. collectd delivers the data to Prometheus by using the AMQP1 plugin, which places the data onto the message bus. On the server side, a Golang application called the Smart Gateway takes the data stream from the bus and exposes it as a local scrape endpoint for Prometheus.

Server-side SAF monitoring infrastructure consists of the following layers:

  • Service Assurance Framework 1.x (SAF)
  • Red Hat OpenShift Container Platform (OCP)
  • Infrastructure platform
SAF Overview 37 0819 deployment prereq

1.3. Installation size

The size of your installation depends on the following factors:

  • The number of nodes being monitored.
  • The number of metrics being collected.
  • The resolution of metrics.
  • The length of time that you want to store the data.

The sizing of the virtual machines for Red Hat OpenShift Container Platform has the largest impact on the hardware requirements, including the number of virtual machines. For more information about the recommended sizing for the OpenShift nodes, see Production Level Hardware Requirements in the OpenShift Container Platform 3.11 documentation.