Chapter 3. Installing the core components of Service Telemetry Framework
You can use Operators to load the Service Telemetry Framework (STF) components and objects. Operators manage each of the following STF core and community components:
- AMQ Interconnect
- Smart Gateway
- Prometheus and AlertManager
- ElasticSearch
- Grafana
Prerequisites
- An Red Hat OpenShift Container Platform version inclusive of 4.7 through 4.8 is running.
- You have prepared your Red Hat OpenShift Container Platform environment and ensured that there is persistent storage and enough resources to run the STF components on top of the Red Hat OpenShift Container Platform environment. For more information, see Service Telemetry Framework Performance and Scaling.
STF is compatible with Red Hat OpenShift Container Platform version 4.7 through 4.8.
Additional resources
- For more information about Operators, see the Understanding Operators guide.
3.1. Deploying Service Telemetry Framework to the Red Hat OpenShift Container Platform environment
Deploy Service Telemetry Framework (STF) to collect, store, and monitor events:
Procedure
Create a namespace to contain the STF components, for example,
service-telemetry
:$ oc new-project service-telemetry
Create an OperatorGroup in the namespace so that you can schedule the Operator pods:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: service-telemetry-operator-group namespace: service-telemetry spec: targetNamespaces: - service-telemetry EOF
For more information, see OperatorGroups.
Enable the OperatorHub.io Community Catalog Source to install data storage and visualization Operators:
WarningRed Hat supports the core Operators and workloads, including AMQ Interconnect, AMQ Certificate Manager, Service Telemetry Operator, and Smart Gateway Operator. Red Hat does not support the community Operators or workload components, inclusive of ElasticSearch, Prometheus, Alertmanager, Grafana, and their Operators.
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: operatorhubio-operators namespace: openshift-marketplace spec: sourceType: grpc image: quay.io/operatorhubio/catalog:latest displayName: OperatorHub.io Operators publisher: OperatorHub.io EOF
Subscribe to the AMQ Certificate Manager Operator by using the redhat-operators CatalogSource:
NoteThe AMQ Certificate Manager deploys to the
openshift-operators
namespace and is then available to all namespaces across the cluster. As a result, on clusters with a large number of namespaces, it can take several minutes for the Operator to be available in theservice-telemetry
namespace. The AMQ Certificate Manager Operator is not compatible with the dependency management of Operator Lifecycle Manager when you use it with other namespace-scoped operators.$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: amq7-cert-manager-operator namespace: openshift-operators spec: channel: 1.x installPlanApproval: Automatic name: amq7-cert-manager-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Validate your ClusterServiceVersion. Ensure that amq7-cert-manager.v1.0.1 displays a phase of
Succeeded
:$ oc get --namespace openshift-operators csv NAME DISPLAY VERSION REPLACES PHASE amq7-cert-manager.v1.0.3 Red Hat Integration - AMQ Certificate Manager 1.0.3 amq7-cert-manager.v1.0.2 Succeeded
If you plan to store events in ElasticSearch, you must enable the Elastic Cloud on Kubernetes (ECK) Operator. To enable the ECK Operator, create the following manifest in your Red Hat OpenShift Container Platform environment:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: elasticsearch-eck-operator-certified namespace: service-telemetry spec: channel: stable installPlanApproval: Automatic name: elasticsearch-eck-operator-certified source: certified-operators sourceNamespace: openshift-marketplace EOF
Verify that the ClusterServiceVersion for Elastic Cloud on Kubernetes
Succeeded
:$ oc get csv NAME DISPLAY VERSION REPLACES PHASE ... elasticsearch-eck-operator-certified.1.9.1 Elasticsearch (ECK) Operator 1.9.1 Succeeded ...
Create the Smart Gateway Operator subscription to manage the Smart Gateway instances:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: smart-gateway-operator namespace: service-telemetry spec: channel: stable-1.3 installPlanApproval: Automatic name: smart-gateway-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Create the Service Telemetry Operator subscription to manage the STF instances:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: service-telemetry-operator namespace: service-telemetry spec: channel: stable-1.3 installPlanApproval: Automatic name: service-telemetry-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Validate the Service Telemetry Operator and the dependent operators:
$ oc get csv --namespace service-telemetry NAME DISPLAY VERSION REPLACES PHASE amq7-cert-manager.v1.0.3 Red Hat Integration - AMQ Certificate Manager 1.0.3 amq7-cert-manager.v1.0.2 Succeeded amq7-interconnect-operator.v1.10.5 Red Hat Integration - AMQ Interconnect 1.10.5 amq7-interconnect-operator.v1.10.4 Succeeded elasticsearch-eck-operator-certified.1.9.1 Elasticsearch (ECK) Operator 1.9.1 Succeeded prometheusoperator.0.47.0 Prometheus Operator 0.47.0 prometheusoperator.0.37.0 Succeeded service-telemetry-operator.v1.3.1635451892 Service Telemetry Operator 1.3.1635451892 Succeeded smart-gateway-operator.v3.0.1635451893 Smart Gateway Operator 3.0.1635451893 Succeeded
3.2. Creating a ServiceTelemetry object in Red Hat OpenShift Container Platform
Create a ServiceTelemetry
object in Red Hat OpenShift Container Platform to result in the Service Telemetry Operator creating the supporting components for a Service Telemetry Framework (STF) deployment. For more information, see Section 3.2.1, “Primary parameters of the ServiceTelemetry object”.
Procedure
To create a
ServiceTelemetry
object that results in an STF deployment that uses the default values, create aServiceTelemetry
object with an emptyspec
parameter:$ oc apply -f - <<EOF apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: {} EOF
To override a default value, define the parameter that you want to override. In this example, enable ElasticSearch by setting
enabled
totrue
:$ oc apply -f - <<EOF apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: backends: events: elasticsearch: enabled: true EOF
Creating a
ServiceTelemetry
object with an emptyspec
parameter results in an STF deployment with the following default settings:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default spec: alerting: alertmanager: storage: persistent: pvcStorageRequest: 20G storageSelector: {} receivers: snmpTraps: enabled: false target: 192.168.24.254 strategy: persistent enabled: true backends: events: elasticsearch: enabled: false storage: persistent: pvcStorageRequest: 20Gi storageSelector: {} strategy: persistent metrics: prometheus: enabled: true scrapeInterval: 10s storage: persistent: pvcStorageRequest: 20G storageSelector: {} retention: 24h strategy: persistent graphing: enabled: false grafana: adminPassword: secret adminUser: root disableSignoutMenu: false ingressEnabled: false baseImage: docker.io/grafana/grafana:8.1.2 highAvailability: enabled: false transports: qdr: enabled: true web: enabled: false clouds: - name: cloud1 metrics: collectors: - collectorType: collectd subscriptionAddress: collectd/telemetry debugEnabled: false - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/metering.sample debugEnabled: false - collectorType: sensubility subscriptionAddress: sensubility/telemetry debugEnabled: false events: collectors: - collectorType: collectd subscriptionAddress: collectd/notify debugEnabled: false - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/event.sample debugEnabled: false
To override these defaults, add the configuration to the
spec
parameter.View the STF deployment logs in the Service Telemetry Operator:
$ oc logs --selector name=service-telemetry-operator ... --------------------------- Ansible Task Status Event StdOut ----------------- PLAY RECAP ********************************************************************* localhost : ok=57 changed=0 unreachable=0 failed=0 skipped=20 rescued=0 ignored=0
Verification
To determine that all workloads are operating correctly, view the pods and the status of each pod.
NoteIf you set the
backends.events.elasticsearch.enabled
parameter totrue
, the notification Smart Gateways reportError
andCrashLoopBackOff
error messages for a period of time before ElasticSearch starts.$ oc get pods NAME READY STATUS RESTARTS AGE alertmanager-default-0 2/2 Running 0 17m default-cloud1-ceil-meter-smartgateway-6484b98b68-vd48z 2/2 Running 0 17m default-cloud1-coll-meter-smartgateway-799f687658-4gxpn 2/2 Running 0 17m default-cloud1-sens-meter-smartgateway-c7f4f7fc8-c57b4 2/2 Running 0 17m default-interconnect-54658f5d4-pzrpt 1/1 Running 0 17m elastic-operator-66b7bc49c4-sxkc2 1/1 Running 0 52m interconnect-operator-69df6b9cb6-7hhp9 1/1 Running 0 50m prometheus-default-0 2/2 Running 1 17m prometheus-operator-6458b74d86-wbdqp 1/1 Running 0 51m service-telemetry-operator-864646787c-hd9pm 1/1 Running 0 51m smart-gateway-operator-79778cf548-mz5z7 1/1 Running 0 51m
3.2.1. Primary parameters of the ServiceTelemetry object
The ServiceTelemetry
object comprises the following primary configuration parameters:
-
alerting
-
backends
-
clouds
-
graphing
-
highAvailability
-
transports
You can configure each of these configuration parameters to provide different features in an STF deployment.
Support for servicetelemetry.infra.watch/v1alpha1
was removed from STF 1.3.
The backends parameter
Use the backends
parameter to control which storage back ends are available for storage of metrics and events, and to control the enablement of Smart Gateways that the clouds
parameter defines. For more information, see the section called “The clouds parameter”.
Currently, you can use Prometheus as the metrics storage back end and ElasticSearch as the events storage back end.
Enabling Prometheus as a storage back end for metrics
To enable Prometheus as a storage back end for metrics, you must configure the ServiceTelemetry
object.
Procedure
Configure the
ServiceTelemetry
object:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: backends: metrics: prometheus: enabled: true
Configuring persistent storage for Prometheus
Use the additional parameters that are defined in backends.metrics.prometheus.storage.persistent
to configure persistent storage options for Prometheus, such as storage class and volume size.
Use storageClass
to define the back end storage class. If you do not set this parameter, the Service Telemetry Operator uses the default storage class for the Red Hat OpenShift Container Platform cluster.
Use the pvcStorageRequest
parameter to define the minimum required volume size to satisfy the storage request. If volumes are statically defined, it is possible that a volume size larger than requested is used. By default, Service Telemetry Operator requests a volume size of 20G
(20 Gigabytes).
Procedure
List the available storage classes:
$ oc get storageclasses NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE csi-manila-ceph manila.csi.openstack.org Delete Immediate false 20h standard (default) kubernetes.io/cinder Delete WaitForFirstConsumer true 20h standard-csi cinder.csi.openstack.org Delete WaitForFirstConsumer true 20h
Configure the
ServiceTelemetry
object:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: backends: metrics: prometheus: enabled: true storage: strategy: persistent persistent: storageClass: standard-csi pvcStorageRequest: 50G
Enabling ElasticSearch as a storage back end for events
To enable ElasticSearch as a storage back end for events, you must configure the ServiceTelemetry
object.
Procedure
Configure the
ServiceTelemetry
object:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: backends: events: elasticsearch: enabled: true
Configuring persistent storage for ElasticSearch
Use the additional parameters defined in backends.events.elasticsearch.storage.persistent
to configure persistent storage options for ElasticSearch, such as storage class and volume size.
Use storageClass
to define the back end storage class. If you do not set this parameter, the Service Telemetry Operator uses the default storage class for the Red Hat OpenShift Container Platform cluster.
Use the pvcStorageRequest
parameter to define the minimum required volume size to satisfy the storage request. If volumes are statically defined, it is possible that a volume size larger than requested is used. By default, Service Telemetry Operator requests a volume size of 20Gi
(20 Gibibytes).
Procedure
List the available storage classes:
$ oc get storageclasses NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE csi-manila-ceph manila.csi.openstack.org Delete Immediate false 20h standard (default) kubernetes.io/cinder Delete WaitForFirstConsumer true 20h standard-csi cinder.csi.openstack.org Delete WaitForFirstConsumer true 20h
Configure the
ServiceTelemetry
object:apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: default namespace: service-telemetry spec: backends: events: elasticsearch: enabled: true version: 7.16.1 storage: strategy: persistent persistent: storageClass: standard-csi pvcStorageRequest: 50G
The clouds parameter
Use the clouds
parameter to define which Smart Gateway objects deploy, thereby providing the interface for multiple monitored cloud environments to connect to an instance of STF. If a supporting back end is available, then metrics and events Smart Gateways for the default cloud configuration are created. By default, the Service Telemetry Operator creates Smart Gateways for cloud1
.
You can create a list of cloud objects to control which Smart Gateways are created for the defined clouds. Each cloud consists of data types and collectors. Data types are metrics
or events
. Each data type consists of a list of collectors, the message bus subscription address, and a parameter to enable debugging. Available collectors for metrics are collectd
, ceilometer
, and sensubility
. Available collectors for events are collectd
and ceilometer
. Ensure that the subscription address for each of these collectors is unique for every cloud, data type, and collector combination.
The default cloud1
configuration is represented by the following ServiceTelemetry
object, which provides subscriptions and data storage of metrics and events for collectd, Ceilometer, and Sensubility data collectors for a particular cloud instance:
apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: name: stf-default namespace: service-telemetry spec: clouds: - name: cloud1 metrics: collectors: - collectorType: collectd subscriptionAddress: collectd/telemetry - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/metering.sample - collectorType: sensubility subscriptionAddress: sensubility/telemetry debugEnabled: false events: collectors: - collectorType: collectd subscriptionAddress: collectd/notify - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/event.sample
Each item of the clouds
parameter represents a cloud instance. A cloud instance consists of three top-level parameters: name
, metrics
, and events
. The metrics
and events
parameters represent the corresponding back end for storage of that data type. The collectors
parameter specifies a list of objects made up of two required parameters, collectorType
and subscriptionAddress
, and these represent an instance of the Smart Gateway. The collectorType
parameter specifies data collected by either collectd, Ceilometer, or Sensubility. The subscriptionAddress
parameter provides the AMQ Interconnect address to which a Smart Gateway subscribes.
You can use the optional Boolean parameter debugEnabled
within the collectors
parameter to enable additional console debugging in the running Smart Gateway pod.
Additional resources
- For more information about deleting default Smart Gateways, see Section 4.4.3, “Deleting the default Smart Gateways”.
- For more information about how to configure multiple clouds, see Section 4.4, “Configuring multiple clouds”.
The alerting parameter
Use the alerting
parameter to control creation of an Alertmanager instance and the configuration of the storage back end. By default, alerting
is enabled. For more information, see Section 5.3, “Alerts in Service Telemetry Framework”.
The graphing parameter
Use the graphing
parameter to control the creation of a Grafana instance. By default, graphing
is disabled. For more information, see Section 5.1, “Dashboards in Service Telemetry Framework”.
The highAvailability parameter
Use the highAvailability
parameter to control the instantiation of multiple copies of STF components to reduce recovery time of components that fail or are rescheduled. By default, highAvailability
is disabled. For more information, see Section 5.5, “High availability”.
The transports parameter
Use the transports
parameter to control the enablement of the message bus for a STF deployment. The only transport currently supported is AMQ Interconnect. By default, the qdr
transport is enabled.
3.3. Removing Service Telemetry Framework from the Red Hat OpenShift Container Platform environment
Remove Service Telemetry Framework (STF) from an Red Hat OpenShift Container Platform environment if you no longer require the STF functionality.
Procedure
3.3.1. Deleting the namespace
To remove the operational resources for STF from Red Hat OpenShift Container Platform, delete the namespace.
Procedure
Run the
oc delete
command:$ oc delete project service-telemetry
Verify that the resources have been deleted from the namespace:
$ oc get all No resources found.
3.3.2. Removing the CatalogSource
If you do not expect to install Service Telemetry Framework (STF) again, delete the CatalogSource. When you remove the CatalogSource, PackageManifests related to STF are automatically removed from the Operator Lifecycle Manager catalog.
Procedure
If you enabled the OperatorHub.io Community Catalog Source during the installation process and you no longer need this catalog source, delete it:
$ oc delete --namespace=openshift-marketplace catalogsource operatorhubio-operators catalogsource.operators.coreos.com "operatorhubio-operators" deleted
Additional resources
For more information about the OperatorHub.io Community Catalog Source, see Section 3.1, “Deploying Service Telemetry Framework to the Red Hat OpenShift Container Platform environment”.