Chapter 2. Installing the core components of Service Telemetry Framework
Before you install Service Telemetry Framework (STF), ensure that Red Hat OpenShift Container Platform (OCP) version 4.x is running and that you understand the core components of the framework. As part of the OCP installation planning process, ensure that the administrator provides persistent storage and enough resources to run the STF components on top of the OCP environment.
Red Hat OpenShift Container Platform version 4.3 or later is currently required for a successful installation of STF.
2.1. The core components of STF
The following STF core components are managed by Operators:
- Prometheus and AlertManager
- ElasticSearch
- Smart Gateway
- AMQ Interconnect
Each component has a corresponding Operator that you can use to load the various application components and objects.
Additional resources
For more information about Operators, see the Understanding Operators guide.
2.2. Preparing your OCP environment for STF
As you prepare your OCP environment for STF, you must plan for persistent storage, adequate resources, and event storage:
- Ensure that persistent storage is available in your Red Hat OpenShift Container Platform cluster to permit a production grade deployment. For more information, see Section 2.2.1, “Persistent volumes”.
- Ensure that enough resources are available to run the Operators and the application containers. For more information, see Section 2.2.2, “Resource allocation”.
- To install ElasticSearch, you must use a community catalog source. If you do not want to use a community catalog or if you do not want to store events, see Section 2.3, “Deploying STF to the OCP environment”.
-
STF uses ElasticSearch to store events, which requires a larger than normal
vm.max_map_count
. Thevm.max_map_count
value is set by default in Red Hat OpenShift Container Platform. For more information about how to edit the value ofvm.max_map_count
, see Section 2.2.3, “Node tuning operator”.
2.2.1. Persistent volumes
STF uses persistent storage in OCP to instantiate the volumes dynamically so that Prometheus and ElasticSearch can store metrics and events.
Additional resources
For more information about configuring persistent storage for OCP, see Understanding persistent storage.
2.2.1.1. Using ephemeral storage
You can use ephemeral storage with STF. However, if you use ephemeral storage, you might experience data loss if a pod is restarted, updated, or rescheduled onto another node. Use ephemeral storage only for development or testing, and not production environments.
Procedure
-
To enable ephemeral storage for STF, set
storageEphemeralEnabled: true
in yourServiceTelemetry
manifest.
Additional resources
For more information about enabling ephemeral storage for STF, see Section 4.6.1, “Configuring ephemeral storage”.
2.2.2. Resource allocation
To enable the scheduling of pods within the OCP infrastructure, you need resources for the components that are running. If you do not allocate enough resources, pods remain in a Pending
state because they cannot be scheduled.
The amount of resources that you require to run STF depends on your environment and the number of nodes and clouds that you want to monitor.
Additional resources
For recommendations about sizing for metrics collection see https://access.redhat.com/articles/4907241.
For information about sizing requirements for ElasticSearch, see https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-managing-compute-resources.html
2.2.3. Node tuning operator
STF uses ElasticSearch to store events, which requires a larger than normal vm.max_map_count
. The vm.max_map_count
value is set by default in Red Hat OpenShift Container Platform.
If you want to edit the value of vm.max_map_count
, you cannot apply node tuning manually using the sysctl
command because Red Hat OpenShift Container Platform manages nodes directly. To configure values and apply them to the infrastructure, you must use the node tuning operator. For more information, see Using the Node Tuning Operator.
In an OCP deployment, the default node tuning operator specification provides the required profiles for ElasticSearch workloads or pods scheduled on nodes. To view the default cluster node tuning specification, run the following command:
oc get Tuned/default -o yaml -n openshift-cluster-node-tuning-operator
The output of the default specification is documented at Default profiles set on a cluster. The assignment of profiles is managed in the recommend
section where profiles are applied to a node when certain conditions are met. When scheduling ElasticSearch to a node in STF, one of the following profiles is applied:
-
openshift-control-plane-es
-
openshift-node-es
When scheduling an ElasticSearch pod, there must be a label present that matches tuned.openshift.io/elasticsearch
. If the label is present, one of the two profiles is assigned to the pod. No action is required by the administrator if you use the recommended Operator for ElasticSearch. If you use a custom-deployed ElasticSearch with STF, ensure that you add the tuned.openshift.io/elasticsearch
label to all scheduled pods.
Additional resources
For more information about virtual memory usage by ElasticSearch, see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
For more information about how the profiles are applied to nodes, see Custom tuning specification.
2.3. Deploying STF to the OCP environment
You can deploy STF to the OCP environment in one of two ways:
- Deploy STF and store events with ElasticSearch. For more information, see Section 2.3.1, “Deploying STF to the OCP environment with ElasticSearch”.
- Deploy STF without ElasticSearch and disable events support. For more information, see Section 2.3.2, “Deploying STF to the OCP environment without ElasticSearch”.
2.3.1. Deploying STF to the OCP environment with ElasticSearch
Complete the following tasks:
- Section 2.3.3, “Creating a namespace”.
- Section 2.3.4, “Creating an OperatorGroup”.
- Section 2.3.5, “Enabling the OperatorHub.io Community Catalog Source”.
- Section 2.3.6, “Enabling Red Hat STF Operator Source”.
- Section 2.3.7, “Subscribing to the AMQ Certificate Manager Operator”.
- Section 2.3.8, “Subscribing to the Elastic Cloud on Kubernetes Operator”.
- Section 2.3.9, “Subscribing to the Service Telemetry Operator”.
- Section 2.3.10, “Creating a ServiceTelemetry object in OCP”.
2.3.2. Deploying STF to the OCP environment without ElasticSearch
Complete the following tasks:
- Section 2.3.3, “Creating a namespace”.
- Section 2.3.4, “Creating an OperatorGroup”.
- Section 2.3.6, “Enabling Red Hat STF Operator Source”.
- Section 2.3.7, “Subscribing to the AMQ Certificate Manager Operator”.
- Section 2.3.9, “Subscribing to the Service Telemetry Operator”.
- Section 2.3.10, “Creating a ServiceTelemetry object in OCP”.
2.3.3. Creating a namespace
Create a namespace to hold the STF components. The service-telemetry
namespace is used throughout the documentation:
Procedure
Enter the following command:
oc new-project service-telemetry
2.3.4. Creating an OperatorGroup
Create an OperatorGroup in the namespace so that you can schedule the Operator pods.
Procedure
Enter the following command:
oc apply -f - <<EOF apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: service-telemetry-operator-group namespace: service-telemetry spec: targetNamespaces: - service-telemetry EOF
Additional resources
For more information, see OperatorGroups.
2.3.5. Enabling the OperatorHub.io Community Catalog Source
Before you install ElasticSearch, you must have access to the resources on the OperatorHub.io Community Catalog Source:
Procedure
Enter the following command:
oc apply -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: operatorhubio-operators namespace: openshift-marketplace spec: sourceType: grpc image: quay.io/operator-framework/upstream-community-operators:latest displayName: OperatorHub.io Operators publisher: OperatorHub.io EOF
2.3.6. Enabling Red Hat STF Operator Source
Before you deploy STF on Red Hat OpenShift Container Platform, you must enable the operator source.
Procedure
Install an OperatorSource that contains the Service Telemetry Operator and the Smart Gateway Operator:
oc apply -f - <<EOF apiVersion: operators.coreos.com/v1 kind: OperatorSource metadata: labels: opsrc-provider: redhat-operators-stf name: redhat-operators-stf namespace: openshift-marketplace spec: authorizationToken: {} displayName: Red Hat STF Operators endpoint: https://quay.io/cnr publisher: Red Hat registryNamespace: redhat-operators-stf type: appregistry EOF
To validate the creation of your OperatorSource, use the
oc get operatorsources
command. A successful import results in theMESSAGE
field returning a result ofThe object has been successfully reconciled
.$ oc get -nopenshift-marketplace operatorsource redhat-operators-stf NAME TYPE ENDPOINT REGISTRY DISPLAYNAME PUBLISHER STATUS MESSAGE redhat-operators-stf appregistry https://quay.io/cnr redhat-operators-stf Red Hat STF Operators Red Hat Succeeded The object has been successfully reconciled
To validate that the Operators are available from the catalog, use the
oc get packagemanifest
command:$ oc get packagemanifests | grep "Red Hat STF" smartgateway-operator Red Hat STF Operators 2m50s servicetelemetry-operator Red Hat STF Operators 2m50s
2.3.7. Subscribing to the AMQ Certificate Manager Operator
You must subscribe to the AMQ Certificate Manager Operator before you deploy the other STF components because the AMQ Certificate Manager Operator runs globally-scoped and is not compatible with the dependency management of Operator Lifecycle Manager when used with other namespace-scoped operators.
Procedure
Subscribe to the AMQ Certificate Manager Operator, create the subscription, and validate the AMQ7 Certificate Manager:
NoteThe AMQ Certificate Manager is installed globally for all namespaces, so the
namespace
value provided isopenshift-operators
. You might not see youramq7-cert-manager.v1.0.0
ClusterServiceVersion in theservice-telemetry
namespace for a few minutes until the processing executes against the namespace.oc apply -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: amq7-cert-manager namespace: openshift-operators spec: channel: alpha installPlanApproval: Automatic name: amq7-cert-manager source: redhat-operators sourceNamespace: openshift-marketplace EOF
To validate your
ClusterServiceVersion
, use theoc get csv
command. Ensure that amq7-cert-manager.v1.0.0 has a phaseSucceeded
.$ oc get --namespace openshift-operators csv NAME DISPLAY VERSION REPLACES PHASE amq7-cert-manager.v1.0.0 Red Hat Integration - AMQ Certificate Manager 1.0.0 Succeeded
2.3.8. Subscribing to the Elastic Cloud on Kubernetes Operator
Before you install the Service Telemetry Operator and if you plan to store events in ElasticSearch, you must enable the Elastic Cloud Kubernetes Operator.
Procedure
Apply the following manifest to your OCP environment to enable the Elastic Cloud on Kubernetes Operator:
oc apply -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: elastic-cloud-eck namespace: service-telemetry spec: channel: stable installPlanApproval: Automatic name: elastic-cloud-eck source: operatorhubio-operators sourceNamespace: openshift-marketplace EOF
To verify that the
ClusterServiceVersion
for ElasticSearch Cloud on Kubernetessucceeded
, enter theoc get csv
command:$ oc get csv NAME DISPLAY VERSION REPLACES PHASE elastic-cloud-eck.v1.1.0 Elastic Cloud on Kubernetes 1.1.0 elastic-cloud-eck.v1.0.1 Succeeded
2.3.9. Subscribing to the Service Telemetry Operator
To instantiate an STF instance, create the ServiceTelemetry
object to allow the Service Telemetry Operator to create the environment.
Procedure
To create the Service Telemetry Operator subscription, enter the
oc apply -f
command:oc apply -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: servicetelemetry-operator namespace: service-telemetry spec: channel: stable installPlanApproval: Automatic name: servicetelemetry-operator source: redhat-operators-stf sourceNamespace: openshift-marketplace EOF
To validate the Service Telemetry Operator and the dependent operators, enter the following command:
$ oc get csv --namespace service-telemetry NAME DISPLAY VERSION REPLACES PHASE amq7-cert-manager.v1.0.0 Red Hat Integration - AMQ Certificate Manager 1.0.0 Succeeded amq7-interconnect-operator.v1.2.0 Red Hat Integration - AMQ Interconnect 1.2.0 Succeeded elastic-cloud-eck.v1.1.0 Elastic Cloud on Kubernetes 1.1.0 elastic-cloud-eck.v1.0.1 Succeeded prometheusoperator.0.37.0 Prometheus Operator 0.37.0 prometheusoperator.0.32.0 Succeeded service-telemetry-operator.v1.0.2 Service Telemetry Operator 1.0.2 service-telemetry-operator.v1.0.1 Succeeded smart-gateway-operator.v1.0.1 Smart Gateway Operator 1.0.1 smart-gateway-operator.v1.0.0 Succeeded
2.3.10. Creating a ServiceTelemetry object in OCP
To deploy the Service Telemetry Framework, you must create an instance of ServiceTelemetry
in OCP. By default, eventsEnabled
is set to false. If you do not want to store events in ElasticSearch, ensure that eventsEnabled
is set to false. For more information, see Section 2.3.2, “Deploying STF to the OCP environment without ElasticSearch”.
The following core parameters are available for a ServiceTelemetry
manifest:
Table 2.1. Core parameters for a ServiceTelemetry
manifest
Parameter | Description | Default Value |
---|---|---|
| Enable events support in STF. Requires prerequisite steps to ensure ElasticSearch can be started. For more information, see Section 2.3.8, “Subscribing to the Elastic Cloud on Kubernetes Operator”. |
|
| Enable metrics support in STF. |
|
| Enable high availability in STF. For more information, see Section 4.3, “High availability”. |
|
| Enable ephemeral storage support in STF. For more information, see Section 4.6, “Ephemeral storage”. |
|
Procedure
To store events in ElasticSearch, set
eventsEnabled
to true during deployment:oc apply -f - <<EOF apiVersion: infra.watch/v1alpha1 kind: ServiceTelemetry metadata: name: stf-default namespace: service-telemetry spec: eventsEnabled: true metricsEnabled: true EOF
To view the STF deployment logs in the Service Telemetry Operator, use the
oc logs
command:oc logs $(oc get pod --selector='name=service-telemetry-operator' -oname) -c ansible
PLAY RECAP *** localhost : ok=37 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
View the pods and the status of each pod to determine that all workloads are operating nominally:
NoteIf you set
eventsEnabled: true
, the notification Smart Gateways willError
andCrashLoopBackOff
for a period of time before ElasticSearch starts.$ oc get pods NAME READY STATUS RESTARTS AGE alertmanager-stf-default-0 2/2 Running 0 26m elastic-operator-645dc8b8ff-jwnzt 1/1 Running 0 88m elasticsearch-es-default-0 1/1 Running 0 26m interconnect-operator-6fd49d9fb9-4bl92 1/1 Running 0 46m prometheus-operator-bf7d97fb9-kwnlx 1/1 Running 0 46m prometheus-stf-default-0 3/3 Running 0 26m service-telemetry-operator-54f4c99d9b-k7ll6 2/2 Running 0 46m smart-gateway-operator-7ff58bcf94-66rvx 2/2 Running 0 46m stf-default-ceilometer-notification-smartgateway-6675df547q4lbj 1/1 Running 0 26m stf-default-collectd-notification-smartgateway-698c87fbb7-xj528 1/1 Running 0 26m stf-default-collectd-telemetry-smartgateway-79c967c8f7-9hsqn 1/1 Running 0 26m stf-default-interconnect-7458fd4d69-nqbfs 1/1 Running 0 26m
2.4. Removing STF from the OCP environment
Remove STF from an OCP environment if you no longer require the STF functionality.
Complete the following tasks:
2.4.1. Deleting the namespace
To remove the operational resources for STF from OCP, delete the namespace.
Procedure
Run the
oc delete
command:oc delete project service-telemetry
Verify that the resources have been deleted from the namespace:
$ oc get all No resources found.
2.4.2. Removing the OperatorSource
If you do not expect to install Service Telemetry Framework again, delete the OperatorSource. When you remove the OperatorSource, PackageManifests related to STF are removed from the Operator Lifecycle Manager catalog.
Procedure
Delete the OperatorSource:
$ oc delete --namespace=openshift-marketplace operatorsource redhat-operators-stf operatorsource.operators.coreos.com "redhat-operators-stf" deleted
Verify that the STF PackageManifests are removed from the platform. If successful, the following command returns no result:
$ oc get packagemanifests | grep "Red Hat STF"
If you enabled the OperatorHub.io Community Catalog Source during the installation process and you no longer need this catalog source, delete it:
$ oc delete --namespace=openshift-marketplace catalogsource operatorhubio-operators catalogsource.operators.coreos.com "operatorhubio-operators" deleted
Additional resources
For more information about the OperatorHub.io Community Catalog Source, see Section 2.3, “Deploying STF to the OCP environment”.