Chapter 6. Upgrading Service Telemetry Framework to version 1.3
To migrate from Service Telemetry Framework (STF) 1.2 to STF 1.3, you must replace the ClusterServiceVersion
and Subscription
objects in the service-telemetry
namespace on your Red Hat OpenShift Container Platform environment.
Prerequisites
- You have upgraded your Red Hat OpenShift Container Platform environment to 4.7. STF 1.3 does not run on Red Hat OpenShift Container Platform 4.5 and lower.STF 1.2 does not run on Red Hat OpenShift Container Platform 4.7 and higher.
-
You have backed up your data before any upgrade of the environment. When you upgrade STF 1.2 to 1.3, there is a brief outage while the Smart Gateways are upgraded. Additionally, changes to the
ServiceTelemetry
andSmartGateway
objects do not have any effect while the Operators are being replaced.
To upgrade from STF 1.2 to 1.3, complete the following procedures:
6.1. Removing Service Telemetry Framework 1.2 Operators
Remove the Operators from STF 1.2, Smart Gateway Operator, and Service Telemetry Operator.
You must temporarily remove the clouds
parameters because of changes in the API interface. This results in the removal of all Smart Gateways until the upgrade is complete and the inability to deliver metrics and events during the upgrade.
Procedure
Retrieve the current
ServiceTelemetry
object and note the contents, in particular theclouds
parameter because you must remove this parameter before you upgrade the Operators.$ oc get stf default -oyaml
Modify the ServiceTelemetry object to clear the
clouds
parameter and set it to an empty list. SetcloudsRemoveOnMissing
totrue
to remove all Smart Gateways.WarningThis command stops all monitoring functions until after the upgrade is completed and the
clouds
object is redefined. If you use the default clouds configuration, it is not defined in your ServiceTelemetry object.$ oc patch stf default --patch $'spec:\n clouds: []\n cloudsRemoveOnMissing: true' --type=merge
Monitor the Smart Gateway pods until they are fully terminated and removed:
$ oc get pods --selector app=smart-gateway --watch NAME READY STATUS RESTARTS AGE default-cloud1-ceil-meter-smartgateway-58cc854f4-hgk92 1/1 Running 0 2m42s default-cloud1-coll-meter-smartgateway-6c76f9786d-crn9b 2/2 Running 0 2m55s default-cloud1-coll-meter-smartgateway-6c76f9786d-crn9b 2/2 Terminating 0 3m12s default-cloud1-ceil-meter-smartgateway-58cc854f4-hgk92 1/1 Terminating 0 3m ...
Retrieve the
Subscription
name of the Smart Gateway Operator:$ oc get sub smart-gateway-operator-stable-1.2-redhat-operators-openshift-marketplace NAME PACKAGE SOURCE CHANNEL smart-gateway-operator-stable-1.2-redhat-operators-openshift-marketplace smart-gateway-operator redhat-operators stable-1.2
Delete the Smart Gateway Operator subscription:
$ oc delete sub smart-gateway-operator-stable-1.2-redhat-operators-openshift-marketplace subscription.operators.coreos.com "smart-gateway-operator-stable-1.2-redhat-operators-openshift-marketplace" deleted
Retrieve the Smart Gateway Operator ClusterServiceVersion:
$ oc get csv -o name | grep -E 'smart-gateway' clusterserviceversion.operators.coreos.com/smart-gateway-operator.v2.2.1623675667
Delete the Smart Gateway Operator ClusterServiceVersion:
$ oc delete clusterserviceversion.operators.coreos.com/smart-gateway-operator.v2.2.1623675667 clusterserviceversion.operators.coreos.com "smart-gateway-operator.v2.2.1623675667" deleted
Delete the SmartGateway Custom Resource Definition:
$ oc delete crd smartgateways.smartgateway.infra.watch customresourcedefinition.apiextensions.k8s.io "smartgateways.smartgateway.infra.watch" deleted
Patch the Service Telemetry Operator Subscription to use the stable-1.3 channel:
$ oc patch sub service-telemetry-operator --patch $'spec:\n channel: stable-1.3' --type=merge subscription.operators.coreos.com/service-telemetry-operator patched
Monitor the output of the
oc get csv
command until the Smart Gateway Operator is installed and Service Telemetry Operator isPending
for version 1.2 and 1.3:$ oc get csv NAME DISPLAY VERSION REPLACES PHASE amq7-cert-manager.v1.0.0 Red Hat Integration - AMQ Certificate Manager 1.0.0 Succeeded amq7-interconnect-operator.v1.2.4 Red Hat Integration - AMQ Interconnect 1.2.4 amq7-interconnect-operator.v1.2.3 Succeeded elastic-cloud-eck.v1.6.0 Elasticsearch (ECK) Operator 1.6.0 elastic-cloud-eck.v1.5.0 Succeeded prometheusoperator.0.47.0 Prometheus Operator 0.47.0 prometheusoperator.0.37.0 Succeeded service-telemetry-operator.v1.2.1623675667 Service Telemetry Operator 1.2.1623675667 Pending service-telemetry-operator.v1.3.1622734200 Service Telemetry Operator 1.3.1622734200 service-telemetry-operator.v1.2.1623675667 Pending smart-gateway-operator.v3.0.1622734308 Smart Gateway Operator 3.0.1622734308 Succeeded
Delete the Service Telemetry Operator v1.2 ClusterServiceVersion:
$ oc delete csv service-telemetry-operator.v1.2.1623675667 clusterserviceversion.operators.coreos.com "service-telemetry-operator.v1.2.1623675667" deleted
Edit the ServiceTelemetry object and insert the contents of your previously noted
clouds
parameter. If theclouds
parameter was not previously defined because you used the default Smart Gateway instances, remove theclouds: []
parameter.$ oc edit stf default
Validate that the Smart Gateways are restored:
$ oc get pods --selector app=smart-gateway NAME READY STATUS RESTARTS AGE default-cloud1-ceil-meter-smartgateway-6484b98b68-sl7mb 2/2 Running 0 5m56s default-cloud1-coll-meter-smartgateway-799f687658-nfzr6 2/2 Running 0 6m6s
6.2. Subscribing to the Service Telemetry Operator
You must subscribe to the Service Telemetry Operator, which manages the STF instances.
Procedure
Create the Service Telemetry Operator subscription:
$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: service-telemetry-operator namespace: service-telemetry spec: channel: stable-1.3 installPlanApproval: Automatic name: service-telemetry-operator source: redhat-operators sourceNamespace: openshift-marketplace EOF
Validate the Service Telemetry Operator and the dependent operators:
$ oc get csv --namespace service-telemetry NAME DISPLAY VERSION REPLACES PHASE amq7-cert-manager.v1.0.0 Red Hat Integration - AMQ Certificate Manager 1.0.0 Succeeded amq7-interconnect-operator.v1.2.3 Red Hat Integration - AMQ Interconnect 1.2.3 amq7-interconnect-operator.v1.2.2 Succeeded elastic-cloud-eck.v1.6.0 Elasticsearch (ECK) Operator 1.6.0 elastic-cloud-eck.v1.5.0 Succeeded prometheusoperator.0.47.0 Prometheus Operator 0.47.0 prometheusoperator.0.37.0 Succeeded service-telemetry-operator.v1.3.1622734200 Service Telemetry Operator 1.3.1622734200 Succeeded smart-gateway-operator.v3.0.1622734308 Smart Gateway Operator 3.0.1622734308 Succeeded
When the new Operators start, they reconcile the existing ServiceTelemetry
and SmartGateway
objects, which restarts the Smart Gateway containers.
Check the state of the Smart Gateway containers:
oc get pods NAME READY STATUS RESTARTS AGE ... default-cloud1-ceil-meter-smartgateway-5849c4cdb5-xgl42 1/1 Running 0 35s default-cloud1-coll-meter-smartgateway-749674f75c-k7pm7 2/2 Terminating 0 56m default-cloud1-coll-meter-smartgateway-868476456b-ksh9b 2/2 Running 0 26s ...