Image-based upgrade for single-node OpenShift cluster with the Lifecycle Agent (Developer Preview)
Table of Contents
- About Developer Preview features
- About the image-based upgrade for single-node OpenShift cluster
- Idle stage
- Prep stage
- Upgrade stage
- (Optional) Rollback stage
- Installing the Lifecycle Agent by using the CLI
- Installing the Lifecycle Agent by using the web console
- Sharing the container directory between `ostree` stateroots
- Sharing the container directory when using RHACM
- Sharing the container directory when using GitOps ZTP
- Generating a seed image with the Lifecycle Agent
- Preparing the single-node OpenShift cluster for the image-based upgrade
- Upgrading the single-node OpenShift cluster with Lifecycle Agent
- (Optional) Initiating rollback of the single-node OpenShift cluster after an image-based upgrade
- Upgrading the single-node OpenShift cluster through GitOps ZTP
- (Optional) Initiating rollback with TALM
The image-based upgrade is currently a Developer Preview feature for OpenShift Container Platform 4.15.
About Developer Preview features
Developer Preview features are not supported with Red Hat production service level agreements (SLAs) and are not functionally complete. Red Hat does not advise using them in a production setting. Developer Preview features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. These releases may not have any documentation, and testing is limited. Red Hat may provide ways to submit feedback on Developer Preview releases without an associated SLA.
About the image-based upgrade for single-node OpenShift cluster
From OpenShift Container Platform 4.14.7, the Lifecycle Agent 4.15 provides you with an alternative way to upgrade the platform version of a single-node OpenShift cluster. The image-based upgrade is faster than the standard upgrade method and allows you to directly upgrade from OpenShift Container Platform <4.y> to <4.y+2>, and <4.y.z> to <4.y.z+n>.
This upgrade method utilizes a generated OCI image from a dedicated seed cluster that is installed on the target single-node OpenShift cluster as a new ostree
stateroot. A seed cluster is a single-node OpenShift cluster deployed with the target OpenShift Container Platform version, Day 2 Operators, and configurations that is common to all target clusters.
You can use the seed image, which is generated from the seed cluster, to upgrade the platform version on any single-node OpenShift cluster that has the same combination of hardware, Day 2 Operators, and cluster configuration as the seed cluster.
IMPORTANT: The image-based upgrade uses custom images that are specific to the hardware platform that the clusters are running on. Each different hardware platform requires a separate seed image.
The Lifecycle Agent uses two custom resources (CRs) on the participating clusters to orchestrate the upgrade:
-
On the seed cluster, the
SeedGenerator
CR allows for the seed image generation. This CR specifies the repository to push the seed image to. -
On the target cluster, the
ImageBasedUpgrade
CR specifies the seed container image for the upgrade of the target cluster and the backup configurations for your workloads.
Example SeedGenerator CR
apiVersion: lca.openshift.io/v1alpha1
kind: SeedGenerator
metadata:
name: seedimage
spec:
seedImage: <seed_container_image>
Example ImageBasedUpgrade CR
apiVersion: lca.openshift.io/v1alpha1
kind: ImageBasedUpgrade
metadata:
name: example-upgrade
spec:
stage: Idle
seedImageRef:
version: <target_version>
image: <seed_container_image>
pullSecretRef: <seed_pull_secret>
additionalImages:
name: ""
namespace: ""
autoRollbackOnFailure: {}
# disabledForPostRebootConfig: "true"
# disabledForUpgradeCompletion: "true"
# disabledInitMonitor: "true"
# initMonitorTimeoutSeconds: 1800
# extraManifests:
# - name: sno-extra-manifests
# namespace: openshift-lifecycle-agent
oadpContent:
- name: oadp-cm-example
namespace: openshift-adp
- Define the desired stage for the
ImageBasedUpgrade
CR in thespec.stage
field. The value can beIdle
,Prep
,Upgrade
, orRollback
. - Define the target platform version, the seed image to be used, and the secret required to access the image in the
spec.seedImageRef
section. - Configure the automatic rollback in the
spec.autoRollbackOnFailure
section. By default, automatic rollback on failure is enabled throughout the upgrade. - Optional. If set to
true
, theautoRollbackOnFailure.disabledForPostRebootConfig
field disables automatic rollback when the reconfiguration of the cluster fails upon the first reboot. - Optional. If set to
true
, theautoRollbackOnFailure.disabledForUpgradeCompletion
field disables automatic rollback after the Lifecycle Agent reports a failed upgrade upon completion. - Optional. If set to
true
, theautoRollbackOnFailure.disabledInitMonitor
field disables automatic rollback when the upgrade does not complete after reboot within the time frame specified in theinitMonitorTimeoutSeconds
field. - Optional. The
autoRollbackOnFailure.initMonitorTimeoutSeconds
field specifies the time frame in seconds. If not defined or set to0
, the default value of1800
seconds (30 minutes) is used. - Optional. In the
spec.extraManifests
section, specify the list ofConfigMap
resources that contain the additional extra manifests that you want to apply to the target cluster in the this section. You can also add your custom catalog sources that you want to retain after the upgrade.
After generating the seed image on the seed cluster, you can move through the stages on the target cluster by setting the spec.stage
field to the following values in the ImageBasedUpgrade
CR:
-
Idle
-
Prep
-
Upgrade
-
Rollback
(Optional)
Idle stage
The Lifecycle Agent creates an ImageBasedUpgrade
CR set to stage: Idle
when the Operator is first deployed. This is the default stage, there is no ongoing upgrade and the cluster is ready to move to the Prep
stage.
After a successful upgrade or a rollback, you commit to the change by patching the stage
field to Idle
in the ImageBasedUpgrade
CR. Changing to this stage ensures that the Lifecycle Agent cleans up resources, so the cluster is ready for upgrades again.
Prep stage
Note: You can complete this stage before a scheduled maintenance window.
During the Prep
stage, you specify the following upgrade details in the ImageBasedUpgrade
CR:
- seed image to use
- resources to back up
- extra manifests to apply after the upgrade
Then, based on what you specify, the Lifecycle Agent prepares for the upgrade without impacting the current running version. This preparation includes ensuring that the target cluster is ready to proceed to the Upgrade
stage by checking if it meets certain conditions and pulling the seed image to the target cluster with additional container images specified in the seed image.
You also prepare backup resources with the OADP Operator’s Backup
and Restore
CRs. These CRs are used in the Upgrade
stage to reconfigure the cluster, register the cluster with RHACM, and restore application artifacts.
Important: The same version of the applications must function on both the current and the target release of OpenShift Container Platform.
Additionally to the OADP Operator, the Lifecycle Agent uses the ostree
versioning system to create a backup, which allow complete cluster reconfiguration after both upgrade and rollback.
You can stop the upgrade process at this point by moving to the Idle
stage or you can start the upgrade by moving to the Upgrade
stage in the ImageBasedUpgrade
CR. If you stop, the Operator performs cleanup operations.
Upgrade stage
Just before the Lifecycle Agent starts the upgrade process, the backup of your cluster resources specified in the Prep
stage are created on a compatible Object storage solution. After the target cluster reboots with the new platform version, the Operator applies the cluster and application configurations defined in the Backup
and Restore
CRs, and applies any extra manifests that are specified in the referenced ConfigMap
resource.
The Operator also regenerates the seed image’s cluster cryptography. This ensures that each single-node OpenShift cluster upgraded with the same seed image has unique and valid cryptographic objects.
Once you are satisfied with the changes, you can finalize the upgrade by moving to the Idle
stage. If you encounter issues after the upgrade, you can move to the Rollback
stage for a manual rollback.
(Optional) Rollback stage
The rollback stage can be initiated manually or automatically upon failure. During the Rollback
stage, the Lifecycle Agent sets the original ostree
stateroot as default. Then, the node reboots with the previous release of OpenShift Container Platform and application configurations.
By default, automatic rollback is enabled in the ImageBasedUpgrade
CR. The Lifecycle Agent can initiate an automatic rollback if the upgrade fails or if the upgrade does not complete within the specified time limit. For more information about the automatic rollback configurations, see the (Optional) Initiating rollback of the single-node OpenShift cluster after an image-based upgrade or (Optional) Initiating rollback with TALM sections.
Warning: If you move to the Idle
stage after a rollback, the Lifecycle Agent cleans up resources that can be used to troubleshoot a failed upgrade.
Installing the Lifecycle Agent by using the CLI
You can use the OpenShift CLI (oc
) to install the Lifecycle Agent from the 4.15 Operator catalog on both the seed and target cluster.
Prerequisites
-
Install the OpenShift CLI (
oc
). -
Log in as a user with
cluster-admin
privileges.
Procedure
-
Create a namespace for the Lifecycle Agent.
apiVersion: v1 kind: Namespace metadata: name: openshift-lifecycle-agent annotations: workload.openshift.io/allowed: management
-
Create the
Namespace
CR:$ oc create -f lca-namespace.yaml
-
-
Create an Operator group for the Lifecycle Agent.
apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: openshift-lifecycle-agent namespace: openshift-lifecycle-agent spec: targetNamespaces: - openshift-lifecycle-agent
-
Create the
OperatorGroup
CR:$ oc create -f lca-operatorgroup.yaml
-
-
Create a
Subscription
CR:-
Define the
Subscription
CR and save the YAML file, for example,lca-subscription.yaml
:apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshift-lifecycle-agent-subscription namespace: openshift-lifecycle-agent spec: channel: "alpha" name: lifecycle-agent source: redhat-operators sourceNamespace: openshift-marketplace
-
Create the
Subscription
CR by running the following command:$ oc create -f lca-subscription.yaml
-
Verification
-
Verify that the installation succeeded by inspecting the CSV resource:
$ oc get csv -n openshift-lifecycle-agent
Example output
NAME DISPLAY VERSION REPLACES PHASE lifecycle-agent.v4.15.0 Openshift Lifecycle Agent 4.15.0 Succeeded
-
Verify that the Lifecycle Agent is up and running:
$ oc get deploy -n openshift-lifecycle-agent
Example output
NAME READY UP-TO-DATE AVAILABLE AGE lifecycle-agent-controller-manager 1/1 1 1 14s
Installing the Lifecycle Agent by using the web console
You can use the OpenShift Container Platform web console to install the Lifecycle Agent from the 4.15 Operator catalog on both the seed and target cluster.
Prerequisites
- Log in as a user with
cluster-admin
privileges.
Procedure
-
In the OpenShift Container Platform web console, navigate to Operators → OperatorHub.
-
Search for the Lifecycle Agent from the list of available Operators, and then click Install.
-
On the Install Operator page, under A specific namespace on the cluster select openshift-lifecycle-agent.
-
Click Install.
Verification.
To confirm that the installation is successful:
-
Navigate to the Operators → Installed Operators page.
-
Ensure that the Lifecycle Agent is listed in the openshift-lifecycle-agent project with a Status of InstallSucceeded.
Note: During installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the Operator is not installed successfully:
-
Go to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
-
Go to the Workloads → Pods page and check the logs for pods in the openshift-lifecycle-agent project.
Sharing the container directory between `ostree` stateroots
You must apply a MachineConfig
to both the seed and the target clusters during installation time to create a separate partition and share the /var/lib/containers
directory between the two ostree
stateroots that will be used during the upgrade process.
Sharing the container directory when using RHACM
When you are using RHACM, you must apply a MachineConfig
to both the seed and target clusters.
Important: You must complete this procedure at installation time.
Prerequisites
- Log in as a user with
cluster-admin
privileges.
Procedure
-
Apply a
MachineConfig
to share the/var/lib/containers
directory.apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: 98-var-lib-containers-partitioned spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/disk/by-id/wwn-<root_disk> <1> partitions: - label: varlibcontainers startMiB: <start_of_parition> <2> sizeMiB: <parition_size> <3> filesystems: - device: /dev/disk/by-partlabel/varlibcontainers format: xfs mountOptions: - defaults - prjquota path: /var/lib/containers wipeFilesystem: true systemd: units: - contents: |- # Generated by Butane [Unit] Before=local-fs.target Requires=systemd-fsck@dev-disk-by\x2dpartlabel-varlibcontainers.service After=systemd-fsck@dev-disk-by\x2dpartlabel-varlibcontainers.service [Mount] Where=/var/lib/containers What=/dev/disk/by-partlabel/varlibcontainers Type=xfs Options=defaults,prjquota [Install] RequiredBy=local-fs.target enabled: true name: var-lib-containers.mount
- Specify the root disk in the
storage.disks.device
field. - Specify the start of the partition in MiB in the
storage.disks.partitions.start_mib
field. If the value is too small, the installation fails. - Specify the size of the partition in MiB in the
storage.disks.partitions.size_mib
field. If the value is too small, the deployments after installation fail.
- Specify the root disk in the
Sharing the container directory when using GitOps ZTP
When you are using the GitOps ZTP workflow, you can do the following procedure to create a separate disk partition on both the seed and target cluster and to share the /var/lib/containers
directory.
Important: You must complete this procedure at installation time.
Prerequisites
-
Log in as a user with
cluster-admin
privileges. -
Install Butane.
Procedure
-
Create the
storage.bu
file.variant: fcos version: 1.3.0 storage: disks: - device: /dev/disk/by-id/wwn-<root_disk> wipe_table: false partitions: - label: var-lib-containers start_mib: <start_of_partition> size_mib: <parition_size> filesystems: - path: /var/lib/containers device: /dev/disk/by-partlabel/var-lib-containers format: xfs wipe_filesystem: true with_mount_unit: true mount_options: - defaults - prjquota
- Specify the root disk in the
storage.disks.device
field. - Specify the start of the partition in MiB in the
storage.disks.partitions.start_mib
field. If the value is too small, the installation fails. - Specify the size of the partition in MiB in the
storage.disks.partitions.size_mib
field. If the value is too small, the deployments after installation fail.
- Specify the root disk in the
-
Convert the
storage.bu
to an Ignition file.$ butane storage.bu
Example output
{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}
-
Copy the output into the
.spec.clusters.nodes.ignitionConfigOverride
field in theSiteConfig
CR.[...] spec: clusters: - nodes: - ignitionConfigOverride: '{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}' [...]
Verification
-
During or after installation, verify on the hub cluster that the
BareMetalHost
object shows the annotation.$ oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]
Example output
"{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"
-
After installation, check the single-node OpenShift disk status:
# lsblk
Example output
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sda 8:0 0 446.6G 0 disk ├─sda1 8:1 0 1M 0 part ├─sda2 8:2 0 127M 0 part ├─sda3 8:3 0 384M 0 part /boot ├─sda4 8:4 0 243.6G 0 part /var │ /sysroot/ostree/deploy/rhcos/var │ /usr │ /etc │ / │ /sysroot └─sda5 8:5 0 202.5G 0 part /var/lib/containers
# df -h
Example output
Filesystem Size Used Avail Use% Mounted on devtmpfs 4.0M 0 4.0M 0% /dev tmpfs 126G 84K 126G 1% /dev/shm tmpfs 51G 93M 51G 1% /run /dev/sda4 244G 5.2G 239G 3% /sysroot tmpfs 126G 4.0K 126G 1% /tmp /dev/sda5 203G 119G 85G 59% /var/lib/containers /dev/sda3 350M 110M 218M 34% /boot tmpfs 26G 0 26G 0% /run/user/1000
Generating a seed image with the Lifecycle Agent
Use the Lifecycle Agent to generate the seed image with the SeedGenerator
CR. The Operator checks for required system configurations, performs any necessary system cleanup before generating the seed image, and launches the image generation. The seed image generation includes the following tasks:
-
Stopping cluster operators
-
Preparing the seed image configuration
-
Generating and pushing the seed image to the image repository specified in the
SeedGenerator
CR -
Restoring cluster operators
-
Expiring seed cluster certificates
-
Generating new certificates for the seed cluster
-
Restoring and updates the
SeedGenerator
CR on the seed cluster
Note: The generated seed image does not include any site-specific data.
Important: During the Developer Preview of this feature, when upgrading a cluster, any custom trusted certificates configured on the cluster will be lost. As a temporary workaround, you must use a seed image from a seed cluster that trusts the certificates to preserve them.
Prerequisites
- Deploy a single-node OpenShift cluster with a DU profile.
- Install the Lifecycle Agent on the seed cluster.
- Install the OADP Operator on the seed cluster.
- Log in as a user with
cluster-admin
privileges. - The seed cluster has the same CPU topology as the target cluster.
-
The seed cluster has the same IP version as the target cluster.
Note: Dual-stack networking is not supported in this release.
-
If the target cluster has a proxy configuration, the seed cluster must also have a proxy configuration. The proxy configuration does not have to be the same.
- The seed cluster is registered as a managed cluster.
- The Lifecycle Agent deployed on the target cluster is compatible with the version in the seed image.
- The seed cluster has a separate partition for the container images that will be shared between stateroots. For more information, see Sharing the container directory between
ostree
stateroots.
Warning: If the target cluster has multiple IPs and one of them belongs to the subnet that was used for creating the seed image, the upgrade fails if the target cluster's node IP does not belong to that subnet.
Procedure
-
Detach the seed cluster from the hub cluster either manually or if using ZTP, by removing the
SiteConfig
CR from thekustomization.yaml
. This deletes any cluster-specific resources from the seed cluster that must not be in the seed image.-
If you are using RHACM, manually detach the seed cluster by running the following command:
$ oc delete managedcluster sno-worker-example
-
Wait until the
ManagedCluster
CR is removed. Once the CR is removed, create the properSeedGenerator
CR. The Lifecycle Agent cleans up the RHACM artifacts.
-
-
If you are using GitOps ZTP, detach your cluster by removing the seed cluster’s
SiteConfig
CR from thekustomization.yaml
:-
Remove your seed cluster’s
SiteConfig
CR from thekustomization.yaml
.apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization generators: #- example-seed-sno1.yaml - example-target-sno2.yaml - example-target-sno3.yaml
-
Commit the
kustomization.yaml
changes in your Git repository and push the changes.The ArgoCD pipeline detects the changes and removes the managed cluster.
-
-
Create the
Secret
.-
Create the authentication file by running the following commands:
$ MY_USER=myuserid $ AUTHFILE=/tmp/my-auth.json $ podman login --authfile ${AUTHFILE} -u ${MY_USER} quay.io/${MY_USER} $ base64 -w 0 ${AUTHFILE} ; echo
-
Copy the output into the
seedAuth
field in theSecret
YAML file namedseedgen
in theopenshift-lifecycle-agent
namespace.apiVersion: v1 kind: Secret metadata: name: seedgen namespace: openshift-lifecycle-agent type: Opaque data: seedAuth: <encoded_authfile>
- The
Secret
resource must have thename: seedgen
andnamespace: openshift-lifecycle-agent
fields. - Specify a base64-encoded authfile for write-access to the registry for pushing the generated seed images in the
data.seedAuth
field.
- The
-
Apply the
Secret
.$ oc apply -f secretseedgenerator.yaml
-
-
Create the
SeedGenerator
CR:apiVersion: lca.openshift.io/v1alpha1 kind: SeedGenerator metadata: name: seedimage spec: seedImage: <seed_container_image>
- The
SeedGenerator
CR must be namedseedimage
. - Specify the container image URL in the
spec.seedImage
field, for example,quay.io/example/seed-container-image:<tag>
. It is recommended to use the<seed_cluster_name>:<ocp_version>
format.
- The
-
Generate the seed image by running the following command:
$ oc apply -f seedgenerator.yaml
Important: The cluster reboots and loses API capabilities while the Lifecycle Agent generates the seed image. Applying the SeedGenerator
CR stops the kubelet
and the CRI-O operations, then it starts the image generation.
Once the image generation is complete, the cluster can be reattached to the hub cluster, and you can access it through the API.
If you want to generate further seed images, you must provision a new seed cluster with the version you want to generate a seed image from.
Verification
-
Once the cluster recovers and it is available, you can check the status of the
SeedGenerator
CR:$ oc get seedgenerator -A -oyaml
Example output for completed seed generation
status: conditions: - lastTransitionTime: 2024-02-13T21:24:26Z message: Seed Generation completed observedGeneration: 1 reason: Completed status: "False" type: SeedGenInProgress - lastTransitionTime: 2024-02-13T21:24:26Z message: Seed Generation completed observedGeneration: 1 reason: Completed status: "True" type: SeedGenCompleted observedGeneration: 1
-
Verify that the single-node OpenShift cluster is running and is attached to the RHACM hub cluster:
$ oc get managedclusters sno-worker-example
Example output
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE sno-worker-example true https://api.sno-worker-example.example.redhat.com True True 21h
- The cluster is attached if you see that the value is
True
for bothJOINED
andAVAILABLE
.
- The cluster is attached if you see that the value is
Note: The cluster requires time to recover after restarting the kubelet
operation.
Preparing the single-node OpenShift cluster for the image-based upgrade
When you deploy the Lifecycle Agent on a cluster, an ImageBasedUpgrade
CR is automatically created. You edit this CR to specify the image repository of the seed image and to move through the different stages on the target cluster.
Prerequisites
- Install a compatible version of the Lifecycle Agent on the target cluster.
- Generate a seed image from a compatible seed cluster.
- Install the OADP Operator on the target cluster. For more information, see About installing the OADP Operator.
- Create an S3-compatible object storage solution and a ready-to-use bucket with proper credentials configured. For more information, see AWS S3 compatible backup storage providers.
- Create a separate partition on the target cluster for the container images that is shared between stateroots. For more information about, see Sharing the container directory between
ostree
stateroots.
Warning: If the target cluster has multiple IPs and one of them belongs to the subnet that was used for creating the seed image, the upgrade fails if the target cluster's node IP does not belong to that subnet.
Procedure
This example procedure demonstrates how to back up and upgrade a cluster with applications which are using persistent volumes.
Note: The target cluster does not need to be detached from the hub cluster.
-
Create your OADP
Backup
andRestore
CRs. For more information, see Creating a Backup CR and Creating a Restore CR-
To back up specific CRs, use the
lca.openshift.io/apply-label
annotation in yourBackup
CRs. Based on the annotation, the Lifecycle Agent applies thelca.openshift.io/backup: <backup_name>
label and adds thelabelSelector.matchLabels.lca.openshift.io/backup: <backup_name>
label selector to the specified resources when creating theBackup
CRs.apiVersion: velero.io/v1 kind: Backup metadata: name: backup-acm-klusterlet annotations: lca.openshift.io/apply-label: "apps/v1/deployments/open-cluster-management-agent/klusterlet,v1/secrets/open-cluster-management-agent/bootstrap-hub-kubeconfig,rbac.authorization.k8s.io/v1/clusterroles/klusterlet,v1/serviceaccounts/open-cluster-management-agent/klusterlet,rbac.authorization.k8s.io/v1/clusterroles/open-cluster-management:klusterlet-admin-aggregate-clusterrole,rbac.authorization.k8s.io/v1/clusterrolebindings/klusterlet,operator.open-cluster-management.io/v1/klusterlets/klusterlet,apiextensions.k8s.io/v1/customresourcedefinitions/klusterlets.operator.open-cluster-management.io,v1/secrets/open-cluster-management-agent/open-cluster-management-image-pull-credentials" labels: velero.io/storage-location: default namespace: openshift-adp spec: includedNamespaces: - open-cluster-management-agent includedClusterScopedResources: - klusterlets.operator.open-cluster-management.io - clusterclaims.cluster.open-cluster-management.io - clusterroles.rbac.authorization.k8s.io - clusterrolebindings.rbac.authorization.k8s.io includedNamespaceScopedResources: - deployments - serviceaccounts - secrets
- The value of the
lca.openshift.io/apply-label
annotation must be a list of comma-separated objects in thegroup/version/resource/name
format for cluster-scoped resources, or in thegroup/version/resource/namespace/name
format for namespace-scoped resources. It must be attached to the relatedBackup
CR.
Note: Depending on the RHACM configuration, the
v1/secrets/open-cluster-management-agent/open-cluster-management-image-pull-credentials
object must be backed up. Check if yourMultiClusterHub
CR has thespec.imagePullSecret
field defined and if the secret exists in theopen-cluster-management-agent
namespace in your hub cluster. If thespec.imagePullSecret
field does not exist, you can remove thev1/secrets/open-cluster-management-agent/open-cluster-management-image-pull-credentials
object from thelca.openshift.io/apply-label
annotation.Important: To use the
lca.openshift.io/apply-label
annotation for backing up specific resources, the resources listed in the annotation must also be included in thespec
section. If thelca.openshift.io/apply-label
annotation is used in theBackup
CR, only the resources listed in the annotation will be backed up, even if other resource types are specified in thespec
section or not. - The value of the
-
Define the restore order for the OADP Operator in the
Restore
CR by using thelca.openshift.io/apply-wave
field:apiVersion: velero.io/v1 kind: Restore metadata: name: restore-acm-klusterlet namespace: openshift-adp labels: velero.io/storage-location: default annotations: lca.openshift.io/apply-wave: "1" spec: backupName: acm-klusterlet --- apiVersion: velero.io/v1 kind: Restore metadata: name: restore-example-app namespace: openshift-adp labels: velero.io/storage-location: default annotations: lca.openshift.io/apply-wave: "2" spec: backupName: backup-example-app
Note: If you do not define the
lca.openshift.io/apply-wave
annotation in theBackup
andRestore
CRs, they will be applied together. -
Create a
kustomization.yaml
that will append the information to a newConfigMap
:configMapGenerator: - name: oadp-cm-example namespace: openshift-adp files: - backup-acm-klusterlet.yaml - backup-example-app.yaml - restore-acm-klusterlet.yaml - restore-example-app.yaml generatorOptions: disableNameSuffixHash: true
- The
generatorOptions.disableNameSuffixHash
disables the hash generation at the end of theConfigMap
filename when set totrue
. This option allows theConfigMap
file to be overwritten when a new one is generated with the same name.
- The
-
Create the
ConfigMap
:$ kustomize build ./ -o oadp-cm-example.yaml
Example output
kind: ConfigMap metadata: name: oadp-cm-example namespace: openshift-adp [...]
-
Apply the
ConfigMap
:$ oc apply -f oadp-cm-example.yaml
-
-
(Optional) To keep your custom catalog sources after the upgrade, add them to the spec.extraManifest in the ImageBasedUpgrade CR. For more information, see Catalog source.
-
Edit the
ImageBasedUpgrade
CR:apiVersion: lca.openshift.io/v1alpha1 kind: ImageBasedUpgrade metadata: name: example-upgrade spec: stage: Idle seedImageRef: version: 4.15.2 image: <seed_container_image> pullSecretRef: <seed_pull_secret> additionalImages: name: "" namespace: "" autoRollbackOnFailure: {} # disabledForPostRebootConfig: "true" # disabledForUpgradeCompletion: "true" # disabledInitMonitor: "true" # initMonitorTimeoutSeconds: 1800 # extraManifests: # - name: sno-extra-manifests # namespace: openshift-lifecycle-agent oadpContent: - name: oadp-cm-example namespace: openshift-adp
- Define the desired stage for the
ImageBasedUpgrade
CR in thespec.stage
field. The value can beIdle
,Prep
,Upgrade
, orRollback
. - Define the target platform version, the seed image to be used, and the secret required to access the image in the
spec.seedImageRef
section. - Configure the automatic rollback in the
spec.autoRollbackOnFailure
section. By default, automatic rollback on failure is enabled throughout the upgrade. - Optional. If set to
true
, theautoRollbackOnFailure.disabledForPostRebootConfig
field disables automatic rollback when the reconfiguration of the cluster fails upon the first reboot. - Optional. If set to
true
, theautoRollbackOnFailure.disabledForUpgradeCompletion
field disables automatic rollback after the Lifecycle Agent reports a failed upgrade upon completion. - Optional. If set to
true
, theautoRollbackOnFailure.disabledInitMonitor
field disables automatic rollback when the upgrade does not complete after reboot within the time frame specified in theinitMonitorTimeoutSeconds
field. - Optional. The
autoRollbackOnFailure.initMonitorTimeoutSeconds
field specifies the time frame in seconds. If not defined or set to0
, the default value of1800
seconds (30 minutes) is used. - Optional. In the
spec.extraManifests
section, specify the list ofConfigMap
resources that contain the additional extra manifests that you want to apply to the target cluster in the this section. You can also add your custom catalog sources that you want to retain after the upgrade. - Specify the list of
ConfigMap
resources that contain the OADPBackup
andRestore
CRs in thespec.oadpContent
section.
- Define the desired stage for the
-
Change the value of the
stage
field toPrep
in theImageBasedUpgrade
CR:$ oc patch imagebasedupgrades.lca.openshift.io example-upgrade -p='{"spec": {"stage": "Prep"}}' --type=merge -n openshift-lifecycle-agent
-
The Lifecycle Agent checks for the health of the cluster, creates a new
ostree
stateroot, and pulls the seed image to the target cluster. Then, the Operator precaches all the required images on the target cluster.
Verification
-
Check the status of the
ImageBasedUpgrade
CR.$ oc get ibu -A -oyaml
Example output
status: conditions: - lastTransitionTime: 2024-01-01T09:00:00Z message: In progress observedGeneration: 2 reason: InProgress status: "False" type: Idle - lastTransitionTime: 2024-01-01T09:00:00Z message: "Prep completed: total: 121 (pulled: 1, skipped: 120, failed: 0)" observedGeneration: 2 reason: Completed status: "True" type: PrepCompleted - lastTransitionTime: 2024-01-01T09:00:00Z message: Prep completed observedGeneration: 2 reason: Completed status: "False" type: PrepInProgress observedGeneration: 2
Upgrading the single-node OpenShift cluster with Lifecycle Agent
Once you generated the seed image and completed the Prep
stage, you can upgrade the target cluster. During the upgrade process, the OADP Operator creates a backup of the artifacts specified in the OADP CRs, then the Lifecycle Agent upgrades the cluster.
If the upgrade fails or stops, an automatic rollback is initiated. If you have an issue after the upgrade, you can initiate a manual rollback. For more information about rollbacks, see the (Optional) Initiating rollback of the single-node OpenShift clusters after an image-based upgrade or (Optional) Initiating rollback with TALM sections.
Important: During the Developer Preview of this feature, when upgrading a cluster, any custom trusted certificates configured on the cluster will be lost. As a temporary workaround, to preserve these certificates, you must use a seed image from a seed cluster that trusts the certificates.
Prerequisites
- Complete the
Prep
stage.
Procedure
-
When you are ready, move to the upgrade stage by changing the value of the
stage
field toUpgrade
in theImageBasedUpgrade
CR.$ oc patch imagebasedupgrades.lca.openshift.io example-upgrade -p='{"spec": {"stage": "Upgrade"}}' --type=merge
-
Check the status of the
ImageBasedUpgrade
CR:$ oc get ibu -A -oyaml
Example output
status: conditions: - lastTransitionTime: 2024-01-01T09:00:00Z message: In progress observedGeneration: 2 reason: InProgress status: "False" type: Idle - lastTransitionTime: 2024-01-01T09:00:00Z message: "Prep completed: total: 121 (pulled: 1, skipped: 120, failed: 0)" observedGeneration: 2 reason: Completed status: "True" type: PrepCompleted - lastTransitionTime: 2024-01-01T09:00:00Z message: Prep completed observedGeneration: 2 reason: Completed status: "False" type: PrepInProgress - lastTransitionTime: 2024-01-01T09:00:00Z message: Upgrade completed observedGeneration: 3 reason: Completed status: "True" type: UpgradeCompleted
-
The OADP Operator creates a backup of the data specified in the OADP
Backup
andRestore
CRs. -
The target cluster reboots.
-
Monitor the status of the CR:
$ oc get ibu -A -oyaml
-
The cluster reboots.
-
Once you are satisfied with the upgrade, commit to the changes by changing the value of the
stage
field toIdle
in theImageBasedUpgrade
CR:$ oc patch imagebasedupgrades.lca.openshift.io example-upgrade -p='{"spec": {"stage": "Idle"}}' --type=merge
Important: You cannot roll back the changes once you move to the
Idle
stage after an upgrade. -
The Lifecycle Agent deletes all resources created during the upgrade process.
Verification
-
Check the status of the
ImageBasedUpgrade
CR:$ oc get ibu -A -oyaml
Example output
status: conditions: - lastTransitionTime: 2024-01-01T09:00:00Z message: In progress observedGeneration: 2 reason: InProgress status: "False" type: Idle - lastTransitionTime: 2024-01-01T09:00:00Z message: "Prep completed: total: 121 (pulled: 1, skipped: 120, failed: 0)" observedGeneration: 2 reason: Completed status: "True" type: PrepCompleted - lastTransitionTime: 2024-01-01T09:00:00Z message: Prep completed observedGeneration: 2 reason: Completed status: "False" type: PrepInProgress - lastTransitionTime: 2024-01-01T09:00:00Z message: Upgrade completed observedGeneration: 3 reason: Completed status: "True" type: UpgradeCompleted
-
Check the status of the cluster restoration:
$ oc get restores -n openshift-adp -o custom-columns=NAME:.metadata.name,Status:.status.phase,Reason:.status.failureReason
Example output
NAME Status Reason acm-klusterlet Completed <none> apache-app Completed <none> localvolume Completed <none>
(Optional) Initiating rollback of the single-node OpenShift cluster after an image-based upgrade
You can manually roll back the changes if you encounter unresolvable issues after an upgrade. By default, an automatic rollback is initiated on the following conditions:
- If the reconfiguration of the cluster fails upon the first reboot.
- If the Lifecycle Agent reports a failed upgrade.
- If the upgrade does not complete within the time frame specified in the
initMonitorTimeoutSeconds
field after rebooting.
You can disable the automatic rollback configuration in the ImageBasedUpgrade
CR at the Prep
stage:
Example ImageBasedUpgrade CR
apiVersion: lca.openshift.io/v1alpha1
kind: ImageBasedUpgrade
metadata:
name: example-upgrade
spec:
stage: Idle
seedImageRef:
version: 4.15.2
image: <seed_container_image>
additionalImages:
name: ""
namespace: ""
autoRollbackOnFailure: {}
# disabledForPostRebootConfig: "true"
# disabledForUpgradeCompletion: "true"
# disabledInitMonitor: "true"
# initMonitorTimeoutSeconds: 1800
[...]
- Configure the automatic rollback in the
spec.autoRollbackOnFailure
section. By default, automatic rollback on failure is enabled throughout the upgrade. - Optional. If set to
true
, theautoRollbackOnFailure.disabledForPostRebootConfig
field disables automatic rollback when the reconfiguration of the cluster fails upon the first reboot. - Optional. If set to
true
, theautoRollbackOnFailure.disabledForUpgradeCompletion
field disables automatic rollback after the Lifecycle Agent reports a failed upgrade upon completion. - Optional. If set to
true
, theautoRollbackOnFailure.disabledInitMonitor
field disables automatic rollback when the upgrade does not complete after reboot within the time frame specified in theinitMonitorTimeoutSeconds
field. - Optional. The
autoRollbackOnFailure.initMonitorTimeoutSeconds
field specifies the time frame in seconds. If not defined or set to0
, the default value of1800
seconds (30 minutes) is used.
Prerequisites
- Log in to the hub cluster as a user with
cluster-admin
privileges.
Procedure
-
Move to the rollback stage by changing the value of the
stage
field toRollback
in theImageBasedUpgrade
CR.$ oc patch imagebasedupgrades.lca.openshift.io example-upgrade -p='{"spec": {"stage": "Rollback"}}' --type=merge
-
The Lifecycle Agent reboots the cluster with the previously installed version of OpenShift Container Platform and restores the applications.
-
Commit to the rollback by changing the value of the
stage
field toIdle
in theImageBasedUpgrade
CR:$ oc patch imagebasedupgrades.lca.openshift.io example-upgrade -p='{"spec": {"stage": "Idle"}}' --type=merge -n openshift-lifecycle-agent
Warning: If you move to the
Idle
stage after a rollback, the Lifecycle Agent cleans up resources that can be used to troubleshoot a failed upgrade.
Upgrading the single-node OpenShift cluster through GitOps ZTP
You can upgrade your managed single-node OpenShift cluster with the image-based upgrade through GitOps ZTP.
Important: During the Developer Preview of this feature, when upgrading a cluster, any custom trusted certificates configured on the cluster will be lost. As a temporary workaround, to preserve these certificates, you must use a seed image from a seed cluster that trusts the certificates.
Prerequisites
- Install RHACM 2.9.2. or later version.
- Install TALM.
- Update GitOps ZTP to the latest version.
- Provision one or more managed clusters with GitOps ZTP.
- Log in as a user with
cluster-admin
privileges. - You generated a seed image from a compatible seed cluster.
- Install the OADP Operator on the target cluster. For more information, see About installing the OADP Operator.
- Create an S3-compatible object storage solution and a ready-to-use bucket with proper credentials configured. For more information, see AWS S3 compatible backup storage providers.
- Create a separate partition on the target cluster for the container images that is shared between stateroots. For more information about, see Sharing the container directory between
ostree
stateroots.
Procedure
-
Create a policy for the OADP
ConfigMap
, namedoadp-cm-common-policies
. For more information about how to create theConfigMap
, follow the first step in Preparing the single-node OpenShift cluster for the image-based upgrade.IMPORTANT: Depending on the RHACM configuration, the
v1/secrets/open-cluster-management-agent/open-cluster-management-image-pull-credentials
object must be backed up. Check if yourMultiClusterHub
CR has thespec.imagePullSecret
field defined and if the secret exists in theopen-cluster-management-agent
namespace in your hub cluster. If thespec.imagePullSecret
field does not exist, you can remove thev1/secrets/open-cluster-management-agent/open-cluster-management-image-pull-credentials
object from thelca.openshift.io/apply-label
annotation. -
(Optional) Create a policy for the
ConfigMap
of your user-specific extra manifests that are not part of the seed image. The Lifecycle Agent does not automatically extract these extra manifests from the seed cluster, so you can add aConfigMap
resource of your user-specific extra manifests in thespec.extraManifests
field in theImageBasedUpgrade
CR. -
(Optional) To keep your custom catalog sources after the upgrade, add them to the spec.extraManifest in the ImageBasedUpgrade CR. For more information, see Catalog source.
-
Create a
PolicyGenTemplate
CR that contains policies for thePrep
andUpgrade
stages.apiVersion: ran.openshift.io/v1 kind: PolicyGenTemplate metadata: name: group-ibu namespace: ztp-group spec: bindingRules: group-du-sno: "" mcp: master evaluationInterval: compliant: 10s noncompliant: 10s sourceFiles: - fileName: ImageBasedUpgrade.yaml policyName: prep-policy spec: stage: Prep seedImageRef: version: 4.15.0 image: quay.io/user/lca-seed:4.15.0 pullSecretRef: name: <seed_pull_secret> oadpContent: - name: oadp-cm-example namespace: openshift-adp # extraManifests: # - name: sno-extra-manifests # namespace: openshift-lifecycle-agent status: conditions: - reason: Completed status: "True" type: PrepCompleted - fileName: ImageBasedUpgrade.yaml policyName: upgrade-policy spec: stage: Upgrade status: conditions: - reason: Completed status: "True" type: UpgradeCompleted
- The
spec.evaluationInterval
field defines the policy evaluation interval for compliant and non-compliant policies. Set them to10s
to ensure that the policy status accurately reflects the current upgrade status. - Define the seed image, OpenShift Container Platform version, and pull secret for the upgrade in the
spec.seedImageRef
section at thePrep
stage. - Define the OADP
ConfigMap
resources required for backup and restore in thespec.oadpContent
section at thePrep
stage. - Optional. Define the
ConfigMap
resource for your user-specific extra manifests in thespec.extraManifests
section at thePrep
stage.
- The
-
Create a
PolicyGenTemplate
CR for the default set of extra manifests:apiVersion: ran.openshift.io/v1 kind: PolicyGenTemplate metadata: name: sno-ibu spec: bindingRules: sites: example-sno du-profile: 4.15.0 mcp: master sourceFiles: - fileName: SriovNetwork.yaml policyName: config-policy metadata: name: sriov-nw-du-fh labels: lca.openshift.io/target-ocp-version: "4.15.0" spec: resourceName: du_fh vlan: 140 - fileName: SriovNetworkNodePolicy.yaml policyName: config-policy metadata: name: sriov-nnp-du-fh labels: lca.openshift.io/target-ocp-version: "4.15.0" spec: deviceType: netdevice isRdma: false nicSelector: pfNames: - ens5f0 numVfs: 8 priority: 10 resourceName: du_fh - fileName: SriovNetwork.yaml policyName: config-policy metadata: name: sriov-nw-du-mh labels: lca.openshift.io/target-ocp-version: "4.15.0" spec: resourceName: du_mh vlan: 150 - fileName: SriovNetworkNodePolicy.yaml policyName: config-policy metadata: name: sriov-nnp-du-mh labels: lca.openshift.io/target-ocp-version: "4.15.0" spec: deviceType: vfio-pci isRdma: false nicSelector: pfNames: - ens7f0 numVfs: 8 priority: 10 resourceName: du_mh
Important: Ensure that the
lca.openshift.io/target-ocp-version
label matches the target OpenShift Container Platform version that is specified in theseedImageRef.version
field of theImageBasedUpgrade
CR. The Lifecycle Agent only applies the CRs that match the specified version. -
Commit, and push the created CRs to the GitOps ZTP Git repository.
- Verify that the policies are created:
$ oc get policies -n spoke1 | grep -E "group-ibu"
Example output
ztp-group.group-ibu-prep-policy inform NonCompliant 31h ztp-group.group-ibu-upgrade-policy inform NonCompliant 31h
-
To reflect the target platform version, update the
du-profile
or the corresponding policy-binding label in theSiteConfig
CR.apiVersion: ran.openshift.io/v1 kind: SiteConfig [...] spec: [...] clusterLabels: du-profile: "4.15.2"
Important: Updating the labels to the target platform version unbinds the existing set of policies.
-
Commit and push the updated
SiteConfig
CR to the GitOps ZTP Git repository. -
When you are ready to move to the
Prep
stage, create theClusterGroupUpgrade
CR with thePrep
and OADPConfigMap
policies:apiVersion: ran.openshift.io/v1alpha1 kind: ClusterGroupUpgrade metadata: name: cgu-ibu-prep namespace: default spec: clusters: - spoke1 enable: true managedPolicies: - oadp-cm-common-policies - group-ibu-prep-policy # - user-spec-extra-manifests remediationStrategy: canaries: - spoke1 maxConcurrency: 1 timeout: 240
-
Apply the
Prep
policy:$ oc apply -f cgu-ibu-prep.yml
-
Monitor the status and wait for the
cgu-ibu-prep
ClusterGroupUpgrade
to reportCompleted
.$ oc get cgu -n default
Example output
NAME AGE STATE DETAILS cgu-ibu-prep 31h Completed All clusters are compliant with all the managed policies
-
-
When you are ready to move to the
Upgrade
stage, create theClusterGroupUpgrade
CR that references theUpgrade
policy:apiVersion: ran.openshift.io/v1alpha1 kind: ClusterGroupUpgrade metadata: name: cgu-ibu-upgrade namespace: default spec: clusters: - spoke1 enable: true managedPolicies: - group-ibu-upgrade-policy remediationStrategy: canaries: - spoke1 maxConcurrency: 1 timeout: 240
-
Apply the
Upgrade
policy:$ oc apply -f cgu-ibu-upgrade.yml
-
Monitor the status and wait for the
cgu-ibu-upgrade
ClusterGroupUpgrade
to reportCompleted
.$ oc get cgu -n default
Example output
NAME AGE STATE DETAILS cgu-ibu-prep 31h Completed All clusters are compliant with all the managed policies cgu-ibu-upgrade 31h Completed All clusters are compliant with all the managed policies
-
-
When you are satisfied with the changes and ready, create the
PolicyGenTemplate
to finalize the upgrade:apiVersion: ran.openshift.io/v1 kind: PolicyGenTemplate metadata: name: group-ibu namespace: "ztp-group" spec: bindingRules: group-du-sno: "" mcp: "master" evaluationInterval: compliant: 10s noncompliant: 10s sourceFiles: - fileName: ImageBasedUpgrade.yaml policyName: "finalize-policy" spec: stage: Idle status: conditions: - status: "True" type: Idle
-
Create a
ClusterGroupUpgrade
CR that references the policy that finalizes the upgrade:apiVersion: ran.openshift.io/v1alpha1 kind: ClusterGroupUpgrade metadata: name: cgu-ibu-finalize namespace: default spec: clusters: - spoke1 enable: true managedPolicies: - group-ibu-finalize-policy remediationStrategy: canaries: - spoke1 maxConcurrency: 1 timeout: 240
-
Apply the policy:
$ oc apply -f cgu-ibu-finalize.yml
-
Monitor the status and wait for the
cgu-ibu-upgrade
ClusterGroupUpgrade
to reportCompleted
.$ oc get cgu -n default
Example output
NAME AGE STATE DETAILS cgu-ibu-finalize 30h Completed All clusters are compliant with all the managed policies cgu-ibu-prep 31h Completed All clusters are compliant with all the managed policies cgu-ibu-upgrade 31h Completed All clusters are compliant with all the managed policies
-
(Optional) Initiating rollback with TALM
By default, an automatic rollback is initiated on certain conditions. For more information about automatic rollback configuration, see (Optional) Initiating rollback of the single-node OpenShift cluster after an image-based upgrade
If you encounter an issue after the upgrade, you can start a manual rollback.
-
Update the
du-profile
or the corresponding policy-binding label with the original platform version in theSiteConfig
CR:apiVersion: ran.openshift.io/v1 kind: SiteConfig [...] spec: [...] clusterLabels: du-profile: "4.14.7"
-
When you are ready to move to the
Rollback
stage, create aPolicyGenTemplate
CR for theRollback
policies:apiVersion: ran.openshift.io/v1 kind: PolicyGenTemplate metadata: name: group-ibu namespace: "ztp-group" spec: bindingRules: group-du-sno: "" mcp: "master" evaluationInterval: compliant: 10s noncompliant: 10s sourceFiles: - fileName: ImageBasedUpgrade.yaml policyName: "rollback-policy" spec: stage: Rollback status: conditions: - message: Rollback completed reason: Completed status: "True" type: RollbackCompleted
-
Create a
ClusterGroupUpgrade
CR that references theRollback
policies:apiVersion: ran.openshift.io/v1alpha1 kind: ClusterGroupUpgrade metadata: name: cgu-ibu-rollback namespace: default spec: clusters: - spoke1 enable: true managedPolicies: - group-ibu-rollback-policy remediationStrategy: canaries: - spoke1 maxConcurrency: 1 timeout: 240
-
Apply the
Rollback
policy:$ oc apply -f cgu-ibu-rollback.yml
-
When you are satisfied with the changes and ready to finalize the rollback, create the
PolicyGenTemplate
CR:apiVersion: ran.openshift.io/v1 kind: PolicyGenTemplate metadata: name: group-ibu namespace: "ztp-group" spec: bindingRules: group-du-sno: "" mcp: "master" evaluationInterval: compliant: 10s noncompliant: 10s sourceFiles: - fileName: ImageBasedUpgrade.yaml policyName: "finalize-policy" spec: stage: Idle status: conditions: - status: "True" type: Idle
-
Create a
ClusterGroupUpgrade
CR that references the policy that finalizes the upgrade:apiVersion: ran.openshift.io/v1alpha1 kind: ClusterGroupUpgrade metadata: name: cgu-ibu-finalize namespace: default spec: clusters: - spoke1 enable: true managedPolicies: - group-ibu-finalize-policy remediationStrategy: canaries: - spoke1 maxConcurrency: 1 timeout: 240
-
Apply the policy:
$ oc apply -f cgu-ibu-finalize.yml
Comments