SAP Data Intelligence 3 on OpenShift Container Platform 4

1. OpenShift Container Platform validation version matrix
2. Requirements
2.1. Hardware/VM and OS Requirements
2.1.1. OpenShift Cluster
2.1.1.1. Node Kinds
2.1.1.2. Note a disconnected and air-gapped environments
2.1.1.3. Minimum Hardware Requirements
2.1.1.4. Minimum Production Hardware Requirements
2.2. Software Requirements
2.2.1. Compatibility Matrix
2.2.2. Persistent Volumes
2.2.3. Container Image Registry
2.2.3.1. Validated Registries
2.2.4. Checkpoint store enablement
2.2.5. SDI Observer
3. Install Red Hat OpenShift Container Platform
3.1. Prepare the Management host
3.1.1. Prepare the connected Management host
3.1.2. Prepare the disconnected RHEL Management host
3.2. Install OpenShift Container Platform
3.3. OpenShift Post Installation Steps
3.3.1. (optional) Install OpenShift Data Foundation
3.3.2. (optional) Install NetApp Trident
3.3.3. Configure SDI compute nodes
3.3.3.1. Air-gapped environment
3.3.4.1. Label the compute nodes for SAP Data Intelligence
3.3.4.2. Pre-load needed kernel modules
3.3.4.3. Change the maximum number of PIDs per Container
3.3.4.4. Associate MachineConfigs to the Nodes
3.3.4.4.1. Enable SDI on control plane
3.3.4.6. Verification of the node configuration
3.3.5. Deploy persistent storage provider
3.3.6. Configure S3 access and bucket
3.3.6.1. Using NooBaa or RADOS Object Gateway S3 endpoint as object storage
3.3.6.1.1. Creating an S3 bucket using CLI
3.3.6.1.2. Increasing object bucket limits
3.3.7. Set up a Container Image Registry
3.3.8. Configure the OpenShift Cluster for SDI
3.3.8.1. Becoming a cluster-admin
4. SDI Observer
4.1. Prerequisites
4.2.1. Prerequisites for Connected OpenShift Cluster
4.2.2. Prerequisites for a Disconnected OpenShift Cluster
4.2.3. Instantiation of Observer's Template
4.2.4. (Optional) SDI Observer Registry
4.2.4.1. SDI Registry Template parameters
4.3. Managing SDI Observer
4.3.1. Viewing and changing the current configuration
4.3.2. Re-deploying SDI Observer
5. Install SDI on OpenShift
5.1. Install Software Lifecycle Container Bridge
5.1.1. Important Parameters
5.1.2. Install SLC Bridge
5.1.2.1. Exposing SLC Bridge with OpenShift Ingress Controller
5.1.2.1.1. Manually exposing SLC Bridge with Ingress
5.1.2.2. Using an external load balancer to access SLC Bridge's NodePort
5.2. SDI Installation Parameters
5.3. Project setup
5.4. Install SDI
5.5. SDI Post installation steps
5.5.1. (Optional) Expose SDI services externally

5.5.1.1. Using OpenShift Ingress Operator
5.5.1.1.1. Export services with an reencrypt route
5.5.1.1.2. Export services with a passthrough route
5.5.1.2. Using NodePorts
5.5.2. Configure the Connection to Data Lake
5.5.3. SDI Validation
5.5.3.1. Log On to SAP Data Intelligence Launchpad
5.5.3.2. Check Your Machine Learning Setup
5.5.4. Configuration of additional tenants
6. OpenShift Container Platform Upgrade
6.1. Pre-upgrade procedures
6.1.1. Stop SAP Data Intelligence
6.2. Upgrade OpenShift
6.3. Post-upgrade procedures
7. SAP Data Intelligence Upgrade or Update
7.1. Pre-upgrade or pre-update procedures
7.1.1. Execute SDI's Pre-Upgrade Procedures
7.1.1.1. Automated route removal
7.1.1.2. Manual route removal
7.1.2. (upgrade) Prepare SDI Project
7.2. Update or Upgrade SDI
7.2.1. Update Software Lifecycle Container Bridge
7.2.2. (upgrade) Upgrade SAP Data Intelligence to a newer minor release
7.3. (ocp-upgrade) Upgrade OpenShift
7.4. SAP Data Intelligence Post-Upgrade Procedures
7.5. Validate SAP Data Intelligence
8. Appendix
8.1. SDI uninstallation
8.2. Quay Registry for SDI
8.2.1. Quay namespaces, users and accounts preparations
8.2.2. Determine the Image Repository
8.2.3. Importing Quay's CA Certificate to OpenShift
8.2.4. Configuring additional SDI tenants
8.2.4.1. Importing Quay's CA Certificate to SAP DI
8.2.4.2. Create and import vflow pull secret into OpenShift
8.2.4.3. Import credentials secret to SDI tenant
8.3. (Deprecated) Deploying SDI Registry manually
8.3.1. Deployment
8.3.1.1. Prerequisites
8.3.1.2. Template instantiation
8.3.1.3. Generic instantiation for a disconnected environment
8.3.2. Update instructions
8.3.3. Determine Registry's credentials
8.3.4. Verification
8.3.5. Post configuration
8.3.5.1. Making SDI Registry trusted by OpenShift
8.3.5.2. SDI Observer Registry tenant configuration
8.4. Configure OpenShift to trust container image registry
8.5. Configure insecure registry
8.6. Running multiple SDI instances on a single OpenShift cluster
8.7. Installing remarshal utilities on RHEL
8.8. (footnote ⁿ) Upgrading to the next minor release from the latest asynchronous release
8.9. HTTP Proxy Configuration
8.9.1. Configuring HTTP Proxy on the management host
8.9.2. Configuring HTTP Proxy on the OpenShift cluster
8.9.3. Configuring HTTP Proxy for the SLC Bridge
8.9.4. Configuring HTTP Proxy for the SAP DI during its installation
8.9.5. Configuring HTTP Proxy after the SAP DI installation
8.10. GPU enablement for SDI on OCP
9. Troubleshooting

In general, the installation of SAP Data Intelligence (SDI) follows these steps:

Install Red Hat OpenShift Container Platform
Configure the prerequisites for SAP Data Intelligence Foundation
Install SDI Observer
Install SAP Data Intelligence Foundation on OpenShift Container Platform

If you're interested in installation of SAP Data Hub or SAP Vora, please refer to the other installation guides:

Note OpenShift Container Storage (OCS) is called throughout this article under its new product name OpenShift Data Foundation (ODF).
Note that OpenShift Container Platform (OCP) can be substituted by OpenShift Kubernetes Engine (OKE). OKE is sufficient and supported to run SAP Data Intelligence.

▲ Note There are known SAP image security issues that may be revealed during a security audit. Red Hat cannot resolve them. Please open a support case with SAP regarding any of the following:

SAP containers run as root
SAP containers run unconfined (unrestricted by SELinux)
SAP containers require privileged security context

1. OpenShift Container Platform validation version matrix

The following version combinations of SDI 3.X, OpenShift Container Platform (OCP), RHEL or RHCOS have been validated for the production environments:

For information on SAP Data Intelligence support with OpenShift releases 4.14 and later, please refer to https://access.redhat.com/articles/7042265.

SAP Data Intelligence	OpenShift Container Platform	Operating System	Infrastructure and (Storage)	Confirmed&Supported by SAP
3.0	4.2 †	RHCOS (nodes), RHEL 8.1+ or Fedora (Management host)	VMware vSphere (ODF 4.2)	supported †
3.0 Patch 3	4.2 †, 4.4 †	RHCOS (nodes), RHEL 8.2+ or Fedora (Management host)	VMware vSphere (ODF 4)	supported †
3.0 Patch 4	4.4 †	RHCOS (nodes), RHEL 8.2+ or Fedora (Management host)	VMware vSphere (ODF 4), (NetApp Trident 20.04)	supported †
3.0 Patch 8	4.6 †	RHCOS (nodes), RHEL 8.2+ or Fedora (Management host)	KVM/libvirt (ODF 4)	supported †
3.1	4.4 †	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	VMware vSphere (ODF 4)	not supported¹
3.1	4.6 †	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	VMware vSphere (ODF 4 ¡, NetApp Trident 20.10 + StorageGRID), Bare metal ∗ (ODF 4 ¡)	supported †
3.2	4.6 †, 4.8	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	IBM Cloud™ (IBM Cloud Block Storage)	supported
3.2	4.6 †, 4.8	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	VMware vSphere (ODF 4)	supported
3.2	4.8, 4.10,	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	Bare metal ∗ (ODF 4 ¡)	supported
3.3	4.8, 4.10, 4.12	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	VMware vSphere (ODF 4)	supported
3.3	4.8, 4.10, 4.12	RHCOS (nodes), RHEL 8.3+ or Fedora (Management host)	Bare metal ∗ (ODF 4 ¡)	supported

† The referenced OpenShift release is no longer supported by Red Hat!
¹ 3.1 on OpenShift 4.4 used to be supported by SAP only for the purpose of upgrade to OpenShift 4.6
∗ Validated on two different hardware configurations:

For more information on OCP on IBM Cloud™, please refer to Getting started with Red Hat OpenShift on IBM Cloud.
If using this platform, you don't need to install OpenShift and may jump directly to IBM's documentation Planning your SAP Data Intelligence deployment. You will get guided through all installation steps and find the appropriate links back to this Red Hat article.
(Dev/PoC level) Lenovo 4 bare metal hosts setup composed of:
- 3 schedulable control plane nodes running both ODF and SDI (Lenovo ThinkSystem SR530)
- 1 compute node running SDI) (Lenovo ThinkSystem SR530)
(Production level) Dell Technologies bare metal cluster composed of:
- 1 CSAH node (Dell EMC PowerEdge R640s)
- 3 control plane nodes (Dell EMC PowerEdge R640s)
- 3 dedicated ODF nodes (Dell EMC PowerEdge R640s)
- 3 dedicated SDI nodes (Dell EMC PowerEdge R740xd)
CSI supported external Dell EMC storage options and cluster sizing options available.
CSAH stands for Cluster System Admin Host - an equivalent of management host

Please refer to the compatibility matrix for version combinations that are considered as working.

SAP Note #2871970 lists more details.

2. Requirements

2.1. Hardware/VM and OS Requirements

2.1.1. OpenShift Cluster

Make sure to consult the following official cluster requirements:

of SAP Data Intelligence in SAP's documentation:
of OpenShift 4 (Minimum resource requirements (4.12) / (4.10))
additionally, if deploying OpenShift Data Foundation (aka ODF), please consult also ODF Supported configurations (4.12) / (4.10)
if deploying on VMware vSphere, please consider also VMware vSphere infrastructure requirements (4.12) / (4.10)
if deploying NetApp Trident, please consult also NetApp Hardware/VM and OS Requirements

2.1.1.1. Node Kinds

There are 4 kinds of nodes:

Bootstrap Node - A temporary bootstrap node needed for the OpenShift deployment. The node can be either destroyed by the installer (using infrastructure-provisioned-installation -- aka IPI) or can be deleted manually by the administrator. Alternatively, it can be re-used as a worker node. Please refer to the Installation process (4.12) / (4.10) for more information.
Master Nodes (4.12) / (4.10) - The control plane manages the OpenShift Container Platform cluster. The control plane can be made schedulable to enable SDI workload there as well.
Compute Nodes (4.12) / (4.10) - Run the actual workload (e.g. SDI pods). They are optional on a three-node cluster (where the master nodes are schedulable).
ODF Nodes (4.12) / (4.10) - Run OpenShift Data Foundation (aka ODF). The nodes can be divided into starting (running both OSDs and monitors) and additional nodes (running only OSDs). Needed only when ODF shall be used as the backing storage provider.
- NOTE: Running in a compact mode (on control plane) is fully supported starting from ODF 4.8.
Management host (aka administrator's workstation or Jump host - The Management host is used among other things for:
- accessing the OpenShift cluster via a configured command line client (oc or kubectl)
- configuring OpenShift cluster
- running Software Lifecycle Container Bridge (SLC Bridge)

The hardware/software requirements for the Management host can be:

OS: Red Hat Enterprise Linux 8.1+, RHEL 7.6+ or Fedora 30+
Diskspace: 20GiB for /:

2.1.1.2. Note a disconnected and air-gapped environments

By the term "disconnected host", it is referred to a host having no access to internet.
By the term "disconnected cluster", it is referred to a cluster where each host is disconnected.
A disconnected cluster can be managed from a Management host that is either connected (having access to the internet) or disconnected.
The latter scenario (both cluster and management host being disconnected) will be referred to by the term "air-gapped".
Unless stated otherwise, whatever applies to a disconnected host, cluster or environment, applies also to the "air-gapped".

2.1.1.3. Minimum Hardware Requirements

The table below lists the minimum requirements and the minimum number of instances for each node type for the latest validated SDI and OpenShift 4.X releases. This is sufficient of a PoC (Proof of Concept) environments.

Type	Count	Operating System	vCPU ⑃	RAM (GB)	Storage (GB)	AWS Instance Type
Bootstrap	1	RHCOS	4	16	120	m4.xlarge
Master	3	RHCOS	4	16	120	m4.xlarge
Compute	3+	RHCOS or RHEL 7.8 or 7.9	8	32	120	m4.2xlarge

On a three-node cluster, it would look like this:

Type	Count	Operating System	vCPU ⑃	RAM (GB)	Storage (GB)	AWS Instance Type
Bootstrap	1	RHCOS	4	16	120	m4.xlarge
Master/Compute	3	RHCOS	10	40	120	m4.xlarge

If using ODF in internal mode, at least additional 3 (starting) nodes are recommended. Alternatively, the Compute nodes outlined above can also run ⑂ ODF pods. In that case, the hardware specifications need to be extended accordingly. The following table lists the minimum requirements for each additional node:

Type	Count	Operating System	vCPU ⑃	RAM (GB)	Storage (GB)	AWS Instance Type
ODF starting (OSD+MON)	3	RHCOS	10	24	120 + 2048 ♢	m5.4xlarge

2.1.1.4. Minimum Production Hardware Requirements

The minimum production requirements for production systems for the latest validated SDI and OpenShift 4 are the following:

Type	Count	Operating System	vCPU ⑃	RAM (GB)	Storage (GB)	AWS Instance Type
Bootstrap	1	RHCOS	4	16	120	m4.xlarge
Master	3+	RHCOS	8	16	120	c5.xlarge
Compute	3+	RHCOS or RHEL 7.8 or 7.9	16	64	120	m4.4xlarge

On a three-node cluster, it would look like this:

Type	Count	Operating System	vCPU ⑃	RAM (GB)	Storage (GB)	AWS Instance Type
Bootstrap	1	RHCOS	4	16	120	m4.xlarge
Master/Compute	3	RHCOS	22	72	120	c5.9xlarge

If using ODF 4 in internal mode, at least additional 3 (starting) nodes are recommended. Alternatively, the Compute nodes outlined above can also run ODF ⑂ pods. In that case, the hardware specifications need to be extended accordingly. The following table lists the minimum requirements for each additional node:

Type	Count	Operating System	vCPU ⑃	RAM (GB)	Storage (GB)	AWS Instance Type
ODF starting (OSD+MON)	3	RHCOS	20	49	120 + 6×2048 ♢	c5a.8xlarge

♢ Please refer to ODF Platform Requirements (4.12) / (4.10).
⑂ Running in a compact mode (on control plane) is fully supported starting from ODF 4.8.
⑃ 1 physical core provides 2 vCPUs when hyper-threading is enabled. 1 physical core provides 1 vCPU when hyper-threading is not enabled.

2.2. Software Requirements

2.2.1. Compatibility Matrix

Later versions of SAP Data Intelligence support newer versions of Kubernetes and OpenShift Container Platform or OpenShift Kubernetes Engine. Even if not listed in the OpenShift validation version matrix above, the following version combinations are considered fully working and supported:

SAP Data Intelligence	OpenShift Container Platform ²	Worker Node	Management host	Infrastructure	Storage	Object Storage
3.0 Patch 3 or higher	4.3, 4.4	RHCOS	RHEL 8.1 or newer	Cloud ❄, VMware vSphere	ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣	ODF, NetApp StorageGRID 11.3 or newer
3.0 Patch 8 or higher	4.4, 4.5, 4.6	RHCOS	RHEL 8.1 or newer	Cloud ❄, VMware vSphere	ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣	ODF, NetApp StorageGRID 11.3 or newer
3.1	4.4, 4.5, 4.6	RHCOS	RHEL 8.1 or newer	Cloud ❄, VMware vSphere, Bare metal	ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣	ODF ¡, NetApp StorageGRID 11.4 or newer
3.2	4.6, 4.7, 4.8	RHCOS	RHEL 8.1 or newer	Cloud ❄, VMware vSphere, Bare metal	ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣	ODF ¡, NetApp StorageGRID 11.4 or newer
3.3	4.8, 4.9, 4.10, 4.11, 4.12	RHCOS	RHEL 8.1 or newer	Cloud ❄, VMware vSphere, Bare metal	ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣, NFS ♣	ODF ¡, NetApp StorageGRID 11.4 or newer

² OpenShift Kubernetes Engine (OKE) is a viable and supported substiute for OpenShift Container Platform (OCP).
❄ Cloud means any cloud provider supported by OpenShift Container Platform. For a complete list of tested and supported infrastructure platforms, please refer to OpenShift Container Platform 4.x Tested Integrations. The persistent storage in this case must be provided by the cloud provider. Please see refer to Understanding persistent storage (4.12) / (4.10) for a complete list of supported storage providers.
♣ This persistent storage provider does not offer a supported object storage service required by SDI's checkpoint store and therefor is suitable only for SAP Data Intelligence development and PoC clusters. It needs to be complemented by an object storage solution for the full SDI functionality.
¡ For the full functionality (including SDI backup&restore), ODF 4.6.4 or newer is required. Alternatively, ODF external mode can be used while utilizing RGW for SDI backup&restore (checkpoint store).

Unless stated otherwise, the compatibility of a listed SDI version covers all its patch releases as well.

2.2.2. Persistent Volumes

Persistent storage is needed for SDI. It is required to use storage that can be created dynamically. You can find more information in the Understanding persistent storage (4.12) / (4.10) document.

2.2.3. Container Image Registry

The SDI installation requires a secured Image Registry where images are first mirrored from an SAP Registry and then delivered to the OpenShift cluster nodes. The integrated OpenShift Container Registry (4.12) / (4.10) is NOT appropriate for this purpose. Neither is AWS ECR Registry. For now, another image registry needs to be set up instead.

The requirements listed here is a subset of the official requirements listed in Container Registry (3.3) / (3.2) / (3.1).

The word secured in this context means that the communication is encrypted using a TLS. Ideally with certificates signed by a trusted certificate authority. If the registry is also exposed publicly, it must require authentication and authorization in order to pull SAP images.

2.2.3.1. Validated Registries

(recommened) Red Hat Quay 3.6 or higher is compatible with SAP Data Intelligence images and is supported for this purpose. The Quay registry can run either on OpenShift cluster itself, another OpenShift cluster or standalone. For more information, please see Quay Registry for SAP DI.
(deprecated) SDI Registry is a community-supported container image registry satisfying the requirements. Please refer to Deploying SDI Registry for more information.

When finished you should have an external image registry up and running. We will use the URL local.image.registry:5000 as an example. You can verify its readiness with the following command.

# curl -k https://local.image.registry:5000/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}

2.2.4. Checkpoint store enablement

In order to enable SAP Vora Database streaming tables, checkpoint store needs to be enabled. The store is an object storage on a particular storage back-end. Several back-end types are supported by the SDI installer that cover most of the storage cloud providers.

The enablement is strongly recommended for production clusters. Clusters having this feature disabled are suitable only for test, development or PoC use-cases.

Make sure to create a desired bucket before the SDI Installation. If the checkpoint store shall reside in a directory on a bucket, the directory needs to exist as well.

2.2.5. SDI Observer

Is a pod monitoring SDI's namespace and modifying objects in there that enable running of SDI on top of OpenShift. The observer shall be run in a dedicated namespace. It must be deployed before the SDI installation is started. SDI Observer section will guide you through the process of deployment.

3. Install Red Hat OpenShift Container Platform

3.1. Prepare the Management host

Note the following has been tested on RHEL 8.4. The steps shall be similar for other RPM based Linux distribution. Recommended are RHEL 7.7+, Fedora 30+ and CentOS 7+.

3.1.1. Prepare the connected Management host

Subscribe the Management host at least to the following repositories:

# OCP_RELEASE=4.12
# sudo subscription-manager repos                 \
    --enable=rhel-8-for-x86_64-appstream-rpms     \
    --enable=rhel-8-for-x86_64-baseos-rpms        \
    --enable=rhocp-${OCP_RELEASE:-4.12}-for-rhel-8-x86_64-rpms

Install jq binary. This installation guide has been tested with jq 1.6.
- on RHEL 8, make sure rhocp-4.12-for-rhel-8-x86_64-rpms repository or newer is enabled and install it from there:
  Raw
```
# dnf install jq-1.6
```
- on earlier releases or other distributions, download the binary from upstream:
  Raw
```
# sudo curl -L -O /usr/local/bin/jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64
# sudo chmod a+x /usr/local/bin/jq
```
Download and install OpenShift client binaries.
Raw
```
# sudo dnf install -y openshift-clients
```

3.1.2. Prepare the disconnected RHEL Management host

Please refer to KB#3176811 Creating a Local Repository and Sharing With Disconnected/Offline/Air-gapped Systems and KB#29269 How can we regularly update a disconnected system (A system without internet connection)?.

Install jq-1.6 and openshift-clients from your local RPM repository.

3.2. Install OpenShift Container Platform

Install OpenShift Container Platform on your desired cluster hosts. Follow the OpenShift installation guide (4.12) / (4.10)

Several changes need to be done to compute nodes running SDI workloads before SDI installation. These include:

pre-load needed kernel modules
increasing the PIDs limit of CRI-O container engine

They will be described in the next section.

3.3. OpenShift Post Installation Steps

3.3.1. (optional) Install OpenShift Data Foundation

Red Hat OpenShift Data Foundation (ODF) has been validated as the persistent storage provider for SAP Data Intelligence. Please refer to the ODF documentation (4.12) / (4.10)

Please make sure to read and follow Disconnected Environment (4.12) / (4.10) if you install on a disconnected cluster.

3.3.2. (optional) Install NetApp Trident

NetApp Trident together with StorageGRID have been validated for SAP Data Intelligence and OpenShift. More details can be found at SAP Data Intelligence on OpenShift 4 with NetApp Trident.

3.3.3. Configure SDI compute nodes

Some SDI components require changes on the OS level of compute nodes. These could impact other workloads running on the same cluster. To prevent that from happening, it is recommended to dedicate a set of nodes to SDI workload. The following needs to be done:

Chosen nodes must be labeled e.g. using the node-role.kubernetes.io/sdi="" label.
MachineConfigs specific to SDI need to be created, they will be applied only to the selected nodes.
MachineConfigPool must be created to associate the chosen nodes with the newly created MachineConfigs.
- no change will be done to the nodes until this point
(optional) Apply a node selector to sdi, sap-slcbridge and datahub-system projects.
- SDI Observer can be configured to do that with SDI_NODE_SELECTOR parameter

Before modifying the recommended approach below, please make yourself familiar with the custom pools concept of the machine config operator.

3.3.3.1. Air-gapped environment

If the Management host does not have access to the internet, you will need to clone the sap-data-intelligence git repository to some other host and make it available on the Management host. For example:

# cd /var/run/user/1000/usb-disk/
# git clone https://github.com/redhat-sap/sap-data-intelligence

Then on the Management host:

unless the local checkout already exists, copy it from the disk:

# git clone /var/run/user/1000/usb-disk/sap-data-intelligence ~/sap-data-intelligence

otherwise, re-apply local changes (if any) to the latest code:

# cd ~/sap-data-intelligence
# git stash         # temporarily remove local changes
# git remote add drive /var/run/user/1000/usb-disk/sap-data-intelligence
# git fetch drive
# git merge drive   # apply the latest changes from drive to the local checkout
# git stash pop     # re-apply the local changes on top of the latest code

3.3.4.1. Label the compute nodes for SAP Data Intelligence

Choose compute nodes for the SDI workload and label them from the Management host like this:

# oc label node/sdi-worker{1,2,3} node-role.kubernetes.io/sdi=""

This step, combined with the SDI Observer functionality, ensures that SAP DI-related workloads in specific namespaces are run on the labeled nodes. The SDI Observer adds a node selector annotation to the relevant SAP DI namespaces (e.g., openshift.io/node-selector: node-role.kubernetes.io/sdi=), which typically include the SAP DI namespace, datahub-system, and sap-slcbridge namespaces.

3.3.4.2. Pre-load needed kernel modules

To apply the desired changes to the existing and future SDI compute nodes, please create another machine config like this:

(connected management host)

# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/mc-75-worker-sap-data-intelligence.yaml

(disconnected management host)

# oc apply -f sap-data-intelligence/master/snippets/mco/mc-75-worker-sap-data-intelligence.yaml

∇ NOTE: If the warning below appears, it can be usually ignored. It suggests that the resource already exists on the cluster and has been created by none of the listed commands. In earlier versions of this documentation, plain oc create used to be recommended instead.

Warning: oc apply should be used on resource created by either oc create --save-config or oc apply

3.3.4.3. Change the maximum number of PIDs per Container

For OCP 4.10 and previous releases, the process of configuring the nodes is described at Modifying Nodes (4.10) In SDI case, the required settings are .spec.containerRuntimeConfig.pidsLimit in a ContainerRuntimeConfig. The result is a modified /etc/crio/crio.conf configuration file on each affected worker node with pids_limit set to the desired value. Please create a ContainerRuntimeConfig like this:

(connected management host)

# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/ctrcfg-sdi-pids-limit.yaml

(disconnected management host)

# oc apply -f sap-data-intelligence/master/snippets/mco/ctrcfg-sdi-pids-limit.yaml

Starting with OCP 4.11, the configuration in CRI-O is deprecated in favor of the configuration in the KubeletConfig, and the default podPidsLimit changed to 4096. The process of configuring the node is described at Modifying Nodes (4.12). In SDI case, we need to increase the pids_limit in the KubeletConfig. The result is a modified /etc/kubernetes/kubelet.conf configuration file on each affected worker node with pids_limit set to the desired value. Please create a ContainerRuntimeConfig like this:

(connected management host)

# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/kubeletconfig-sdi-pids-limit.yaml

(disconnected management host)

# oc apply -f sap-data-intelligence/master/snippets/mco/kubeletconfig-sdi-pids-limit.yaml

3.3.4.4. Associate MachineConfigs to the Nodes

Define a new MachineConfigPool associating MachineConfigs to the nodes. The nodes will inherit all the MachineConfigs targeting worker and sdi roles.

(connected management host)

# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/mcp-sdi.yaml

(disconnected management host)

# oc apply -f sap-data-intelligence/master/snippets/mco/mcp-sdi.yaml

Note that you may see a warning ∇ if the MCO exists already.

The changes will be rendered into machineconfigpool/sdi. The workers will be restarted one-by-one until the changes are applied to all of them. See Applying configuration changes to the cluster (4.12) / (4.10) for more information.

The following command can be used to wait until the change gets applied to all the worker nodes:

# oc wait mcp/sdi --all --for=condition=updated

After performing the changes above, you should end up with a new role sdi assigned to the chosen nodes and a new MachineConfigPool containing the nodes:

# oc get nodes
NAME          STATUS   ROLES        AGE   VERSION
ocs-worker1   Ready    worker       32d   v1.19.0+9f84db3
ocs-worker2   Ready    worker       32d   v1.19.0+9f84db3
ocs-worker3   Ready    worker       32d   v1.19.0+9f84db3
sdi-worker1   Ready    sdi,worker   32d   v1.19.0+9f84db3
sdi-worker2   Ready    sdi,worker   32d   v1.19.0+9f84db3
sdi-worker3   Ready    sdi,worker   32d   v1.19.0+9f84db3
master1       Ready    master       32d   v1.19.0+9f84db3
master2       Ready    master       32d   v1.19.0+9f84db3
master3       Ready    master       32d   v1.19.0+9f84db3

# oc get mcp
NAME     CONFIG                 UPDATED  UPDATING  DEGRADED  MACHINECOUNT  READYMACHINECOUNT  UPDATEDMACHINECOUNT  DEGRADED
master   rendered-master-15f⋯   True     False     False     3             3                  3                    0
sdi      rendered-sdi-f4f⋯      True     False     False     3             3                  3                    0
worker   rendered-worker-181⋯   True     False     False     3             3                  3                    0

3.3.4.4.1. Enable SDI on control plane

If the control plane (or master nodes) shall be used for running SDI workload, in addition to the previous step, one needs to perform the following:

Please make sure the control plane is schedulable

Duplicate the machine configs for master nodes:

# oc get -o json mc -l machineconfiguration.openshift.io/role=sdi | jq  '.items[] |
    select((.metadata.annotations//{}) |
        has("machineconfiguration.openshift.io/generated-by-controller-version") | not) |
    .metadata |= ( .name   |= sub("^(?<i>(\\d+-)*)(worker-)?"; "\(.i)master-") |
                   .labels |= {"machineconfiguration.openshift.io/role": "master"} )' | oc apply -f -

Note that you may see a couple of warnings ∇ if this has been done earlier.

Make the master machine config pool inherit the PID limits changes:
Raw
```
# oc label mcp/master workload=sapdataintelligence
```

The following command can be used to wait until the change gets applied to all the worker nodes:

# oc wait mcp/master --all --for=condition=updated

3.3.4.6. Verification of the node configuration

The following steps assume that the node-role.kubernetes.io/sdi="" label has been applied to nodes running the SDI workload. All the commands shall be executed on the Management host. All the diagnostics commands will be run in parallel on such nodes.

(disconneted only) Make one of the tools images available for your cluster:

Either use the image stream openshift/tools:

Make sure the image stream has been populated:

# oc get -n openshift istag/tools:latest

Example output:

NAME           IMAGE REFERENCE                                                UPDATED
tools:latest   quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:13c...   17 hours ago

If it is not the case, make sure your registry mirror CA certificate is trusted.

Set the following variable:

# ocDebugArgs="--image-stream=openshift/tools:latest"

Or make registry.redhat.io/rhel8/support-tools image available in your local registry:

# LOCAL_REGISTRY=local.image.registry:5000
# podman login registry.redhat.io
# podman login "$LOCAL_REGISTRY"    # if the local registry requires authentication
# skopeo copy --remove-signatures \
    docker://registry.redhat.io/rhel8/support-tools:latest \
    docker://"$LOCAL_REGISTRY/rhel8/support-tools:latest"
# ocDebugArgs="--image=$LOCAL_REGISTRY/rhel8/support-tools:latest"

Verify that the PID limit has been increased to 16384:

For OCP 4.10 and previous releases, check if the CRI-O pids_limit is being set on the node where the application container is running:

    # oc get nodes -l node-role.kubernetes.io/sdi= -o name | \
        xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/bash -c \
            "crio-status config | awk '/pids_limit/ {
                print ENVIRON[\"HOSTNAME\"]\":\t\"\$0}'" |& grep pids_limit

NOTE: $ocDebugArgs is set only in a disconnected environment, otherwise it shall be empty.

An example output could look like this:

    sdi-worker3:    pids_limit = 16384
    sdi-worker1:    pids_limit = 16384
    sdi-worker2:    pids_limit = 16384

For OCP 4.11 and later releases, check if the kubelet podPidsLimit is being set in /etc/kubernetes/kubelet.conf by running the following command:

    # oc get nodes -l node-role.kubernetes.io/sdi= -o name | \
        xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/bash -c \
            "cat /etc/kubernetes/kubelet.conf" | jq '.podPidsLimit'

Verify that the kernel modules have been loaded:

# oc get nodes -l node-role.kubernetes.io/sdi= -o name | \
    xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/sh -c \
        "lsmod | awk 'BEGIN {ORS=\":\t\"; print ENVIRON[\"HOSTNAME\"]; ORS=\",\"}
            /^(nfs|ip_tables|iptable_nat|[^[:space:]]+(REDIRECT|owner|filter))/ {
                print \$1
            }'; echo" 2>/dev/null

An example output could look like this:

sdi-worker2:  iptable_filter,iptable_nat,xt_owner,xt_REDIRECT,nfsv4,nfs,nfsd,nfs_acl,ip_tables,
sdi-worker3:  iptable_filter,iptable_nat,xt_owner,xt_REDIRECT,nfsv4,nfs,nfsd,nfs_acl,ip_tables,
sdi-worker1:  iptable_filter,iptable_nat,xt_owner,xt_REDIRECT,nfsv4,nfs,nfsd,nfs_acl,ip_tables,

If any of the following modules is missing on any of the SDI nodes, the module loading does not work: iptable_nat, nfsv4, nfsd, ip_tables, xt_owner

To further debug missing modules, one can execute also the following command:

# oc get nodes -l node-role.kubernetes.io/sdi= -o name | \
    xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/bash -c \
         "( for service in {sdi-modules-load,systemd-modules-load}.service; do \
             printf '%s:\t%s\n' \$service \$(systemctl is-active \$service); \
         done; find /etc/modules-load.d -type f \
             -regex '.*\(sap\|sdi\)[^/]+\.conf\$' -printf '%p\n';) | \
         awk '{print ENVIRON[\"HOSTNAME\"]\":\t\"\$0}'" 2>/dev/null

Please make sure that both systemd services are active and at least one *.conf file is listed for each host like shown in the following example output:

sdi-worker3:  sdi-modules-load.service:       active
sdi-worker3:  systemd-modules-load.service:   active
sdi-worker3:  /etc/modules-load.d/sdi-dependencies.conf
sdi-worker1:  sdi-modules-load.service:       active
sdi-worker1:  systemd-modules-load.service:   active
sdi-worker1:  /etc/modules-load.d/sdi-dependencies.conf
sdi-worker2:  sdi-modules-load.service:       active
sdi-worker2:  systemd-modules-load.service:   active
sdi-worker2:  /etc/modules-load.d/sdi-dependencies.conf

3.3.5. Deploy persistent storage provider

Unless your platform already offers a supported persistent storage provider, one needs to be deployed. Please refer to Understanding persistent storage (4.12) / (4.10) for an overview of possible options.

On OpenShift, one can deploy OpenShift Data Foundation (ODF) (4.12) / (4.10) running converged on OpenShift nodes providing both persistent volumes and object storage. Please refer to ODF Planning your Deployment (4.12) / (4.10) and Deploying OpenShift Data Foundation (4.12) / (4.10) for more information and installation instructions.

ODF can be deployed also in a disconnected environment (4.12) / (4.10).

For the NFS based persistent storage, dynamic stroage provisioning method is a prerequites. Please refer to the following article for detailed information "How do I create a storage class for NFS dynamic storage provisioning in an OpenShift environment?".

3.3.6. Configure S3 access and bucket

Object storage is required for the following features of SDI:

backup&restore (previously checkpoint store) feature providing regular back-ups of SDI service data
SDL Data Lake connection (3.3) / (3.2) / (3.1) for the machine learning scenarios

Several interfaces to the object storage are supported by SDI. S3 interface is one of them. Please take a look at Checkpoint Store Type at Required Input Parameters (3.3) / (3.2) / (3.1) for the complete list. SAP help page covers preparation of object store (3.3) / (3.2) / (3.1).

Backup&restore can be enabled against ODF NooBaa's S3 endpoint as long as ODF is of version 4.6.4 or newer, or against RADOS Object Gateway S3 endpoint when ODF is deployed in the external mode.

3.3.6.1. Using NooBaa or RADOS Object Gateway S3 endpoint as object storage

ODF contains NooBaa object data service for hybrid and multi cloud environments which provides S3 API one can use with SAP Data Intelligence. Starting from ODF release 4.6.4, it can be used also for SDI's backup&restore functionality. Alternatively, the functionality can be enabled against RADOS Object Gateway S3 endpoint (from now on just RGW) which is available when ODF is deployed in the external mode (4.12) / (4.10).

For SDI, one needs to provide the following:

S3 host URL prefixed either with https:// or http://
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
bucket name

NOTE: In case of https://, the endpoint must be secured by certificates signed by a trusted certificate authority. Self-signed CAs will not work out of the box as of now.

Once ODF is deployed, one can create the access keys and buckets using one of the following:

(internal mode only) via NooBaa Management Console by default exposed at noobaa-mgmt-openshift-storage.apps.<cluster_name>.<base_domain>
(both internal and external modes) via CLI with mksdibuckets script

In both cases, the S3 endpoint provided to the SAP Data Intelligence cannot be secured with a self-signed certificate as of now. Unless the endpoints are secured with a proper signed certificate, one must use insecure HTTP connection. Both NooBaa and RGW come with such an insecure service reachable from inside the cluster (within the SDN), it cannot be resolved from outside of cluster unless exposed via e.g. route.

The following two URLs are the example endpoints on OpenShift cluster with ODF deployed.

http://s3.openshift-storage.svc.cluster.local - NooBaa S3 Endpoint available always
http://rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore.openshift-storage.svc.cluster.local:8080 - RGW endpoint that shall be preferably used when ODF is deployed in the external mode

3.3.6.1.1. Creating an S3 bucket using CLI

The bucket can be created with the command below executed from the Management host. Be sure to switch to appropriate project/namespace (e.g. sdi) first before executing the following command or append parameters -n SDI_NAMESPACE to it.

(connected management host)

# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/mksdibuckets)

(disconnected management host)

# bash sap-data-intelligence/master/utils/mksdibuckets

By default, two buckets will be created. You can list them this way:

(connected management host)

# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/mksdibuckets) list

(disconnected management host)

# bash sap-data-intelligence/master/utils/mksdibuckets list

Example output:

Bucket claim namespace/name:  sdi/sdi-checkpoint-store  (Status: Bound, Age: 7m33s)
  Cluster internal URL:       http://s3.openshift-storage.svc.cluster.local
  Bucket name:                sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9
  AWS_ACCESS_KEY_ID:          LQ7YciYTw8UlDLPi83MO
  AWS_SECRET_ACCESS_KEY:      8QY8j1U4Ts3RO4rERXCHGWGIhjzr0SxtlXc2xbtE
Bucket claim namespace/name:  sdi/sdi-data-lake  (Status: Bound, Age: 7m33s)
  Cluster internal URL:       http://s3.openshift-storage.svc.cluster.local
  Bucket name:                sdi-data-lake-f86a7e6e-27fb-4656-98cf-298a572f74f3
  AWS_ACCESS_KEY_ID:          cOxfi4hQhGFW54WFqP3R
  AWS_SECRET_ACCESS_KEY:      rIlvpcZXnonJvjn6aAhBOT/Yr+F7wdJNeLDBh231

# # NOTE: for more information and options, run the command with --help

The example above uses ODF NooBaa's S3 endpoint which is always the preferred choice for ODF internal mode.

The values of the claim sdi-checkpoint-store shall be passed to the following SLC Bridge parameters during SDI's installation in order to enable backup&restore (previously known as) checkpoint store functionality.

Parameter	Example value
Object Store Type	`S3 compatible object store`
Access Key	`LQ7YciYTw8UlDLPi83MO`
Secret Key	`8QY8j1U4Ts3RO4rERXCHGWGIhjzr0SxtlXc2xbtE`
Endpoint	`http://s3.openshift-storage.svc.cluster.local`
Path	`sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9`
Disable Certificate Validation	Yes

3.3.6.1.2. Increasing object bucket limits

NOTE: needed only for RGW (ODF external mode)

When performing checkpoint store validation during SDI installation, the installer will create a temporary bucket. For that to work with the RGW, bucket's owner limit on maximum allocatable buckets needs to be increased. The limit is set to 1 by default.

You can use the following command to perform the needed changes for the bucket assigned to the backup&restore (checkpoint store). Please execute it on the management node of the external Red Hat Ceph Storage cluster (or on the host where the external RGW service runs). The last argument is the "Bucket name", not the "Bucket claim name".

(connected management host)

# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/rgwtunebuckets) \
        sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9

(disconnected management host)

# bash sap-data-intelligence/master/utils/rgwtunebuckets \
        sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9

For more information and additional options, append --help parameter at the end.

3.3.7. Set up a Container Image Registry

If you haven't done so already, please follow the Container Image Registry prerequisite.

NOTE: It is now required to use a registry secured by TLS for SDI. Plain HTTP will not do.

If the registry is signed by a proper trusted (not self-signed) certificate, this may be skipped.

There are two ways to make OpenShift trust an additional registry using certificates signed by a self-signed certificate authority:

(recommended) update the CA certificate trust in OpenShift's image configuration.
(less secure) mark the registry as insecure

3.3.8. Configure the OpenShift Cluster for SDI

3.3.8.1. Becoming a cluster-admin

Many commands below require cluster admin privileges. To become a cluster-admin, you can do one of the following:

Use the auth/kubeconfig generated in the working directory during the installation of the OpenShift cluster:

INFO Install complete!
INFO Run 'export KUBECONFIG=<your working directory>/auth/kubeconfig' to manage the cluster with 'oc', the OpenShift CLI.
INFO The cluster is ready when 'oc login -u kubeadmin -p <provided>' succeeds (wait a few minutes).
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.demo1.openshift4-beta-abcorp.com
INFO Login to the console with user: kubeadmin, password: <provided>
# export KUBECONFIG=working_directory/auth/kubeconfig
# oc whoami
system:admin

As a system:admin user or a member of cluster-admin group, make another user a cluster admin to allow him to perform the SDI installation:
1. As a cluster-admin, configure the authentication (4.12) / (4.10) and add the desired user (e.g. sdiadmin).
2. As a cluster-admin, grant the user a permission to administer the cluster:
  Raw
```
# oc adm policy add-cluster-role-to-user cluster-admin sdiadmin
```

You can learn more about the cluster-admin role in Cluster Roles and Local Roles article (4.12) / (4.10)

4. SDI Observer

SDI Observer monitors SDI and SLC Bridge namespaces and applies changes to SDI deployments to allow SDI to run on OpenShift. Among other things, it does the following:

adds additional persistent volume to vsystem-vrep StatefulSet to allow it to run on RHCOS system
grants fluentd pods permissions to logs
reconfigures the fluentd pods to parse plain text file container logs on the OpenShift 4 nodes
exposes SDI System Management service
exposes SLC Bridge service
(optional) deploys the SDI Registry suitable for mirroring, storing and serving SDI images and for use by the Pipeline Modeler
(optional) creates cmcertificates secret to allow SDI to talk to container image registry secured by a self-signed CA certificate early during the installation time

It is deployed as an OpenShift template. Its behaviour is controlled by the template's parameters which are mirrored to its environment variables.

Deploy SDI Observer in its own k8s namespace (e.g. sdi-observer). Please refer to its documentation for the complete list of issues that it currently attempts to solve.

4.1. Prerequisites

The following must be satisfied before SDI Observer can be deployed:

OpenShift cluster must be healthy including all the cluster operators.
The OpenShift integrated image registry (4.12) / (4.10) must be properly configured and working.

4.2.1. Prerequisites for Connected OpenShift Cluster

In order to build images needed for SDI Observer, a secret with credentials for registry.redhat.io needs to be created in the namespace of SDI Observer. Please visit Red Hat Registry Service Accounts to obtain the OpenShift secret. For more details, please refer to Red Hat Container Registry Authentication. We will refer to the file asrht-registry-secret.yaml. The import to the OpenShift cluster will be covered down below.

4.2.2. Prerequisites for a Disconnected OpenShift Cluster

On a disconnected OpenShift cluster, it is necessary to mirror a pre-built image of SDI Observer to a local container image registry. Please follow Disconnected OpenShift cluster instructions.

4.2.3. Instantiation of Observer's Template

Assuming the SDI will be run in the SDI_NAMESPACE which is different from the observer NAMESPACE, instantiate the template with default parameters like this:

Prepare the script and images depending on your system connectivity.
- In a connected environment, download the run script from git repository like this:
  Raw
```
# curl -O https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/observer/run-observer-template.sh
```
- In a disconnected environment, where the Management host is connected.
  
  Mirror the SDI Observer image to the local registry. For example, on RHEL8:
  Raw
```
# podman login local.image.registry:5000    # if the local registry requires authentication
# skopeo copy \
    docker://quay.io/redhat-sap-cop/sdi-observer:latest-ocp4.12 \
    docker://local.image.registry:5000/sdi-observer:latest-ocp4.12
```
  Please make sure to modify the 4.12 suffix according to your OpenShift server minor release.
- In an air-gapped environment (assuming the observer repository has been already cloned to the Management host):
  1. On a host with access to the internet, copy the SDI Observer image to an archive on USB drive. For example, on RHEL8:
    Raw
```
# skopeo copy \
    docker://quay.io/redhat-sap-cop/sdi-observer:latest-ocp4.12 \
    oci-archive:/var/run/user/1000/usb-disk/sdi-observer.tar:latest-ocp4.12
```
  2. Plug the USB drive to the Management host (without internet access) and mirror the image from it to your local.image.registry:5000:
    Raw
```
# skopeo copy \
    oci-archive:/var/run/user/1000/usb-disk/sdi-observer.tar:latest-ocp4.12 \
    docker://local.image.registry:5000/sdi-observer:latest-ocp4.12
```
Edit the downloaded run-observer-template.sh file in your favorite editor. Especially, mind the FLAVOUR, NAMESPACE and SDI_NAMESPACE parameters.
- for the ubi-build flavour, make sure to set REDHAT_REGISTRY_SECRET_PATH=to/your/rht-registry-secret.yaml downloaded earlier
- for a disconnected environment, make sure to set FLAVOUR to ocp-prebuilt and IMAGE_PULL_SPEC to your local.image.registry:5000
- for an air-gapped environment, set also SDI_OBSERVER_REPOSITORY=to/local/git/repo/checkout
Run it in bash like this:
Raw
```
# bash ./run-observer-template.sh
```
Keep the modified script around for case of updates.

4.2.4. (Optional) SDI Observer Registry

NOTE: SDI Observer can optionally deploy SDI Registry on a connected OpenShift cluster only. For a disconnected environment, please refer to Generic instantiation for a disconnected environment.

If the observer is configured to deploy SDI Registry via DEPLOY_SDI_REGISTRY=true parameter, it will deploy the deploy-registry job which does the following:

(connected only) builds the container-image-registry image and pushes it to the integrated OpenShift Image Registry
generates or uses configured credentials for the registry
deploys container-image-registry deployment config which in turn deploys a corresponding pod
exposes the registry using a route
- if observer's SDI_REGISTRY_ROUTE_HOSTNAME parameter is set, it will be used as its hostname
- otherwise the registry's hostname will be container-image-registry-${NAMESPACE}.apps.<cluster_name>.<base_domain>

4.2.4.1. SDI Registry Template parameters

The following Observer's Template Parameters influence the deployment of the SDI Registry:

Parameter	Example value	Description
`DEPLOY_SDI_REGISTRY`	`true`	Whether to deploy SDI Registry for the purpose of SAP Data Intelligence.
`REDHAT_REGISTRY_SECRET_NAME`	`123456-username-pull-secret`	Name of the secret with credentials for registry.redhat.io registry. Please visit Please visit Red Hat Registry Service Accounts to obtain the OpenShift secret. For more details, please refer to Red Hat Container Registry Authentication. Must be provided in order to build registry's image.
`SDI_REGISTRY_ROUTE_HOSTNAME`	`registry.cluster.tld`	This variable will be used as the SDI Registry's hostname when creating the corresponding route. Defaults to `container-image-registry-$NAMESPACE.<cluster_name>.<base_domain>`. If set, the domain name must resolve to the IP of the ingress router.
`INJECT_CABUNDLE`	`true`	Inject CA certificate bundle into SAP Data Intelligence pods. The bundle can be specified with `CABUNDLE_SECRET_NAME`. It is needed if the registry is secured by a self-signed certificate.
`CABUNDLE_SECRET_NAME`	`custom-ca-bundle`	The name of the secret containing certificate authority bundle that shall be injected into Data Intelligence pods. By default, the secret bundle is obtained from `openshift-ingress-operator` namespace where the `router-ca` secret contains the certificate authority used to sign all the edge and reencrypt routes that are, among others, used for `SDI_REGISTRY` and S3 API services. The secret name may be optionally prefixed with `$namespace/`.
`SDI_REGISTRY_STORAGE_CLASS_NAME`	`ocs-storagecluster-cephfs`	Unless given, the default storage class will be used. If possible, prefer volumes with `ReadWriteMany` (`RWX`) access mode.
`REPLACE_SECRETS`	`true`	By default, existing `SDI_REGISTRY_HTPASSWD_SECRET_NAME` secret will not be replaced if it already exists. If the registry credentials shall be changed while using the same secret name, this must be set to `true`.
`SDI_REGISTRY_AUTHENTICATION`	`none`	Set to `none` if the registry shall not require any authentication at all. The default is to secure the registry with `htpasswd` file which is necessary if the registry is publicly available (e.g. when exposed via ingress route which is globally resolvable).
`SDI_REGISTRY_USERNAME`	`registry-user`	Will be used to generate htpasswd file to provide authentication data to the sdi registry service as long as `SDI_REGISTRY_HTPASSWD_SECRET_NAME` does not exist or `REPLACE_SECRETS` is `true`. Unless given, it will be autogenerated by the job.
`SDI_REGISTRY_PASSWORD`	`secure-password`	ditto
`SDI_REGISTRY_HTPASSWD_SECRET_NAME`	`registry-htpasswd`	A secret with htpasswd file with authentication data for the sdi image container. If given and the secret exists, it will be used instead of `SDI_REGISTRY_USERNAME` and `SDI_REGISTRY_PASSWORD`. Defaults to `container-image-registry-htpasswd`. Please make sure to follow the official guidelines on generating the `htpasswd` file.
`SDI_REGISTRY_VOLUME_CAPACITY`	`250Gi`	Volume space available for container images. Defaults to `120Gi`.
`SDI_REGISTRY_VOLUME_ACCESS_MODE`	`ReadWriteMany`	If the given `SDI_REGISTRY_STORAGE_CLASS_NAME` or the default storage class supports `ReadWriteMany` ("RWX") access mode, please change this to `ReadWriteMany`. For example, the `ocs-storagecluster-cephfs` storage class, deployed by ODF operator, does support it.

To use them, please set the desired parameters in the run-observer-template.sh script in the section above.

Monitoring registry's deployment

# oc logs -n "${NAMESPACE:-sdi-observer}" -f job/deploy-registry

You can find more information in the appendix:
- Update instructions
- Determine Registry's credentials
- Verification

4.3. Managing SDI Observer

4.3.1. Viewing and changing the current configuration

View the current configuration of SDI Observer:

# oc set env --list -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer

Change the settings:

it is recommended to modify the run-observer-template.sh and re-run it

it is also possible to set the desired parameter directly without triggering an image build:

# # instruct the observer to schedule SDI pods only on the matching nodes
# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer SDI_NODE_SELECTOR="node-role.kubernetes.io/sdi="

4.3.2. Re-deploying SDI Observer

Is useful in the following cases:

SDI Observer shall be updated to the latest release.
SDI has been uninstalled and its namespace deleted and/or re-created.
Parameter being reflected in multiple resources (not just in the DeploymentConfig) needs to be changed (e.g. OCP_MINOR_RELEASE)
Different SDI instance in another namespace shall be observed.

Before updating to the latest SDI Observer code, please be sure to check the Update instructions.

NOTE: Re-deployment preserves generated secrets and persistent volumes unless REPLACE_SECRETS or REPLACE_PERSISTENT_VOLUMES are true.

Backup the previous run-observer-template.sh script and open it as long as available. If not available, run the following to see the previous environment variables:
Raw
```
# oc set env --list dc/sdi-observer -n "${NAMESPACE:-sdi-observer}"
```

Download the run script from git repository like this:

# curl -O https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/observer/run-observer-template.sh

Edit the downloaded run-observer-template.sh file in your favorite editor. Especially, mind the FLAVOUR, NAMESPACE, SDI_NAMESPACE and OCP_MINOR_RELEASE parameters. Compare it against the old run-observer-template.sh or against the output of oc set env --list dc/sdi-observer and update the parameters accordingly.
Run it in bash like this:
Raw
```
# bash ./run-observer-template.sh
```
Keep the modified script around for case of updates.

5. Install SDI on OpenShift

5.1. Install Software Lifecycle Container Bridge

Please follow the official documentation (3.3) / (3.2) / (3.1).

5.1.1. Important Parameters

Parameter	Condition	Description
Mode	Always	Make sure to choose the `Expert` Mode.
Address of the Container Image Repository	Always	This is the `Host` value of the `container-image-registry` route in the observer namespace if the registry is deployed by SDI Observer.
Image registry username	if … ‡	Refer to your registry configuration. If using the SDI Registry, please follow Determine Registry's credentials.
Image registry password	if … ‡	ditto
Namespace of the SLC Bridge	Always	If you override the default (`sap-slcbridge`), make sure to deploy SDI Observer with the corresponding `SLCB_NAMESPACE` value.
Service Type	SLC Bridge Base installation	On vSphere, make sure to use `NodePort`. On AWS, please use `LoadBalancer`.
Cluster No Proxy	Required in conjunction with the HTTPS Proxy value	Make sure to this according to the Configuring HTTP Proxy for the SLC Bridge section.

‡ If the registry requires authentication. Red Hat Quay or SDI Registry does.

For more details, please refer to Configuring the cluster-wide proxy (4.12) / (4.10)

On a NAT'd on-premise cluster, in order to access slcbridgebase-service NodePort service, one needs to have either a direct access to one of the SDI Compute nodes or modify an external load balancer to add an additional route to the service.

5.1.2. Install SLC Bridge

Please install SLC Bridge according to Making the SLC Bridge Base available on Kubernetes (3.3) / (3.2) / (3.1) while paying attention to the notes on the installation parameters.

5.1.2.1. Exposing SLC Bridge with OpenShift Ingress Controller

For SLC Bridge, the only possible type of TLS termination is passthrough unless the Ingress Controller is configured to use globally trusted certificates.

It is recommended to let the SDI Observer (0.1.15 at the minimum) to manage the route creation and updates. If the SDI Observer has been deployed with MANAGE_SLCB_ROUTE=true, this section can be skipped. To configure it ex post, please execute the following:

# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_SLCB_ROUTE=true
# # wait for the observer to get re-deployed
# oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer

After a while, the bridge will be become available at https://sap-slcbridge.apps.<cluster_name>.<base_domain>/docs/index.html. You can wait for route's availability like this:

# oc get route -w -n "${SLCB_NAMESPACE:-sap-slcbridge}"
NAME            HOST/PORT                                         PATH   SERVICES                PORT    TERMINATION            WILDCARD
sap-slcbridge   <SLCB_NAMESPACE>.apps.<cluster_name>.<base_domain>       slcbridgebase-service   <all>   passthrough/Redirect   None

5.1.2.1.1. Manually exposing SLC Bridge with Ingress

Alternatively, you can expose SLC Bridge manually with this approach.

Look up the slcbridgebase-service service:

# oc project "${SLCB_NAMESPACE:-sap-slcbridge}"            # switch to the Software Lifecycle Bridge project
# oc get services | grep 'NAME\|slcbridge'
NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE
slcbridgebase-service   NodePort    172.30.206.105   <none>        32455:31477/TCP   14d

Create the route for the service:

# oc create route passthrough sap-slcbridge --service=slcbridgebase-service \
    --insecure-policy=Redirect --dry-run=client -o json | \
    oc annotate --local -f - haproxy.router.openshift.io/timeout=10m -o json | oc apply -f -

You can also set your desired hostname with the --hostname parameter. Make sure it resolves to the router's IP.

Get the generated hostname:

# oc get route
NAME      HOST/PORT                                                  PATH  SERVICES  PORT      TERMINATION           WILDCARD
vsystem   vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>        vsystem   vsystem   passthrough/Redirect  None

Make sure to configure your external load balancer to increase the timeout for WebSocket connections for this particular hostname at least to 10 minutes. For example, in HAProxy, it would be timeout tunnel 10m.
Access the System Management service at https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain> to verify.

5.1.2.2. Using an external load balancer to access SLC Bridge's NodePort

NOTE: applicable only when "Service Type" was set to "NodePort".

Once the SLC Bridge is deployed, its NodePort shall be determined in order to point the load balancer at it.

# oc get svc -n "${SLCB_NAMESPACE:-sap-slcbridge}" slcbridgebase-service -o jsonpath='{.spec.ports[0].nodePort}{"\n"}'
31875

The load balancer shall point at all the compute nodes running SDI workload. The following is an example for HAProxy load balancer:

# # in the example, the <cluster_name> is "boston" and <base_domain> is "ocp.vslen"
# cat /etc/haproxy/haproxy.cfg
....
frontend        slcb
    bind        *:9000
    mode        tcp
    option      tcplog
    # # commented blocks are useful for multiple OpenShift clusters or multiple SLC Bridge services
    #tcp-request inspect-delay      5s
    #tcp-request content accept     if { req_ssl_hello_type 1 }

    use_backend  boston-slcb       #if { req_ssl_sni -m end -i boston.ocp.vslen  }
    #use_backend raleigh-slcb      #if { req_ssl_sni -m end -i raleigh.ocp.vslen }

backend         boston-slcb
    balance     source
    mode        tcp
    server      sdi-worker1        sdi-worker1.boston.ocp.vslen:31875   check
    server      sdi-worker2        sdi-worker2.boston.ocp.vslen:31875   check
    server      sdi-worker3        sdi-worker3.boston.ocp.vslen:31875   check

backend         raleigh-slcb
....

The SLC Bridge can then be accessed at the URL https://boston.ocp.vslen:9000/docs/index.html as long as boston.ocp.vslen resolves correctly to the load balancer's IP.

5.2. SDI Installation Parameters

Please follow SAP's guidelines on configuring the SDI while paying attention to the following additional comments:

Name	Condition	Recommendation
Kubernetes Namespace	Always	Must match the project name chosen in the Project Setup (e.g. `sdi`)
Installation Type	Installation or Update	Choose `Advanced Installation` if you need to specify you want to choose particular storage class or there is no default storage class (4.12) / (4.10) set or you want to deploy multiple SDI instances on the same cluster.
Container Image Repository	Installation	Must be set to the container image registry.
Cluster Proxy Settings	Advanced Installation or Update	Choose yes if a local HTTP(S) proxy must be used to access external web resources.
Cluster No Proxy	When `Cluster Proxy Settings` is configured.	Please refer to the HTTP proxy configuration.
Backup Configuration	Installation or Upgrade from a system in which backups are not enabled	For a production environment, please choose yes. ⁴
Checkpoint Store Configuration	Installation	Recommended for production deployments. If backup is enabled, it is enabled by default.
Checkpoint Store Type	If Checkpoint Store Configuration parameter is enabled.	Set to S3 compatible object store if using for example ODF or NetApp StorageGRID as the object storage. See Using NooBaa as object storage gateway or NetApp StorageGRID for more details.
Disable Certificate Validation	If Checkpoint Store Configuration parameter is enabled.	Please choose yes if using the HTTPS for your object storage endpoint secured with a certificate having a self-signed CA. For ODF NooBaa, you can set it to no.
Checkpoint Store Validation	Installation	Please make sure to validate the connection during the installation time. Otherwise in case an incorrect value is supplied, the installation will fail at a later point.
Container Registry Settings for Pipeline Modeler	Advanced Installation	Shall be changed if the same registry is used for more than one SAP Data Intelligence instance. Either another `<registry>` or a different `<prefix>` or both will do.
StorageClass Configuration	Advanced Installation	Configure this if you want to choose different dynamic storage provisioners for different SDI components or if there's no default storage class (4.12) / (4.10) set or you want to choose non-default storage class for the SDI components.
Default StorageClass	Advanced Installation and if storage classes are configured	Set this if there's no default storage class (4.12) / (4.10) set or you want to choose non-default storage class for the SDI components.
Enable Kaniko Usage	Advanced Installation	Must be enabled on OpenShift 4.
Container Image Repository Settings for SAP Data Intelligence Modeler	Advanced Installation or Upgrade	If using the same registry for multiple SDI instances, choose "yes".
Container Registry for Pipeline Modeler	Advanced Installation and if "Use different one" option is selected in the previous selection.	If using the same registry for multiple SDI instances, it is required to use either different prefix (e.g. `local.image.registry:5000/mymodelerprefix2`) or a different registry.
Loading NFS Modules	Advanced Installation	Feel free to say "no". This is no longer of concern as long as the loading of the needed kernel modules has been configured.
Additional Installer Parameters	Advanced Installation	Please include `-e vsystem.vRep.exportsMask=true`. If omitted and SDI Observer is running, it will apply this parameter on your behalf.

⁴ Note that the validated S3 API endpoint providers are ODF' NooBaa 4.6.4 or newer, ODF 4.6 in external mode and NetApp StorageGRID

5.3. Project setup

It is assumed the sdi project has been already created during SDI Observer's prerequisites.

# change to the SDI_NAMESPACE project using: oc project "${SDI_NAMESPACE:-sdi}"
oc adm policy add-scc-to-group anyuid "system:serviceaccounts:$(oc project -q)"
oc adm policy add-scc-to-user privileged -z default
oc adm policy add-scc-to-user privileged -z mlf-deployment-api
oc adm policy add-scc-to-user privileged -z vora-vflow-server
oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)"
oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)-vrep"

Two monitoring service accounts are renamed in SDI 3.3. For the SDI 3.2 and prior versions, the following commands will grant privileges to the two service accounts:

oc adm policy add-scc-to-user privileged -z "$(oc project -q)-elasticsearch"
oc adm policy add-scc-to-user privileged -z "$(oc project -q)-fluentd"

For SDI 3.3 version, please run the following command instead:

oc adm policy add-scc-to-user privileged -z "diagnostics-elasticsearch"
oc adm policy add-scc-to-user privileged -z "diagnostics-fluentd"

During the restore process of SDI 3.3, you might encounter an error from the hana StatefulSet. If you face the error message similar to the following:

create Pod hana-0 in StatefulSet hana failed error: pods "hana-0" is forbidden: unable to validate against any security context constraint: [provider anyuid: .initContainers[0].capabilities.add: Invalid value: "CHOWN": capability may not be added

To resolve this issue and launch the hana pod successfully, execute the following command:

# change to the restoring SDI_NAMESPACE project using: oc project "${SDI_NAMESPACE:-sdi}"
oc adm policy add-scc-to-user privileged -z "hana-service-account"

Red Hat is aware that the changes do not comply with best practices and substantially decrease cluster's security. Therefor it is not recommended to share the Data Intelligence nodes with other workloads. As stated earlier, please consult SAP directly if you wish for an improvement.

5.4. Install SDI

Please follow the official procedure according to Install using SLC Bridge in a Kubernetes Cluster with Internet Access (3.3) / (3.2) / (3.1).

5.5. SDI Post installation steps

5.5.1. (Optional) Expose SDI services externally

There are multiple possibilities how to make SDI services accessible outside of the cluster. Compared to Kubernetes, OpenShift offers additional method, which is recommended for most of the scenarios including SDI System Management service. It's based on OpenShift Ingress Operator (4.12) / (4.10)

For SAP Vora Transaction Coordinator and SAP HANA Wire, please use the official suggested method available to your environment (3.3) / (3.2) / (3.1).

5.5.1.1. Using OpenShift Ingress Operator

NOTE Instead of using this manual approach, it is now recommended to let the SDI Observer to manage the route creation and updates instead. If the SDI Observer has been deployed with MANAGE_VSYSTEM_ROUTE, this section can be skipped. To configure it ex post, please execute the following:

# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_VSYSTEM_ROUTE=true
# # wait for the observer to get re-deployed
# oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer

Or please continue with the manual route creation.

OpenShift allows you to access the Data Intelligence services via Ingress Controllers (4.12) / (4.10) as opposed to the regular NodePorts (4.12) / (4.10) For example, instead of accessing the vsystem service via https://worker-node.example.com:32322, after the service exposure, you will be able to access it at https://vsystem-sdi.apps.<cluster_name>.<base_domain>. This is an alternative to the official guide documentation to Expose the Service On Premise (3.3) / (3.2) / (3.1).

There are two of kinds routes secured with TLS. The reencrypt kind, allows for a custom signed or self-signed certificate to be used. The other kind is passthrough which uses the pre-installed certificate generated or passed to the installer.

5.5.1.1.1. Export services with an reencrypt route

With this kind of route, different certificates are used on client and service sides of the route. The router stands in the middle and re-encrypts the communication coming from either side using a certificate corresponding to the opposite side. In this case, the client side is secured by a provided certificate and the service side is encrypted with the original certificate generated or passed to the SAP Data Intelligence installer. This is the same kind of route SDI Observer creates automatically.

The reencrypt route allows for securing the client connection with a proper signed certificate.

Look up the vsystem service:
Raw
```
# oc project "${SDI_NAMESPACE:-sdi}"            # switch to the Data Intelligence project
# oc get services | grep "vsystem "
vsystem   ClusterIP   172.30.227.186   <none>   8797/TCP   19h
```
When exported, the resulting hostname will look like vsystem-${SDI_NAMESPACE}.apps.<cluster_name>.<base_domain>. However, an arbitrary hostname can be chosen instead as long as it resolves correctly to the IP of the router.
Get, generate or use the default certificates for the route. In this example, the default self-signed certificate used by router is used to secure the connection between the client and OpenShift's router. The CA certificate for clients can be obtained from the router-ca secret located in the openshift-ingress-operator namespace:
Raw
```
# oc get secret -n openshift-ingress-operator -o json router-ca | \
    jq -r '.data as $d | $d | keys[] | select(test("\\.crt$")) | $d[.] | @base64d' >router-ca.crt
```
Obtain the SDI's root certificate authority bundle generated at the SDI's installation time. The generated bundle is available in the ca-bundle.pem secret in the sdi namespace.
Raw
```
# oc get -n "${SDI_NAMESPACE:-sdi}" -o go-template='{{index .data "ca-bundle.pem"}}' \
    secret/ca-bundle.pem | base64 -d >sdi-service-ca-bundle.pem
```

Create the reencrypt route for the vsystem service like this:

# oc create route reencrypt -n "${SDI_NAMESPACE:-sdi}" --dry-run -o json \
        --dest-ca-cert=sdi-service-ca-bundle.pem --service vsystem \
        --insecure-policy=Redirect | \
    oc annotate --local -o json -f - haproxy.router.openshift.io/timeout=2m | \
    oc apply -f -
# oc get route
NAME      HOST/PORT                                                  SERVICES  PORT      TERMINATION         WILDCARD
vsystem   vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>  vsystem   vsystem   reencrypt/Redirect  None

Verify the connection:

# # use the HOST/PORT value obtained from the previous command instead
# curl --cacert router-ca.crt https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>/

5.5.1.1.2. Export services with a passthrough route

With the passthrough route, the communication is encrypted by the SDI service's certificate all the way to the client.

NOTE: If possible, please prefer the reencrypt route because the hostname of vsystem certificate cannot be verified by clients as can be seen in the following output:

# oc get -n "${SDI_NAMESPACE:-sdi}" -o go-template='{{index .data "ca-bundle.pem"}}' \
    secret/ca-bundle.pem | base64 -d >sdi-service-ca-bundle.pem
# openssl x509 -noout -subject -in sdi-service-ca-bundle.pem
subject=C = DE, ST = BW, L = Walldorf, O = SAP, OU = Data Hub, CN = SAPDataHub

Look up the vsystem service:

# oc project "${SDI_NAMESPACE:-sdi}"            # switch to the Data Intelligence project
# oc get services | grep "vsystem "
vsystem   ClusterIP   172.30.227.186   <none>   8797/TCP   19h

Create the route:

# oc create route passthrough --service=vsystem --insecure-policy=Redirect
# oc get route
NAME      HOST/PORT                                                  PATH  SERVICES  PORT      TERMINATION           WILDCARD
vsystem   vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>        vsystem   vsystem   passthrough/Redirect  None

You can modify the hostname with --hostname parameter. Make sure it resolves to the router's IP.

Access the System Management service at https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain> to verify.

5.5.1.2. Using NodePorts

NOTE For OpenShift, an exposure using routes is preferred although only possible for the System Management service (aka vsystem).

Exposing SAP Data Intelligence vsystem

Either with an auto-generated node port:

# oc expose service vsystem --type NodePort --name=vsystem-nodeport --generator=service/v2
# oc get -o jsonpath='{.spec.ports[0].nodePort}{"\n"}' services vsystem-nodeport
30617

Or with a specific node port (e.g. 32123):

# oc expose service vsystem --type NodePort --name=vsystem-nodeport --generator=service/v2 --dry-run -o yaml | \
    oc patch -p '{"spec":{"ports":[{"port":8797, "nodePort": 32123}]}}' --local -f - -o yaml | oc apply -f -

The original service remains accessible on the same ClusterIP:Port as before. Additionally, it is now accessible from outside of the cluster under the node port.

Exposing SAP Vora Transaction Coordinator and HANA Wire

# oc expose service vora-tx-coordinator-ext --type NodePort --name=vora-tx-coordinator-nodeport --generator=service/v2
# oc get -o jsonpath='tx-coordinator:{"\t"}{.spec.ports[0].nodePort}{"\n"}hana-wire:{"\t"}{.spec.ports[1].nodePort}{"\n"}' \
    services vora-tx-coordinator-nodeport
tx-coordinator: 32445
hana-wire:      32192

The output shows the generated node ports for the newly exposed services.

5.5.2. Configure the Connection to Data Lake

Please follow the official post-installation instructions at Configure the Connection to DI_DATA_LAKE (3.3) / (3.2) / (3.1).

In case the ODF is used as a backing object storage provider, please make sure to use the HTTP service endpoint as documented in Using NooBaa or RADOS Object Gateway S3 endpoint as object storage.

Based on the example output in that section, the configuration may look like this:

Parameter	Value
Connection Type	`SDL`
Id	`DI_DATA_LAKE`
Object Storage Type	`S3`
Endpoint	`http://s3.openshift-storage.svc.cluster.local`
Access Key ID	`cOxfi4hQhGFW54WFqP3R`
Secret Access Key	`rIlvpcZXnonJvjn6aAhBOT/Yr+F7wdJNeLDBh231`
Root Path	`sdi-data-lake-f86a7e6e-27fb-4656-98cf-298a572f74f3`

5.5.3. SDI Validation

Validate SDI installation on OpenShift to make sure everything works as expected. Please follow the instructions in Testing Your Installation (3.3) / (3.2) / (3.1).

5.5.3.1. Log On to SAP Data Intelligence Launchpad

In case the vsystem service has been exposed using a route, the URL can be determined like this:

# oc get route -n "${SDI_NAMESPACE:-sdi}"
NAME      HOST/PORT                                                  SERVICES  PORT      TERMINATION  WILDCARD
vsystem   vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>  vsystem   vsystem   reencrypt    None

The HOST/PORT value needs to be then prefixed with https://, for example:

https://vsystem-sdi.apps.boston.ocp.vslen

5.5.3.2. Check Your Machine Learning Setup

In order to upload training and test datasets using ML Data Manager, the user needs to be assigned app.datahub-app-data.fullAcces (as of 3.2) or sap.dh.metadata (up to 3.1) policy. Please make sure to follow Using SAP Data Intelligence Policy Management (3.3) / (3.2) / (3.1) to assign the policies to the users that need them.

5.5.4. Configuration of additional tenants

When a new tenant is created (using e.g. Manage Clusters instructions (3.3) / (3.2) / (3.1)) it is not configured to work with the container image registry. Therefore, the Pipeline Modeler is unusable and will fail to start until configured.

There are a few steps that need to be performed for each new tenant:

import CA certificate for the registry via SDI Connection Manager if the CA certificate is self-signed
as long as a different registry for modeler is used, pull secret needs to be imported to the SDI_NAMESPACE
create and import credential secret using the SDI System Management and update the modeler secret if the container image registry requires authentication

If the Red Hat Quay is used, please follow the Configuring additional SDI tenants.

If the SDI Registry is used, please follow the SDI Observer Registry tenant configuration. Otherwise, please make sure to execute the official instructions in the following articles according to your registry configuration:

Provide Access Credentials for a Password Protected Container Registry (3.3) / (3.2) / (3.1) (as long as your registry for the Pipeline Modeler uses TLS with a self-signed CA)
(3.3) / Manage Certificates (3.2) / (3.1) (as long as your registry requires authentication)

6. OpenShift Container Platform Upgrade

This section is useful as a guide for performing OpenShift upgrades to the latest asynchronous releaseⁿ of the same minor version or to the newer minor release supported by the running SDI instance without upgrading SDI itself.

6.1. Pre-upgrade procedures

Make yourself familiar with the OpenShift's upgrade guide (4.8 ⇒ 4.9) / (4.9 ⇒ 4.10) / (4.10 ⇒ 4.11) / (4.11 ⇒ 4.12).
Plan for SDI downtime.
Make sure to re-configure SDI compute nodes.

6.1.1. Stop SAP Data Intelligence

In order to speed up the cluster upgrade and/or to ensure SDI's consistency, it is possible to stop the SDI before performing the upgrade.

The procedure is outlined in the official Administration Guide (3.3) / (3.2) / (3.1). In short, the command is:

# oc -n "${SDI_NAMESPACE}" patch datahub default --type='json' -p '[
    {"op":"replace","path":"/spec/runLevel","value":"Stopped"}]'

6.2. Upgrade OpenShift

The following instructions outline a process of OpenShift upgrade to a minor release 2 versions higher than the current one. If only an upgrade to the latest asynchronous releaseⁿ of the same minor version is desired, please skip steps 5 and 6.

Upgrade OpenShift to a higher minor release or the latest asynchronous release (⇒ 4.12).
If having OpenShift Data Foundation deployed, update ODF to the latest supported release for the current OpenShift release according to the interoperability guide.

Update OpenShift client tools on the Management host to match the target ※ OpenShift release. On RHEL 8, one can do it like this:

# current=4.10; new=4.12
# sudo subscription-manager repos \
    --disable=rhocp-${current}-for-rhel-8-x86_64-rpms --enable=rhocp-${new}-for-rhel-8-x86_64-rpms
# sudo dnf update -y openshift-clients

Update SDI Observer to use the OpenShift client tools matching the target ※ OpenShift release by following Re-Deploying SDI Observer while reusing the previous parameters.
Upgrade OpenShift to a higher minor release or the latest asynchronous release (⇒ 4.12) ⁿ.
If having OpenShift Data Foundation deployed, update ODF to the latest supported release for the current OpenShift release according to the interoperability guide.

※ for the initial OpenShift release 4.X, the target release is 4.(X+2); if performing just the latest asynchronous releaseⁿ upgrade, the target release is 4.X

6.3. Post-upgrade procedures

Start SAP Data Intelligence as outlined in the official Administration Guide (3.3) / (3.2) / (3.1). In short, the command is:

# oc -n "${SDI_NAMESPACE}" patch datahub default --type='json' -p '[
    {"op":"replace","path":"/spec/runLevel","value":"Started"}]'

7. SAP Data Intelligence Upgrade or Update

NOTE This section covers an upgrade of SAP Data Intelligence to a newer minor, micro or patch release. Sections related only to the former or the latter will be annotated with the following annotations:

(upgrade) to denote a section specific to an upgrade from Data Intelligence to a newer minor release (3.X ⇒ 3.(X+1))
(update) to denote a section specific to an update of Data Intelligence to a newer micro/patch release (3.X.Y ⇒ 3.X.(Y+1))
annotation-free are sections relating to both

The following steps must be performed in the given order. Unless an OpenShift upgrade is needed, the steps marked with (ocp-upgrade) can be skipped.

7.1. Pre-upgrade or pre-update procedures

Make sure to get familiar with the official SAP Upgrade guide (3.0 ⇒ 3.1) / (3.1 ⇒ 3.2) / (3.2 ⇒ 3.3).
(ocp-upgrade) Make yourself familiar with the OpenShift's upgrade guide (4.8 ⇒ 4.9) / (4.9 ⇒ 4.10) / (4.10 ⇒ 4.11)/ (4.11 ⇒ 4.12).
Plan for a downtime.
Make sure to re-configure SDI compute nodes.

7.1.1. Execute SDI's Pre-Upgrade Procedures

Please follow the official Pre-Upgrade procedures (3.0 ⇒ 3.1) / (3.1 ⇒ 3.2) / (3.2 ⇒ 3.3).

7.1.1.1. Automated route removal

SDI Observer now allows to manage creation and updates of vsystem route for external access. It takes care of updating route's destination certificate during SDI's update. It can also be instructed to keep the route deleted which is useful during SDI updates. You can instruct the SDI Observer to delete the route like this:

ensure SDI Observer is managing the route already:
Raw
```
# oc set env -n "${NAMESPACE:-sdi-observer}" --list dc/sdi-observer | grep MANAGE_VSYSTEM_ROUTE
MANAGE_VSYSTEM_ROUTE=true
```
if there is no output or MANAGE_VSYSTEM_ROUTE is not one of true, yes or 1, please follow the Manual route removal instead.

instruct the observer to keep the route removed:

# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_VSYSTEM_ROUTE=removed
# # wait for the observer to get re-deployed
# oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer

7.1.1.2. Manual route removal

If you exposed the vsystem service using routes, delete the route:

# # note the hostname in the output of the following command
# oc get route -n "${SDI_NAMESPACE:-sdi}"
# # delete the route
# oc delete route -n "${SDI_NAMESPACE:-sdi}" --all

7.1.2. (upgrade) Prepare SDI Project

Grant the needed security context constraints to the new service accounts by executing the commands from the project setup. NOTE: Re-running the commands that have been run already, will do no harm.

7.2. Update or Upgrade SDI

7.2.1. Update Software Lifecycle Container Bridge

Before updating the SLC Bridge, please consider exposing it via Ingress Controller.

If you decide to continue using the NodePort service load-balanced by an external load balancer, make sure to note down the current service node port:

# oc get -o jsonpath='{.spec.ports[0].nodePort}{"\n"}' -n sap-slcbridge \
    svc/slcbridgebase-service
31555

Please follow the official documentation (3.3) / (3.2) / (3.1) to obtain the binary and updating its resources on OpenShift cluster.

If exposed via Ingress Controller, you can skip the next step. Otherwise, re-set the nodePort to the previous value so no changes on load-balancer side are necessary.

    # nodePort=31555    # change your value to the desired one
    # oc patch --type=json -n sap-slcbridge svc/slcbridgebase-service -p '[{
        "op":"add", "path":"/spec/ports/0/nodePort","value":'"$nodePort"'}]'

7.2.2. (upgrade) Upgrade SAP Data Intelligence to a newer minor release

Execute the SDI upgrade according to the official instructions (DH 3.0 ⇒ 3.1) / (DH 3.1 ⇒ 3.2) / (DH 3.2 ⇒ 3.3).

7.3. (ocp-upgrade) Upgrade OpenShift

Depending on the target SDI release, OpenShift cluster must be upgraded either to a newer minor release or to the latest asynchronous releaseⁿ for the current minor release.

Upgraded/Current SDI release	Desired and validated OpenShift Releases
3.3	4.12
3.3	4.10
3.3	4.8
3.2	4.8
3.1	4.6
3.0	4.6

If the current OpenShift release is two or more releases behind the desired, OpenShift cluster must be upgraded iteratively to each successive minor release until the desired one is reached.

(optional) Stop the SAP Data Intelligence as it will speed up the cluster update and ensure SDI's consistency.
Make sure to follow the official upgrade instructions for your upgrade path:
- 4.8 ⇒ 4.9 ⇒ 4.10
- 4.10 ⇒ 4.11 ⇒ 4.12
When on OpenShift 4.11, please follow the Re-deploying SDI Observer to update the observer. Please make sure to set MANAGE_VSYSTEM_ROUTE to remove until the SDI's update is finished. Please set the desired OpenShift minor release (e.g. OCP_MINOR_RELEASE=4.12).

For SDI 3.2 to 3.3 upgrade, privileges need to be granted to the following two service accounts by running the following commands:

# change to the SDI_NAMESPACE project using: oc project "${SDI_NAMESPACE:-sdi}"
oc adm policy add-scc-to-user privileged -z "diagnostics-elasticsearch"
oc adm policy add-scc-to-user privileged -z "diagnostics-fluentd"

(optional) Start the SAP Data Intelligence again if the desired OpenShift release has been reached.

Upgrade OpenShift client tools on the Management host. The example below can be used on RHEL 8:

# current=4.10; new=4.12
# sudo subscription-manager repos \
    --disable=rhocp-${current}-for-rhel-8-x86_64-rpms --enable=rhocp-${new}-for-rhel-8-x86_64-rpms
# sudo dnf update -y openshift-clients

7.4. SAP Data Intelligence Post-Upgrade Procedures

Execute the Post-Upgrade Procedures for the SDH (3.3) / (3.2) / (3.1).
Re-create the route for the vsystem service using one of the following methods:
- (recommented) instruct SDI Observer to manage the route:
  Raw
```
# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_VSYSTEM_ROUTE=true
# # wait for the observer to get re-deployed
# oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer
```
- follow Expose SDI services externally to recreate the route manually from scratch

7.5. Validate SAP Data Intelligence

Validate SDI installation on OpenShift to make sure everything works as expected. Please follow the instructions in Testing Your Installation (3.3) / (3.2) / (3.1).

8. Appendix

8.1. SDI uninstallation

Please follow the SAP documentation Uninstalling SAP Data Intelligence using the SLC Bridge (3.3) / (3.2) / (3.1).

Additionally, make sure to delete the sdi project and datahub-system as well, e.g.:

# oc delete project sdi

Followed by the deletion of datahub-system project, eg.:

# oc delete project datahub-system

NOTE: With this, SDI Observer loses permissions to view and modify resources in the deleted namespace. If a new SDI installation shall take place, SDI observer needs to be re-deployed.

Optionally, one can also delete SDI Observer's namespace, e.g.:

# oc delete project sdi-observer

NOTE: this will also delete the SDI registry if deployed using SDI Observer which means the mirroring needs to be performed again during a new installation. If SDI Observer (including the registry and its data) shall be preserved for the next installation, please make sure to re-deploy it once the sdi project is re-created.

When done, you may continue with a new installation round in the same or another namespace.

8.2. Quay Registry for SDI

Red Hat Quay Registry has been validated to host SAP Data Intelligence images. The Quay registry can run directly on the OpenShift cluster together with SDI, on another OpenShift cluster or standalone.

Note: Red Hat Quay 3.6 or newer is compatible with SDI images.

Once Red Hat Quay is deployed according to the documentation, make sure to configure OpenShift cluster to trust the registry.

8.2.1. Quay namespaces, users and accounts preparations

Create a new organization. In this example, we will call the organization sdi.
- This organization will host all the images needed by SLC Bridge, SAP DI and SAP DI operator.
As the Quay Superadmin create a new user (e.g. sdi_slcb). Please note the credentials. The user will be used as a robot account by SLC Bridge and OpenShift (not by a human). So far, the regular Quay robot account cannot be used because the robot accounts cannot create repositories on push.
Grant the sdi_slcb user at least the Creator access to the sdi organization.
- Either by adding the user to the owners team in "Teams and Membership" pane.
- Or by creating a new team in the sdi organization called e.g. pushers with the Creator team role assigned and adding the sdi_slcb user as a member.
(optional) As the Superadmin, create another user for pipeline modeler (e.g. sdi_default_modeler where default stands for the default tenant).
- Advantages of having a separate registry namespace and users for each tenant's pipeline modeler:
  - Images can be easily pruned on per-tenant basis. Once the SDI tenant is no longer needed, the corresponding Quay user can be deleted and its images will be automatically pruned from the registry and space recovered.
  - Improved security. SDI tenant users cannot access images of other SDI tenants.
- This user will be used again as a robot account, similar to sdi_slcb.
- For user's E-mail address, any fake address will do as long as it is unique among all Quay users.
- The name of the user is at the same time the namespace where the images will be pushed to and pull from.
- Make sure to note the credentials.
- The user must be able to pull from the sdi organization.
In order for the user to pull from sdi organization, make sure to perform also the following.
1. As the owner of the sdi organization, go to its "Teams and Membership" pane, create a new team (e.g. pullers) with the Member Team Role.
2. Click on "Set permissions for pullers" and make sure the team can Read all the repositories that already exist in the sdi organization.
3. Click on the puller team, search for sdi_default_modeler user and add him to the team.
4. Go back to Default Permissions of the sdi organization, click on "Create Default Permission" and add the "Read" permission to the puller team for repositories created by Anyone.
(optional) Repeat the previous step for any additional SDI tenant you are going to create.

8.2.2. Determine the Image Repository

The Image Repository Input Parameter is composed of <hostname>/<namespace>.

The registry <hostname> for Quay running on the OpenShift cluster can be determined on the Management host like this:
Raw
```
# oc get route --all-namespaces -o jsonpath='{range .items[*]}{.spec.host}{"\n"}{end}' \
        -l quay-component=quay-app-route
```
An example output:
Raw
```
quay.apps.cluster.example.com
```
In case your local Quay registry runs outside of OpenShift cluster, you will need to determine its hostname by other means.
The <namespace> is either the organization name or username. For sdi organization, the <namespace> is sdi.

In this example, the resulting Image Repository parameter will be quay.apps.cluster.example.com/sdi.

8.2.3. Importing Quay's CA Certificate to OpenShift

If you haven't done it already, please make sure to make OpenShift cluster trust the Quay registry.

If the Quay registry is running on the OpenShift cluster, obtain the router-ca.crt of the secret as documented in the SDI Registry Verification section. Otherwise, please fetch the self-signed CA certificate of your external Quay registry.
Follow section Configure OpenShift to trust container image registry to make the registry trusted.

8.2.4. Configuring additional SDI tenants

There are three steps that need to be performed for each new SDI tenant:

import CA certificate for the registry via SDI Connection Manager if the CA certificate is self-signed
create a and import a vflow pull secret to OpenShift namespace
create and import credential secret using the SDI System Management and update the modeler secret

In this example, we will operate with a newly created tenant blue and we assume that new Quay registry user called blue_modeler has been created.

8.2.4.1. Importing Quay's CA Certificate to SAP DI

Please follow step one from Importing Quay's CA Certificate to OpenShift to get the CA certificate localy as router-ca.crt.
Follow the Manage Certificates guide (3.3) / (3.2) / (3.1) to import the router-ca.crt via the SDI Connection Management.

8.2.4.2. Create and import vflow pull secret into OpenShift

This is needed only if a different Quay namespace is used for each tenant.

Login into to your Quay registry as the user blue_modeler.
Click on your user avatar in the upper right corner, go to "Account Settings" -> "User Settings" and there click on "Create Application Token". Let's use blue_modeler_quay_token as the token name.
Once the application token is generated, click on it and download the corresponding "Kubernetes Secret". In this example, the downloaded file is called blue-modeler-quay-token-secret.yaml.
On your Management host, import the secret into the SDI_NAMESPACE on your OpenShift cluster, e.g.:
Raw
```
# oc apply -n "${SDI_NAMESPACE:-sdi}" -f blue-modeler-quay-token-secret.yaml
```
In SDI "System Management" of the blue tenant, go to the Applications tab, search for pull, click on the Edit button and set "Modeler: Docker image pull secret for Modeler" to the name of the imported secret (e.g. blue-modeler-quay-token-pull-secret).

8.2.4.3. Import credentials secret to SDI tenant

If you have imported the vflow pull secret into OpenShift cluster, you can turn the imported secret into the proper file format for SDI like this:

# secret=blue-modeler-quay-token-pull-secret
# oc get -o json -n "${SDI_NAMESPACE:-sdi}" "secret/$secret" | \
    jq -r '.data[".dockerconfigjson"] | @base64d' | jq -r '.auths as $auths | $auths | keys |
        map(. as $address | $auths[.].auth | @base64d | capture("^(?<username>[^:]+):(?<password>.+)$") |
        {"address": $address, "username": .username, "password": .password})' | \
    json2yaml | tee vsystem-registry-secret.txt

Otherwise, create the secret manually like this:

# cat >/tmp/vsystem-registry-secret.txt <<EOF
- username: "blue_modeler"
  password: "CHANGEME"
  address: "quay.apps.cluster.example.com"
EOF

Note that the address must not contain any /<namespace> suffix!

Import the secret using the SDI System Management by following the official Provide Access Credentials for a Password Protected Container Registry (3.3) / (3.2) / (3.1).

8.3. (Deprecated) Deploying SDI Registry manually

The secure container image registry suitable for hosting SAP Data Intelligence images on OpenShift cluster.

8.3.1. Deployment

SDI Registry's kubernetes resources are defined in OpenShift Templates. To choose the right template and provide the right parameters for it, it is recommended to use the deployment script documented below.

8.3.1.1. Prerequisites

OpenShift cluster must be healthy including all the cluster operators.
jq >= 1.6 binary available on the management host

8.3.1.2. Template instantiation

Make the git repository available on your management host.

# git clone https://github.com/redhat-sap/sap-data-intelligence

Inspect the available arguments of the deployment script:

# ./sap-data-intelligence/registry/deploy-registry.sh --help

Choose the right set of arguments and make a dry run to see what will happen. The ubi-prebuilt flavour will be chosen by default. The image will be pulled from quay.io/redhat-sap-cop/container-image-registry.
Raw
```
# ./sap-data-intelligence/registry/deploy-registry.sh --dry-run
```
Next time, deploy the SDI registry for real and wait until it gets deployed:
Raw
```
# ./sap-data-intelligence/registry/deploy-registry.sh --wait
```

8.3.1.3. Generic instantiation for a disconnected environment

There must be another container image registry running outside of the OpenShift cluster to host the image of SDI Registry. That registry should be used to host SAP Data Intelligence images also as long as it is compatible. Otherwise, please follow this guide.

Mirror the pre-built image of SDI Registry to the local registry. For example, on RHEL8:

Where the management host has access to the internet:

# podman login local.image.registry:5000    # if the local registry requires authentication
# skopeo copy \
    docker://quay.io/redhat-sap-cop/container-image-registry:latest \
    docker://local.image.registry:5000/container-image-registry:latest

Where the management host lacks access to the internet.

i. Copy the image on a USB flash on a host having the connection to the internet:

    # skopeo copy \
        docker://quay.io/redhat-sap-cop/contaimer-image-registry:latest \
        oci-archive:/var/run/user/1000/usb-disk/container-image-registry:latest

ii. Plug the USB drive to the management host and mirror the image from it to your local.image.registry:5000:

    # skopeo copy \
        oci-archive:/var/run/user/1000/usb-disk/container-image-registry:latest \
        docker://local.image.registry:5000/container-image-registry:latest

Make the git repository available on your management host.

# git clone https://github.com/redhat-sap/sap-data-intelligence

Inspect the available arguments of the deployment script:

# ./sap-data-intelligence/registry/deploy-registry.sh --help

Choose the right set of arguments and make a dry run to see what will happen:

# ./sap-data-intelligence/registry/deploy-registry.sh \
    --image-pull-spec=local.image.registry:5000/container-image-registry:latest --dry-run

Next time, deploy the SDI Registry for real and wait until it gets deployed:

# ./sap-data-intelligence/registry/deploy-registry.sh \
    --image-pull-spec=local.image.registry:5000/container-image-registry:latest --wait

Please make sure to backup the arguments used for future updates.

8.3.2. Update instructions

So far, updates need to be performed manually.

Please follow the steps outlined in Template Instantiation anew. A re-run of the deployment script will change only what needs to be changed.

8.3.3. Determine Registry's credentials

The username and password are separated by a colon in the SDI_REGISTRY_HTPASSWD_SECRET_NAME secret:

# # make sure to change the "sdi-registry" to your SDI Registry's namespace
# oc get -o json -n "sdi-registry" secret/container-image-registry-htpasswd | \
    jq -r '.data[".htpasswd.raw"] | @base64d'
user-qpx7sxeei:OnidDrL3acBHkkm80uFzj697JGWifvma

8.3.4. Verification

Obtain Ingress' default self-signed CA certificate:

# oc get secret -n openshift-ingress-operator -o json router-ca | \
    jq -r '.data as $d | $d | keys[] | select(test("\\.crt$")) | $d[.] | @base64d' >router-ca.crt

Set the nm variable to the Kubernetes namespace where SDI Registry runs:
Raw
```
# nm=sdi-registry
```

Do a simple test using curl:

# # determine registry's hostname from its route
# hostname="$(oc get route -n "$nm" container-image-registry -o jsonpath='{.spec.host}')"
# curl -I --user user-qpx7sxeei:OnidDrL3acBHkkm80uFzj697JGWifvma --cacert router-ca.crt \
    "https://$hostname/v2/"
HTTP/1.1 200 OK
Content-Length: 2
Content-Type: application/json; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
Date: Sun, 24 May 2020 17:54:31 GMT
Set-Cookie: d22d6ce08115a899cf6eca6fd53d84b4=9176ba9ff2dfd7f6d3191e6b3c643317; path=/; HttpOnly; Secure
Cache-control: private

Optionally, make the certificate trusted on your management host (this example is for RHEL7 or newer):
Raw
```
# sudo cp -v router-ca.crt /etc/pki/ca-trust/source/anchors/router-ca.crt
# sudo update-ca-trust
```

Using the podman:

# # determine registry's hostname from its route
# hostname="$(oc get route -n "$nm" container-image-registry -o jsonpath='{.spec.host}')"
# sudo mkdir -p "/etc/containers/certs.d/$hostname"
# sudo cp router-ca.crt "/etc/containers/certs.d/$hostname/"
# podman login -u user-qpx7sxeei "$hostname"
Password:
Login Succeeded!

8.3.5. Post configuration

By default, the SDI Registry is secured by the Ingress Controller's certificate signed by a self-signed CA certificate. Self-signed certificates are trusted neither by OpenShift nor by SDI.

If the registry is signed by a proper trusted (not self-signed) certificate, this may be skipped.

8.3.5.1. Making SDI Registry trusted by OpenShift

To make the registry trusted by the OpenShift cluster, please follow Configure OpenShift to trust container image registry. You can determine the registry hostname in bash like this:

# nm="sdi-registry"   # namespace where registry runs
# registry="$(oc get route -n "$nm" \
    container-image-registry -o jsonpath='{.spec.host}')"; echo "$registry"

8.3.5.2. SDI Observer Registry tenant configuration

The default tenant is configured automatically as long as one of the following holds true:

SDI Observer is running and configured with INJECT_CABUNDLE=true and the right CA certicate is configured with one of CABUNDLE_* environment variables (the default values are usually alright).
Setting Up Certificates has been followed.

NOTE: Only applicable once the SDI installation is complete.

Each newly created tenant needs to be configured to be able to talk to the SDI Registry. The initial tenant (the default) does not need to be configured manually as it is configured during the installation.

There are two steps that need to be performed for each new tenant:

import CA certificate for the registry via SDI Connection Manager if the CA certificate is self-signed
create and import credential secret using the SDI System Management and update the modeler secret

Import the CA certificate

Obtain the router-ca.crt of the secret as documented in the previous section.
Follow the Manage Certificates guide (3.3) / (3.2) / (3.1) to import the router-ca.crt via the SDI Connection Management.

Import the credentials secret

Determine the credentials and import them using the SDI System Management by following the official Provide Access Credentials for a Password Protected Container Registry (3.3) / (3.2) / (3.1).

As an alternative to the step "1. Create a secret file that contains the container registry credentials and …", you can also use the following way to create the vsystem-registry-secret.txt file:

# # determine registry's hostname from its route
# hostname="$(oc get route -n "${NAMESPACE:-sdi-observer}" container-image-registry -o jsonpath='{.spec.host}')"
# oc get -o json -n "${NAMESPACE:-sdi-observer}" secret/container-image-registry-htpasswd | \
    jq -r '.data[".htpasswd.raw"] | @base64d | sub("^\\s*Credentials:\\s+"; "") | gsub("\\s+"; "") | split(":") |
        [{"username":.[0], "password":.[1], "address":"'"$hostname"'"}]' | \
    json2yaml | tee vsystem-registry-secret.txt

NOTE: that json2yaml binary from the remarshal project must be installed on the Management host in addition to jq

8.4. Configure OpenShift to trust container image registry

If the registry's certificate is signed by a self-signed certificate authority, one must make OpenShift aware of it.

If the registry runs on the OpenShift cluster itself and is exposed via a reencrypt or edge route with the default TLS settings (no custom TLS certificates set), the CA certificate used is available in the secret router-ca in openshift-ingress-operator namespace.

To make the registry available via such route trusted, set the route's hostname into the registry variable and execute the following code in bash:

# registry="local.image.registry:5000"
# caBundle="$(oc get -n openshift-ingress-operator -o json secret/router-ca | \
    jq -r '.data as $d | $d | keys[] | select(test("\\.(?:crt|pem)$")) | $d[.] | @base64d')"
# # determine the name of the CA configmap if it exists already
# cmName="$(oc get images.config.openshift.io/cluster -o json | \
    jq -r '.spec.additionalTrustedCA.name // "trusted-registry-cabundles"')"
# if oc get -n openshift-config "cm/$cmName" 2>/dev/null; then
    # configmap already exists -> just update it
    oc get -o json -n openshift-config "cm/$cmName" | \
        jq '.data["'"${registry//:/..}"'"] |= "'"$caBundle"'"' | \
        oc replace -f - --force
  else
      # creating the configmap for the first time
      oc create configmap -n openshift-config "$cmName" \
          --from-literal="${registry//:/..}=$caBundle"
      oc patch images.config.openshift.io cluster --type=merge \
          -p '{"spec":{"additionalTrustedCA":{"name":"'"$cmName"'"}}}'
  fi

If using a registry running outside of OpenShift or not secured by the default ingress CA certificate, take a look at the official guideline at Configuring a ConfigMap for the Image Registry Operator (4.12) / (4.10)

To verify that the CA certificate has been deployed, execute the following and check whether the supplied registry name appears among the file names in the output:

# oc rsh -n openshift-image-registry "$(oc get pods -n openshift-image-registry -l docker-registry=default | \
        awk '/Running/ {print $1; exit}')" ls -1 /etc/pki/ca-trust/source/anchors
container-image-registry-sdi-observer.apps.boston.ocp.vslen
image-registry.openshift-image-registry.svc..5000
image-registry.openshift-image-registry.svc.cluster.local..5000

If this is not feasible, one can also mark the registry as insecure.

8.5. Configure insecure registry

As a less secure an alternative to the Configure OpenShift to trust container image registry, registry may also be marked as insecure which poses a potential security risk. Please follow Configuring image settings (4.12) / (4.10) and add the registry to the .spec.registrySources.insecureRegistries array. For example:

apiVersion: config.openshift.io/v1
kind: Image
metadata:
  annotations:
    release.openshift.io/create-only: "true"
  name: cluster
spec:
  registrySources:
    insecureRegistries:
    - local.image.registry:5000

NOTE: it may take a couple of tens of minutes until the nodes are reconfigured. You can use the following commands to monitor the progress:

watch oc get machineconfigpool
watch oc get nodes

8.6. Running multiple SDI instances on a single OpenShift cluster

Two instances of SAP Data Intelligence running in parallel on a single OpenShift cluster have been validated. Running more instances is possible, but most probably needs an extra support statement from SAP.

Please consider the following before deploying more than one SDI instance to a cluster:

Each SAP Data Intelligence instance must run in its own namespace/project.
Each SAP Data Intelligence instance must use a different prefix or container image registry for the Pipeline Modeler. For example, the first instance can configure "Container Registry Settings for Pipeline Modeler" as local.image.registry:5000/sdi30blue and the second as local.image.registry:5000/sdi30green.
It is recommended to dedicate particular nodes to each SDI instance.
It is recommended to use network policy (4.12) / (4.10) SDN mode for completely granular network isolation configuration and improved security. Check network policy configuration (4.12) / (4.10) for further references and examples. This, however, cannot be changed post OpenShift installation.
If running the production and test (aka blue-green) SDI deployments on a single OpenShift cluster, mind also the following:
- There is no way to test an upgrade of OpenShift cluster before an SDI upgrade.
- The idle (non-productive) landscape should have the same network security as the live (productive) one.

To deploy a new SDI instance to OpenShift cluster, please repeat the steps from project setup starting from point 6 with a new project name and continue with SDI Installation.

8.7. Installing remarshal utilities on RHEL

For a few example snippets throughout this guide, either yaml2json or json2yaml scripts are necessary.

They are provided by the remarshal project and shall be installed on the Management host in addition to jq. On RHEL 8.2, one can install it this way:

# sudo dnf install -y python3-pip
# sudo pip3 install remarshal

8.8. (footnote ⁿ) Upgrading to the next minor release from the latest asynchronous release

If the OpenShift cluster is subscribed to the stable channel, its latest available micro release for the current minor release may not be upgradable to a newer minor release.

Consider the following example:

The OpenShift cluster is of release 4.11.24.
The latest asynchronous release available in stable-4.11 channel is 4.11.30.
The latest stable 4.12 release is 4.12.15 (available in stable-4.12 channel).
From the 4.11.24 micro release, one can upgrade to one of 4.11.27, 4.11.28, 4.11.30, 4.12.13 or 4.12.15
However, from the 4.11.30 one cannot upgrade to any newer release because no upgrade path has been validated/provided yet in the stable channel.

Therefor, OpenShift cluster can get stuck on 4.11 release if it is first upgraded to the latest asynchronous release 4.11.30 instead of being upgraded directly to one of the 4.12 minor releases. However, at the same time, the fast-4.12 channel contains 4.12.16 release with an upgrade path from 4.11.30. The 4.12.16 release appears in the stable-4.12 channel sooner of later after being introduced in the fast channel first.

To amend the situation without waiting for an upgrade path to appear in the stable channel:

Temporarily switch to the fast-4.X channel.
Perform the upgrade.
Switch back to the stable-4.X channel.
Continue performing upgrades to the latest micro release available in the stable-4.X channel.

8.9. HTTP Proxy Configuration

HTTP(S) Proxy must be configured on different places. The corresponding No Proxy settings are treated differently by different components.

management host
OpenShift cluster
SLC Bridge
SAP Data Intelligence

The sections below assume the following:
- cluster's base domain is example.com
- cluster name is foo, which means its API is listening at api.foo.example.com:6443
- the local proxy server is listening at http://proxy.example.com:3128
- management host's hostname is jump.example.com, we should add its shortname (jump) to the NO_PROXY
- the local network CIDR is 192.168.128.0/24
- the OpenShift's service network has the default range of 172.30.0.0/16

8.9.1. Configuring HTTP Proxy on the management host

Please export the Proxy environment variables on your management host according to your Linux distribution. For RHEL, please follow How to apply a system wide proxy. For example in BASH:

# sudo cp /dev/stdin /etc/profile.d/http_proxy.sh <<EOF
export http_proxy=http://proxy.example.com:3128
export https_proxy=http://proxy.example.com:3128
export no_proxy=localhost,127.0.0.1,jump,.example.com,192.168.128.0/24
EOF
# source /etc/profile.d/http_proxy.sh

Where the .example.com is a wildcard pattern matching any subdomains like foo.example.com.

8.9.2. Configuring HTTP Proxy on the OpenShift cluster

Usually, the OpenShift is configured to use the proxy during its installation.

But it is also possible to set/re-configure it ex-post.

An example configuration could look like this:

# oc get proxy/cluster -o json | jq '.spec'
{
  "httpProxy": "http://proxy.example.com:3128",
  "httpsProxy": "http://proxy.example.com:3128",
  "noProxy": "192.168.128.0/24,jump,.local,.example.com",
  "trustedCA": {
    "name": "user-ca-bundle"
  }
}

Please keep in mind that wildcard characters (e.g. *.example.com) are not supported by OpenShift.

The complete no_proxy list extended for container and service networks and additional service names is generated automatically and is stored in the .status.noProxy field of the proxy object:

# oc get proxy/cluster -o json | jq -r '.status.noProxy'
.cluster.local,.local,.example.com,.svc,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.128.0/24,api-int.foo.example.com,localhost,jump

8.9.3. Configuring HTTP Proxy for the SLC Bridge

The SLC Bridge binary shall use the proxy settings from the environment on management host configured earlier. This is important to allow SLCB to talk to the SAP image registry (proxied), local image registry and OpenShift API (not proxied).

During SLC Bridge's init phase, which deploys the bridge as a container on the OpenShift cluster, one must set Proxy settings as well when prompted. Here are the example values:

# ./slcb init
...
***************************************************************************
* Choose whether you want to run the deployment in typical or expert mode *
***************************************************************************

     1. Typical Mode
   > 2. Expert Mode
Choose action <F12> for Back/<F1> for help
  possible values [1,2]: 2
...
************************
*    Proxy Settings    *
************************

   Configure Proxy Settings: y
Choose action <F12> for Back/<F1> for help
  possible values [yes(y)/no(n)]: y

************************
*     HTTPS Proxy      *
************************

Enter the URL of the HTTPS Proxy to use
Choose action <F12> for Back/<F1> for help
  HTTPS Proxy: http://proxy.example.com:3128

So far no surprise. For the No Proxy however, it is recommended to copy&append the .status.noProxy settings from the OpenShift's proxy object.

************************
*   Cluster No Proxy   *
************************

Specify the NO_PROXY setting for the cluster.
The value cannot contain white space and it must be comma-separated.
You have to include the address range configured for the kubernetes cluster in this list (e.g. "10.240.0.0/20").
Choose action <F12> for Back/<F1> for help
  Cluster No Proxy: 10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.128.0/24,localhost,jump,169.254.169.254,sap-slcbridge,.local,.example.com,.svc,.internal

Note: you can use the following script to generate the value from OpenShift's proxy settings.

# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/get_no_proxy.sh) --slcb

Please make sure to append the --slcb paramater.

8.9.4. Configuring HTTP Proxy for the SAP DI during its installation

During the SDI installation, one must choose the "Advanced Installation" for the "Installation Type" in order to configure Proxy.

Then the following is the example of proxy settings:

**************************
* Cluster Proxy Settings *
**************************

   Choose if you want to configure proxy settings on the cluster: y
Choose action <F12> for Back/<F1> for help
  possible values [yes(y)/no(n)]: y

************************
*  Cluster HTTP Proxy  *
************************

Specify the HTTP_PROXY value for the cluster.
Choose action <F12> for Back/<F1> for help
  HTTP_PROXY: http://proxy.example.com:3128

************************
* Cluster HTTPS Proxy  *
************************

Specify the HTTPS_PROXY value for the cluster.
Choose action <F12> for Back/<F1> for help
  HTTPS_PROXY: http://proxy.ocpoff.vslen:3128

************************
*   Cluster No Proxy   *
************************

Specify the NO_PROXY value for the cluster. NO_PROXY value cannot contain white spaces and it must be comma-separated.
Choose action <F12> for Back/<F1> for help
  NO_PROXY: 10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.0.0/16,192.168.128.2,localhost,jump,169.254.169.254,auditlog,datalake,diagnostics-prometheus-pushgateway,hana-service,storagegateway,uaa,vora-consul,vora-dlog,vora-prometheus-pushgateway,vsystem,vsystem-internal,*.local,*.example.com,*.svc,*.internal

Note: the value can be generated using the following script:

# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/get_no_proxy.sh)

# # to see the usage and options, append `--help`
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/get_no_proxy.sh) --help

When setting the No Proxy, please mind the following:

The wildcard domains must contain wildcard character. On the contrary, the OpenShift's proxy settings must not contain wildcard characters.
As of SLC Bridge 1.1.72, NO_PROXY must not start with a wildcard domain. IOW, please put the wildcard domains at the end of NO_PROXY.
In addition to the OpenShift Proxy's .status.noProxy values, the list should include also the following service names:
- vora-consul,hana-service,uaa,auditlog,vora-dlog,vsystem-internal,vsystem,vora-prometheus-pushgateway,diagnostics-prometheus-pushgateway,storagegateway,datalake

8.9.5. Configuring HTTP Proxy after the SAP DI installation

Login to the system tenant as a clusterAdmin and open the System Management.
Click on Cluster and then click on Tenants.
For each tenant, click on the tenant row.
Click on the "View Application Configuration and Secrets".
Search for PROXY and click on the Edit button.
Edit the values as needed. Feel free to use the get_no_proxy.sh script above to generate the No proxy value.
Click the Update button.
(If dealing with system tenant, please skip this step until the very end). Go back to the tenant overview. This time, click on "Delete all Instances". Note that this will cause a slight downtime for the tenant's current users.
Repeat for other tenants from step 3.
Execute step 8 for system tenant as well.

8.10. GPU enablement for SDI on OCP

To enable the GPU usage for SDI on OCP, please refer to GPU enablement for SDI on OCP.

9. Troubleshooting

Please refer to https://access.redhat.com/articles/7018550 for the detailed troubleshooting guide.

Table of Contents

1. OpenShift Container Platform validation version matrix

2. Requirements

2.1. Hardware/VM and OS Requirements

2.1.1. OpenShift Cluster

2.1.1.1. Node Kinds

2.1.1.2. Note a disconnected and air-gapped environments

2.1.1.3. Minimum Hardware Requirements

2.1.1.4. Minimum Production Hardware Requirements

2.2. Software Requirements

2.2.1. Compatibility Matrix

2.2.2. Persistent Volumes

2.2.3. Container Image Registry

2.2.3.1. Validated Registries

2.2.4. Checkpoint store enablement

2.2.5. SDI Observer

3. Install Red Hat OpenShift Container Platform

3.1. Prepare the Management host

3.1.1. Prepare the connected Management host

3.1.2. Prepare the disconnected RHEL Management host

3.2. Install OpenShift Container Platform

3.3. OpenShift Post Installation Steps

3.3.1. (optional) Install OpenShift Data Foundation

3.3.2. (optional) Install NetApp Trident

3.3.3. Configure SDI compute nodes

3.3.3.1. Air-gapped environment

3.3.4.1. Label the compute nodes for SAP Data Intelligence

3.3.4.2. Pre-load needed kernel modules

3.3.4.3. Change the maximum number of PIDs per Container

3.3.4.4. Associate MachineConfigs to the Nodes

3.3.4.4.1. Enable SDI on control plane

3.3.4.6. Verification of the node configuration

3.3.5. Deploy persistent storage provider

3.3.6. Configure S3 access and bucket

3.3.6.1. Using NooBaa or RADOS Object Gateway S3 endpoint as object storage

3.3.6.1.1. Creating an S3 bucket using CLI

3.3.6.1.2. Increasing object bucket limits

3.3.7. Set up a Container Image Registry

3.3.8. Configure the OpenShift Cluster for SDI

3.3.8.1. Becoming a cluster-admin

4. SDI Observer

4.1. Prerequisites

4.2.1. Prerequisites for Connected OpenShift Cluster

4.2.2. Prerequisites for a Disconnected OpenShift Cluster

4.2.3. Instantiation of Observer's Template

4.2.4. (Optional) SDI Observer Registry

4.2.4.1. SDI Registry Template parameters

4.3. Managing SDI Observer

4.3.1. Viewing and changing the current configuration

4.3.2. Re-deploying SDI Observer

5. Install SDI on OpenShift

5.1. Install Software Lifecycle Container Bridge

5.1.1. Important Parameters

5.1.2. Install SLC Bridge

5.1.2.1. Exposing SLC Bridge with OpenShift Ingress Controller

5.1.2.1.1. Manually exposing SLC Bridge with Ingress

5.1.2.2. Using an external load balancer to access SLC Bridge's NodePort

5.2. SDI Installation Parameters

5.3. Project setup

5.4. Install SDI

5.5. SDI Post installation steps

5.5.1. (Optional) Expose SDI services externally

5.5.1.1. Using OpenShift Ingress Operator

5.5.1.1.1. Export services with an reencrypt route

5.5.1.1.2. Export services with a passthrough route

5.5.1.2. Using NodePorts

5.5.2. Configure the Connection to Data Lake

5.5.3. SDI Validation

5.5.3.1. Log On to SAP Data Intelligence Launchpad

5.5.3.2. Check Your Machine Learning Setup

5.5.4. Configuration of additional tenants

6. OpenShift Container Platform Upgrade

6.1. Pre-upgrade procedures

6.1.1. Stop SAP Data Intelligence

6.2. Upgrade OpenShift

6.3. Post-upgrade procedures

7. SAP Data Intelligence Upgrade or Update

7.1. Pre-upgrade or pre-update procedures

7.1.1. Execute SDI's Pre-Upgrade Procedures

7.1.1.1. Automated route removal