SAP Data Intelligence 3 on OpenShift Container Platform 4
Table of Contents
- 1. OpenShift Container Platform validation version matrix
- 2. Requirements
- 2.1. Hardware/VM and OS Requirements
- 2.1.1. OpenShift Cluster
- 2.1.1.1. Node Kinds
- 2.1.1.2. Note a disconnected and air-gapped environments
- 2.1.1.3. Minimum Hardware Requirements
- 2.1.1.4. Minimum Production Hardware Requirements
- 2.2. Software Requirements
- 2.2.1. Compatibility Matrix
- 2.2.2. Persistent Volumes
- 2.2.3. Container Image Registry
- 2.2.3.1. Validated Registries
- 2.2.4. Checkpoint store enablement
- 2.2.5. SDI Observer
- 3. Install Red Hat OpenShift Container Platform
- 3.1. Prepare the Management host
- 3.1.1. Prepare the connected Management host
- 3.1.2. Prepare the disconnected RHEL Management host
- 3.2. Install OpenShift Container Platform
- 3.3. OpenShift Post Installation Steps
- 3.3.1. (optional) Install OpenShift Data Foundation
- 3.3.2. (optional) Install NetApp Trident
- 3.3.3. Configure SDI compute nodes
- 3.3.3.1. Air-gapped environment
- 3.3.4.1. Label the compute nodes for SAP Data Intelligence
- 3.3.4.2. Pre-load needed kernel modules
- 3.3.4.3. Change the maximum number of PIDs per Container
- 3.3.4.4. Associate MachineConfigs to the Nodes
- 3.3.4.4.1. Enable SDI on control plane
- 3.3.4.6. Verification of the node configuration
- 3.3.5. Deploy persistent storage provider
- 3.3.6. Configure S3 access and bucket
- 3.3.6.1. Using NooBaa or RADOS Object Gateway S3 endpoint as object storage
- 3.3.6.1.1. Creating an S3 bucket using CLI
- 3.3.6.1.2. Increasing object bucket limits
- 3.3.7. Set up a Container Image Registry
- 3.3.8. Configure the OpenShift Cluster for SDI
- 3.3.8.1. Becoming a cluster-admin
- 4. SDI Observer
- 4.1. Prerequisites
- 4.2.1. Prerequisites for Connected OpenShift Cluster
- 4.2.2. Prerequisites for a Disconnected OpenShift Cluster
- 4.2.3. Instantiation of Observer's Template
- 4.2.4. (Optional) SDI Observer Registry
- 4.2.4.1. SDI Registry Template parameters
- 4.3. Managing SDI Observer
- 4.3.1. Viewing and changing the current configuration
- 4.3.2. Re-deploying SDI Observer
- 5. Install SDI on OpenShift
- 5.1. Install Software Lifecycle Container Bridge
- 5.1.1. Important Parameters
- 5.1.2. Install SLC Bridge
- 5.1.2.1. Exposing SLC Bridge with OpenShift Ingress Controller
- 5.1.2.1.1. Manually exposing SLC Bridge with Ingress
- 5.1.2.2. Using an external load balancer to access SLC Bridge's NodePort
- 5.2. SDI Installation Parameters
- 5.3. Project setup
- 5.4. Install SDI
- 5.5. SDI Post installation steps
- 5.5.1. (Optional) Expose SDI services externally
- 5.5.1.1. Using OpenShift Ingress Operator
- 5.5.1.1.1. Export services with an reencrypt route
- 5.5.1.1.2. Export services with a passthrough route
- 5.5.1.2. Using NodePorts
- 5.5.2. Configure the Connection to Data Lake
- 5.5.3. SDI Validation
- 5.5.3.1. Log On to SAP Data Intelligence Launchpad
- 5.5.3.2. Check Your Machine Learning Setup
- 5.5.4. Configuration of additional tenants
- 6. OpenShift Container Platform Upgrade
- 6.1. Pre-upgrade procedures
- 6.1.1. Stop SAP Data Intelligence
- 6.2. Upgrade OpenShift
- 6.3. Post-upgrade procedures
- 7. SAP Data Intelligence Upgrade or Update
- 7.1. Pre-upgrade or pre-update procedures
- 7.1.1. Execute SDI's Pre-Upgrade Procedures
- 7.1.1.1. Automated route removal
- 7.1.1.2. Manual route removal
- 7.1.2. (upgrade) Prepare SDI Project
- 7.2. Update or Upgrade SDI
- 7.2.1. Update Software Lifecycle Container Bridge
- 7.2.2. (upgrade) Upgrade SAP Data Intelligence to a newer minor release
- 7.3. (ocp-upgrade) Upgrade OpenShift
- 7.4. SAP Data Intelligence Post-Upgrade Procedures
- 7.5. Validate SAP Data Intelligence
- 8. Appendix
- 8.1. SDI uninstallation
- 8.2. Quay Registry for SDI
- 8.2.1. Quay namespaces, users and accounts preparations
- 8.2.2. Determine the Image Repository
- 8.2.3. Importing Quay's CA Certificate to OpenShift
- 8.2.4. Configuring additional SDI tenants
- 8.2.4.1. Importing Quay's CA Certificate to SAP DI
- 8.2.4.2. Create and import vflow pull secret into OpenShift
- 8.2.4.3. Import credentials secret to SDI tenant
- 8.3. (Deprecated) Deploying SDI Registry manually
- 8.3.1. Deployment
- 8.3.1.1. Prerequisites
- 8.3.1.2. Template instantiation
- 8.3.1.3. Generic instantiation for a disconnected environment
- 8.3.2. Update instructions
- 8.3.3. Determine Registry's credentials
- 8.3.4. Verification
- 8.3.5. Post configuration
- 8.3.5.1. Making SDI Registry trusted by OpenShift
- 8.3.5.2. SDI Observer Registry tenant configuration
- 8.4. Configure OpenShift to trust container image registry
- 8.5. Configure insecure registry
- 8.6. Running multiple SDI instances on a single OpenShift cluster
- 8.7. Installing remarshal utilities on RHEL
- 8.8. (footnote ⁿ) Upgrading to the next minor release from the latest asynchronous release
- 8.9. HTTP Proxy Configuration
- 8.9.1. Configuring HTTP Proxy on the management host
- 8.9.2. Configuring HTTP Proxy on the OpenShift cluster
- 8.9.3. Configuring HTTP Proxy for the SLC Bridge
- 8.9.4. Configuring HTTP Proxy for the SAP DI during its installation
- 8.9.5. Configuring HTTP Proxy after the SAP DI installation
- 8.10. GPU enablement for SDI on OCP
- 9. Troubleshooting
In general, the installation of SAP Data Intelligence (SDI) follows these steps:
- Install Red Hat OpenShift Container Platform
- Configure the prerequisites for SAP Data Intelligence Foundation
- Install SDI Observer
- Install SAP Data Intelligence Foundation on OpenShift Container Platform
If you're interested in installation of SAP Data Hub or SAP Vora, please refer to the other installation guides:
- SAP Data Hub 2 on OpenShift Container Platform 4
- SAP Data Hub 2 on OpenShift Container Platform 3
- Install SAP Data Hub 1.X Distributed Runtime on OpenShift Container Platform
- Installing SAP Vora 2.1 on Red Hat OpenShift 3.7
Note OpenShift Container Storage (OCS) is called throughout this article under its new product name OpenShift Data Foundation (ODF).
Note that OpenShift Container Platform (OCP) can be substituted by OpenShift Kubernetes Engine (OKE). OKE is sufficient and supported to run SAP Data Intelligence.
▲ Note There are known SAP image security issues that may be revealed during a security audit. Red Hat cannot resolve them. Please open a support case with SAP regarding any of the following:
- SAP containers run as root
- SAP containers run unconfined (unrestricted by SELinux)
- SAP containers require privileged security context
1. OpenShift Container Platform validation version matrix
The following version combinations of SDI 3.X, OpenShift Container Platform (OCP), RHEL or RHCOS have been validated for the production environments:
For information on SAP Data Intelligence support with OpenShift releases 4.14 and later, please refer to https://access.redhat.com/articles/7042265.
SAP Data Intelligence | OpenShift Container Platform | Operating System | Infrastructure and (Storage) | Confirmed&Supported by SAP |
---|---|---|---|---|
3.0 | 4.2 † | RHCOS (nodes), RHEL 8.1+ or Fedora (Management host) | VMware vSphere (ODF 4.2) | supported † |
3.0 Patch 3 | 4.2 †, 4.4 † | RHCOS (nodes), RHEL 8.2+ or Fedora (Management host) | VMware vSphere (ODF 4) | supported † |
3.0 Patch 4 | 4.4 † | RHCOS (nodes), RHEL 8.2+ or Fedora (Management host) | VMware vSphere (ODF 4), (NetApp Trident 20.04) | supported † |
3.0 Patch 8 | 4.6 † | RHCOS (nodes), RHEL 8.2+ or Fedora (Management host) | KVM/libvirt (ODF 4) | supported † |
3.1 | 4.4 † | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | VMware vSphere (ODF 4) | not supported¹ |
3.1 | 4.6 † | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | VMware vSphere (ODF 4 ¡, NetApp Trident 20.10 + StorageGRID), Bare metal ∗ (ODF 4 ¡) | supported † |
3.2 | 4.6 †, 4.8 | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | IBM Cloud™ (IBM Cloud Block Storage) | supported |
3.2 | 4.6 †, 4.8 | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | VMware vSphere (ODF 4) | supported |
3.2 | 4.8, 4.10, | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | Bare metal ∗ (ODF 4 ¡) | supported |
3.3 | 4.8, 4.10, 4.12 | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | VMware vSphere (ODF 4) | supported |
3.3 | 4.8, 4.10, 4.12 | RHCOS (nodes), RHEL 8.3+ or Fedora (Management host) | Bare metal ∗ (ODF 4 ¡) | supported |
† The referenced OpenShift release is no longer supported by Red Hat!
¹ 3.1 on OpenShift 4.4 used to be supported by SAP only for the purpose of upgrade to OpenShift 4.6
∗ Validated on two different hardware configurations:
-
For more information on OCP on IBM Cloud™, please refer to Getting started with Red Hat OpenShift on IBM Cloud.
If using this platform, you don't need to install OpenShift and may jump directly to IBM's documentation Planning your SAP Data Intelligence deployment. You will get guided through all installation steps and find the appropriate links back to this Red Hat article. -
(Dev/PoC level) Lenovo 4 bare metal hosts setup composed of:
- 3 schedulable control plane nodes running both ODF and SDI (Lenovo ThinkSystem SR530)
- 1 compute node running SDI) (Lenovo ThinkSystem SR530)
-
(Production level) Dell Technologies bare metal cluster composed of:
- 1 CSAH node (Dell EMC PowerEdge R640s)
- 3 control plane nodes (Dell EMC PowerEdge R640s)
- 3 dedicated ODF nodes (Dell EMC PowerEdge R640s)
- 3 dedicated SDI nodes (Dell EMC PowerEdge R740xd)
CSI supported external Dell EMC storage options and cluster sizing options available.
CSAH stands for Cluster System Admin Host - an equivalent of management host
Please refer to the compatibility matrix for version combinations that are considered as working.
SAP Note #2871970 lists more details.
2. Requirements
2.1. Hardware/VM and OS Requirements
2.1.1. OpenShift Cluster
Make sure to consult the following official cluster requirements:
- of SAP Data Intelligence in SAP's documentation:
- of OpenShift 4 (Minimum resource requirements (4.12) / (4.10))
- additionally, if deploying OpenShift Data Foundation (aka ODF), please consult also ODF Supported configurations (4.12) / (4.10)
- if deploying on VMware vSphere, please consider also VMware vSphere infrastructure requirements (4.12) / (4.10)
- if deploying NetApp Trident, please consult also NetApp Hardware/VM and OS Requirements
2.1.1.1. Node Kinds
There are 4 kinds of nodes:
- Bootstrap Node - A temporary bootstrap node needed for the OpenShift deployment. The node can be either destroyed by the installer (using infrastructure-provisioned-installation -- aka IPI) or can be deleted manually by the administrator. Alternatively, it can be re-used as a worker node. Please refer to the Installation process (4.12) / (4.10) for more information.
- Master Nodes (4.12) / (4.10) - The control plane manages the OpenShift Container Platform cluster. The control plane can be made schedulable to enable SDI workload there as well.
- Compute Nodes (4.12) / (4.10) - Run the actual workload (e.g. SDI pods). They are optional on a three-node cluster (where the master nodes are schedulable).
- ODF Nodes (4.12) / (4.10) - Run OpenShift Data Foundation (aka ODF). The nodes can be divided into starting (running both OSDs and monitors) and additional nodes (running only OSDs). Needed only when ODF shall be used as the backing storage provider.
- NOTE: Running in a compact mode (on control plane) is fully supported starting from ODF 4.8.
-
Management host (aka administrator's workstation or Jump host - The Management host is used among other things for:
- accessing the OpenShift cluster via a configured command line client (
oc
orkubectl
) - configuring OpenShift cluster
- running Software Lifecycle Container Bridge (SLC Bridge)
- accessing the OpenShift cluster via a configured command line client (
The hardware/software requirements for the Management host can be:
- OS: Red Hat Enterprise Linux 8.1+, RHEL 7.6+ or Fedora 30+
- Diskspace: 20GiB for
/
:
2.1.1.2. Note a disconnected and air-gapped environments
By the term "disconnected host", it is referred to a host having no access to internet.
By the term "disconnected cluster", it is referred to a cluster where each host is disconnected.
A disconnected cluster can be managed from a Management host that is either connected (having access to the internet) or disconnected.
The latter scenario (both cluster and management host being disconnected) will be referred to by the term "air-gapped".
Unless stated otherwise, whatever applies to a disconnected host, cluster or environment, applies also to the "air-gapped".
2.1.1.3. Minimum Hardware Requirements
The table below lists the minimum requirements and the minimum number of instances for each node type for the latest validated SDI and OpenShift 4.X releases. This is sufficient of a PoC (Proof of Concept) environments.
Type | Count | Operating System | vCPU ⑃ | RAM (GB) | Storage (GB) | AWS Instance Type |
---|---|---|---|---|---|---|
Bootstrap | 1 | RHCOS | 4 | 16 | 120 | m4.xlarge |
Master | 3 | RHCOS | 4 | 16 | 120 | m4.xlarge |
Compute | 3+ | RHCOS or RHEL 7.8 or 7.9 | 8 | 32 | 120 | m4.2xlarge |
On a three-node cluster, it would look like this:
Type | Count | Operating System | vCPU ⑃ | RAM (GB) | Storage (GB) | AWS Instance Type |
---|---|---|---|---|---|---|
Bootstrap | 1 | RHCOS | 4 | 16 | 120 | m4.xlarge |
Master/Compute | 3 | RHCOS | 10 | 40 | 120 | m4.xlarge |
If using ODF in internal mode, at least additional 3 (starting) nodes are recommended. Alternatively, the Compute nodes outlined above can also run ⑂ ODF pods. In that case, the hardware specifications need to be extended accordingly. The following table lists the minimum requirements for each additional node:
Type | Count | Operating System | vCPU ⑃ | RAM (GB) | Storage (GB) | AWS Instance Type |
---|---|---|---|---|---|---|
ODF starting (OSD+MON) | 3 | RHCOS | 10 | 24 | 120 + 2048 ♢ | m5.4xlarge |
2.1.1.4. Minimum Production Hardware Requirements
The minimum production requirements for production systems for the latest validated SDI and OpenShift 4 are the following:
Type | Count | Operating System | vCPU ⑃ | RAM (GB) | Storage (GB) | AWS Instance Type |
---|---|---|---|---|---|---|
Bootstrap | 1 | RHCOS | 4 | 16 | 120 | m4.xlarge |
Master | 3+ | RHCOS | 8 | 16 | 120 | c5.xlarge |
Compute | 3+ | RHCOS or RHEL 7.8 or 7.9 | 16 | 64 | 120 | m4.4xlarge |
On a three-node cluster, it would look like this:
Type | Count | Operating System | vCPU ⑃ | RAM (GB) | Storage (GB) | AWS Instance Type |
---|---|---|---|---|---|---|
Bootstrap | 1 | RHCOS | 4 | 16 | 120 | m4.xlarge |
Master/Compute | 3 | RHCOS | 22 | 72 | 120 | c5.9xlarge |
If using ODF 4 in internal mode, at least additional 3 (starting) nodes are recommended. Alternatively, the Compute nodes outlined above can also run ODF ⑂ pods. In that case, the hardware specifications need to be extended accordingly. The following table lists the minimum requirements for each additional node:
Type | Count | Operating System | vCPU ⑃ | RAM (GB) | Storage (GB) | AWS Instance Type |
---|---|---|---|---|---|---|
ODF starting (OSD+MON) | 3 | RHCOS | 20 | 49 | 120 + 6×2048 ♢ | c5a.8xlarge |
♢ Please refer to ODF Platform Requirements (4.12) / (4.10).
⑂ Running in a compact mode (on control plane) is fully supported starting from ODF 4.8.
⑃ 1 physical core provides 2 vCPUs when hyper-threading is enabled. 1 physical core provides 1 vCPU when hyper-threading is not enabled.
2.2. Software Requirements
2.2.1. Compatibility Matrix
Later versions of SAP Data Intelligence support newer versions of Kubernetes and OpenShift Container Platform or OpenShift Kubernetes Engine. Even if not listed in the OpenShift validation version matrix above, the following version combinations are considered fully working and supported:
SAP Data Intelligence | OpenShift Container Platform ² | Worker Node | Management host | Infrastructure | Storage | Object Storage |
---|---|---|---|---|---|---|
3.0 Patch 3 or higher | 4.3, 4.4 | RHCOS | RHEL 8.1 or newer | Cloud ❄, VMware vSphere | ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣ | ODF, NetApp StorageGRID 11.3 or newer |
3.0 Patch 8 or higher | 4.4, 4.5, 4.6 | RHCOS | RHEL 8.1 or newer | Cloud ❄, VMware vSphere | ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣ | ODF, NetApp StorageGRID 11.3 or newer |
3.1 | 4.4, 4.5, 4.6 | RHCOS | RHEL 8.1 or newer | Cloud ❄, VMware vSphere, Bare metal | ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣ | ODF ¡, NetApp StorageGRID 11.4 or newer |
3.2 | 4.6, 4.7, 4.8 | RHCOS | RHEL 8.1 or newer | Cloud ❄, VMware vSphere, Bare metal | ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣ | ODF ¡, NetApp StorageGRID 11.4 or newer |
3.3 | 4.8, 4.9, 4.10, 4.11, 4.12 | RHCOS | RHEL 8.1 or newer | Cloud ❄, VMware vSphere, Bare metal | ODF 4, NetApp Trident 20.04 or newer, vSphere volumes ♣, NFS ♣ | ODF ¡, NetApp StorageGRID 11.4 or newer |
² OpenShift Kubernetes Engine (OKE) is a viable and supported substiute for OpenShift Container Platform (OCP).
❄ Cloud means any cloud provider supported by OpenShift Container Platform. For a complete list of tested and supported infrastructure platforms, please refer to OpenShift Container Platform 4.x Tested Integrations. The persistent storage in this case must be provided by the cloud provider. Please see refer to Understanding persistent storage (4.12) / (4.10) for a complete list of supported storage providers.
♣ This persistent storage provider does not offer a supported object storage service required by SDI's checkpoint store and therefor is suitable only for SAP Data Intelligence development and PoC clusters. It needs to be complemented by an object storage solution for the full SDI functionality.
¡ For the full functionality (including SDI backup&restore), ODF 4.6.4 or newer is required. Alternatively, ODF external mode can be used while utilizing RGW for SDI backup&restore (checkpoint store).
Unless stated otherwise, the compatibility of a listed SDI version covers all its patch releases as well.
2.2.2. Persistent Volumes
Persistent storage is needed for SDI. It is required to use storage that can be created dynamically. You can find more information in the Understanding persistent storage (4.12) / (4.10) document.
2.2.3. Container Image Registry
The SDI installation requires a secured Image Registry where images are first mirrored from an SAP Registry and then delivered to the OpenShift cluster nodes. The integrated OpenShift Container Registry (4.12) / (4.10) is NOT appropriate for this purpose. Neither is AWS ECR Registry. For now, another image registry needs to be set up instead.
The requirements listed here is a subset of the official requirements listed in Container Registry (3.3) / (3.2) / (3.1).
The word secured in this context means that the communication is encrypted using a TLS. Ideally with certificates signed by a trusted certificate authority. If the registry is also exposed publicly, it must require authentication and authorization in order to pull SAP images.
2.2.3.1. Validated Registries
-
(recommened) Red Hat Quay 3.6 or higher is compatible with SAP Data Intelligence images and is supported for this purpose. The Quay registry can run either on OpenShift cluster itself, another OpenShift cluster or standalone. For more information, please see Quay Registry for SAP DI.
-
(deprecated) SDI Registry is a community-supported container image registry satisfying the requirements. Please refer to Deploying SDI Registry for more information.
When finished you should have an external image registry up and running. We will use the URL local.image.registry:5000
as an example. You can verify its readiness with the following command.
# curl -k https://local.image.registry:5000/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
2.2.4. Checkpoint store enablement
In order to enable SAP Vora Database streaming tables, checkpoint store needs to be enabled. The store is an object storage on a particular storage back-end. Several back-end types are supported by the SDI installer that cover most of the storage cloud providers.
The enablement is strongly recommended for production clusters. Clusters having this feature disabled are suitable only for test, development or PoC use-cases.
Make sure to create a desired bucket before the SDI Installation. If the checkpoint store shall reside in a directory on a bucket, the directory needs to exist as well.
2.2.5. SDI Observer
Is a pod monitoring SDI's namespace and modifying objects in there that enable running of SDI on top of OpenShift. The observer shall be run in a dedicated namespace. It must be deployed before the SDI installation is started. SDI Observer section will guide you through the process of deployment.
3. Install Red Hat OpenShift Container Platform
3.1. Prepare the Management host
Note the following has been tested on RHEL 8.4. The steps shall be similar for other RPM based Linux distribution. Recommended are RHEL 7.7+, Fedora 30+ and CentOS 7+.
3.1.1. Prepare the connected Management host
-
Subscribe the Management host at least to the following repositories:
# OCP_RELEASE=4.12 # sudo subscription-manager repos \ --enable=rhel-8-for-x86_64-appstream-rpms \ --enable=rhel-8-for-x86_64-baseos-rpms \ --enable=rhocp-${OCP_RELEASE:-4.12}-for-rhel-8-x86_64-rpms
-
Install
jq
binary. This installation guide has been tested with jq 1.6.-
on RHEL 8, make sure
rhocp-4.12-for-rhel-8-x86_64-rpms
repository or newer is enabled and install it from there:# dnf install jq-1.6
-
on earlier releases or other distributions, download the binary from upstream:
# sudo curl -L -O /usr/local/bin/jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64 # sudo chmod a+x /usr/local/bin/jq
-
-
Download and install OpenShift client binaries.
# sudo dnf install -y openshift-clients
3.1.2. Prepare the disconnected RHEL Management host
Please refer to KB#3176811 Creating a Local Repository and Sharing With Disconnected/Offline/Air-gapped Systems and KB#29269 How can we regularly update a disconnected system (A system without internet connection)?.
Install jq-1.6
and openshift-clients
from your local RPM repository.
3.2. Install OpenShift Container Platform
Install OpenShift Container Platform on your desired cluster hosts. Follow the OpenShift installation guide (4.12) / (4.10)
Several changes need to be done to compute nodes running SDI workloads before SDI installation. These include:
- pre-load needed kernel modules
- increasing the PIDs limit of CRI-O container engine
They will be described in the next section.
3.3. OpenShift Post Installation Steps
3.3.1. (optional) Install OpenShift Data Foundation
Red Hat OpenShift Data Foundation (ODF) has been validated as the persistent storage provider for SAP Data Intelligence. Please refer to the ODF documentation (4.12) / (4.10)
Please make sure to read and follow Disconnected Environment (4.12) / (4.10) if you install on a disconnected cluster.
3.3.2. (optional) Install NetApp Trident
NetApp Trident together with StorageGRID have been validated for SAP Data Intelligence and OpenShift. More details can be found at SAP Data Intelligence on OpenShift 4 with NetApp Trident.
3.3.3. Configure SDI compute nodes
Some SDI components require changes on the OS level of compute nodes. These could impact other workloads running on the same cluster. To prevent that from happening, it is recommended to dedicate a set of nodes to SDI workload. The following needs to be done:
- Chosen nodes must be labeled e.g. using the
node-role.kubernetes.io/sdi=""
label. - MachineConfigs specific to SDI need to be created, they will be applied only to the selected nodes.
- MachineConfigPool must be created to associate the chosen nodes with the newly created MachineConfigs.
- no change will be done to the nodes until this point
- (optional) Apply a node selector to
sdi
,sap-slcbridge
anddatahub-system
projects.- SDI Observer can be configured to do that with
SDI_NODE_SELECTOR
parameter
- SDI Observer can be configured to do that with
Before modifying the recommended approach below, please make yourself familiar with the custom pools concept of the machine config operator.
3.3.3.1. Air-gapped environment
If the Management host does not have access to the internet, you will need to clone the sap-data-intelligence git repository to some other host and make it available on the Management host. For example:
# cd /var/run/user/1000/usb-disk/
# git clone https://github.com/redhat-sap/sap-data-intelligence
Then on the Management host:
-
unless the local checkout already exists, copy it from the disk:
# git clone /var/run/user/1000/usb-disk/sap-data-intelligence ~/sap-data-intelligence
-
otherwise, re-apply local changes (if any) to the latest code:
# cd ~/sap-data-intelligence # git stash # temporarily remove local changes # git remote add drive /var/run/user/1000/usb-disk/sap-data-intelligence # git fetch drive # git merge drive # apply the latest changes from drive to the local checkout # git stash pop # re-apply the local changes on top of the latest code
3.3.4.1. Label the compute nodes for SAP Data Intelligence
Choose compute nodes for the SDI workload and label them from the Management host like this:
# oc label node/sdi-worker{1,2,3} node-role.kubernetes.io/sdi=""
This step, combined with the SDI Observer functionality, ensures that SAP DI-related workloads in specific namespaces are run on the labeled nodes. The SDI Observer adds a node selector annotation to the relevant SAP DI namespaces (e.g., openshift.io/node-selector: node-role.kubernetes.io/sdi=), which typically include the SAP DI namespace, datahub-system, and sap-slcbridge namespaces.
3.3.4.2. Pre-load needed kernel modules
To apply the desired changes to the existing and future SDI compute nodes, please create another machine config like this:
-
(connected management host)
# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/mc-75-worker-sap-data-intelligence.yaml
-
(disconnected management host)
# oc apply -f sap-data-intelligence/master/snippets/mco/mc-75-worker-sap-data-intelligence.yaml
∇ NOTE: If the warning below appears, it can be usually ignored. It suggests that the resource already exists on the cluster and has been created by none of the listed commands. In earlier versions of this documentation, plain oc create
used to be recommended instead.
Warning: oc apply should be used on resource created by either oc create --save-config or oc apply
3.3.4.3. Change the maximum number of PIDs per Container
For OCP 4.10 and previous releases, the process of configuring the nodes is described at Modifying Nodes (4.10) In SDI case, the required settings are .spec.containerRuntimeConfig.pidsLimit
in a ContainerRuntimeConfig
. The result is a modified /etc/crio/crio.conf
configuration file on each affected worker node with pids_limit
set to the desired value. Please create a ContainerRuntimeConfig like this:
-
(connected management host)
# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/ctrcfg-sdi-pids-limit.yaml
-
(disconnected management host)
# oc apply -f sap-data-intelligence/master/snippets/mco/ctrcfg-sdi-pids-limit.yaml
Starting with OCP 4.11, the configuration in CRI-O is deprecated in favor of the configuration in the KubeletConfig, and the default podPidsLimit changed to 4096. The process of configuring the node is described at Modifying Nodes (4.12). In SDI case, we need to increase the pids_limit in the KubeletConfig
. The result is a modified /etc/kubernetes/kubelet.conf
configuration file on each affected worker node with pids_limit
set to the desired value. Please create a ContainerRuntimeConfig like this:
-
(connected management host)
# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/kubeletconfig-sdi-pids-limit.yaml
-
(disconnected management host)
# oc apply -f sap-data-intelligence/master/snippets/mco/kubeletconfig-sdi-pids-limit.yaml
3.3.4.4. Associate MachineConfigs to the Nodes
Define a new MachineConfigPool associating MachineConfigs to the nodes. The nodes will inherit all the MachineConfigs targeting worker
and sdi
roles.
-
(connected management host)
# oc apply -f https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/snippets/mco/mcp-sdi.yaml
-
(disconnected management host)
# oc apply -f sap-data-intelligence/master/snippets/mco/mcp-sdi.yaml
Note that you may see a warning ∇ if the MCO exists already.
The changes will be rendered into machineconfigpool/sdi
. The workers will be restarted one-by-one until the changes are applied to all of them. See Applying configuration changes to the cluster (4.12) / (4.10) for more information.
The following command can be used to wait until the change gets applied to all the worker nodes:
# oc wait mcp/sdi --all --for=condition=updated
After performing the changes above, you should end up with a new role sdi
assigned to the chosen nodes and a new MachineConfigPool containing the nodes:
# oc get nodes
NAME STATUS ROLES AGE VERSION
ocs-worker1 Ready worker 32d v1.19.0+9f84db3
ocs-worker2 Ready worker 32d v1.19.0+9f84db3
ocs-worker3 Ready worker 32d v1.19.0+9f84db3
sdi-worker1 Ready sdi,worker 32d v1.19.0+9f84db3
sdi-worker2 Ready sdi,worker 32d v1.19.0+9f84db3
sdi-worker3 Ready sdi,worker 32d v1.19.0+9f84db3
master1 Ready master 32d v1.19.0+9f84db3
master2 Ready master 32d v1.19.0+9f84db3
master3 Ready master 32d v1.19.0+9f84db3
# oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADED
master rendered-master-15f⋯ True False False 3 3 3 0
sdi rendered-sdi-f4f⋯ True False False 3 3 3 0
worker rendered-worker-181⋯ True False False 3 3 3 0
3.3.4.4.1. Enable SDI on control plane
If the control plane (or master nodes) shall be used for running SDI workload, in addition to the previous step, one needs to perform the following:
- Please make sure the control plane is schedulable
-
Duplicate the machine configs for master nodes:
# oc get -o json mc -l machineconfiguration.openshift.io/role=sdi | jq '.items[] | select((.metadata.annotations//{}) | has("machineconfiguration.openshift.io/generated-by-controller-version") | not) | .metadata |= ( .name |= sub("^(?<i>(\\d+-)*)(worker-)?"; "\(.i)master-") | .labels |= {"machineconfiguration.openshift.io/role": "master"} )' | oc apply -f -
Note that you may see a couple of warnings ∇ if this has been done earlier.
-
Make the master machine config pool inherit the PID limits changes:
# oc label mcp/master workload=sapdataintelligence
The following command can be used to wait until the change gets applied to all the worker nodes:
# oc wait mcp/master --all --for=condition=updated
3.3.4.6. Verification of the node configuration
The following steps assume that the node-role.kubernetes.io/sdi=""
label has been applied to nodes running the SDI workload. All the commands shall be executed on the Management host. All the diagnostics commands will be run in parallel on such nodes.
-
(disconneted only) Make one of the tools images available for your cluster:
-
Either use the image stream
openshift/tools
:-
Make sure the image stream has been populated:
# oc get -n openshift istag/tools:latest
Example output:
NAME IMAGE REFERENCE UPDATED tools:latest quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:13c... 17 hours ago
If it is not the case, make sure your registry mirror CA certificate is trusted.
-
Set the following variable:
# ocDebugArgs="--image-stream=openshift/tools:latest"
-
-
Or make
registry.redhat.io/rhel8/support-tools
image available in your local registry:# LOCAL_REGISTRY=local.image.registry:5000 # podman login registry.redhat.io # podman login "$LOCAL_REGISTRY" # if the local registry requires authentication # skopeo copy --remove-signatures \ docker://registry.redhat.io/rhel8/support-tools:latest \ docker://"$LOCAL_REGISTRY/rhel8/support-tools:latest" # ocDebugArgs="--image=$LOCAL_REGISTRY/rhel8/support-tools:latest"
-
-
Verify that the PID limit has been increased to 16384:
For OCP 4.10 and previous releases, check if the CRI-O pids_limit is being set on the node where the application container is running:
# oc get nodes -l node-role.kubernetes.io/sdi= -o name | \
xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/bash -c \
"crio-status config | awk '/pids_limit/ {
print ENVIRON[\"HOSTNAME\"]\":\t\"\$0}'" |& grep pids_limit
NOTE: $ocDebugArgs
is set only in a disconnected environment, otherwise it shall be empty.
An example output could look like this:
sdi-worker3: pids_limit = 16384
sdi-worker1: pids_limit = 16384
sdi-worker2: pids_limit = 16384
For OCP 4.11 and later releases, check if the kubelet podPidsLimit is being set in /etc/kubernetes/kubelet.conf by running the following command:
# oc get nodes -l node-role.kubernetes.io/sdi= -o name | \
xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/bash -c \
"cat /etc/kubernetes/kubelet.conf" | jq '.podPidsLimit'
-
Verify that the kernel modules have been loaded:
# oc get nodes -l node-role.kubernetes.io/sdi= -o name | \ xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/sh -c \ "lsmod | awk 'BEGIN {ORS=\":\t\"; print ENVIRON[\"HOSTNAME\"]; ORS=\",\"} /^(nfs|ip_tables|iptable_nat|[^[:space:]]+(REDIRECT|owner|filter))/ { print \$1 }'; echo" 2>/dev/null
An example output could look like this:
sdi-worker2: iptable_filter,iptable_nat,xt_owner,xt_REDIRECT,nfsv4,nfs,nfsd,nfs_acl,ip_tables, sdi-worker3: iptable_filter,iptable_nat,xt_owner,xt_REDIRECT,nfsv4,nfs,nfsd,nfs_acl,ip_tables, sdi-worker1: iptable_filter,iptable_nat,xt_owner,xt_REDIRECT,nfsv4,nfs,nfsd,nfs_acl,ip_tables,
If any of the following modules is missing on any of the SDI nodes, the module loading does not work:
iptable_nat
,nfsv4
,nfsd
,ip_tables
,xt_owner
To further debug missing modules, one can execute also the following command:
# oc get nodes -l node-role.kubernetes.io/sdi= -o name | \ xargs -P 6 -n 1 -i oc debug $ocDebugArgs {} -- chroot /host /bin/bash -c \ "( for service in {sdi-modules-load,systemd-modules-load}.service; do \ printf '%s:\t%s\n' \$service \$(systemctl is-active \$service); \ done; find /etc/modules-load.d -type f \ -regex '.*\(sap\|sdi\)[^/]+\.conf\$' -printf '%p\n';) | \ awk '{print ENVIRON[\"HOSTNAME\"]\":\t\"\$0}'" 2>/dev/null
Please make sure that both systemd services are
active
and at least one*.conf
file is listed for each host like shown in the following example output:sdi-worker3: sdi-modules-load.service: active sdi-worker3: systemd-modules-load.service: active sdi-worker3: /etc/modules-load.d/sdi-dependencies.conf sdi-worker1: sdi-modules-load.service: active sdi-worker1: systemd-modules-load.service: active sdi-worker1: /etc/modules-load.d/sdi-dependencies.conf sdi-worker2: sdi-modules-load.service: active sdi-worker2: systemd-modules-load.service: active sdi-worker2: /etc/modules-load.d/sdi-dependencies.conf
3.3.5. Deploy persistent storage provider
Unless your platform already offers a supported persistent storage provider, one needs to be deployed. Please refer to Understanding persistent storage (4.12) / (4.10) for an overview of possible options.
On OpenShift, one can deploy OpenShift Data Foundation (ODF) (4.12) / (4.10) running converged on OpenShift nodes providing both persistent volumes and object storage. Please refer to ODF Planning your Deployment (4.12) / (4.10) and Deploying OpenShift Data Foundation (4.12) / (4.10) for more information and installation instructions.
ODF can be deployed also in a disconnected environment (4.12) / (4.10).
For the NFS based persistent storage, dynamic stroage provisioning method is a prerequites. Please refer to the following article for detailed information "How do I create a storage class for NFS dynamic storage provisioning in an OpenShift environment?".
3.3.6. Configure S3 access and bucket
Object storage is required for the following features of SDI:
- backup&restore (previously checkpoint store) feature providing regular back-ups of SDI service data
- SDL Data Lake connection (3.3) / (3.2) / (3.1) for the machine learning scenarios
Several interfaces to the object storage are supported by SDI. S3 interface is one of them. Please take a look at Checkpoint Store Type at Required Input Parameters (3.3) / (3.2) / (3.1) for the complete list. SAP help page covers preparation of object store (3.3) / (3.2) / (3.1).
Backup&restore can be enabled against ODF NooBaa's S3 endpoint as long as ODF is of version 4.6.4 or newer, or against RADOS Object Gateway S3 endpoint when ODF is deployed in the external mode.
3.3.6.1. Using NooBaa or RADOS Object Gateway S3 endpoint as object storage
ODF contains NooBaa object data service for hybrid and multi cloud environments which provides S3 API one can use with SAP Data Intelligence. Starting from ODF release 4.6.4, it can be used also for SDI's backup&restore functionality. Alternatively, the functionality can be enabled against RADOS Object Gateway S3 endpoint (from now on just RGW) which is available when ODF is deployed in the external mode (4.12) / (4.10).
For SDI, one needs to provide the following:
- S3 host URL prefixed either with
https://
orhttp://
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
- bucket name
NOTE: In case of https://
, the endpoint must be secured by certificates signed by a trusted certificate authority. Self-signed CAs will not work out of the box as of now.
Once ODF is deployed, one can create the access keys and buckets using one of the following:
- (internal mode only) via NooBaa Management Console by default exposed at
noobaa-mgmt-openshift-storage.apps.<cluster_name>.<base_domain>
- (both internal and external modes) via CLI with
mksdibuckets
script
In both cases, the S3 endpoint provided to the SAP Data Intelligence cannot be secured with a self-signed certificate as of now. Unless the endpoints are secured with a proper signed certificate, one must use insecure HTTP connection. Both NooBaa and RGW come with such an insecure service reachable from inside the cluster (within the SDN), it cannot be resolved from outside of cluster unless exposed via e.g. route.
The following two URLs are the example endpoints on OpenShift cluster with ODF deployed.
http://s3.openshift-storage.svc.cluster.local
- NooBaa S3 Endpoint available alwayshttp://rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore.openshift-storage.svc.cluster.local:8080
- RGW endpoint that shall be preferably used when ODF is deployed in the external mode
3.3.6.1.1. Creating an S3 bucket using CLI
The bucket can be created with the command below executed from the Management host. Be sure to switch to appropriate project/namespace (e.g. sdi
) first before executing the following command or append parameters -n SDI_NAMESPACE
to it.
-
(connected management host)
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/mksdibuckets)
-
(disconnected management host)
# bash sap-data-intelligence/master/utils/mksdibuckets
By default, two buckets will be created. You can list them this way:
-
(connected management host)
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/mksdibuckets) list
-
(disconnected management host)
# bash sap-data-intelligence/master/utils/mksdibuckets list
Example output:
Bucket claim namespace/name: sdi/sdi-checkpoint-store (Status: Bound, Age: 7m33s)
Cluster internal URL: http://s3.openshift-storage.svc.cluster.local
Bucket name: sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9
AWS_ACCESS_KEY_ID: LQ7YciYTw8UlDLPi83MO
AWS_SECRET_ACCESS_KEY: 8QY8j1U4Ts3RO4rERXCHGWGIhjzr0SxtlXc2xbtE
Bucket claim namespace/name: sdi/sdi-data-lake (Status: Bound, Age: 7m33s)
Cluster internal URL: http://s3.openshift-storage.svc.cluster.local
Bucket name: sdi-data-lake-f86a7e6e-27fb-4656-98cf-298a572f74f3
AWS_ACCESS_KEY_ID: cOxfi4hQhGFW54WFqP3R
AWS_SECRET_ACCESS_KEY: rIlvpcZXnonJvjn6aAhBOT/Yr+F7wdJNeLDBh231
# # NOTE: for more information and options, run the command with --help
The example above uses ODF NooBaa's S3 endpoint which is always the preferred choice for ODF internal mode.
The values of the claim sdi-checkpoint-store
shall be passed to the following SLC Bridge parameters during SDI's installation in order to enable backup&restore (previously known as) checkpoint store functionality.
Parameter | Example value |
---|---|
Object Store Type | S3 compatible object store |
Access Key | LQ7YciYTw8UlDLPi83MO |
Secret Key | 8QY8j1U4Ts3RO4rERXCHGWGIhjzr0SxtlXc2xbtE |
Endpoint | http://s3.openshift-storage.svc.cluster.local |
Path | sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9 |
Disable Certificate Validation | Yes |
3.3.6.1.2. Increasing object bucket limits
NOTE: needed only for RGW (ODF external mode)
When performing checkpoint store validation during SDI installation, the installer will create a temporary bucket. For that to work with the RGW, bucket's owner limit on maximum allocatable buckets needs to be increased. The limit is set to 1 by default.
You can use the following command to perform the needed changes for the bucket assigned to the backup&restore (checkpoint store). Please execute it on the management node of the external Red Hat Ceph Storage cluster (or on the host where the external RGW service runs). The last argument is the "Bucket name", not the "Bucket claim name".
-
(connected management host)
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/rgwtunebuckets) \ sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9
-
(disconnected management host)
# bash sap-data-intelligence/master/utils/rgwtunebuckets \ sdi-checkpoint-store-ef4999e0-2d89-4900-9352-b1e1e7b361d9
For more information and additional options, append --help
parameter at the end.
3.3.7. Set up a Container Image Registry
If you haven't done so already, please follow the Container Image Registry prerequisite.
NOTE: It is now required to use a registry secured by TLS for SDI. Plain HTTP
will not do.
If the registry is signed by a proper trusted (not self-signed) certificate, this may be skipped.
There are two ways to make OpenShift trust an additional registry using certificates signed by a self-signed certificate authority:
- (recommended) update the CA certificate trust in OpenShift's image configuration.
- (less secure) mark the registry as insecure
3.3.8. Configure the OpenShift Cluster for SDI
3.3.8.1. Becoming a cluster-admin
Many commands below require cluster admin privileges. To become a cluster-admin, you can do one of the following:
-
Use the
auth/kubeconfig
generated in the working directory during the installation of the OpenShift cluster:INFO Install complete! INFO Run 'export KUBECONFIG=<your working directory>/auth/kubeconfig' to manage the cluster with 'oc', the OpenShift CLI. INFO The cluster is ready when 'oc login -u kubeadmin -p <provided>' succeeds (wait a few minutes). INFO Access the OpenShift web-console here: https://console-openshift-console.apps.demo1.openshift4-beta-abcorp.com INFO Login to the console with user: kubeadmin, password: <provided> # export KUBECONFIG=working_directory/auth/kubeconfig # oc whoami system:admin
-
As a
system:admin
user or a member ofcluster-admin
group, make another user a cluster admin to allow him to perform the SDI installation:- As a cluster-admin, configure the authentication (4.12) / (4.10) and add the desired user (e.g.
sdiadmin
). -
As a cluster-admin, grant the user a permission to administer the cluster:
# oc adm policy add-cluster-role-to-user cluster-admin sdiadmin
- As a cluster-admin, configure the authentication (4.12) / (4.10) and add the desired user (e.g.
You can learn more about the cluster-admin role in Cluster Roles and Local Roles article (4.12) / (4.10)
4. SDI Observer
SDI Observer monitors SDI and SLC Bridge namespaces and applies changes to SDI deployments to allow SDI to run on OpenShift. Among other things, it does the following:
- adds additional persistent volume to
vsystem-vrep
StatefulSet to allow it to run on RHCOS system - grants fluentd pods permissions to logs
- reconfigures the fluentd pods to parse plain text file container logs on the OpenShift 4 nodes
- exposes SDI System Management service
- exposes SLC Bridge service
- (optional) deploys the SDI Registry suitable for mirroring, storing and serving SDI images and for use by the Pipeline Modeler
- (optional) creates
cmcertificates
secret to allow SDI to talk to container image registry secured by a self-signed CA certificate early during the installation time
It is deployed as an OpenShift template. Its behaviour is controlled by the template's parameters which are mirrored to its environment variables.
Deploy SDI Observer in its own k8s namespace (e.g. sdi-observer
). Please refer to its documentation for the complete list of issues that it currently attempts to solve.
4.1. Prerequisites
The following must be satisfied before SDI Observer can be deployed:
- OpenShift cluster must be healthy including all the cluster operators.
- The OpenShift integrated image registry (4.12) / (4.10) must be properly configured and working.
4.2.1. Prerequisites for Connected OpenShift Cluster
In order to build images needed for SDI Observer, a secret with credentials for registry.redhat.io
needs to be created in the namespace of SDI Observer. Please visit Red Hat Registry Service Accounts to obtain the OpenShift secret. For more details, please refer to Red Hat Container Registry Authentication. We will refer to the file asrht-registry-secret.yaml
. The import to the OpenShift cluster will be covered down below.
4.2.2. Prerequisites for a Disconnected OpenShift Cluster
On a disconnected OpenShift cluster, it is necessary to mirror a pre-built image of SDI Observer to a local container image registry. Please follow Disconnected OpenShift cluster instructions.
4.2.3. Instantiation of Observer's Template
Assuming the SDI will be run in the SDI_NAMESPACE
which is different from the observer NAMESPACE
, instantiate the template with default parameters like this:
-
Prepare the script and images depending on your system connectivity.
-
In a connected environment, download the run script from git repository like this:
# curl -O https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/observer/run-observer-template.sh
-
In a disconnected environment, where the Management host is connected.
Mirror the SDI Observer image to the local registry. For example, on RHEL8:
# podman login local.image.registry:5000 # if the local registry requires authentication # skopeo copy \ docker://quay.io/redhat-sap-cop/sdi-observer:latest-ocp4.12 \ docker://local.image.registry:5000/sdi-observer:latest-ocp4.12
Please make sure to modify the
4.12
suffix according to your OpenShift server minor release. -
In an air-gapped environment (assuming the observer repository has been already cloned to the Management host):
-
On a host with access to the internet, copy the SDI Observer image to an archive on USB drive. For example, on RHEL8:
# skopeo copy \ docker://quay.io/redhat-sap-cop/sdi-observer:latest-ocp4.12 \ oci-archive:/var/run/user/1000/usb-disk/sdi-observer.tar:latest-ocp4.12
-
Plug the USB drive to the Management host (without internet access) and mirror the image from it to your
local.image.registry:5000
:# skopeo copy \ oci-archive:/var/run/user/1000/usb-disk/sdi-observer.tar:latest-ocp4.12 \ docker://local.image.registry:5000/sdi-observer:latest-ocp4.12
-
-
-
Edit the downloaded
run-observer-template.sh
file in your favorite editor. Especially, mind theFLAVOUR
,NAMESPACE
andSDI_NAMESPACE
parameters.- for the
ubi-build
flavour, make sure to setREDHAT_REGISTRY_SECRET_PATH=to/your/rht-registry-secret.yaml
downloaded earlier - for a disconnected environment, make sure to set
FLAVOUR
toocp-prebuilt
andIMAGE_PULL_SPEC
to yourlocal.image.registry:5000
- for an air-gapped environment, set also
SDI_OBSERVER_REPOSITORY=to/local/git/repo/checkout
- for the
-
Run it in bash like this:
# bash ./run-observer-template.sh
-
Keep the modified script around for case of updates.
4.2.4. (Optional) SDI Observer Registry
NOTE: SDI Observer can optionally deploy SDI Registry on a connected OpenShift cluster only. For a disconnected environment, please refer to Generic instantiation for a disconnected environment.
If the observer is configured to deploy SDI Registry via DEPLOY_SDI_REGISTRY=true
parameter, it will deploy the deploy-registry
job which does the following:
- (connected only) builds the
container-image-registry
image and pushes it to the integrated OpenShift Image Registry - generates or uses configured credentials for the registry
- deploys
container-image-registry
deployment config which in turn deploys a corresponding pod -
exposes the registry using a route
- if observer's
SDI_REGISTRY_ROUTE_HOSTNAME
parameter is set, it will be used as its hostname - otherwise the registry's hostname will be
container-image-registry-${NAMESPACE}.apps.<cluster_name>.<base_domain>
- if observer's
4.2.4.1. SDI Registry Template parameters
The following Observer's Template Parameters influence the deployment of the SDI Registry:
Parameter | Example value | Description |
---|---|---|
DEPLOY_SDI_REGISTRY |
true |
Whether to deploy SDI Registry for the purpose of SAP Data Intelligence. |
REDHAT_REGISTRY_SECRET_NAME |
123456-username-pull-secret |
Name of the secret with credentials for registry.redhat.io registry. Please visit Please visit Red Hat Registry Service Accounts to obtain the OpenShift secret. For more details, please refer to Red Hat Container Registry Authentication. Must be provided in order to build registry's image. |
SDI_REGISTRY_ROUTE_HOSTNAME |
registry.cluster.tld |
This variable will be used as the SDI Registry's hostname when creating the corresponding route. Defaults to container-image-registry-$NAMESPACE.<cluster_name>.<base_domain> . If set, the domain name must resolve to the IP of the ingress router. |
INJECT_CABUNDLE |
true |
Inject CA certificate bundle into SAP Data Intelligence pods. The bundle can be specified with CABUNDLE_SECRET_NAME . It is needed if the registry is secured by a self-signed certificate. |
CABUNDLE_SECRET_NAME |
custom-ca-bundle |
The name of the secret containing certificate authority bundle that shall be injected into Data Intelligence pods. By default, the secret bundle is obtained from openshift-ingress-operator namespace where the router-ca secret contains the certificate authority used to sign all the edge and reencrypt routes that are, among others, used for SDI_REGISTRY and S3 API services. The secret name may be optionally prefixed with $namespace/ . |
SDI_REGISTRY_STORAGE_CLASS_NAME |
ocs-storagecluster-cephfs |
Unless given, the default storage class will be used. If possible, prefer volumes with ReadWriteMany (RWX ) access mode. |
REPLACE_SECRETS |
true |
By default, existing SDI_REGISTRY_HTPASSWD_SECRET_NAME secret will not be replaced if it already exists. If the registry credentials shall be changed while using the same secret name, this must be set to true . |
SDI_REGISTRY_AUTHENTICATION |
none |
Set to none if the registry shall not require any authentication at all. The default is to secure the registry with htpasswd file which is necessary if the registry is publicly available (e.g. when exposed via ingress route which is globally resolvable). |
SDI_REGISTRY_USERNAME |
registry-user |
Will be used to generate htpasswd file to provide authentication data to the sdi registry service as long as SDI_REGISTRY_HTPASSWD_SECRET_NAME does not exist or REPLACE_SECRETS is true . Unless given, it will be autogenerated by the job. |
SDI_REGISTRY_PASSWORD |
secure-password |
ditto |
SDI_REGISTRY_HTPASSWD_SECRET_NAME |
registry-htpasswd |
A secret with htpasswd file with authentication data for the sdi image container. If given and the secret exists, it will be used instead of SDI_REGISTRY_USERNAME and SDI_REGISTRY_PASSWORD . Defaults to container-image-registry-htpasswd . Please make sure to follow the official guidelines on generating the htpasswd file. |
SDI_REGISTRY_VOLUME_CAPACITY |
250Gi |
Volume space available for container images. Defaults to 120Gi . |
SDI_REGISTRY_VOLUME_ACCESS_MODE |
ReadWriteMany |
If the given SDI_REGISTRY_STORAGE_CLASS_NAME or the default storage class supports ReadWriteMany ("RWX") access mode, please change this to ReadWriteMany . For example, the ocs-storagecluster-cephfs storage class, deployed by ODF operator, does support it. |
To use them, please set the desired parameters in the run-observer-template.sh
script in the section above.
Monitoring registry's deployment
# oc logs -n "${NAMESPACE:-sdi-observer}" -f job/deploy-registry
You can find more information in the appendix:
- Update instructions
- Determine Registry's credentials
- Verification
4.3. Managing SDI Observer
4.3.1. Viewing and changing the current configuration
View the current configuration of SDI Observer:
# oc set env --list -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer
Change the settings:
- it is recommended to modify the run-observer-template.sh and re-run it
-
it is also possible to set the desired parameter directly without triggering an image build:
# # instruct the observer to schedule SDI pods only on the matching nodes # oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer SDI_NODE_SELECTOR="node-role.kubernetes.io/sdi="
4.3.2. Re-deploying SDI Observer
Is useful in the following cases:
- SDI Observer shall be updated to the latest release.
- SDI has been uninstalled and its namespace deleted and/or re-created.
- Parameter being reflected in multiple resources (not just in the DeploymentConfig) needs to be changed (e.g.
OCP_MINOR_RELEASE
) - Different SDI instance in another namespace shall be observed.
Before updating to the latest SDI Observer code, please be sure to check the Update instructions.
NOTE: Re-deployment preserves generated secrets and persistent volumes unless REPLACE_SECRETS
or REPLACE_PERSISTENT_VOLUMES
are true
.
-
Backup the previous
run-observer-template.sh
script and open it as long as available. If not available, run the following to see the previous environment variables:# oc set env --list dc/sdi-observer -n "${NAMESPACE:-sdi-observer}"
-
Download the run script from git repository like this:
# curl -O https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/observer/run-observer-template.sh
-
Edit the downloaded
run-observer-template.sh
file in your favorite editor. Especially, mind theFLAVOUR
,NAMESPACE
,SDI_NAMESPACE
andOCP_MINOR_RELEASE
parameters. Compare it against the oldrun-observer-template.sh
or against the output ofoc set env --list dc/sdi-observer
and update the parameters accordingly. -
Run it in bash like this:
# bash ./run-observer-template.sh
-
Keep the modified script around for case of updates.
5. Install SDI on OpenShift
5.1. Install Software Lifecycle Container Bridge
Please follow the official documentation (3.3) / (3.2) / (3.1).
5.1.1. Important Parameters
Parameter | Condition | Description |
---|---|---|
Mode | Always | Make sure to choose the Expert Mode. |
Address of the Container Image Repository | Always | This is the Host value of the container-image-registry route in the observer namespace if the registry is deployed by SDI Observer. |
Image registry username | if … ‡ | Refer to your registry configuration. If using the SDI Registry, please follow Determine Registry's credentials. |
Image registry password | if … ‡ | ditto |
Namespace of the SLC Bridge | Always | If you override the default (sap-slcbridge ), make sure to deploy SDI Observer with the corresponding SLCB_NAMESPACE value. |
Service Type | SLC Bridge Base installation | On vSphere, make sure to use NodePort . On AWS, please use LoadBalancer . |
Cluster No Proxy | Required in conjunction with the HTTPS Proxy value | Make sure to this according to the Configuring HTTP Proxy for the SLC Bridge section. |
‡ If the registry requires authentication. Red Hat Quay or SDI Registry does.
For more details, please refer to Configuring the cluster-wide proxy (4.12) / (4.10)
On a NAT'd on-premise cluster, in order to access slcbridgebase-service
NodePort
service, one needs to have either a direct access to one of the SDI Compute nodes or modify an external load balancer to add an additional route to the service.
5.1.2. Install SLC Bridge
Please install SLC Bridge according to Making the SLC Bridge Base available on Kubernetes (3.3) / (3.2) / (3.1) while paying attention to the notes on the installation parameters.
5.1.2.1. Exposing SLC Bridge with OpenShift Ingress Controller
For SLC Bridge, the only possible type of TLS termination is passthrough
unless the Ingress Controller is configured to use globally trusted certificates.
It is recommended to let the SDI Observer (0.1.15 at the minimum) to manage the route creation and updates. If the SDI Observer has been deployed with MANAGE_SLCB_ROUTE=true
, this section can be skipped. To configure it ex post, please execute the following:
# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_SLCB_ROUTE=true
# # wait for the observer to get re-deployed
# oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer
After a while, the bridge will be become available at https://sap-slcbridge.apps.<cluster_name>.<base_domain>/docs/index.html
. You can wait for route's availability like this:
# oc get route -w -n "${SLCB_NAMESPACE:-sap-slcbridge}"
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
sap-slcbridge <SLCB_NAMESPACE>.apps.<cluster_name>.<base_domain> slcbridgebase-service <all> passthrough/Redirect None
5.1.2.1.1. Manually exposing SLC Bridge with Ingress
Alternatively, you can expose SLC Bridge manually with this approach.
-
Look up the
slcbridgebase-service
service:# oc project "${SLCB_NAMESPACE:-sap-slcbridge}" # switch to the Software Lifecycle Bridge project # oc get services | grep 'NAME\|slcbridge' NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE slcbridgebase-service NodePort 172.30.206.105 <none> 32455:31477/TCP 14d
-
Create the route for the service:
# oc create route passthrough sap-slcbridge --service=slcbridgebase-service \ --insecure-policy=Redirect --dry-run=client -o json | \ oc annotate --local -f - haproxy.router.openshift.io/timeout=10m -o json | oc apply -f -
You can also set your desired hostname with the
--hostname
parameter. Make sure it resolves to the router's IP. -
Get the generated hostname:
# oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD vsystem vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain> vsystem vsystem passthrough/Redirect None
-
Make sure to configure your external load balancer to increase the timeout for WebSocket connections for this particular hostname at least to 10 minutes. For example, in HAProxy, it would be
timeout tunnel 10m
. -
Access the System Management service at
https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>
to verify.
5.1.2.2. Using an external load balancer to access SLC Bridge's NodePort
NOTE: applicable only when "Service Type" was set to "NodePort".
Once the SLC Bridge is deployed, its NodePort
shall be determined in order to point the load balancer at it.
# oc get svc -n "${SLCB_NAMESPACE:-sap-slcbridge}" slcbridgebase-service -o jsonpath='{.spec.ports[0].nodePort}{"\n"}'
31875
The load balancer shall point at all the compute nodes running SDI workload. The following is an example for HAProxy load balancer:
# # in the example, the <cluster_name> is "boston" and <base_domain> is "ocp.vslen"
# cat /etc/haproxy/haproxy.cfg
....
frontend slcb
bind *:9000
mode tcp
option tcplog
# # commented blocks are useful for multiple OpenShift clusters or multiple SLC Bridge services
#tcp-request inspect-delay 5s
#tcp-request content accept if { req_ssl_hello_type 1 }
use_backend boston-slcb #if { req_ssl_sni -m end -i boston.ocp.vslen }
#use_backend raleigh-slcb #if { req_ssl_sni -m end -i raleigh.ocp.vslen }
backend boston-slcb
balance source
mode tcp
server sdi-worker1 sdi-worker1.boston.ocp.vslen:31875 check
server sdi-worker2 sdi-worker2.boston.ocp.vslen:31875 check
server sdi-worker3 sdi-worker3.boston.ocp.vslen:31875 check
backend raleigh-slcb
....
The SLC Bridge can then be accessed at the URL https://boston.ocp.vslen:9000/docs/index.html
as long as boston.ocp.vslen
resolves correctly to the load balancer's IP.
5.2. SDI Installation Parameters
Please follow SAP's guidelines on configuring the SDI while paying attention to the following additional comments:
Name | Condition | Recommendation |
---|---|---|
Kubernetes Namespace | Always | Must match the project name chosen in the Project Setup (e.g. sdi ) |
Installation Type | Installation or Update | Choose Advanced Installation if you need to specify you want to choose particular storage class or there is no default storage class (4.12) / (4.10) set or you want to deploy multiple SDI instances on the same cluster. |
Container Image Repository | Installation | Must be set to the container image registry. |
Cluster Proxy Settings | Advanced Installation or Update | Choose yes if a local HTTP(S) proxy must be used to access external web resources. |
Cluster No Proxy | When Cluster Proxy Settings is configured. |
Please refer to the HTTP proxy configuration. |
Backup Configuration | Installation or Upgrade from a system in which backups are not enabled | For a production environment, please choose yes. ⁴ |
Checkpoint Store Configuration | Installation | Recommended for production deployments. If backup is enabled, it is enabled by default. |
Checkpoint Store Type | If Checkpoint Store Configuration parameter is enabled. | Set to S3 compatible object store if using for example ODF or NetApp StorageGRID as the object storage. See Using NooBaa as object storage gateway or NetApp StorageGRID for more details. |
Disable Certificate Validation | If Checkpoint Store Configuration parameter is enabled. | Please choose yes if using the HTTPS for your object storage endpoint secured with a certificate having a self-signed CA. For ODF NooBaa, you can set it to no. |
Checkpoint Store Validation | Installation | Please make sure to validate the connection during the installation time. Otherwise in case an incorrect value is supplied, the installation will fail at a later point. |
Container Registry Settings for Pipeline Modeler | Advanced Installation | Shall be changed if the same registry is used for more than one SAP Data Intelligence instance. Either another <registry> or a different <prefix> or both will do. |
StorageClass Configuration | Advanced Installation | Configure this if you want to choose different dynamic storage provisioners for different SDI components or if there's no default storage class (4.12) / (4.10) set or you want to choose non-default storage class for the SDI components. |
Default StorageClass | Advanced Installation and if storage classes are configured | Set this if there's no default storage class (4.12) / (4.10) set or you want to choose non-default storage class for the SDI components. |
Enable Kaniko Usage | Advanced Installation | Must be enabled on OpenShift 4. |
Container Image Repository Settings for SAP Data Intelligence Modeler | Advanced Installation or Upgrade | If using the same registry for multiple SDI instances, choose "yes". |
Container Registry for Pipeline Modeler | Advanced Installation and if "Use different one" option is selected in the previous selection. | If using the same registry for multiple SDI instances, it is required to use either different prefix (e.g. local.image.registry:5000/mymodelerprefix2 ) or a different registry. |
Loading NFS Modules | Advanced Installation | Feel free to say "no". This is no longer of concern as long as the loading of the needed kernel modules has been configured. |
Additional Installer Parameters | Advanced Installation | Please include -e vsystem.vRep.exportsMask=true . If omitted and SDI Observer is running, it will apply this parameter on your behalf. |
⁴ Note that the validated S3 API endpoint providers are ODF' NooBaa 4.6.4 or newer, ODF 4.6 in external mode and NetApp StorageGRID
5.3. Project setup
It is assumed the sdi
project has been already created during SDI Observer's prerequisites.
Login to OpenShift as a cluster-admin, and perform the following configurations for the installation:
# change to the SDI_NAMESPACE project using: oc project "${SDI_NAMESPACE:-sdi}"
oc adm policy add-scc-to-group anyuid "system:serviceaccounts:$(oc project -q)"
oc adm policy add-scc-to-user privileged -z default
oc adm policy add-scc-to-user privileged -z mlf-deployment-api
oc adm policy add-scc-to-user privileged -z vora-vflow-server
oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)"
oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)-vrep"
Two monitoring service accounts are renamed in SDI 3.3. For the SDI 3.2 and prior versions, the following commands will grant privileges to the two service accounts:
oc adm policy add-scc-to-user privileged -z "$(oc project -q)-elasticsearch"
oc adm policy add-scc-to-user privileged -z "$(oc project -q)-fluentd"
For SDI 3.3 version, please run the following command instead:
oc adm policy add-scc-to-user privileged -z "diagnostics-elasticsearch"
oc adm policy add-scc-to-user privileged -z "diagnostics-fluentd"
During the restore process of SDI 3.3, you might encounter an error from the hana StatefulSet. If you face the error message similar to the following:
create Pod hana-0 in StatefulSet hana failed error: pods "hana-0" is forbidden: unable to validate against any security context constraint: [provider anyuid: .initContainers[0].capabilities.add: Invalid value: "CHOWN": capability may not be added
To resolve this issue and launch the hana pod successfully, execute the following command:
# change to the restoring SDI_NAMESPACE project using: oc project "${SDI_NAMESPACE:-sdi}"
oc adm policy add-scc-to-user privileged -z "hana-service-account"
Red Hat is aware that the changes do not comply with best practices and substantially decrease cluster's security. Therefor it is not recommended to share the Data Intelligence nodes with other workloads. As stated earlier, please consult SAP directly if you wish for an improvement.
5.4. Install SDI
Please follow the official procedure according to Install using SLC Bridge in a Kubernetes Cluster with Internet Access (3.3) / (3.2) / (3.1).
5.5. SDI Post installation steps
5.5.1. (Optional) Expose SDI services externally
There are multiple possibilities how to make SDI services accessible outside of the cluster. Compared to Kubernetes, OpenShift offers additional method, which is recommended for most of the scenarios including SDI System Management service. It's based on OpenShift Ingress Operator (4.12) / (4.10)
For SAP Vora Transaction Coordinator and SAP HANA Wire, please use the official suggested method available to your environment (3.3) / (3.2) / (3.1).
5.5.1.1. Using OpenShift Ingress Operator
NOTE Instead of using this manual approach, it is now recommended to let the SDI Observer to manage the route creation and updates instead. If the SDI Observer has been deployed with MANAGE_VSYSTEM_ROUTE
, this section can be skipped. To configure it ex post, please execute the following:
# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_VSYSTEM_ROUTE=true
# # wait for the observer to get re-deployed
# oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer
Or please continue with the manual route creation.
OpenShift allows you to access the Data Intelligence services via Ingress Controllers (4.12) / (4.10) as opposed to the regular NodePorts (4.12) / (4.10) For example, instead of accessing the vsystem service via https://worker-node.example.com:32322
, after the service exposure, you will be able to access it at https://vsystem-sdi.apps.<cluster_name>.<base_domain>
. This is an alternative to the official guide documentation to Expose the Service On Premise (3.3) / (3.2) / (3.1).
There are two of kinds routes secured with TLS. The reencrypt
kind, allows for a custom signed or self-signed certificate to be used. The other kind is passthrough
which uses the pre-installed certificate generated or passed to the installer.
5.5.1.1.1. Export services with an reencrypt route
With this kind of route, different certificates are used on client and service sides of the route. The router stands in the middle and re-encrypts the communication coming from either side using a certificate corresponding to the opposite side. In this case, the client side is secured by a provided certificate and the service side is encrypted with the original certificate generated or passed to the SAP Data Intelligence installer. This is the same kind of route SDI Observer creates automatically.
The reencrypt route allows for securing the client connection with a proper signed certificate.
-
Look up the
vsystem
service:# oc project "${SDI_NAMESPACE:-sdi}" # switch to the Data Intelligence project # oc get services | grep "vsystem " vsystem ClusterIP 172.30.227.186 <none> 8797/TCP 19h
When exported, the resulting hostname will look like
vsystem-${SDI_NAMESPACE}.apps.<cluster_name>.<base_domain>
. However, an arbitrary hostname can be chosen instead as long as it resolves correctly to the IP of the router. -
Get, generate or use the default certificates for the route. In this example, the default self-signed certificate used by router is used to secure the connection between the client and OpenShift's router. The CA certificate for clients can be obtained from the
router-ca
secret located in theopenshift-ingress-operator
namespace:# oc get secret -n openshift-ingress-operator -o json router-ca | \ jq -r '.data as $d | $d | keys[] | select(test("\\.crt$")) | $d[.] | @base64d' >router-ca.crt
-
Obtain the SDI's root certificate authority bundle generated at the SDI's installation time. The generated bundle is available in the
ca-bundle.pem
secret in thesdi
namespace.# oc get -n "${SDI_NAMESPACE:-sdi}" -o go-template='{{index .data "ca-bundle.pem"}}' \ secret/ca-bundle.pem | base64 -d >sdi-service-ca-bundle.pem
-
Create the reencrypt route for the vsystem service like this:
# oc create route reencrypt -n "${SDI_NAMESPACE:-sdi}" --dry-run -o json \ --dest-ca-cert=sdi-service-ca-bundle.pem --service vsystem \ --insecure-policy=Redirect | \ oc annotate --local -o json -f - haproxy.router.openshift.io/timeout=2m | \ oc apply -f - # oc get route NAME HOST/PORT SERVICES PORT TERMINATION WILDCARD vsystem vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain> vsystem vsystem reencrypt/Redirect None
-
Verify the connection:
# # use the HOST/PORT value obtained from the previous command instead # curl --cacert router-ca.crt https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>/
5.5.1.1.2. Export services with a passthrough route
With the passthrough
route, the communication is encrypted by the SDI service's certificate all the way to the client.
NOTE: If possible, please prefer the reencrypt
route because the hostname of vsystem certificate cannot be verified by clients as can be seen in the following output:
# oc get -n "${SDI_NAMESPACE:-sdi}" -o go-template='{{index .data "ca-bundle.pem"}}' \
secret/ca-bundle.pem | base64 -d >sdi-service-ca-bundle.pem
# openssl x509 -noout -subject -in sdi-service-ca-bundle.pem
subject=C = DE, ST = BW, L = Walldorf, O = SAP, OU = Data Hub, CN = SAPDataHub
-
Look up the
vsystem
service:# oc project "${SDI_NAMESPACE:-sdi}" # switch to the Data Intelligence project # oc get services | grep "vsystem " vsystem ClusterIP 172.30.227.186 <none> 8797/TCP 19h
-
Create the route:
# oc create route passthrough --service=vsystem --insecure-policy=Redirect # oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD vsystem vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain> vsystem vsystem passthrough/Redirect None
You can modify the hostname with
--hostname
parameter. Make sure it resolves to the router's IP. -
Access the System Management service at
https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>
to verify.
5.5.1.2. Using NodePorts
NOTE For OpenShift, an exposure using routes is preferred although only possible for the System Management service (aka vsystem
).
Exposing SAP Data Intelligence vsystem
-
Either with an auto-generated node port:
# oc expose service vsystem --type NodePort --name=vsystem-nodeport --generator=service/v2 # oc get -o jsonpath='{.spec.ports[0].nodePort}{"\n"}' services vsystem-nodeport 30617
-
Or with a specific node port (e.g. 32123):
# oc expose service vsystem --type NodePort --name=vsystem-nodeport --generator=service/v2 --dry-run -o yaml | \ oc patch -p '{"spec":{"ports":[{"port":8797, "nodePort": 32123}]}}' --local -f - -o yaml | oc apply -f -
The original service remains accessible on the same ClusterIP:Port
as before. Additionally, it is now accessible from outside of the cluster under the node port.
Exposing SAP Vora Transaction Coordinator and HANA Wire
# oc expose service vora-tx-coordinator-ext --type NodePort --name=vora-tx-coordinator-nodeport --generator=service/v2
# oc get -o jsonpath='tx-coordinator:{"\t"}{.spec.ports[0].nodePort}{"\n"}hana-wire:{"\t"}{.spec.ports[1].nodePort}{"\n"}' \
services vora-tx-coordinator-nodeport
tx-coordinator: 32445
hana-wire: 32192
The output shows the generated node ports for the newly exposed services.
5.5.2. Configure the Connection to Data Lake
Please follow the official post-installation instructions at Configure the Connection to DI_DATA_LAKE
(3.3) / (3.2) / (3.1).
In case the ODF is used as a backing object storage provider, please make sure to use the HTTP service endpoint as documented in Using NooBaa or RADOS Object Gateway S3 endpoint as object storage.
Based on the example output in that section, the configuration may look like this:
Parameter | Value |
---|---|
Connection Type | SDL |
Id | DI_DATA_LAKE |
Object Storage Type | S3 |
Endpoint | http://s3.openshift-storage.svc.cluster.local |
Access Key ID | cOxfi4hQhGFW54WFqP3R |
Secret Access Key | rIlvpcZXnonJvjn6aAhBOT/Yr+F7wdJNeLDBh231 |
Root Path | sdi-data-lake-f86a7e6e-27fb-4656-98cf-298a572f74f3 |
5.5.3. SDI Validation
Validate SDI installation on OpenShift to make sure everything works as expected. Please follow the instructions in Testing Your Installation (3.3) / (3.2) / (3.1).
5.5.3.1. Log On to SAP Data Intelligence Launchpad
In case the vsystem
service has been exposed using a route, the URL can be determined like this:
# oc get route -n "${SDI_NAMESPACE:-sdi}"
NAME HOST/PORT SERVICES PORT TERMINATION WILDCARD
vsystem vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain> vsystem vsystem reencrypt None
The HOST/PORT
value needs to be then prefixed with https://
, for example:
https://vsystem-sdi.apps.boston.ocp.vslen
5.5.3.2. Check Your Machine Learning Setup
In order to upload training and test datasets using ML Data Manager, the user needs to be assigned app.datahub-app-data.fullAcces
(as of 3.2) or sap.dh.metadata
(up to 3.1) policy. Please make sure to follow Using SAP Data Intelligence Policy Management (3.3) / (3.2) / (3.1) to assign the policies to the users that need them.
5.5.4. Configuration of additional tenants
When a new tenant is created (using e.g. Manage Clusters instructions (3.3) / (3.2) / (3.1)) it is not configured to work with the container image registry. Therefore, the Pipeline Modeler is unusable and will fail to start until configured.
There are a few steps that need to be performed for each new tenant:
- import CA certificate for the registry via SDI Connection Manager if the CA certificate is self-signed
- as long as a different registry for modeler is used, pull secret needs to be imported to the SDI_NAMESPACE
- create and import credential secret using the SDI System Management and update the modeler secret if the container image registry requires authentication
If the Red Hat Quay is used, please follow the Configuring additional SDI tenants.
If the SDI Registry is used, please follow the SDI Observer Registry tenant configuration. Otherwise, please make sure to execute the official instructions in the following articles according to your registry configuration:
- Provide Access Credentials for a Password Protected Container Registry (3.3) / (3.2) / (3.1) (as long as your registry for the Pipeline Modeler uses TLS with a self-signed CA)
- (3.3) / Manage Certificates (3.2) / (3.1) (as long as your registry requires authentication)
6. OpenShift Container Platform Upgrade
This section is useful as a guide for performing OpenShift upgrades to the latest asynchronous releaseⁿ of the same minor version or to the newer minor release supported by the running SDI instance without upgrading SDI itself.
6.1. Pre-upgrade procedures
- Make yourself familiar with the OpenShift's upgrade guide (4.8 ⇒ 4.9) / (4.9 ⇒ 4.10) / (4.10 ⇒ 4.11) / (4.11 ⇒ 4.12).
- Plan for SDI downtime.
- Make sure to re-configure SDI compute nodes.
6.1.1. Stop SAP Data Intelligence
In order to speed up the cluster upgrade and/or to ensure SDI's consistency, it is possible to stop the SDI before performing the upgrade.
The procedure is outlined in the official Administration Guide (3.3) / (3.2) / (3.1). In short, the command is:
# oc -n "${SDI_NAMESPACE}" patch datahub default --type='json' -p '[
{"op":"replace","path":"/spec/runLevel","value":"Stopped"}]'
6.2. Upgrade OpenShift
The following instructions outline a process of OpenShift upgrade to a minor release 2 versions higher than the current one. If only an upgrade to the latest asynchronous releaseⁿ of the same minor version is desired, please skip steps 5 and 6.
- Upgrade OpenShift to a higher minor release or the latest asynchronous release (⇒ 4.12).
- If having OpenShift Data Foundation deployed, update ODF to the latest supported release for the current OpenShift release according to the interoperability guide.
-
Update OpenShift client tools on the Management host to match the target ※ OpenShift release. On RHEL 8, one can do it like this:
# current=4.10; new=4.12 # sudo subscription-manager repos \ --disable=rhocp-${current}-for-rhel-8-x86_64-rpms --enable=rhocp-${new}-for-rhel-8-x86_64-rpms # sudo dnf update -y openshift-clients
-
Update SDI Observer to use the OpenShift client tools matching the target ※ OpenShift release by following Re-Deploying SDI Observer while reusing the previous parameters.
- Upgrade OpenShift to a higher minor release or the latest asynchronous release (⇒ 4.12) ⁿ.
- If having OpenShift Data Foundation deployed, update ODF to the latest supported release for the current OpenShift release according to the interoperability guide.
※ for the initial OpenShift release 4.X
, the target release is 4.(X+2)
; if performing just the latest asynchronous releaseⁿ upgrade, the target release is 4.X
6.3. Post-upgrade procedures
-
Start SAP Data Intelligence as outlined in the official Administration Guide (3.3) / (3.2) / (3.1). In short, the command is:
# oc -n "${SDI_NAMESPACE}" patch datahub default --type='json' -p '[ {"op":"replace","path":"/spec/runLevel","value":"Started"}]'
7. SAP Data Intelligence Upgrade or Update
NOTE This section covers an upgrade of SAP Data Intelligence to a newer minor, micro or patch release. Sections related only to the former or the latter will be annotated with the following annotations:
- (upgrade) to denote a section specific to an upgrade from Data Intelligence to a newer minor release (
3.X ⇒ 3.(X+1)
) - (update) to denote a section specific to an update of Data Intelligence to a newer micro/patch release (
3.X.Y ⇒ 3.X.(Y+1)
) - annotation-free are sections relating to both
The following steps must be performed in the given order. Unless an OpenShift upgrade is needed, the steps marked with (ocp-upgrade) can be skipped.
7.1. Pre-upgrade or pre-update procedures
- Make sure to get familiar with the official SAP Upgrade guide (3.0 ⇒ 3.1) / (3.1 ⇒ 3.2) / (3.2 ⇒ 3.3).
- (ocp-upgrade) Make yourself familiar with the OpenShift's upgrade guide (4.8 ⇒ 4.9) / (4.9 ⇒ 4.10) / (4.10 ⇒ 4.11)/ (4.11 ⇒ 4.12).
- Plan for a downtime.
- Make sure to re-configure SDI compute nodes.
7.1.1. Execute SDI's Pre-Upgrade Procedures
Please follow the official Pre-Upgrade procedures (3.0 ⇒ 3.1) / (3.1 ⇒ 3.2) / (3.2 ⇒ 3.3).
7.1.1.1. Automated route removal
SDI Observer now allows to manage creation and updates of vsystem route for external access. It takes care of updating route's destination certificate during SDI's update. It can also be instructed to keep the route deleted which is useful during SDI updates. You can instruct the SDI Observer to delete the route like this:
-
ensure SDI Observer is managing the route already:
# oc set env -n "${NAMESPACE:-sdi-observer}" --list dc/sdi-observer | grep MANAGE_VSYSTEM_ROUTE MANAGE_VSYSTEM_ROUTE=true
if there is no output or
MANAGE_VSYSTEM_ROUTE
is not one oftrue
,yes
or1
, please follow the Manual route removal instead. -
instruct the observer to keep the route removed:
# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_VSYSTEM_ROUTE=removed # # wait for the observer to get re-deployed # oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer
7.1.1.2. Manual route removal
If you exposed the vsystem service using routes, delete the route:
# # note the hostname in the output of the following command
# oc get route -n "${SDI_NAMESPACE:-sdi}"
# # delete the route
# oc delete route -n "${SDI_NAMESPACE:-sdi}" --all
7.1.2. (upgrade) Prepare SDI Project
Grant the needed security context constraints to the new service accounts by executing the commands from the project setup. NOTE: Re-running the commands that have been run already, will do no harm.
7.2. Update or Upgrade SDI
7.2.1. Update Software Lifecycle Container Bridge
Before updating the SLC Bridge, please consider exposing it via Ingress Controller.
If you decide to continue using the NodePort service load-balanced by an external load balancer, make sure to note down the current service node port:
# oc get -o jsonpath='{.spec.ports[0].nodePort}{"\n"}' -n sap-slcbridge \
svc/slcbridgebase-service
31555
Please follow the official documentation (3.3) / (3.2) / (3.1) to obtain the binary and updating its resources on OpenShift cluster.
If exposed via Ingress Controller, you can skip the next step. Otherwise, re-set the nodePort to the previous value so no changes on load-balancer side are necessary.
# nodePort=31555 # change your value to the desired one
# oc patch --type=json -n sap-slcbridge svc/slcbridgebase-service -p '[{
"op":"add", "path":"/spec/ports/0/nodePort","value":'"$nodePort"'}]'
7.2.2. (upgrade) Upgrade SAP Data Intelligence to a newer minor release
Execute the SDI upgrade according to the official instructions (DH 3.0 ⇒ 3.1) / (DH 3.1 ⇒ 3.2) / (DH 3.2 ⇒ 3.3).
7.3. (ocp-upgrade) Upgrade OpenShift
Depending on the target SDI release, OpenShift cluster must be upgraded either to a newer minor release or to the latest asynchronous releaseⁿ for the current minor release.
Upgraded/Current SDI release | Desired and validated OpenShift Releases |
---|---|
3.3 | 4.12 |
3.3 | 4.10 |
3.3 | 4.8 |
3.2 | 4.8 |
3.1 | 4.6 |
3.0 | 4.6 |
If the current OpenShift release is two or more releases behind the desired, OpenShift cluster must be upgraded iteratively to each successive minor release until the desired one is reached.
- (optional) Stop the SAP Data Intelligence as it will speed up the cluster update and ensure SDI's consistency.
-
Make sure to follow the official upgrade instructions for your upgrade path:
-
When on OpenShift 4.11, please follow the Re-deploying SDI Observer to update the observer. Please make sure to set
MANAGE_VSYSTEM_ROUTE
toremove
until the SDI's update is finished. Please set the desired OpenShift minor release (e.g.OCP_MINOR_RELEASE=4.12
). -
For SDI 3.2 to 3.3 upgrade, privileges need to be granted to the following two service accounts by running the following commands:
# change to the SDI_NAMESPACE project using: oc project "${SDI_NAMESPACE:-sdi}" oc adm policy add-scc-to-user privileged -z "diagnostics-elasticsearch" oc adm policy add-scc-to-user privileged -z "diagnostics-fluentd"
-
(optional) Start the SAP Data Intelligence again if the desired OpenShift release has been reached.
-
Upgrade OpenShift client tools on the Management host. The example below can be used on RHEL 8:
# current=4.10; new=4.12 # sudo subscription-manager repos \ --disable=rhocp-${current}-for-rhel-8-x86_64-rpms --enable=rhocp-${new}-for-rhel-8-x86_64-rpms # sudo dnf update -y openshift-clients
7.4. SAP Data Intelligence Post-Upgrade Procedures
-
Execute the Post-Upgrade Procedures for the SDH (3.3) / (3.2) / (3.1).
-
Re-create the route for the vsystem service using one of the following methods:
-
(recommented) instruct SDI Observer to manage the route:
# oc set env -n "${NAMESPACE:-sdi-observer}" dc/sdi-observer MANAGE_VSYSTEM_ROUTE=true # # wait for the observer to get re-deployed # oc rollout status -n "${NAMESPACE:-sdi-observer}" -w dc/sdi-observer
-
follow Expose SDI services externally to recreate the route manually from scratch
-
7.5. Validate SAP Data Intelligence
Validate SDI installation on OpenShift to make sure everything works as expected. Please follow the instructions in Testing Your Installation (3.3) / (3.2) / (3.1).
8. Appendix
8.1. SDI uninstallation
Please follow the SAP documentation Uninstalling SAP Data Intelligence using the SLC Bridge (3.3) / (3.2) / (3.1).
Additionally, make sure to delete the sdi
project and datahub-system
as well, e.g.:
# oc delete project sdi
Followed by the deletion of datahub-system
project, eg.:
# oc delete project datahub-system
NOTE: With this, SDI Observer loses permissions to view and modify resources in the deleted namespace. If a new SDI installation shall take place, SDI observer needs to be re-deployed.
Optionally, one can also delete SDI Observer's namespace, e.g.:
# oc delete project sdi-observer
NOTE: this will also delete the SDI registry if deployed using SDI Observer which means the mirroring needs to be performed again during a new installation. If SDI Observer (including the registry and its data) shall be preserved for the next installation, please make sure to re-deploy it once the sdi
project is re-created.
When done, you may continue with a new installation round in the same or another namespace.
8.2. Quay Registry for SDI
Red Hat Quay Registry has been validated to host SAP Data Intelligence images. The Quay registry can run directly on the OpenShift cluster together with SDI, on another OpenShift cluster or standalone.
Note: Red Hat Quay 3.6 or newer is compatible with SDI images.
Once Red Hat Quay is deployed according to the documentation, make sure to configure OpenShift cluster to trust the registry.
8.2.1. Quay namespaces, users and accounts preparations
-
Create a new organization. In this example, we will call the organization
sdi
.- This organization will host all the images needed by SLC Bridge, SAP DI and SAP DI operator.
-
As the Quay Superadmin create a new user (e.g.
sdi_slcb
). Please note the credentials. The user will be used as a robot account by SLC Bridge and OpenShift (not by a human). So far, the regular Quay robot account cannot be used because the robot accounts cannot create repositories on push. -
Grant the
sdi_slcb
user at least theCreator
access to thesdi
organization.- Either by adding the user to the
owners
team in "Teams and Membership" pane. - Or by creating a new team in the
sdi
organization called e.g.pushers
with theCreator
team role assigned and adding thesdi_slcb
user as a member.
- Either by adding the user to the
-
(optional) As the Superadmin, create another user for pipeline modeler (e.g.
sdi_default_modeler
wheredefault
stands for the default tenant).- Advantages of having a separate registry namespace and users for each tenant's pipeline modeler:
- Images can be easily pruned on per-tenant basis. Once the SDI tenant is no longer needed, the corresponding Quay user can be deleted and its images will be automatically pruned from the registry and space recovered.
- Improved security. SDI tenant users cannot access images of other SDI tenants.
- This user will be used again as a robot account, similar to
sdi_slcb
. - For user's E-mail address, any fake address will do as long as it is unique among all Quay users.
- The name of the user is at the same time the namespace where the images will be pushed to and pull from.
- Make sure to note the credentials.
- The user must be able to pull from the
sdi
organization.
In order for the user to pull from
sdi
organization, make sure to perform also the following.- As the owner of the
sdi
organization, go to its "Teams and Membership" pane, create a new team (e.g.pullers
) with theMember
Team Role. - Click on "Set permissions for pullers" and make sure the team can Read all the repositories that already exist in the
sdi
organization. - Click on the
puller
team, search forsdi_default_modeler
user and add him to the team. - Go back to
Default Permissions
of thesdi
organization, click on "Create Default Permission" and add the "Read" permission to thepuller
team for repositories created byAnyone
.
- Advantages of having a separate registry namespace and users for each tenant's pipeline modeler:
-
(optional) Repeat the previous step for any additional SDI tenant you are going to create.
8.2.2. Determine the Image Repository
The Image Repository Input Parameter is composed of <hostname>/<namespace>
.
-
The registry
<hostname>
for Quay running on the OpenShift cluster can be determined on the Management host like this:# oc get route --all-namespaces -o jsonpath='{range .items[*]}{.spec.host}{"\n"}{end}' \ -l quay-component=quay-app-route
An example output:
quay.apps.cluster.example.com
In case your local Quay registry runs outside of OpenShift cluster, you will need to determine its hostname by other means.
-
The
<namespace>
is either the organization name or username. Forsdi
organization, the<namespace>
issdi
.
In this example, the resulting Image Repository parameter will be quay.apps.cluster.example.com/sdi
.
8.2.3. Importing Quay's CA Certificate to OpenShift
If you haven't done it already, please make sure to make OpenShift cluster trust the Quay registry.
- If the Quay registry is running on the OpenShift cluster, obtain the
router-ca.crt
of the secret as documented in the SDI Registry Verification section. Otherwise, please fetch the self-signed CA certificate of your external Quay registry. - Follow section Configure OpenShift to trust container image registry to make the registry trusted.
8.2.4. Configuring additional SDI tenants
There are three steps that need to be performed for each new SDI tenant:
- import CA certificate for the registry via SDI Connection Manager if the CA certificate is self-signed
- create a and import a vflow pull secret to OpenShift namespace
- create and import credential secret using the SDI System Management and update the modeler secret
In this example, we will operate with a newly created tenant blue
and we assume that new Quay registry user called blue_modeler
has been created.
8.2.4.1. Importing Quay's CA Certificate to SAP DI
- Please follow step one from Importing Quay's CA Certificate to OpenShift to get the CA certificate localy as
router-ca.crt
. - Follow the Manage Certificates guide (3.3) / (3.2) / (3.1) to import the
router-ca.crt
via the SDI Connection Management.
8.2.4.2. Create and import vflow pull secret into OpenShift
This is needed only if a different Quay namespace is used for each tenant.
- Login into to your Quay registry as the user
blue_modeler
. - Click on your user avatar in the upper right corner, go to "Account Settings" -> "User Settings" and there click on "Create Application Token". Let's use
blue_modeler_quay_token
as the token name. - Once the application token is generated, click on it and download the corresponding "Kubernetes Secret". In this example, the downloaded file is called
blue-modeler-quay-token-secret.yaml
. -
On your Management host, import the secret into the
SDI_NAMESPACE
on your OpenShift cluster, e.g.:# oc apply -n "${SDI_NAMESPACE:-sdi}" -f blue-modeler-quay-token-secret.yaml
-
In SDI "System Management" of the
blue
tenant, go to the Applications tab, search forpull
, click on the Edit button and set "Modeler: Docker image pull secret for Modeler" to the name of the imported secret (e.g.blue-modeler-quay-token-pull-secret
).
8.2.4.3. Import credentials secret to SDI tenant
If you have imported the vflow pull secret into OpenShift cluster, you can turn the imported secret into the proper file format for SDI like this:
# secret=blue-modeler-quay-token-pull-secret
# oc get -o json -n "${SDI_NAMESPACE:-sdi}" "secret/$secret" | \
jq -r '.data[".dockerconfigjson"] | @base64d' | jq -r '.auths as $auths | $auths | keys |
map(. as $address | $auths[.].auth | @base64d | capture("^(?<username>[^:]+):(?<password>.+)$") |
{"address": $address, "username": .username, "password": .password})' | \
json2yaml | tee vsystem-registry-secret.txt
Otherwise, create the secret manually like this:
# cat >/tmp/vsystem-registry-secret.txt <<EOF
- username: "blue_modeler"
password: "CHANGEME"
address: "quay.apps.cluster.example.com"
EOF
Note that the address must not contain any /<namespace>
suffix!
Import the secret using the SDI System Management by following the official Provide Access Credentials for a Password Protected Container Registry (3.3) / (3.2) / (3.1).
8.3. (Deprecated) Deploying SDI Registry manually
The secure container image registry suitable for hosting SAP Data Intelligence images on OpenShift cluster.
8.3.1. Deployment
SDI Registry's kubernetes resources are defined in OpenShift Templates. To choose the right template and provide the right parameters for it, it is recommended to use the deployment script documented below.
8.3.1.1. Prerequisites
- OpenShift cluster must be healthy including all the cluster operators.
jq >= 1.6
binary available on the management host
8.3.1.2. Template instantiation
-
Make the git repository available on your management host.
# git clone https://github.com/redhat-sap/sap-data-intelligence
-
Inspect the available arguments of the deployment script:
# ./sap-data-intelligence/registry/deploy-registry.sh --help
-
Choose the right set of arguments and make a dry run to see what will happen. The
ubi-prebuilt
flavour will be chosen by default. The image will be pulled from quay.io/redhat-sap-cop/container-image-registry.# ./sap-data-intelligence/registry/deploy-registry.sh --dry-run
-
Next time, deploy the SDI registry for real and wait until it gets deployed:
# ./sap-data-intelligence/registry/deploy-registry.sh --wait
8.3.1.3. Generic instantiation for a disconnected environment
There must be another container image registry running outside of the OpenShift cluster to host the image of SDI Registry. That registry should be used to host SAP Data Intelligence images also as long as it is compatible. Otherwise, please follow this guide.
-
Mirror the pre-built image of SDI Registry to the local registry. For example, on RHEL8:
-
Where the management host has access to the internet:
# podman login local.image.registry:5000 # if the local registry requires authentication # skopeo copy \ docker://quay.io/redhat-sap-cop/container-image-registry:latest \ docker://local.image.registry:5000/container-image-registry:latest
-
Where the management host lacks access to the internet.
i. Copy the image on a USB flash on a host having the connection to the internet:
# skopeo copy \ docker://quay.io/redhat-sap-cop/contaimer-image-registry:latest \ oci-archive:/var/run/user/1000/usb-disk/container-image-registry:latest
ii. Plug the USB drive to the management host and mirror the image from it to your
local.image.registry:5000
:# skopeo copy \ oci-archive:/var/run/user/1000/usb-disk/container-image-registry:latest \ docker://local.image.registry:5000/container-image-registry:latest
-
-
Make the git repository available on your management host.
# git clone https://github.com/redhat-sap/sap-data-intelligence
-
Inspect the available arguments of the deployment script:
# ./sap-data-intelligence/registry/deploy-registry.sh --help
-
Choose the right set of arguments and make a dry run to see what will happen:
# ./sap-data-intelligence/registry/deploy-registry.sh \ --image-pull-spec=local.image.registry:5000/container-image-registry:latest --dry-run
-
Next time, deploy the SDI Registry for real and wait until it gets deployed:
# ./sap-data-intelligence/registry/deploy-registry.sh \ --image-pull-spec=local.image.registry:5000/container-image-registry:latest --wait
-
Please make sure to backup the arguments used for future updates.
8.3.2. Update instructions
So far, updates need to be performed manually.
Please follow the steps outlined in Template Instantiation anew. A re-run of the deployment script will change only what needs to be changed.
8.3.3. Determine Registry's credentials
The username and password are separated by a colon in the SDI_REGISTRY_HTPASSWD_SECRET_NAME
secret:
# # make sure to change the "sdi-registry" to your SDI Registry's namespace
# oc get -o json -n "sdi-registry" secret/container-image-registry-htpasswd | \
jq -r '.data[".htpasswd.raw"] | @base64d'
user-qpx7sxeei:OnidDrL3acBHkkm80uFzj697JGWifvma
8.3.4. Verification
-
Obtain Ingress' default self-signed CA certificate:
# oc get secret -n openshift-ingress-operator -o json router-ca | \ jq -r '.data as $d | $d | keys[] | select(test("\\.crt$")) | $d[.] | @base64d' >router-ca.crt
-
Set the
nm
variable to the Kubernetes namespace where SDI Registry runs:# nm=sdi-registry
-
Do a simple test using curl:
# # determine registry's hostname from its route # hostname="$(oc get route -n "$nm" container-image-registry -o jsonpath='{.spec.host}')" # curl -I --user user-qpx7sxeei:OnidDrL3acBHkkm80uFzj697JGWifvma --cacert router-ca.crt \ "https://$hostname/v2/" HTTP/1.1 200 OK Content-Length: 2 Content-Type: application/json; charset=utf-8 Docker-Distribution-Api-Version: registry/2.0 Date: Sun, 24 May 2020 17:54:31 GMT Set-Cookie: d22d6ce08115a899cf6eca6fd53d84b4=9176ba9ff2dfd7f6d3191e6b3c643317; path=/; HttpOnly; Secure Cache-control: private
-
Optionally, make the certificate trusted on your management host (this example is for RHEL7 or newer):
# sudo cp -v router-ca.crt /etc/pki/ca-trust/source/anchors/router-ca.crt # sudo update-ca-trust
-
Using the podman:
# # determine registry's hostname from its route # hostname="$(oc get route -n "$nm" container-image-registry -o jsonpath='{.spec.host}')" # sudo mkdir -p "/etc/containers/certs.d/$hostname" # sudo cp router-ca.crt "/etc/containers/certs.d/$hostname/" # podman login -u user-qpx7sxeei "$hostname" Password: Login Succeeded!
8.3.5. Post configuration
By default, the SDI Registry is secured by the Ingress Controller's certificate signed by a self-signed CA certificate. Self-signed certificates are trusted neither by OpenShift nor by SDI.
If the registry is signed by a proper trusted (not self-signed) certificate, this may be skipped.
8.3.5.1. Making SDI Registry trusted by OpenShift
To make the registry trusted by the OpenShift cluster, please follow Configure OpenShift to trust container image registry. You can determine the registry hostname in bash like this:
# nm="sdi-registry" # namespace where registry runs
# registry="$(oc get route -n "$nm" \
container-image-registry -o jsonpath='{.spec.host}')"; echo "$registry"
8.3.5.2. SDI Observer Registry tenant configuration
The default tenant is configured automatically as long as one of the following holds true:
- SDI Observer is running and configured with
INJECT_CABUNDLE=true
and the right CA certicate is configured with one ofCABUNDLE_*
environment variables (the default values are usually alright). - Setting Up Certificates has been followed.
NOTE: Only applicable once the SDI installation is complete.
Each newly created tenant needs to be configured to be able to talk to the SDI Registry. The initial tenant (the default
) does not need to be configured manually as it is configured during the installation.
There are two steps that need to be performed for each new tenant:
- import CA certificate for the registry via SDI Connection Manager if the CA certificate is self-signed
- create and import credential secret using the SDI System Management and update the modeler secret
Import the CA certificate
- Obtain the
router-ca.crt
of the secret as documented in the previous section. - Follow the Manage Certificates guide (3.3) / (3.2) / (3.1) to import the
router-ca.crt
via the SDI Connection Management.
Import the credentials secret
Determine the credentials and import them using the SDI System Management by following the official Provide Access Credentials for a Password Protected Container Registry (3.3) / (3.2) / (3.1).
As an alternative to the step "1. Create a secret file that contains the container registry credentials and …", you can also use the following way to create the vsystem-registry-secret.txt
file:
# # determine registry's hostname from its route
# hostname="$(oc get route -n "${NAMESPACE:-sdi-observer}" container-image-registry -o jsonpath='{.spec.host}')"
# oc get -o json -n "${NAMESPACE:-sdi-observer}" secret/container-image-registry-htpasswd | \
jq -r '.data[".htpasswd.raw"] | @base64d | sub("^\\s*Credentials:\\s+"; "") | gsub("\\s+"; "") | split(":") |
[{"username":.[0], "password":.[1], "address":"'"$hostname"'"}]' | \
json2yaml | tee vsystem-registry-secret.txt
NOTE: that json2yaml
binary from the remarshal project must be installed on the Management host in addition to jq
8.4. Configure OpenShift to trust container image registry
If the registry's certificate is signed by a self-signed certificate authority, one must make OpenShift aware of it.
If the registry runs on the OpenShift cluster itself and is exposed via a reencrypt
or edge
route with the default TLS settings (no custom TLS certificates set), the CA certificate used is available in the secret router-ca
in openshift-ingress-operator
namespace.
To make the registry available via such route trusted, set the route's hostname into the registry
variable and execute the following code in bash:
# registry="local.image.registry:5000"
# caBundle="$(oc get -n openshift-ingress-operator -o json secret/router-ca | \
jq -r '.data as $d | $d | keys[] | select(test("\\.(?:crt|pem)$")) | $d[.] | @base64d')"
# # determine the name of the CA configmap if it exists already
# cmName="$(oc get images.config.openshift.io/cluster -o json | \
jq -r '.spec.additionalTrustedCA.name // "trusted-registry-cabundles"')"
# if oc get -n openshift-config "cm/$cmName" 2>/dev/null; then
# configmap already exists -> just update it
oc get -o json -n openshift-config "cm/$cmName" | \
jq '.data["'"${registry//:/..}"'"] |= "'"$caBundle"'"' | \
oc replace -f - --force
else
# creating the configmap for the first time
oc create configmap -n openshift-config "$cmName" \
--from-literal="${registry//:/..}=$caBundle"
oc patch images.config.openshift.io cluster --type=merge \
-p '{"spec":{"additionalTrustedCA":{"name":"'"$cmName"'"}}}'
fi
If using a registry running outside of OpenShift or not secured by the default ingress CA certificate, take a look at the official guideline at Configuring a ConfigMap for the Image Registry Operator (4.12) / (4.10)
To verify that the CA certificate has been deployed, execute the following and check whether the supplied registry name appears among the file names in the output:
# oc rsh -n openshift-image-registry "$(oc get pods -n openshift-image-registry -l docker-registry=default | \
awk '/Running/ {print $1; exit}')" ls -1 /etc/pki/ca-trust/source/anchors
container-image-registry-sdi-observer.apps.boston.ocp.vslen
image-registry.openshift-image-registry.svc..5000
image-registry.openshift-image-registry.svc.cluster.local..5000
If this is not feasible, one can also mark the registry as insecure.
8.5. Configure insecure registry
As a less secure an alternative to the Configure OpenShift to trust container image registry, registry may also be marked as insecure which poses a potential security risk. Please follow Configuring image settings (4.12) / (4.10) and add the registry to the .spec.registrySources.insecureRegistries
array. For example:
apiVersion: config.openshift.io/v1
kind: Image
metadata:
annotations:
release.openshift.io/create-only: "true"
name: cluster
spec:
registrySources:
insecureRegistries:
- local.image.registry:5000
NOTE: it may take a couple of tens of minutes until the nodes are reconfigured. You can use the following commands to monitor the progress:
watch oc get machineconfigpool
watch oc get nodes
8.6. Running multiple SDI instances on a single OpenShift cluster
Two instances of SAP Data Intelligence running in parallel on a single OpenShift cluster have been validated. Running more instances is possible, but most probably needs an extra support statement from SAP.
Please consider the following before deploying more than one SDI instance to a cluster:
- Each SAP Data Intelligence instance must run in its own namespace/project.
- Each SAP Data Intelligence instance must use a different prefix or container image registry for the Pipeline Modeler. For example, the first instance can configure "Container Registry Settings for Pipeline Modeler" as
local.image.registry:5000/sdi30blue
and the second aslocal.image.registry:5000/sdi30green
. - It is recommended to dedicate particular nodes to each SDI instance.
- It is recommended to use network policy (4.12) / (4.10) SDN mode for completely granular network isolation configuration and improved security. Check network policy configuration (4.12) / (4.10) for further references and examples. This, however, cannot be changed post OpenShift installation.
- If running the production and test (aka blue-green) SDI deployments on a single OpenShift cluster, mind also the following:
- There is no way to test an upgrade of OpenShift cluster before an SDI upgrade.
- The idle (non-productive) landscape should have the same network security as the live (productive) one.
To deploy a new SDI instance to OpenShift cluster, please repeat the steps from project setup starting from point 6 with a new project name and continue with SDI Installation.
8.7. Installing remarshal utilities on RHEL
For a few example snippets throughout this guide, either yaml2json
or json2yaml
scripts are necessary.
They are provided by the remarshal project and shall be installed on the Management host in addition to jq
. On RHEL 8.2, one can install it this way:
# sudo dnf install -y python3-pip
# sudo pip3 install remarshal
8.8. (footnote ⁿ) Upgrading to the next minor release from the latest asynchronous release
If the OpenShift cluster is subscribed to the stable channel, its latest available micro release for the current minor release may not be upgradable to a newer minor release.
Consider the following example:
- The OpenShift cluster is of release
4.11.24
. - The latest asynchronous release available in stable-4.11 channel is
4.11.30
. - The latest stable 4.12 release is
4.12.15
(available instable-4.12
channel). - From the
4.11.24
micro release, one can upgrade to one of4.11.27
,4.11.28
,4.11.30
,4.12.13
or4.12.15
- However, from the
4.11.30
one cannot upgrade to any newer release because no upgrade path has been validated/provided yet in the stable channel.
Therefor, OpenShift cluster can get stuck on 4.11 release if it is first upgraded to the latest asynchronous release 4.11.30
instead of being upgraded directly to one of the 4.12
minor releases. However, at the same time, the fast-4.12 channel contains 4.12.16
release with an upgrade path from 4.11.30
. The 4.12.16
release appears in the stable-4.12
channel sooner of later after being introduced in the fast channel first.
To amend the situation without waiting for an upgrade path to appear in the stable channel:
- Temporarily switch to the fast-4.X channel.
- Perform the upgrade.
- Switch back to the stable-4.X channel.
- Continue performing upgrades to the latest micro release available in the stable-4.X channel.
8.9. HTTP Proxy Configuration
HTTP(S) Proxy must be configured on different places. The corresponding No Proxy settings are treated differently by different components.
- management host
- OpenShift cluster
- SLC Bridge
- SAP Data Intelligence
The sections below assume the following:
- cluster's base domain is example.com
- cluster name is foo
, which means its API is listening at api.foo.example.com:6443
- the local proxy server is listening at http://proxy.example.com:3128
- management host's hostname is jump.example.com, we should add its shortname (jump
) to the NO_PROXY
- the local network CIDR is 192.168.128.0/24
- the OpenShift's service network has the default range of 172.30.0.0/16
8.9.1. Configuring HTTP Proxy on the management host
Please export the Proxy environment variables on your management host according to your Linux distribution. For RHEL, please follow How to apply a system wide proxy. For example in BASH:
# sudo cp /dev/stdin /etc/profile.d/http_proxy.sh <<EOF
export http_proxy=http://proxy.example.com:3128
export https_proxy=http://proxy.example.com:3128
export no_proxy=localhost,127.0.0.1,jump,.example.com,192.168.128.0/24
EOF
# source /etc/profile.d/http_proxy.sh
Where the .example.com
is a wildcard pattern matching any subdomains like foo.example.com
.
8.9.2. Configuring HTTP Proxy on the OpenShift cluster
Usually, the OpenShift is configured to use the proxy during its installation.
But it is also possible to set/re-configure it ex-post.
An example configuration could look like this:
# oc get proxy/cluster -o json | jq '.spec'
{
"httpProxy": "http://proxy.example.com:3128",
"httpsProxy": "http://proxy.example.com:3128",
"noProxy": "192.168.128.0/24,jump,.local,.example.com",
"trustedCA": {
"name": "user-ca-bundle"
}
}
Please keep in mind that wildcard characters (e.g. *.example.com
) are not supported by OpenShift.
The complete no_proxy
list extended for container and service networks and additional service names is generated automatically and is stored in the .status.noProxy
field of the proxy object:
# oc get proxy/cluster -o json | jq -r '.status.noProxy'
.cluster.local,.local,.example.com,.svc,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.128.0/24,api-int.foo.example.com,localhost,jump
8.9.3. Configuring HTTP Proxy for the SLC Bridge
The SLC Bridge binary shall use the proxy settings from the environment on management host configured earlier. This is important to allow SLCB to talk to the SAP image registry (proxied), local image registry and OpenShift API (not proxied).
During SLC Bridge's init phase, which deploys the bridge as a container on the OpenShift cluster, one must set Proxy settings as well when prompted. Here are the example values:
# ./slcb init
...
***************************************************************************
* Choose whether you want to run the deployment in typical or expert mode *
***************************************************************************
1. Typical Mode
> 2. Expert Mode
Choose action <F12> for Back/<F1> for help
possible values [1,2]: 2
...
************************
* Proxy Settings *
************************
Configure Proxy Settings: y
Choose action <F12> for Back/<F1> for help
possible values [yes(y)/no(n)]: y
************************
* HTTPS Proxy *
************************
Enter the URL of the HTTPS Proxy to use
Choose action <F12> for Back/<F1> for help
HTTPS Proxy: http://proxy.example.com:3128
So far no surprise. For the No Proxy
however, it is recommended to copy&append the .status.noProxy
settings from the OpenShift's proxy object.
************************
* Cluster No Proxy *
************************
Specify the NO_PROXY setting for the cluster.
The value cannot contain white space and it must be comma-separated.
You have to include the address range configured for the kubernetes cluster in this list (e.g. "10.240.0.0/20").
Choose action <F12> for Back/<F1> for help
Cluster No Proxy: 10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.128.0/24,localhost,jump,169.254.169.254,sap-slcbridge,.local,.example.com,.svc,.internal
Note: you can use the following script to generate the value from OpenShift's proxy settings.
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/get_no_proxy.sh) --slcb
Please make sure to append the --slcb
paramater.
8.9.4. Configuring HTTP Proxy for the SAP DI during its installation
During the SDI installation, one must choose the "Advanced Installation" for the "Installation Type" in order to configure Proxy.
Then the following is the example of proxy settings:
**************************
* Cluster Proxy Settings *
**************************
Choose if you want to configure proxy settings on the cluster: y
Choose action <F12> for Back/<F1> for help
possible values [yes(y)/no(n)]: y
************************
* Cluster HTTP Proxy *
************************
Specify the HTTP_PROXY value for the cluster.
Choose action <F12> for Back/<F1> for help
HTTP_PROXY: http://proxy.example.com:3128
************************
* Cluster HTTPS Proxy *
************************
Specify the HTTPS_PROXY value for the cluster.
Choose action <F12> for Back/<F1> for help
HTTPS_PROXY: http://proxy.ocpoff.vslen:3128
************************
* Cluster No Proxy *
************************
Specify the NO_PROXY value for the cluster. NO_PROXY value cannot contain white spaces and it must be comma-separated.
Choose action <F12> for Back/<F1> for help
NO_PROXY: 10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.0.0/16,192.168.128.2,localhost,jump,169.254.169.254,auditlog,datalake,diagnostics-prometheus-pushgateway,hana-service,storagegateway,uaa,vora-consul,vora-dlog,vora-prometheus-pushgateway,vsystem,vsystem-internal,*.local,*.example.com,*.svc,*.internal
Note: the value can be generated using the following script:
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/get_no_proxy.sh)
# # to see the usage and options, append `--help`
# bash <(curl -s https://raw.githubusercontent.com/redhat-sap/sap-data-intelligence/master/utils/get_no_proxy.sh) --help
When setting the No Proxy
, please mind the following:
- The wildcard domains must contain wildcard character. On the contrary, the OpenShift's proxy settings must not contain wildcard characters.
- As of SLC Bridge 1.1.72,
NO_PROXY
must not start with a wildcard domain. IOW, please put the wildcard domains at the end ofNO_PROXY
. -
In addition to the OpenShift Proxy's
.status.noProxy
values, the list should include also the following service names:vora-consul,hana-service,uaa,auditlog,vora-dlog,vsystem-internal,vsystem,vora-prometheus-pushgateway,diagnostics-prometheus-pushgateway,storagegateway,datalake
8.9.5. Configuring HTTP Proxy after the SAP DI installation
- Login to the
system
tenant as aclusterAdmin
and open theSystem Management
. - Click on
Cluster
and then click onTenants
. - For each tenant, click on the tenant row.
- Click on the "View Application Configuration and Secrets".
- Search for
PROXY
and click on the Edit button. - Edit the values as needed. Feel free to use the
get_no_proxy.sh
script above to generate theNo proxy
value. - Click the
Update
button. - (If dealing with
system
tenant, please skip this step until the very end). Go back to the tenant overview. This time, click on "Delete all Instances". Note that this will cause a slight downtime for the tenant's current users. - Repeat for other tenants from step 3.
- Execute step 8 for
system
tenant as well.
8.10. GPU enablement for SDI on OCP
To enable the GPU usage for SDI on OCP, please refer to GPU enablement for SDI on OCP.
9. Troubleshooting
Please refer to https://access.redhat.com/articles/7018550 for the detailed troubleshooting guide.
Comments