SAP Data Hub 2 on OpenShift Container Platform 3
Table of Contents
- 1. OpenShift Container Platform validation version matrix
- 2. Requirements
- 2.1. Hardware/VM and OS Requirements
- 2.1.1. OpenShift Cluster
- 2.1.1.1. Worker Nodes
- 2.1.1.2. Master Nodes
- 2.1.2. Jump host
- 2.2. Software Requirements
- 2.2.1. Compatibility Matrix
- 2.2.2. Prepare the Subscription and Packages
- 2.2.3. Persistent Volumes
- 2.2.3.1. Ceph RBD
- 2.2.3.2. OCS (OpenShift Container Storage) / Gluster
- 2.2.4. Checkpoint store enablement
- 2.2.5. External Image Registry
- 2.2.5.1. Update the list of insecure registries
- 2.2.5.2. Update proxy settings
- 2.2.6. (Optional) Hadoop
- 3. Install Red Hat OpenShift Container Platform
- 3.1. Prepare the Jump host
- 3.2. Install OpenShift Container Platform
- 3.2.1. (OCP 3.11 only) Verify access to the Red Hat Registry
- 3.3. (Optional) Validate the OpenShift cluster
- 3.4. OCP Post Installation Steps
- 3.4.1. Configure Dynamic Storage Provider
- 3.4.2. Set up an External Image Registry
- 3.4.3. Configure the OpenShift Cluster for SDH
- 3.4.3.1. Becoming a cluster-admin
- 3.4.3.2. Project setup
- 3.4.3.2.1. Enable NFS in containers
- 3.4.3.2.2. Pre-load kernel modules
- 3.4.3.2.3. Permit access to docker socket
- 3.4.3.2.4. Allow administrator to manage SDH resources
- 3.4.3.2.5. Create privileged tiller service account
- 3.4.3.2.6. Initialize helm
- 3.4.3.2.7. Create sdh project
- 3.4.3.2.8. Granting privileges to sdh admin
- 3.4.3.2.9. Deploy SDH Observer
- 4. Install SDH on OpenShift
- 4.1. Required Input Parameters
- 4.2. Kaniko Image Builder
- 4.2.1. Registry requirements for the Kaniko Image Builder
- 4.3. Installation using the Maintenance Planner and SL Plugin (mpsl)
- 4.4. Installation using SL Plugin without Maintenance Planner (mpfree)
- 4.5. Manual Installation using an installation script (manual)
- 4.5.1. Download and unpack the SDH binaries
- 4.5.1.1. Note on Installation on cluster with 3 nodes in total
- 4.5.1.2. Patch the fluentd daemonset deployment files
- 4.5.2. Install SAP Data Hub
- 4.5.2.1. Remarks on the installation options
- 4.5.2.2. Executing the installation script
- 4.6. SDH Post installation steps
- 4.6.1. (Optional) Expose SDH services externally
- 4.6.1.1. Using OpenShift Router and routes
- 4.6.1.1.1. Export services with an reencrypt route
- 4.6.1.1.2. Export services with a passthrough route
- 4.6.1.2. Using NodePorts
- 4.6.2. (AWS or IBM Cloud™ only) Configure registry secret for the Modeler
- 4.6.3. SDH Validation
- 5. Upgrade of SDH to a newer release
- 6. Appendix
- 6.1. Ceph and OCP integration
- 6.1.1. Create a dedicated pool and user in Ceph cluster for OCP
- 6.1.2. Install ceph-common package on OCP cluster hosts
- 6.1.3. Configure the ceph secrets
- 6.1.4. Create Ceph RBD storage class
- 6.1.5. (Optional) Test the Ceph RBD storage class
- 6.2. SDH uninstallation
- 6.2.1. Using the SL Plugin
- 6.2.2. Manual uninstallation
- 6.3. Uninstall Helm
- 6.4. Allow a non-root user to interact with Docker on Jump host
- 6.5. Load nfsd kernel modules
- 6.6. Grant fluentd pods permissions to logs
- 6.6.1. Before the installation
- 6.6.2. After the SDH installation
- 6.7. Make the installer cope with just 3 nodes in cluster
- 6.8. Unset the default node selector on Data Hub's project
- 6.9. Deploy SDH Observer
- 6.10. Permit Pipeline Modeler to access Docker socket
- 6.11. Marking the vflow registry as insecure
- 6.12. Running SDH pods on particular nodes
- 6.13. Running multiple SDH instances on a single OCP cluster
- 6.14. Using AWS ECR Registry for the Modeler
- 7. Troubleshooting Tips
- 7.1. SDH Installation or Upgrade problems
- 7.1.1. HANA, consul and UAA pods keep restarting
- 7.1.2. Vsystem-vrep pod not starting
- 7.1.3. Vora Installation Error: timeout at “Deploying vora-consul”
- 7.1.4. Too few worker nodes
- 7.1.5. Privileged security context unassigned
- 7.1.6. No Default Storage Class set
- 7.1.7. vsystem-app pods not coming up
- 7.1.8. Fluentd pods cannot access /var/log
- 7.2. Validation errors
- 7.2.1. Services not installed
- 7.2.2. Less than desired daemonset pods deployed
- 7.2.3. Diagnostics Prometheus Node Exporter pods not starting
- 7.2.4. Checkpoint store validation
- 7.2.5. Node goes down when new tenants are created or new users added to SDH
- 7.3. Pipeline Modeler troubleshooting
- 7.3.1. Graphs cannot be run in the Pipeline Modeler
- 7.3.2. Graphs cannot be built by the Pipeline Modeler
- 7.3.2.1. Determine the Pipeline Modeler's node
- 7.3.2.2. Fix the SELinux issue
- 7.3.3. Pipeline Modeler cannot push images to the registry
- 7.3.4. Modeler does not run when AWS ECR registry is used
In general, the installation of SAP Data Hub Foundation (SDH) follows these steps:
- Install Red Hat OpenShift Container Platform
- Configure the prerequisites for SAP Data Hub Foundation
- Install SAP Data Hub Foundation on OpenShift Container Platform
The last step has three different approaches listed below. Each approach is compatible with this guide. Please refer to the SAP's documentation for more information (2.7) / (2.6) / (2.5) / (2.4) / (2.3).
- mpsl - installation using the Maintenance Planner and SL Plugin (recommended by SAP)
- mpfree - installation using SL Plugin without Maintenance Planner
- manual - manual installation using an installation script
If you're interested in installation of older SDH or SAP Vora releases, please refer to the other installation guides:
- SAP Data Intelligence 3 on OpenShift Container Platform 4
- SAP Data Hub 2 on OpenShift Container Platform 4
- Install SAP Data Hub 1.X Distributed Runtime on OpenShift Container Platform
- Installing SAP Vora 2.1 on Red Hat OpenShift 3.7
1. OpenShift Container Platform validation version matrix
The following version combinations of SDH 2.X, OCP, RHEL have been validated:
SAP Data Hub | OpenShift Container Platform | RHEL for OCP Installation | Infrastructure and (Storage) |
---|---|---|---|
2.3 | 3.10, 3.9 | 7.6, 7.5 | AWS (EBS), RHV (Ceph RBD) |
2.4 | 3.10 | 7.6 | AWS (EBS) |
2.4 | 3.11 | 7.6 | AWS (EBS, Ceph RBD, OCS 3.11 ♠), VMware vSphere (Ceph RBD) |
2.5 | 3.10 | 7.6 | AWS (Ceph RBD, EBS) |
2.5 | 3.11 | 7.6 | AWS (EBS), VMware vSphere (Ceph RBD) |
2.5 Patch 1 | 3.11 | 7.6 | AWS (EBS), Bare metal (Ceph RBD), VMware vSphere (Ceph RBD) |
2.6 | 3.11 | 7.7 | AWS (EBS), Bare metal (Ceph RBD) |
2.6 Patch 1 | 3.11 | 7.7 | AWS (EBS), Bare metal (Ceph RBD) |
2.7 | 3.11 | 7.7 | KVM/libvirt (Ceph RBD) |
2.7 Patch 1 | 3.11 | 7.7 | KVM/libvirt (Ceph RBD) |
2.7 Patch 3 | 3.11 | 7.7 | IBM Cloud™ (IBM Cloud Block Storage), KVM/libvirt (OCS 3.11 ♠) |
♠ Gluster Block Storage with gluster.org/glusterblock-glusterfs
PV provisioner was used.
Please refer to compatibility matrix for version combinations that are considered as working.
If you are looking for OCP releases 4.1 or higher, please refer to SAP Data Hub 2 on OpenShift Container Platform 4.
For more information on OCP on IBM Cloud™, please refer to Getting started with Red Hat OpenShift on IBM Cloud. If using this platform, then you may jump directly to the chapter Install Red Hat OpenShift Container Platform right away because you don't need to install OpenShift.
2. Requirements
2.1. Hardware/VM and OS Requirements
2.1.1. OpenShift Cluster
Make sure to consult the following official cluster requirements corresponding to your release:
- of SAP Data Hub (2.6 or newer) in SAP's documentation
- of SAP Data Hub (prior to 2.6) in SAP's documentation
- of OpenShift 3 Minimum Hardware Requirements (3.11) / (3.10) / (3.9)
2.1.1.1. Worker Nodes
The following are the minimum requirements for the OpenShift Worker Nodes for proof-of-concept deployments for the latest validated SDH and OCP 3.X releases:
- OS: Red Hat Enterprise Linux 7.7, 7.6, 7.5, 7.4 or 7.31
- CPU: 4 virtual cores
- Memory: 32GB
- Diskspace:
- 230GiB for
/
including at least- 90 GiB for
/var/lib/docker
- 100 GiB for
/var/lib/origin/openshift.local.volumes
for (Ephemeral volume storage for pods)
- 90 GiB for
- at least 500GiB for persistent volumes if hosting persistent storage
- 230GiB for
The minimum number of Worker Nodes is 3. SDH can be deployed on a cluster with just 2 Worker Nodes as well with additional installation parameters.
2.1.1.2. Master Nodes
The following are the minimum requirements for the OpenShift Master nodes for proof-of-concept deployments for the latest validated SDH and OCP 3.X releases:
- OS: Red Hat Enterprise Linux 7.7, 7.6, 7.5, 7.4 or 7.3
- CPU: 4 virtual cores
- Memory: 16GB
- Diskspace:
- 100GiB for
/
including at least- 50 GiB for
/var/lib/docker
- 50 GiB for
- 100GiB for
Under the following assumptions:
- master nodes are used as infra nodes as well
- master nodes are not used to run SDH workload
The minimum number of Master/Infra Nodes is 1.
2.1.2. Jump host
It is recommended to do the installation of SAP Data Hub Foundation from an external Jump host and not from within the OpenShift Cluster.
The Jump host is used among other things for:
- running the OCP installation
- hosting the external image registry
- running the SDH installation
The hardware requirements for the Jump host can be:
- OS: Red Hat Enterprise Linux 7.7, 7.6, 7.5, 7.4 or 7.3
- CPU: 2 cores
- Memory: 4GB
- Diskspace:
- 75GiB for
/
:- to store the work directory and the installation binaries of SAP Data Hub Foundation
- including at least 50 GiB for
/var/lib/docker
- Additional 50 GiB for registry's storage if hosting image registry (by default at
/var/lib/registry
).
- 75GiB for
NOTE: It is of course possible not to have a dedicated Jump host and instead, run the installation from one of the OCP cluster hosts - ideally from one of the master nodes. In that case please run all the commands meant for the Jump host on the host of your choice.
2.2. Software Requirements
2.2.1. Compatibility Matrix
Later versions of SAP Data Hub support newer versions of Kubernetes and OpenShift Container Platform. Even if not listed in the OCP validation version matrix above, the following version combinations are considered fully working:
SAP Data Hub | OpenShift Container Platform | RHEL for OCP Installation | Storage |
---|---|---|---|
2.3 | 3.9 | 7.6, 7.5, 7.4, 7.3 | Ceph RBD, OCS ❄, cloud2 |
2.3 | 3.10 | 7.6, 7.5, 7.4 | Ceph RBD, OCS ❄, cloud |
2.5, 2.4 | 3.10 | 7.6, 7.5, 7.4 | Ceph RBD, OCS ❄, cloud |
2.7, 2.6, 2.5, 2.4 | 3.11 | 7.8, 7.7, 7.6, 7.5, 7.4 | Ceph RBD, OCS, cloud |
Storage option marked with ❄ is compatible, however, not supported. Please refer to OCS and OCP interoperability matrix for more information. The only supported OCS version for SDH is 3.11.
Unless stated otherwise, the compatibility of a listed SDH version covers all its patch releases as well.
2.2.2. Prepare the Subscription and Packages
-
On each host of the OpenShift cluster, register system using subscription-manager. Look up and then and attach to the pool that provides the OpenShift Container Platform subscription.
# subscription-manager register --username=UserName --password=Password your system is registered with ID: XXXXXXXXXXXXXXXX # subscription-manager list --available # subscription-manager attach --pool=Pool_Id_Identified_From_Previous_Command
-
Subscribe each host only to the following repositories.
# subscription-manager repos --disable='*' \ --enable=rhel-7-server-rpms \ --enable=rhel-7-server-extras-rpms \ --enable=rhel-7-fast-datapath-rpms \ --enable=rhel-7-server-ose-3.9-rpms \ --enable=rhel-7-server-ansible-2.4-rpms
For OCP 3.10, please use the following commands:
# subscription-manager repos --disable='*' \ --enable=rhel-7-server-rpms \ --enable=rhel-7-server-extras-rpms \ --enable=rhel-7-server-ose-3.10-rpms \ --enable=rhel-7-server-ansible-2.4-rpms
For OCP 3.11, please use the following commands:
# subscription-manager repos --disable='*' \ --enable=rhel-7-server-rpms \ --enable=rhel-7-server-extras-rpms \ --enable=rhel-7-server-ose-3.11-rpms \ --enable=rhel-7-server-ansible-2.6-rpms
-
Additionally, if you plan to use Ceph RBD as a storage, make sure to enable also the following repository so that Ceph client tools installed to OCP cluster are of the same version as the Ceph packages on server.
# subscription-manager repos --enable=rhel-7-server-rhceph-3-tools-rpms
-
Please follow the host preparation documentation (3.11) / (3.10) / (3.9) corresponding to your cluster version.
2.2.3. Persistent Volumes
Persistent storage is needed for SDH. It’s required to use storage that can be created dynamically. You can find more information in the Dynamic Provisioning and Creating Storage Classes document.
Recommended and tested storage options for on-premise installations are:
For installations in cloud, please use the recommended provider for particular cloud platform.
2.2.3.1. Ceph RBD
To make use of an existing Ceph cluster in OCP, please follow the Using Ceph RBD for dynamic provisioning guide (3.11) / (3.10) / (3.9).
Please refer to Installing the Ceph Object Gateway on enabling S3 object interface for Checkpoint store enablement.
For production environment, see the supported configurations.
For small-to-medium sized test and pre-production deployments, Civetweb interface can be utilized. The S3 API endpoint must either be secured by a signed certificate (not self-signed) or insecure. In the latter case, scheme must be present in the endpoint URL specified during the installation (e.g. http://s3.172.18.0.85.nip.io:8080
).
NOTE: Wildcard DNS does not need to be configured for the checkpoint store.
The following is an example configuration of a rados gateway residing in a Ceph configuration file (e.g. /etc/ceph/ceph.conf
) for small-to-medium sized test and pre-production deployments.
[client.rgw.ip-172-18-0-85]
host = 0.0.0.0
keyring = /var/lib/ceph/radosgw/ceph-rgw.ip-172-18-0-85/keyring
log file = /var/log/ceph/ceph-rgw-ip-172-18-0-85.log
#rgw frontends = civetweb port=443s ssl_certificate=/etc/ceph/radosgw-172.18.0.85.nip.io.pem num_threads=100
rgw frontends = civetweb port=8080 num_threads=100
debug rgw = 1
rgw_enable_apis = s3
rgw_dns_name = ip-172-18-0-85.internal
rgw_resolve_cname = false
In this case the Amazon S3 Host input parameter shall be set to http://ip-172-18-0-85.internal:8080
.
Once the Ceph Storage and OCP are deployed, please continue to Ceph integration.
2.2.3.2. OCS (OpenShift Container Storage) / Gluster
Can be deployed into the OpenShift cluster at the time of OCP installation or afterwards using either converged or independent mode. An existing standalone Gluster Storage deployed outside of the OpenShift cluster can be utilized as well. For various installation options, please refer to the OCP documentation.
For SDH, the only supported provisioners are gluster.org/glusterblock
and gluster.org/glusterblock-glusterfs
that provision gluster-block volumes. If the openshift-ansible scripts are used to deploy OCS and the openshift_storage_glusterfs_name
parameter is not overridden, it is represented by the glusterfs-storage-block
storage class.
The regular glusterfs volumes provisioned by kubernetes.io/glusterfs
do not perform well for Data Hub. Its use will most probably result in the following problems described in the SAP Note #2755247.
Due to a known slow performance issue with GlusterFS storage on OpenShift 3.9 and 3.10 the vflow pod fails to start up after exceeding the 3 minute liveness probe timeout. Or if the vflow pod is running it spends a long time retrieving objects from the underlying storage.
opeshift-ansible Inventory Configuration
Below are several parameters for the inventory file used during the advanced installation. The inventory can be also used post OCP installation for the deployment of OCS alone. See Persistent Storage Using Red Hat Gluster Storage for instructions and examples.
The following is the minimum set of inventory arguments needed for the OCS installation (converged mode):
[OSEv3:children]
...
glusterfs
[glusterfs]
# the node names must be listed in the [nodes] section as well
# make sure to check for the right device names representing bare block devices with no partitions and no LVM PVs
node11.example.com glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
node12.example.com glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
node13.example.com glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
[OSEv3:vars]
...
openshift_storage_glusterfs_image="registry.access.redhat.com/rhgs3/rhgs-server-rhel7:v3.11"
openshift_storage_glusterfs_block_image="registry.access.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7:v3.11"
openshift_storage_glusterfs_heketi_image="registry.access.redhat.com/rhgs3/rhgs-volmanager-rhel7:v3.11"
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_size=135
openshift_storage_glusterfs_block_storageclass=true
The *_image
variables are mandatory starting from OCP 3.10. They must contain a specific :<version>
tag. The :latest
tag cannot be used. To see the complete list of image tags, execute the following command.
# # make sure to install the following packages first: skopeo, jq
# sudo skopeo inspect 'docker://registry.redhat.io/rhgs3/rhgs-server-rhel7' | \
jq -r '.RepoTags[] | sub("^(?<a>[^.]+\\.[^.]+).*"; "\(.a)")' | sort -uV
3.1
...
v3.11
The parameter openshift_storage_glusterfs_block_host_vol_size
specifies the size in GiB for glusterfs volumes hosting the block volumes. It effectively determines the maximum size of a block volume that can be provisioned. Since SAP HANA requires at least 128GiB big volume, it is recommended to set this parameter at least to 132.
Optionally, make the block storage class the default with:
openshift_storage_glusterfs_block_storageclass_default=true
Please refer to the Gluster Role Variables for more options.
SELinux Enablement
If a standalone Gluster Storage is being used, additional permissions need to be granted by enabling the following SELinux booleans on all the schedulable nodes:
# setsebool -P virt_sandbox_use_fusefs=on virt_use_fusefs=on
IMPORTANT: Gluster's S3 API cannot be currently used for checkpoint store. We are working on resolving the issues. If you demand this feature, please fill an RFE.
2.2.4. Checkpoint store enablement
In order to enable SAP Vora Database streaming tables, checkpoint store needs to be enabled. The store is an object storage on a particular storage back-end. Several back-end types are supported by the SDH installer that cover most of the storage cloud providers. For IBM Cloud™, please follow Preparing IBM Cloud IBM Cloud Object Storage for SAP Vora Checkpoint. For on-premise deployments, Ceph RBD can be utilized with their S3 interfaces.
The enablement is strongly recommended for production clusters. Clusters having this feature disabled are suitable for development or as PoCs.
Make sure to create a desired bucket before the SDH Installation. If the checkpoint store shall reside in a directory on a bucket, the directory needs to exist as well.
2.2.5. External Image Registry
The SDH installation requires an Image Registry where images are first mirrored from an SAP Registry and then delivered to the OCP cluster nodes. The integrated OpenShift Container Registry is not appropriate for this purpose or may require further analysis. For now, an external image registry needs to be setup instead.
On AWS, it is recommended to utilize Amazon Elastic Container Registry. Please refer to Using AWS ECR Registry for the Modeler for a post-configuration step to enable the registry for the Modeler.
On IBM Cloud™, you can utilize the image container registry provided by the platform.
If an external registry is not provided by your platform or not feasible, it needs to be deployed manually. As an example, the Jump host can be used to host the registry. Please follow the steps in article How do I setup/install a Docker registry?.
After the setup you should have an external image registry up and running at the URL My_Image_Registry_FQDN:5000
. You can verify that with the following command.
# curl http://My_Image_Registry_FQDN:5000/v2/
{}
Make sure to mark the address as insecure.
Additionally, if using Kaniko Image Builder, make sure to mark the registry as insecure within the Pipeline Modeler.
2.2.5.1. Update the list of insecure registries
-
Since the external image registry deployed above is insecure by default, in order to push images to the image registry and pull them on nodes it must be listed as insecure in
/etc/containers/registries.conf
file on all the hosts, including the Jump host:# vi /etc/containers/registries.conf ... [registries.insecure] registries = [ "My_Image_Registry_FQDN:5000" ] ...
-
For the changes to become effective, restart the docker daemon:
# systemctl restart docker
If you plan to run the installation as a non-root user, please check the instructions below for additional steps.
During the advanced installation of the OCP, make sure to include the My_Image_Registry_FQDN:5000
among openshift_docker_insecure_registries
.
NOTE These settings have no effect on the Kaniko Image Builder, which also needs to be aware of the insecure registry. Please refer to Marking the vflow registry as insecure for more information.
2.2.5.2. Update proxy settings
If there's a mandatory proxy in the cluster's network, make sure to include the My_Image_Registry_FQDN
in the NO_PROXY
settings in addition to the recommended NO_PROXY
addresses (3.11) / (3.10) / (3.9).
Additionally, during the advanced installation (3.11) / (3.10) / (3.9) of the OCP, you should include My_Image_Registry_FQDN
in openshift_no_proxy
variable.
2.2.6. (Optional) Hadoop
It's optional to install the extensions to the Spark environment on Hadoop. Please refer to SAP Data Hub Spark Extensions on a Hadoop Cluster (2.7) / (2.6) / (2.5) / (2.4) / (2.3) for details. This document doesn't cover the Hadoop part.
3. Install Red Hat OpenShift Container Platform
If installing SAP Data Hub on IBM Cloud™, please follow the instructions that you find in Deploying your OpenShift cluster and jump host.
3.1. Prepare the Jump host
- Ideally, subscribe to the same repositories as on the cluster hosts.
-
Install a helm client on the Jump host.
-
Download a script from https://github.com/helm/helm and execute it with the desired version set to the latest major release of helm for your SDH release. That is
v2.11.0
for SDH 2.7, 2.6, 2.5 and 2.4 andv2.9.1
for SDH 2.3.# DESIRED_VERSION=v2.11.0 # or v2.9.1 for SDH releases 2.3.* # curl --silent https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | \ DESIRED_VERSION="${DESIRED_VERSION:-v2.11.0}" bash
-
See the blog Getting started with Helm on OpenShift for more information.
-
-
Download and install
kubectl
.-
Either via standard repositories by installing
atomic-openshift-clients
:# sudo yum install -y atomic-openshift-clients
NOTE:
rhel-7-server-ose-X.Y-rpms
repositories corresponding to the same major release version (e.g. 3.10) as on the cluster nodes need to be enabled. -
Or by downloading and installing the binary manually after determining the right version (e.g. the latest v1.11 for OCP 3.11 cluster):
# curl -LO https://dl.k8s.io/release/v1.11.10/bin/linux/amd64/kubectl # chmod +x ./kubectl # sudo mv ./kubectl /usr/local/bin/kubectl
-
-
In case of mpsl and mpfree SDH installation methods, make sure to install and run the SAP Host Agent (2.7) / (2.6) / (2.5) / (2.4) / (2.3) as well.
-
However, in step 4, instead of downloading a
*.SAR
archive, as suggested by the guide, on RHEL it is recommended to download the latest RPM package (e.g.saphostagentrpm_40-20009394.rpm
) and install it on the Jump host using a command like:# yum install saphostagentrpm_40-20009394.rpm
NOTE (2.4 only): SAP Host Agent Patch Level (PL) 40 or higher with a self-signed SSL certificate is required.
-
This way, the installation of
SAPCAR
listed in prerequisites is not needed. - Step 6 (SAR archive extraction) can then be skipped.
-
In the step 7, the command then needs to be modified to:
# cd /usr/sap/hostctrl/exe # ./saphostexec -setup slplugin -passwd
-
Additionally, make sure to set the password for
sapadm
user. You will be prompted for the username and password by the maintenance planner.# passwd sapadm
-
3.2. Install OpenShift Container Platform
This section can be skipped if using managed OpenShift platform or the cluster is already deployed.
Install OpenShift Container Platform on your desired cluster hosts. Follow the OpenShift installation guide (3.11) / (3.10) / (3.9) or use the playbooks for a cloud reference architecture.
NOTE: On AWS you have to label all nodes according to Labeling Clusters for AWS with openshift_clusterid="<clusterid>"
where <clusterid>
part matches the same part in the tag kubernetes.io/cluster/<clusterid>,Value=(owned|shared)
of resources belonging to the cluster.
Important Advanced Installation Variables
Variable | Description |
---|---|
openshift_release |
must contain one of 3.11 , 3.10 , 3.9 |
openshift_deployment |
must be set to openshift-enterprise |
openshift_docker_insecure_registries |
shall contain the URL of the external image registry (My_Image_Registry_FQDN:5000 )3 |
openshift_https_proxy , openshift_http_proxy |
shall be set up according to internal network policies |
openshift_no_proxy |
if the proxy settings are set and the registry is deployed in the internal network, it must contain My_Image_Registry_FQDN |
openshift_cloudprovider_kind |
the name of the target cloud provider if deploying in cloud (e.g. aws , azure , openstack or vsphere ) |
openshift_clusterid |
needs to be set only for AWS unless using IAM profiles4 |
openshift_master_default_subdomain |
the subdomain used for exposed routes5 |
oreg_auth_user and oreg_auth_password |
mandatory since 3.11 for the default registry.redhat.io registry, older releases continue to pull from registry.access.redhat.com |
os_sdn_network_plugin_name |
set to redhat/openshift-ovs-multitenant if the cluster shall run multiple instances of SDH or another workloads |
Please refer to OCP / Gluster section for additional parameters related to OCS if you plan on deploying it.
3.2.1. (OCP 3.11 only) Verify access to the Red Hat Registry
If using the default registry.redhat.io registry, verify you have access to it before launching the installation like this:
# sudo docker login -u $REDHAT_PORTAL_ACCOUNT registry.redhat.io
# sudo skopeo inspect docker://registry.redhat.io/openshift3/ose-pod
{
"Name": "registry.redhat.io/openshift3/ose-pod",
...
}
3.3. (Optional) Validate the OpenShift cluster
Before continuing with the installation, you may want to double-check that the OpenShift cluster is healthy to rule out any possible issues resulting from misbehaving or misconfigured OpenShift cluster.
Please follow one of the health-check guides corresponding to you cluster version:
- Environment Health Checks for 3.11
- Environment Health Checks for 3.10
- Environment Health Checks for 3.9
3.4. OCP Post Installation Steps
3.4.1. Configure Dynamic Storage Provider
For cloud deployment, the default dynamic storage provisioner should already be in place. For example, on AWS, gp2
will be most probably configured as the default storage class:
# oc get sc
NAME PROVISIONER AGE
gp2 (default) kubernetes.io/aws-ebs 7d
For IBM Cloud™ specifics, please refer to Choosing the Dynamic Storage Provisioner.
For on-premise installations, a suitable storage provisioner needs to be considered and deployed. Please refer to the validated provisioners listed above.
In case of OCS / Gluster, the provisioner and storage class can be deployed during OCP installation. In that case, you can skip this step.
3.4.2. Set up an External Image Registry
If you haven't done so already, please follow the External Image Registry prerequisite.
If installing SAP Data Hub on IBM Cloud™, please follow the steps Setting up the IBM Cloud Container Registry.
3.4.3. Configure the OpenShift Cluster for SDH
3.4.3.1. Becoming a cluster-admin
Many commands below require cluster admin privileges. To become a cluster-admin, you can do one of the following:
-
Copy the
admin.kubeconfig
file from a remote master node to a local host and use that:# scp master.node:/etc/origin/master/admin.kubeconfig . # export KUBECONFIG=$(pwd)/admin.kubeconfig # oc login -u system:admin
This is recommended for mpsl and mpfree installation methods when using the Jump host.
NOTE: the same file is used as the
KUBECONFIG File
input parameter for mpsl and mpfree installation methods. -
Log-in to any master node as the root user, execute the following command and continue the installation as
system:admin
user. In this case, the master node becomes a Jump host.# oc login -u system:admin
Possible for all installation methods without a Jump host.
-
(manual installation method only) Make any existing user (
dhamin
in this example) a cluster admin by doing the previous step followed by:# oc adm policy add-cluster-role-to-user cluster-admin dhadmin
You can learn more about the cluster-admin role in Cluster Roles and Local Roles article.
3.4.3.2. Project setup
3.4.3.2.1. Enable NFS in containers
On every schedulable node of the OpenShift cluster, make sure the NFS filesystem is ready for use by loading the nfsd
kernel module and enabling it with SELinux boolean:
# setsebool virt_use_nfs true
3.4.3.2.2. Pre-load kernel modules
On every schedulable node, make sure to load the ipt_REDIRECT
kernel module.
# modprobe ipt_REDIRECT
# echo "ipt_REDIRECT" > /etc/modules-load.d/ipt_redirect.conf
3.4.3.2.3. Permit access to docker socket
(OCP 3.9 or older) Unless using kaniko builds, on every schedulable node of the OpenShift cluster, permit vflow pod to access /var/run/docker.sock
.
3.4.3.2.4. Allow administrator to manage SDH resources
As a cluster-admin, allow the project administrator to manage SDH custom resources.
# oc create -f - <<EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: aggregate-sapvc-admin-edit
labels:
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rules:
- apiGroups: ["sap.com"]
resources: ["voraclusters"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: aggregate-sapvc-view
labels:
# Add these permissions to the "view" default role.
rbac.authorization.k8s.io/aggregate-to-view: "true"
rules:
- apiGroups: ["sap.com"]
resources: ["voraclusters"]
verbs: ["get", "list", "watch"]
EOF
3.4.3.2.5. Create privileged tiller service account
As a cluster-admin, create a tiller service account in the kube-system
project (aka namespace) and grant it the necessary permissions:
# oc create sa -n kube-system tiller
# oc adm policy add-cluster-role-to-user cluster-admin -n kube-system -z tiller
3.4.3.2.6. Initialize helm
Set up helm and tiller for the deployment:
# helm init --service-account=tiller --upgrade --wait
Upon successful initialization, you should be able to see a tiller pod in the kube-system
namespace:
# oc get pods -n kube-system
NAME READY STATUS RESTARTS AGE
tiller-deploy-551988758-dzjx5 1/1 Running 0 1m
# helm ls
[There should be no error in the output. If there is no output at all, it means good news, no error]
3.4.3.2.7. Create sdh project
Create a dedicated project in OpenShift for the SDH deployment. For example sdh
. Login to OpenShift as a cluster-admin, and perform the following configurations for the installation:
# oc new-project sdh
# oc adm policy add-scc-to-group anyuid "system:serviceaccounts:$(oc project -q)"
# oc adm policy add-scc-to-group hostmount-anyuid "system:serviceaccounts:$(oc project -q)"
# oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)"
# oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)-vrep"
# oc adm policy add-scc-to-user privileged -z "$(oc project -q)-elasticsearch"
# oc adm policy add-scc-to-user privileged -z "$(oc project -q)-fluentd"
# oc adm policy add-scc-to-user privileged -z "default"
# oc adm policy add-scc-to-user privileged -z "vora-vflow-server"
# oc adm policy add-scc-to-user hostaccess -z "$(oc project -q)-nodeexporter"
# oc patch namespace "$(oc project -q)" -p '{"metadata":{"annotations":{"openshift.io/node-selector":""}}}'
3.4.3.2.8. Granting privileges to sdh admin
(Optional) (Until SDH 2.5) In case of the manual installation, you may want to execute the installation as a regular user (not as a cluster-admin). You can either create a new OpenShift user or delegate the installation procedure to an existing OCP user. In any case, the user needs to be granted the following roles in the chosen target project (in this case dhadmin
is the user name and sdh
is the target project):
# oc adm policy add-role-to-user -n sdh admin dhadmin
# oc adm policy add-cluster-role-to-user system:node-reader dhadmin
NOTE: Starting from SDH 2.5, the regular user cannot perform the installation anymore. It needs to be performed by a cluster-admin.
3.4.3.2.9. Deploy SDH Observer
(OCP 3.10 or higher) Deploy sdh-observer in the sdh
namespace. Please follow the appendix Deploy SDH Observer .
4. Install SDH on OpenShift
4.1. Required Input Parameters
A few important installation parameters are described below. Please refer to the official documentation (2.7) / (2.6) / (2.5) / (2.4) / (2.3) for their full description. Most of the parameters must be provided for the mpsl and mpfree installation methods. The Command line argument column describes corresponding options for the manual installation.
Name | Condition | Recommendation | Command line argument |
---|---|---|---|
Kubernetes Namespace | Always | Must match the project name chosen in the Project Setup (e.g. sdh ) |
-n sdh |
Installation Type | Installation or Update | Choose Advanced Installation if you need to specify proxy settings or you want to choose particular storage class or there is no default storage class set. |
None |
KUBECONFIG File | Always | The path to the kubeconfig file on the Jump host. It is the same file as described in Becoming a cluster-admin. If the SAP Host Agent is running on the master host, it can be set to /root/.kube/config . |
None6 |
Container Registry | Installation | Must be set to the external image registry. | -r My_Image_Registry_FQDN:5000 |
Certificate Domain7 | Installation | Shall be set either to 1. the FQDN of the vsystem route,2. the wildcard domain matching the master default subdomain or 2. the external FQDN of the OpenShift node used to access the vsystem service if using NodePort. |
1. --cert-domain vsystem-sdh.wildcard-domain 2. --cert-domain "*.wildcard-domain" 3. --cert-domain master.example.com |
Cluster Proxy Settings | Advanced Installation or Advanced Updates | Make sure to configure this if the traffic to internet needs to be routed through a proxy. | None |
Cluster HTTP Proxy | Advanced Installation or Advanced Updates | Make sure to set this if the traffic to internet needs to be routed through a proxy. | --cluster-http-proxy "$HTTP_PROXY" |
Cluster HTTPS Proxy | Advanced Installation or Advanced Updates | Make sure to set this if the traffic to internet needs to be routed through a proxy. | --cluster-https-proxy "$HTTPS_PROXY" |
Cluster No Proxy | Advanced Installation or Advanced Updates | Make sure to set this if the traffic to internet needs to be routed through a proxy. | --cluster-https-proxy "$NO_PROXY" |
StorageClass Configuration | Advanced Installation | Configure this if you want to choose different dynamic storage provisioners for different SDH components or if there's no default storage class set or you want to choose non-default storage class for the SDH components. | None |
Default StorageClass | Advanced Installation and if storage classes are configured | Set this if there's no default storage class set or you want to choose non-default storage class for the SDH components. | --pv-storage-class ceph-rbd |
Additional Installer Parameters | Advanced Installation | Useful for making the SDH deploy to just 2 worker nodes, reducing the minimum memory requirements of the HANA pod, etc.(Starting from 2.5) Can be used to enable Kaniko Image Builder. | -e vora-cluster.components.dlog.replicationFactor=2 --enable-kaniko=yes |
4.2. Kaniko Image Builder
NOTE Available starting from SDH 2.5.
By default, Pipeline Modeler (vflow) pod uses Docker Daemon on the node, where it runs, to build container images before they are run as graphs. This poses a security threat because in order for the vflow pod to mount it, one of the following must hold true:
- the pod is run as privileged or super privileged
- (OCP 3.9 or older) the SELinux context of the Docker socket must be modified, otherwise the mount will be denied by the enforcing SELinux policy
Additionally, Docker itself becomes a mandatory dependency, preventing from the use of another container runtime (e.g. CRI-O).
To make the platform secure and independent of the underlying container runtime, it is possible to configure the SDH to use Kaniko Image Builder. This is enabled with --enable-kaniko=yes
parameter passed to the install.sh
script during the manual installation. For the other installation methods, one can enable it by appending --enable-kaniko=yes
to SLP_EXTRA_PARAMETERS
(Additional Installation Parameters).
4.2.1. Registry requirements for the Kaniko Image Builder
The Kaniko Image Builder supports out-of-the-box only connections to secure image registries with a certificate signed by a trusted certificate authority.
In order to use an insecure image registry (e.g. the proposed external image registry) in combination with the builder, the registry must be whitelisted in Pipeline Modeler by marking it as insecure.
4.3. Installation using the Maintenance Planner and SL Plugin (mpsl)
Is a web-based installation method recommended by SAP offering you an option to send analytics data and feedback to SAP. All the necessary prerequisites have been satisfied by applying all the steps described above. The Installation using the Maintenance Planner and SL Plugin (2.7) / (2.6) / (2.5) / (2.4) / (2.3) documentation will guide you through the process.
4.4. Installation using SL Plugin without Maintenance Planner (mpfree)
Is an alternative command-line-based installation method. Please refer to the SAP Data Hub documentation (2.7)) / (2.6) / (2.5) / (2.4) / (2.3) for more information and the exact procedure.
SDH 2.3 Note
If you have prepared the installation host according to the prior instructions, you will have sapcar
binary available at /usr/sap/hostctrl/exe/SAPCAR
. Thus the extraction step 2.b) in the procedure will look like:
# /usr/sap/hostctrl/exe/SAPCAR -xvf /tmp/download/SLPUGIN<latest SP-version>.SAR /tmp/slplugin/bin
4.5. Manual Installation using an installation script (manual)
4.5.1. Download and unpack the SDH binaries
Download and unpack SDH installation binary onto the Jump host.
-
Go to SAP Software Download Center, login with your SAP account and search for
SAP DATA HUB 2
or access this link. -
Download the SAP Data Hub Foundation file, for example:
DHFOUNDATION03_3-80004015.ZIP
(SAP DATA HUB - FOUNDATION 2.3
) orDHFOUNDATION04_0-80004015.ZIP
(SAP DATA HUB - FOUNDATION 2.4
). -
Unpack the installer file. For example, when you unpack the
DHFOUNDATION04_0-80004015.ZIP
package, it will create the installation folderSAPDataHub-2.4.63-Foundation
.# unzip DHFOUNDATION04_0-80004015.ZIP
4.5.1.1. Note on Installation on cluster with 3 nodes in total
This note is useful for just a small proof-of-concept, not for production deployment.
SDH's dlog pod expects at least 3 schedulable compute nodes that are neither master nor infra nodes. This requirement can be mitigated by reducing replication factor of dlog pod with the following patch applied to the foundation directory. See below for instructions on making the installer cope with just 2 compute nodes.
4.5.1.2. Patch the fluentd daemonset deployment files
In order to allow diagnostics-fluentd pods to access /var/log
directories on the nodes, the diagnostics-fluentd daemonset needs to be patched to run as privileged.
To achieve that, the easiest solution is to deploy SDH Observer for OCP releases 3.10 or higher.
For older OCP releases, you can patch the deployment files now - before the Data Hub installation. In order to do so, please change to the unzipped foundation directory and follow the instructions Grant fluentd pods permissions to logs.
4.5.2. Install SAP Data Hub
4.5.2.1. Remarks on the installation options
Feature | Installation method | Parameter Name |
---|---|---|
Storage Class | Manual | --pv-storage-class="$storage_class_name" |
Advanced mpsl or mpfree installations | Default storage class | |
Kaniko Image Builder | Manual | --enable-kaniko=yes |
Advanced mpsl or mpfree installations | Additional Installation Parameters or SLP_EXTRA_PARAMETERS |
Storage Class
When there is no default dynamic storage provisioner defined, the preferred one needs to be specified explicitly.
If the default dynamic storage provisioner has been defined, the parameter can be omitted. To define the default dynamic storage provisioner, please follow the document Changing the Default StorageClass.
Kaniko Image Builder
Can be enabled starting from SDH 2.5. During mpsl and mpfree installation methods, one needs to append --enable-kaniko=yes
to the list of Additional Installation Parameters or SLP_EXTRA_PARAMETERS
.
See Kaniko Image Builder for more information.
4.5.2.2. Executing the installation script
Run the SDH installer as described in Manually Installing SAP Data Hub on the Kubernetes Cluster (2.7) / (2.6) / (2.5) / (2.4) / (2.3).
For IBM Cloud™ specific installation parameters, please visit Running the installation shell script.
4.6. SDH Post installation steps
4.6.1. (Optional) Expose SDH services externally
There are multiple possibilities how to make SDH services accessible outside of the cluster. Compared to Kubernetes, OpenShift offers additional method, which is recommended for most of the scenarios including SDH services. It's based on OpenShift Router and routes. The other methods documented in the official SAP Data Hub documentation are still available.
4.6.1.1. Using OpenShift Router and routes
OpenShift allows you to access the Data Hub services via routes as opposed to regular NodePorts. For example, instead of accessing the vsystem service via https://master-node.example.com:32322
, after the service exposure, you will be able to access it at https://vsystem-sdh.wildcard-domain
. This is an alternative to the official guide documentation to Expose the Service From Outside the Network.
NOTE: For this to work, a wildcard domain needs to be preconfigured in the local DNS server to resolve the desired wildcard-domain
and all its subdomains (e.g. vsystem-sdh.wildcard-domain
) to the node, where OpenShift Router (or its load balancer) runs. Please follow Using Wildcard Routes (for a Subdomain) for more information.
There are two kinds of routes. The reencrypt
kind, allows for a custom signed or self-signed certificate to be used. The other is a passthrough
kind which uses the pre-installed certificate generated by the installer or passed to the installer.
4.6.1.1.1. Export services with an reencrypt route
With this kind of route, different certificates are used on client and service sides of the route. The router stands in the middle and re-encrypts the communication coming from either side using a certificate corresponding to the opposite side. In this case, the client side is secured by a provided certificate and the service side is encrypted with the original certificate generated or passed to the SAP Data Hub installer.
The reencrypt route allows for securing the client connection with a proper signed certificate.
-
Look up the
vsystem
service:# oc project sdh # switch to the Data Hub project # oc get services | grep "vsystem " vsystem ClusterIP 172.30.227.186 <none> 8797/TCP 19h
When exported, the resulting hostname will look like
vsystem-${SDH_NAMESPACE}.wildcard-domain
. However, an arbitrary hostname can be chosen instead as long as it resolves correctly to the IP of the router. -
Get or generate the certificates. In the following example, a self-signed certificate is created.
# openssl genpkey -algorithm RSA -out sdhroute-privkey.pem \ -pkeyopt rsa_keygen_bits:2048 -pkeyopt rsa_keygen_pubexp:3 # openssl req -new -key sdhroute-privkey.pem -out sdhroute.csr \ -subj "/C=DE/ST=BW/L=Walldorf/O=SAP SE/CN=vsystem-$(oc project -q).wildcard-domain" # openssl x509 -req -days 365 -in sdhroute.csr -signkey sdhroute-privkey.pem -out sdhroute.crt
Please refer to Generating Certificates for more information.
If you want to export more SDH services without using multiple certificates, the Common Name (CN
) attribute could be set to the*.wildcard-domain
instead, which will match all its possible subdomains.
Thesdhroute.crt
self-signed certificate can be imported to a web browser or passed to any other vsystem client. -
Obtain the SDH's root certificate authority bundle generated at the SDH's installation time. The generated bundle is available at
SAPDataHub-*-Foundation/deployment/vsolutions/certs/vrep/ca/ca.crt
when installing manually. But it is also available as in theca-bundle.pem
secret in thesdh
namespace.# # in case of manual installation # cp path/to/SAPDataHub-*-Foundation/deployment/vsolutions/certs/vrep/ca/ca.crt sdh-service-ca-bundle.pem # # otherwise get it from the ca-bundle.pem secret # oc get -o go-template='{{index .data "ca-bundle.pem"}}' secret/ca-bundle.pem | base64 -d >sdh-service-ca-bundle.pem
-
Create the reencrypt route for the vsystem service like this:
# oc create route reencrypt --cert=sdhroute.crt --key=sdhroute-privkey.pem \ --dest-ca-cert=sdh-service-ca-bundle.pem --service=vsystem # oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD vsystem vsystem-sdh.wildcard-domain vsystem vsystem reencrypt None
-
Verify you can access the SDH web console at
https://vsystem-sdh.wildcard-domain
. If generated a self-signed certificate in step 2, import it in your web browser and refresh the page.
4.6.1.1.2. Export services with a passthrough route
With the passthrough
route, the communication is encrypted by the SDH service's certificate all the way to the client. It can be treated as secure by the clients as long as the SDH installer has been given proper Certificate Domain to generate a certificate having the Common Name matching the route's hostname and the certificate is imported or passed to the client.
- Obtain the SDH's root certificate as documented in step 3 of Export services with an reencrypt route.
-
Print its attributes and make sure its Common Name (
CN
) matches the expected hostname or wildcard domain.# openssl x509 -noout -subject -in sdh-service-ca-bundle.pem subject= /C=DE/ST=BW/L=Walldorf/O=SAP SE/CN=*.wildcard-domain
In this case, the certificate will be valid for any subdomain of the
.wildcard-domain
. -
Look up the
vsystem
service:# oc project sdh # switch to the Data Hub project # oc get services | grep "vsystem " vsystem ClusterIP 172.30.227.186 <none> 8797/TCP 19h
-
Create the route:
# oc create route passthrough --service=vsystem # oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD vsystem vsystem-sdh.wildcard-domain vsystem vsystem passthrough None
Verify that it matches the Common Name (
CN
) attribute from step 2. You can modify the hostname with--hostname
parameter. Make sure it resolves to the router's IP. -
Import the self-signed certificate to your web browser and access the SDH web console at
https://vsystem-sdh.wildcard-domain
to verify.
4.6.1.2. Using NodePorts
Until SDH 2.5, the SAP Data Hub services were exposed implicitly by the installer. From this version onward, the services need to be exposed manually if desired.
Exposing SAP Data Hub vsystem
NOTE For OpenShift, an exposure using routes is preferred.
-
Either with an auto-generated node port:
# oc expose service vsystem --type NodePort --name=vsystem-nodeport --generator=service/v2 # oc get -o jsonpath=$'{.spec.ports[0].nodePort}\n' services vsystem-nodeport 30617
-
Or with a specific node port (e.g. 32123):
# oc expose service vsystem --type NodePort --name=vsystem-nodeport --generator=service/v2 --dry-run -o yaml | \ oc patch -p '{"spec":{"ports":[{"port":8797, "nodePort": 32123}]}}' --local -f - -o yaml | oc create -f -
The original service remains accessible on the same ClusterIP:Port
as before. Additionally, it is now accessible from outside of the cluster under the node port.
Exposing SAP Vora Transaction Coordinator and HANA Wire
NOTE Routes cannot be used for exposing these services, therefore please use either NodePorts (as documented here) or an alternative method for Getting Traffic into a Cluster.
# oc expose service vora-tx-coordinator-ext --type NodePort --name=vora-tx-coordinator-nodeport --generator=service/v2
# oc get -o jsonpath=$'tx-coordinator:\t{.spec.ports[0].nodePort}\nhana-wire:\t{.spec.ports[1].nodePort}\n' \
services vora-tx-coordinator-nodeport
tx-coordinator: 32445
hana-wire: 32192
The output shows the generated node ports for the newly exposed services.
4.6.2. (AWS or IBM Cloud™ only) Configure registry secret for the Modeler
If installing SAP Data Hub on AWS, please follow Using AWS ECR Registry for the Modeler if this registry shall be used for the Pipeline Modeler.
If installing SAP Data Hub on IBM Cloud™, please refer to Provide the Modeler's Access Credentials for the IBM Cloud Container Registry for a post-configuration step to enable the registry for the SAP Data Hub Modeler.
4.6.3. SDH Validation
Validate SDH installation on OCP to make sure everything works as expected. Please follow the instructions in Testing Your Installation (2.7) / (2.6) / (2.5) / (2.4) / (2.3).
5. Upgrade of SDH to a newer release
This section will guide you through the SAP Data Hub upgrade to a newer release version. The upgrade involves also an upgrade of OpenShift cluster if you run SDH 2.3 on OCP 3.9 cluster.
The following steps must be performed in the given order. Unless an OCP upgrade is needed, the steps marked with (ocp-upgrade) can be skipped.
- Make sure to get familiar with the official SAP Upgrade guide (2.7) / (2.6) / (2.5) / (2.4).
- (ocp-upgrade) Make yourself familiar with the OpenShift's upgrade guide (3.11) / (3.10).
- Plan for a downtime.
-
Follow and execute the SAP Pre-Upgrade Procedures (2.7) / (2.6) / (2.5) / (2.4).
-
If you exposed the vsystem service using routes, delete the route:
# oc get route vsystem -o yaml >route-vsystem.bak.yaml <br><br>\# make a backup # oc delete route vsystem
-
-
(ocp-upgrade) Choose one of the OCP's upgrade methods (3.11) / (3.10) and execute it.
-
(SDH 2.3 to 2.4) Execute the following items from the Project setup that became necessary since SDH 2.4:
-
Load the
ipt_REDIRECT
kernel module on every schedulable node:# modprobe ipt_REDIRECT # echo "ipt_REDIRECT" > /etc/modules-load.d/ipt_redirect.conf
-
Execute the following as the cluster-admin in the SDH's namespace (e.g.
sdh
):# oc project sdh # oc adm policy add-scc-to-user privileged -z "vora-vflow-server" # oc patch namespace "$(oc project -q)" -p '{"metadata":{"annotations":{"openshift.io/node-selector":""}}}'
-
Set up helm according to the instructions in the Project setup.
-
-
(SDH 2.4 to 2.5) Execute the following items from the Project setup that became necessary since SDH 2.5:
- Deploy SDH Observer in the Data Hub's namespace.
- Make sure the OpenShift user performing the upgrade has been granted cluster-admin role. See becoming a cluster-admin for details.
- If an insecure external registry is used and kaniko shall be enabled, make sure to mark the registry as insecure.
-
Execute the following as the cluster-admin in the SDH's namespace (e.g.
sdh
):# oc adm policy add-scc-to-user hostaccess -z "$(oc project -q)-nodeexporter"
-
(SDH 2.5 to 2.6) Execute the following items from the Project setup that became necessary since SDH 2.6:
-
Execute the following as the cluster-admin in the SDH's namespace (e.g.
sdh
):# oc adm policy add-scc-to-user hostaccess -z "$(oc project -q)-nodeexporter"
-
-
(SDH 2.6 to 2.7) Execute the following items from the Project setup that became necessary since SDH 2.7:
-
Execute the following as the cluster-admin in the SDH's namespace (e.g.
sdh
):# oc adm policy add-scc-to-user privileged -z "vora-vsystem-$(oc project -q)-vrep"
-
-
Execute the SDH upgrade according to the official instructions. You may again choose between different upgrade methods:
- mpsl - Upgrading SAP Data Hub using SL Plugin with Maintenance Planner and SAP Host Agent (2.7) / (2.6) / (2.5) / (2.4) (recommended by SAP)
- mpfree - Upgrading SAP Data Hub using SL Plugin without Maintenance Planner and SAP Host Agent (2.7) / (2.6) / (2.5) / (2.4)
- manual - Upgrade SAP Data Hub Using the Command-Line Tool (2.7) / (2.6) / (2.5) / (2.4)
-
Execute the Post-Upgrade Procedures for the SDH (2.7) / (2.6) / (2.5) / (2.4).
-
If you exposed the vsystem service using routes, re-create the route. If no backup is available, please follow Using OpenShift Router and routes.
# oc create -f route-vsystem.bak.yaml
-
6. Appendix
6.1. Ceph and OCP integration
In order to dynamically provision Ceph volumes in OCP cluster, several steps need to be performed to make it aware of the storage back-end.
6.1.1. Create a dedicated pool and user in Ceph cluster for OCP
Please follow the instructions on Creating a pool for dynamic volumes in OCP documentation.
This will create the kube
pool and the kube
user in the Ceph cluster.
6.1.2. Install ceph-common package on OCP cluster hosts
Make sure to enable the following repository on OCP cluster hosts and install the ceph-common
package:
# subscription-manager repos --enable=rhel-7-server-rhceph-3-tools-rpms
# yum install -y ceph-common
This allows OCP nodes to interact with Ceph cluster.
6.1.3. Configure the ceph secrets
Ceph RBD storage class requires at least one secret (admin). However, for the sake of security, rather than using Ceph administrator's credentials for all interaction, it's recommended to use a dedicated user (e.g. kube
) for authentication against a dedicated pool for OCP (e.g. kube
).
- admin secret - needs to be present in an arbitrary namespace (in this example we stick to
kube-system
) - user secret - needs to be present in all namespaces that need to provision Ceph RBD persistent volumes; if not specified in storage class (see below), the admin secret will be used instead
-
Get the base64-encoded keys for both users with the following commands executed from a Ceph's administrator or MON node:
# ceph auth get-key client.admin | base64 YXFhOU5xRENueTJib2JhYS82cDF4NGVxQjdBVUw2dnZWdDZsWVc9PQ== # ceph auth get-key client.kube | base64 YXFhdk5xOUNCMkZlYUhhYXllWDhNaHlWTUFidDZWSC8vV0FyY1c9PQ==
-
As the cluster-admin, create the admin secret like this while making sure to use your own
$ADMIN_KEY
:# ADMIN_KEY="YXFhOU5xRENueTJib2JhYS82cDF4NGVxQjdBVUw2dnZWdDZsWVc9PQ==" # oc create -f - <<EOF apiVersion: v1 kind: Secret metadata: name: ceph-admin-secret namespace: kube-system data: key: ${ADMIN_KEY} type: kubernetes.io/rbd EOF
-
As the cluster-admin, create a custom project template that will provision user secret in each newly created project/namespace. Again make sure to use your own
$USER_KEY
.# USER_KEY="YXFhdk5xOUNCMkZlYUhhYXllWDhNaHlWTUFidDZWSC8vV0FyY1c9PQ==" # oc adm create-bootstrap-project-template -o yaml | oc patch --local -f - -o yaml --type=json -p '[{ "path": "/objects/0", "op": "add", "value": { "apiVersion": "v1", "kind": "Secret", "data": { "key": "'"$USER_KEY"'" }, "metadata": { "name": "ceph-user-secret" }, "type": "kubernetes.io/rbd" } }, { "path": "/metadata/namespace", "op": "add", "value": "default" }, { "path": "/metadata/name", "op": "add", "value": "ceph-project" }]' | oc create -f - -n default
This creates a template called
ceph-project
in thedefault
namespace. If another non-default project template is already being used, you can modify it to contain the following object instead:apiVersion: v1 kind: Secret metadata: name: ceph-user-secret namespace: kube-system data: key: ${USER_KEY} type: kubernetes.io/rbd
-
As the root user on all the OCP master nodes, modify the master's configuration file to instantiate the newly created template for all new projects:
# cp /etc/origin/master/master-config{,.bak}.yaml # oc ex config patch -p '{ "projectConfig": { "projectRequestTemplate": "default/ceph-project" } }' /etc/origin/master/master-config.bak.yaml >/etc/origin/master/master-config.yaml
-
Restart the master api and controllers for the change to take effect:
# # for OCP 3.9 only # systemctl restart atomic-openshift-master-{api,controllers} # # for OCP 3.10 or higher # for component in api controllers; do /usr/local/bin/master-restart $component $component; done
Now if a new project is created, there will be the ceph-user-secret
auto-generated. To verify, execute the following:
# PROJECT="test-$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 6 | head -n 1)"
# oc new-project $PROJECT
Now using project "test-zj1nq9" on server "https://ip-172-18-0-251.ec2.internal:8443".
...
# oc get secret ceph-user-secret
NAME TYPE DATA AGE
ceph-user-secret kubernetes.io/rbd 1 47s
# oc delete project $PROJECT
project.project.openshift.io "test-zj1nq9" deleted
6.1.4. Create Ceph RBD storage class
The following will create the storage class named ceph-rbd
usable in the whole OCP cluster and make it the default storage class for new PVCs. Make sure to use MONITORS
corresponding to your cluster.
# MONITORS="192.168.1.11:6789,192.168.1.12:6789,192.168.1.13:6789"
# oc create -f - <<EOF
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: ceph-rbd
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/rbd
parameters:
monitors: ${MONITORS}
adminId: admin
adminSecretName: ceph-admin-secret
adminSecretNamespace: kube-system
pool: kube
userId: kube
userSecretName: ceph-user-secret
EOF
If it is not desirable to make the ceph-rbd
storage class the default, remove the annotations
section from the yaml above or with the following command as a post-creation step:
# oc annotate sc ceph-rbd storageclass.kubernetes.io/is-default-class-
storageclass.storage.k8s.io/ceph-rbd annotated
Otherwise, if there is another storage class already marked as default, make sure to remove its annotation:
# oc get sc
NAME PROVISIONER AGE
ceph-rbd (default) kubernetes.io/rbd 1d
glusterfs-storage kubernetes.io/glusterfs 1d
gp2 (default) kubernetes.io/aws-ebs 1d
# oc annotate sc/gp2 storageclass.kubernetes.io/is-default-class- \
storageclass.beta.kubernetes.io/is-default-class-
storageclass.storage.k8s.io/gp2 annotated
# oc get sc
NAME PROVISIONER AGE
ceph-rbd (default) kubernetes.io/rbd 1d
glusterfs-storage kubernetes.io/glusterfs 1d
gp2 kubernetes.io/aws-ebs 1d
6.1.5. (Optional) Test the Ceph RBD storage class
Please follow the OCP documentation Using an existing Ceph cluster for dynamic persistent storage from step 6. Create the PVC object definition onward.
NOTE If your ceph-rbd
class is not the default, you need to specify it in the claim like this:
kind: PersistentVolumeClaim
...
requests:
storage: 2Gi
storageClassName: ceph-rbd
6.2. SDH uninstallation
Choose one of the uninstallation methods based on what method has been used for the installation.
When done, you may continue with a new installation round in the same or another namespace.
6.2.1. Using the SL Plugin
Please follow the SAP documentation Uninstall SAP Data Hub Using the SL Plugin (2.7) / (2.6) / (2.5) / (2.4) / (2.3) if you have installed SDH using either mpsl or mpfree methods.
6.2.2. Manual uninstallation
The installation script allows for project clean-up where all the deployed pods, persistent volume claims, secrets, etc. are deleted. If the manual installation method was used, the SDH can be uninstalled using the same script and different set of parameters. The snippet below is an example where the SDH installation resides in the project sdh
. In addition to running this script, the project needs to be deleted as well if the same project shall host the new installation.
# ./install.sh --delete --purge --force-deletion --namespace=sdh \
--docker-registry=registry.local.example:5000
# oc delete project sdh
# # start the new installation
The deletion of the project often takes quite a while. Until fully uninstalled, the project will be listed as Terminating
in the output of oc get project
. You may speed the process up with the following command. Again please mind the namespace.
# oc delete pods --all --grace-period=0 --force --namespace sdh
NOTE: Make sure not to run the same installation script more than once at the same time even when working with different OpenShift projects.
6.3. Uninstall Helm
# helm reset
6.4. Allow a non-root user to interact with Docker on Jump host
-
Append
-G dockerroot
toOPTIONS=
in/etc/sysconfig/docker
file on your Jump host. -
Run the following commands on the Jump host, after you modify the
/etc/sysconfig/docker
file. Make sure to replacealice
with your user name.# sudo usermod -a -G dockerroot alice # sudo chown root:dockerroot /var/run/docker.sock
-
Log out and re-log-in to the Jump host for the changes to become effective.
If the Jump host is part of the OCP cluster, make sure to add -G dockerroot
to openshift_docker_options
in the inventory file before the advanced installation.
6.5. Load nfsd kernel modules
Execute the following on all the schedulable nodes in bash:
# sudo mount -t nfsd nfsd /proc/fs/nfsd
# sudo modprobe nfsv4
# sudo tee /etc/modules-load.d/nfsd.conf <<<$'nfsd\nnfsv4'
6.6. Grant fluentd pods permissions to logs
The diagnostics-fluentd-*
pods need access to /var/log
directories on nodes. For this to work. The pods need to be run as privileged. There are two steps necessary to make it happen:
-
the
${SDH_PROJECT_NAME}-fluentd
service account needs to be added on privileged scc list with the following command copied from the project setup:# oc project "${SDH_PROJECT_NAME}" # oc adm policy add-scc-to-user privileged -z "$(oc project -q)-fluentd"
-
the daemonset
diagnostics-fluentd
needs to be patched to request the privileged security context.
The recommended way to execute the second step is to deploy deploy SDH Observer. Alternatively, the patching of the daemonset can be done either before the SDH Installation (only applicable for manual installation) or afterwards.
6.6.1. Before the installation
For SDH 2.5 or newer, the recommended approach is to deploy SDH observer that will patch the diagnostics-fluentd daemonset as soon as it appears.
Nevertheless, it is still possible to patch the helm template directly when installing SDH manually:
# patch -p1 -B patchbaks/ -r - <<EOF
Index: SAPDataHub-2.5.114-Foundation/deployment/helm/vora-diagnostic/templates/logging/fluentd.yaml
===================================================================
--- SAPDataHub-2.5.114-Foundation.orig/deployment/helm/vora-diagnostic/templates/logging/fluentd.yaml
+++ SAPDataHub-2.5.114-Foundation/deployment/helm/vora-diagnostic/templates/logging/fluentd.yaml
@@ -41,6 +41,7 @@ spec:
- name: "FLUENT_ELASTICSEARCH_SCHEME"
value: "http"
securityContext:
+ privileged: true
runAsUser: 0
runAsNonRoot: false
volumeMounts:
EOF
For SDH 2.4+ during manual installation only:
# patch -p1 -B patchbaks/ -r - <<EOF
Index: SAPDataHub-2.4.83-Foundation/deployment/helm/vora-diagnostic/templates/logging/fluentd.yaml
===================================================================
--- SAPDataHub-2.5.88-Foundation.orig/deployment/helm/vora-diagnostic/templates/logging/fluentd.yaml
+++ SAPDataHub-2.5.88-Foundation/deployment/helm/vora-diagnostic/templates/logging/fluentd.yaml
@@ -38,6 +38,8 @@ spec:
value: {{ .Context.elasticsearch.service.port | quote }}
- name: "FLUENT_ELASTICSEARCH_SCHEME"
value: "http"
+ securityContext:
+ privileged: true
volumeMounts:
- name: "settings"
mountPath: "/etc/fluent"
EOF
For SDH 2.3 during manual installation only:
# patch -p1 -B patchbaks/ -r - <<EOF
Index: SAPDataHub-2.3.173-Foundation/deployment/helm/vora-diagnostic/templates/logging/fluentd-kubernetes-ds.yaml
===================================================================
--- SAPDataHub-2.3.173-Foundation.orig/deployment/helm/vora-diagnostic/templates/logging/fluentd-kubernetes-ds.yaml
+++ SAPDataHub-2.3.173-Foundation/deployment/helm/vora-diagnostic/templates/logging/fluentd-kubernetes-ds.yaml
@@ -63,6 +63,8 @@ spec:
- name: {{ .Values.docker.imagePullSecret }}
{{- end }}
terminationGracePeriodSeconds: 30
+ securityContext:
+ privileged: true
volumes:
- name: config-volume
configMap:
EOF
6.6.2. After the SDH installation
For OCP cluster releases 3.10 or higher, the recommended approach is to deploy SDH observer.
For older releases, execute the following command once the SDH installation is finished in the Data Hub's namespace.
# oc patch ds/diagnostics-fluentd -p '{ "spec": { "template": { "spec": {
"containers": [{ "name": "diagnostics-fluentd", "securityContext": { "privileged": true }}]
}}}}'
The fluentd pods will get restarted automatically.
6.7. Make the installer cope with just 3 nodes in cluster
IMPORTANT: This hint is useful for just small PoCs, not for production deployment. For the latter, please increase the number of schedulable compute nodes.
SDH's dlog pod expects at least 3 schedulable compute nodes that are neither master nor infra nodes. This requirement can be mitigated by reducing replication factor of dlog pod with the following parameters passed either to the installation script (when installing manually) or as Additional Installation Parameters during mpsl or mpfree installation methods:
-e=vora-cluster.components.dlog.standbyFactor=0 -e=vora-cluster.components.dlog.replicationFactor=2
Alternatively, you may choose 1 for both standby and replication factors. The parameters are documented in the Installation Guide (2.7) / (2.6) / (2.5) / (2.4) / (2.3).
6.8. Unset the default node selector on Data Hub's project
The daemonsets deployed by the SDH installer are expected to deploy to all the schedulable nodes, including infra nodes. If there is a default node selector set that is not present on all the schedulable nodes, the pods will get restricted to it. This will result in less available pods than desired:
# oc get daemonset -n sdh
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
diagnostics-fluentd 4 3 3 3 3 <none> 16h
diagnostics-prometheus-node-exporter 4 3 3 3 3 <none> 16h
vsystem-module-loader 4 3 3 3 3 <none> 16h
For example, on OCP 3.10, the default node selector is set to node-role.kubernetes.io/compute=true
. So unless overridden, all the pods will be scheduled to compute nodes. And as the following output shows, only 3 nodes out of 4 have the compute role.
# oc get nodes
NAME STATUS ROLES AGE VERSION
master.example.com Ready infra,master 2d v1.10.0+b81c8f8
node1.example.com Ready compute 2d v1.10.0+b81c8f8
node2.example.com Ready compute 2d v1.10.0+b81c8f8
node3.example.com Ready compute 2d v1.10.0+b81c8f8
To address this, as cluster-admin, set the Data Hub project's node selector to empty string, which overrides the global setting:
# oc patch namespace sdh -p '{"metadata":{"annotations":{"openshift.io/node-selector":""}}}'
To actually trigger the deployment of the missing pods, re-create the daemonsets:
# oc get ds -o yaml | oc replace --force -f -
6.9. Deploy SDH Observer
The SDH Observer observes the SDH namespace and applies fixes to deployments as they appear. It does the following for OCP releases prior to 4.1:
- modifies the Pipeline Modeler (aka vflow) to run as Super Privileged Container to enable it to access
/var/run/docker.sock
socket on the host if kaniko builds are disabled - enables the Pipeline Modeler (aka vflow) to talk to an insecure registry - needed only if kaniko builds are enabled and the registry is insecure
- makes the SDH's diagnostic-fluentd pods privileged to allow them to access log files on the hosts
Apart from accessing resources in sdh
namespace, it also requires node-reader
cluster role.
To deploy it, as a cluster-admin execute the following command in the SDH namespace before, during or after the SDH installation:
# OCPVER=v3.11 # this must match OCP minor release
# INSECURE_REGISTRY=false # set to true if the registry is insecure
# oc process -f https://raw.githubusercontent.com/redhat-sap/sap-datahub/master/sdh-observer.yaml \
NAMESPACE="$(oc project -q)" \
BASE_IMAGE_TAG="${OCPVER:-3.11}" \
MARK_REGISTRY_INSECURE=${INSECURE_REGISTRY:-0} | oc create -f -
NOTE: that the BASE_IMAGE_TAG
must match one of the tags available in the quay.io/openshift/origin-cli repository. The difference between the client's minor release version and OCP server's minor release must not exceed 1.
For IBM Cloud™ specifics, please visit Deploying Red Hat's SAP Data Hub (SDH) Observer.
6.10. Permit Pipeline Modeler to access Docker socket
NOTE: applicable to OCP cluster release 3.9 or older. For newer releases, please follow Deploy SDH Observer instead.
The SDH Pipeline Modeler running in the vflow pod needs access to /var/run/docker.sock
socket in order to build images. This is a security violation of the default SELinux policy and it will be denied at runtime unless explicitly allowed. To allow it, run the following commands as a user with root permissions on all schedulable nodes:
# semanage fcontext -m -t container_file_t -f s "/var/run/docker\.sock"
# restorecon -v /var/run/docker.sock
To make the change permanent, execute the following on all the nodes:
# cat >/etc/systemd/system/docker.service.d/socket-context.conf <<EOF
[Service]
ExecStartPost=/sbin/restorecon /var/run/docker.sock
EOF
6.11. Marking the vflow registry as insecure
NOTE: applicable only when kaniko image builds are enabled.
NOTE: applicable before, during or a after SDH installation.
In SAP Data Hub 2.5.x and 2.6.x releases, it is possible to configure insecure registry for Pipeline Modeler (aka vflow pod) neither via installer nor in the UI.
The insecure registry needs to be set if the container registry listens on an insecure port (HTTP) or the communication is encrypted using a self-signed certificate.
Without the insecure registry set, kaniko builder cannot push built images into the configured registry for the Pipeline Modeler (see "Container Registry for Pipeline Modeler" Input Parameter at the official SAP Data Hub documentation.
To mark the configured vflow registry as insecure, the SDH Observer needs to be deployed with MARK_REGISTRY_INSECURE=true
parameter. If it is already deployed, it can be re-configured to take care of insecure registries by executing the following command in the sdh
namespace:
# oc set env dc/sdh-observer MARK_REGISTRY_INSECURE=true
Once deployed, all the existing pipeline modeler pods will be patched. It will take a couple of tens of seconds until all the modified pods become available.
For more information, take a look at SAP Data Hub RHT CoP repo.
6.12. Running SDH pods on particular nodes
Due to shortcomings in SDH's installer, the validation of SDH installation fails if its daemonsets are not deployed to all the nodes in the cluster.
Therefor, the installation should be executed without a restriction on nodes. After the installation is done, the pods can be re-scheduled to desired nodes like this:
-
choose a label to apply to the SAP Data Hub project and the desired nodes (e.g.
run-sdh-project=sdhblue
) -
label the desired nodes (in this example
worker1
,worker2
,worker3
andworker4
)# for node in worker{1,2,3,4}; do oc label node/$node run-sdh-project=sdhblue; done
-
set the project node selector of the
sdhblue
namespace to match the label# oc patch namespace sdhblue -p '{"metadata":{"annotations":{"openshift.io/node-selector":"run-sdh-project=sdhblue"}}}'
-
evacuate the pods from all the other nodes by killing them (requires
jq
utility installed)# oc project sdhblue # switch to the SDH project # label="run-sdh-project=sdhblue" # set the chosen label # nodeNames="$(oc get nodes -o json | jq -c '[.items[] | select(.metadata.labels["'"${label%=*}"'"] == "'"${label#*=}"'") | .metadata.name]')" # oc get pods -o json | jq -r '.items[] | . as $pod | select(('"$nodeNames"' | all(. != $pod.spec.nodeName))) | "pod/\(.metadata.name)"' | xargs -r oc delete
NOTE: Please make sure the Data Hub instance is not being used because killing its pods will cause a downtime.
The pods will be re-launched on the nodes labeled with run-sdh-project=sdhblue
. It may take several minutes before the SDH becomes available again.
6.13. Running multiple SDH instances on a single OCP cluster
Two instances of SAP Data Hub running in parallel on a single OCP cluster have been validated. Running more instances is possible, but most probably needs an extra support statement from SAP.
Please consider the following before deploying more than one SDH instance to a cluster:
- Each SAP Data Hub instance must run in its own namespace/project.
- It is recommended to dedicate particular nodes to each SDH instance.
- It is recommended to use ovs-multitenant network plug-in for project-level network isolation and improved security. This, however, cannot be changed post OCP installation.
- If running the production and test (aka blue-green) SDH deployments on a single OCP cluster, mind also the following:
- There is no way to test an upgrade of OCP cluster before an SDH upgrade.
- The idle (non-productive) landscape should have the same network security as the live (productive) one.
To deploy a new SDH instance to OCP cluster, please repeat the steps from project setup starting from point 6 with a new project name and continue with SDH Installation.
6.14. Using AWS ECR Registry for the Modeler
Post-sdh-installation step
SAP Data Hub installer allows to specify "AWS IAM Role for Pipeline Modeler" when AWS ECR Registry is used as the external registry. However, due to a bug in Data Hub, the Modeler cannot use it. In order to use AWS ECR Registry for Data Hub, one can follow the instructions at Provide Access Credentials for a Password Protected Container Registry with the following modification.
# cat >/tmp/vsystem-registry-secret.txt <<EOF
username: "AWS_ACCESS_KEY_ID"
password: "AWS_SECRET_ACCESS_KEY"
EOF
The AWS_*
credentials must belong to a user that has the power-user access to the ECR registry provided by the AmazonEC2ContainerRegistryPowerUser. Please refer to Amazon ECR Repository Policies when you need fine-grained access control.
7. Troubleshooting Tips
7.1. SDH Installation or Upgrade problems
7.1.1. HANA, consul and UAA pods keep restarting
If the mentioned pods keep restarting, like the following output illustrates, you may be using a buggy version of docker.
# oc get pods
NAME READY STATUS RESTARTS AGE
auditlog-759889fc5-d2sbm 1/1 Running 3 25m
hana-0 1/1 Running 5 29m
tiller-deploy-5778c7768-96jbc 1/1 Running 0 1d
uaa-7df64bf-wnssm 0/2 Init:0/1 0 24m
vora-consul-0 1/1 Running 5 29m
vora-consul-1 1/1 Running 5 29m
vora-consul-2 1/1 Running 5 29m
vora-deployment-operator-7dffb56687-pxwkz 1/1 Running 0 29m
vora-security-operator-7b8ffcfb59-tb6n4 1/1 Running 0 24m
vora-spark-resource-staging-server-849c68f457-dpxst 1/1 Running 0 29m
Make sure not to install docker-1.13.1-84.git07f3374.el7.x86_64
on schedulable nodes. Either update to a newer version or downgrade to an earlier and exclude the version from future updates.
# yum downgrade -y docker-1.13.1-75.git8633870.el7_5.x86_64
# sed -i "s/^exclude.*/\0 docker*-1.13.1-84.git07f3374.el7/" /etc/yum.conf
7.1.2. Vsystem-vrep pod not starting
Upon inspection, it looks like the NFS filesystem is not supported:
# oc get pods -l vora-component=vsystem-vrep
NAME READY STATUS RESTARTS AGE
vsystem-vrep-0 0/1 CrashLoopBackOff 6 8m
# oc logs $(oc get pods -o name -l vora-component=vsystem-vrep)
2018-08-30 10:41:51.935459|+0000|INFO |Starting Kernel NFS Server||vrep|1|Start|server.go(53)
2018-08-30 10:41:52.014885|+0000|ERROR|service nfs-kernel-server start: Not starting NFS kernel daemon: no support in current kernel.||vrep|1|Start|server.go(80)
2018-08-30 10:41:52.014953|+0000|ERROR|error starting nfs-kernel-server: exit status 1
||vrep|1|Start|server.go(82)
2018-08-30 10:41:52.014976|+0000|FATAL|Error starting NFS server: NFSD error||vsystem|1|fail|server.go(145)
This means that the proper NFS kernel module hasn't been loaded yet. Make sure to load it permanently.
7.1.3. Vora Installation Error: timeout at “Deploying vora-consul”
Vora Installation Error: timeout at "Deploying vora-consul with: helm install --namespace vora -f values.yaml ..."
To view the log messages, you can login to the OpenShift web console, navigate to Applications -> Pods, select the failing pod e.g. vora-consul-2-0
, and check the log under the Events
tab.
A common error: if the external image registry is insecure, but the OpenShift cluster is configured to pull from a secure registry, you will see errors in the log. If secure registry is not feasible, follow the instructions on configuring the registry as insecure.
7.1.4. Too few worker nodes
If you see the installation failing with the following error, there are too few schedulable non-infra nodes in the cluster.
Status:
Message: Less available workers than Distributed Log requirements
State: Failed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
New Vora Cluster 11m vora-deployment-operator Started processing
Update Vora Cluster 11m vora-deployment-operator Processing failed: less available workers than Distributeed Log requirements
2018-09-14T16:16:15+0200 [ERROR] Timeout waiting for vora cluster! Please check the status of the cluster from above logs and kubernetes dashboard...
If you have at least 2 schedulable non-infra nodes, you may still make the installation succeed by reducing the dlog's replication factor. After the patch is applied, make sure to uninstall the failed installation and start it anew.
7.1.5. Privileged security context unassigned
If there are pods, replicasets, or statefulsets not coming up and you can see an event similar to the one below, you need to add privileged security context constraint to its service account.
# oc get events | grep securityContext
1m 32m 23 diagnostics-elasticsearch-5b5465ffb.156926cccbf56887 ReplicaSet Warning FailedCreate replicaset-controller Error creating: pods "diagnostics-elasticsearch-5b5465ffb-" is forbidden: unable to validate against any security context constraint: [spec.initContainers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.initContainers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.initContainers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
Copy the name in the fourth column (the event name - diagnostics-elasticsearch-5b5465ffb.156926cccbf56887
) and determine its corresponding service account name.
# eventname="diagnostics-elasticsearch-5b5465ffb.156926cccbf56887"
# oc get -o go-template=$'{{with .spec.template.spec.serviceAccountName}}{{.}}{{else}}default{{end}}\n' \
"$(oc get events "${eventname}" -o jsonpath=$'{.involvedObject.kind}/{.involvedObject.name}\n')"
sdh-elasticsearch
The obtained service account name (sdh-elasticsearch
) now needs to be assigned privileged scc:
# oc adm policy add-scc-to-user privileged -z sdh-elasticsearch
The pod then shall come up on its own unless this was the only problem.
7.1.6. No Default Storage Class set
If pods are failing because because of PVCs not being bound, the problem may be that the default storage class has not been set and no storage class was specified to the installer.
# oc get pods
NAME READY STATUS RESTARTS AGE
hana-0 0/1 Pending 0 45m
vora-consul-0 0/1 Pending 0 45m
vora-consul-1 0/1 Pending 0 45m
vora-consul-2 0/1 Pending 0 45m
# oc describe pvc data-hana-0
Name: data-hana-0
Namespace: sdh
StorageClass:
Status: Pending
Volume:
Labels: app=vora
datahub.sap.com/app=hana
vora-component=hana
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 47s (x126 over 30m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
To fix this, either make sure to set the Default StorageClass or provide the storage class name to the installer. For the manual installation, that would be ./install.sh --pv-storage-class STORAGECLASS
.
7.1.7. vsystem-app pods not coming up
If you have SELinux in enforcing mode you may see the pods launched by vsystem crash-looping because of the container named vsystem-iptables
like this:
# oc get pods
NAME READY STATUS RESTARTS AGE
auditlog-59b4757cb9-ccgwh 1/1 Running 0 40m
datahub-app-db-gzmtb-67cd6c56b8-9sm2v 2/3 CrashLoopBackOff 11 34m
datahub-app-db-tlwkg-5b5b54955b-bb67k 2/3 CrashLoopBackOff 10 30m
...
internal-comm-secret-gen-nd7d2 0/1 Completed 0 36m
license-management-gjh4r-749f4bd745-wdtpr 2/3 CrashLoopBackOff 11 35m
shared-k98sh-7b8f4bf547-2j5gr 2/3 CrashLoopBackOff 4 2m
...
vora-tx-lock-manager-7c57965d6c-rlhhn 2/2 Running 3 40m
voraadapter-lsvhq-94cc5c564-57cx2 2/3 CrashLoopBackOff 11 32m
voraadapter-qkzrx-7575dcf977-8x9bt 2/3 CrashLoopBackOff 11 35m
vsystem-5898b475dc-s6dnt 2/2 Running 0 37m
When you inspect one of those pods, you can see an error message similar to the one below:
# oc logs voraadapter-lsvhq-94cc5c564-57cx2 -c vsystem-iptables
2018-12-06 11:45:16.463220|+0000|INFO |Execute: iptables -N VSYSTEM-AGENT-PREROUTING -t nat||vsystem|1|execRule|iptables.go(56)
2018-12-06 11:45:16.465087|+0000|INFO |Output: iptables: Chain already exists.||vsystem|1|execRule|iptables.go(62)
Error: exited with status: 1
Usage:
vsystem iptables [flags]
Flags:
-h, --help help for iptables
--no-wait Exit immediately after applying the rules and don't wait for SIGTERM/SIGINT.
--rule stringSlice IPTables rule which should be applied. All rules must be specified as string and without the iptables command.
And in the audit log on the node, where the pod got scheduled, you should be able to find an AVC denial similar to:
# grep 'denied.*iptab' /var/log/audit/audit.log
type=AVC msg=audit(1544115868.568:15632): avc: denied { module_request } for pid=54200 comm="iptables" kmod="ipt_REDIRECT" scontext=system_u:system_r:container_t:s0:c826,c909 tcontext=system_u:system_r:kernel_t:s0 tclass=system permissive=0
...
To fix this, the ip_REDIRECT
kernel module needs to be loaded with the following commands executed on all the schedulable nodes:
# modprobe ipt_REDIRECT
# echo "ipt_REDIRECT" > /etc/modules-load.d/ipt_redirect.conf
7.1.8. Fluentd pods cannot access /var/log
If you see errors like shown below in the logs of fluentd pods, make sure to follow the Grant fluentd pods permissions to logs to fix the problem.
# oc logs $(oc get pods -o name -l datahub.sap.com/app-component=fluentd | head -n 1) | tail -n 20
2019-04-15 18:53:24 +0000 [error]: unexpected error error="Permission denied @ rb_sysopen - /var/log/es-containers-sdh25-mortal-garfish.log.pos"
2019-04-15 18:53:24 +0000 [error]: suppressed same stacktrace
2019-04-15 18:53:25 +0000 [warn]: '@' is the system reserved prefix. It works in the nested configuration for now but it will be rejected: @timestamp
2019-04-15 18:53:26 +0000 [error]: unexpected error error_class=Errno::EACCES error="Permission denied @ rb_sysopen - /var/log/es-containers-sdh25-mortal-garfish.log.pos"
2019-04-15 18:53:26 +0000 [error]: /usr/lib64/ruby/gems/2.5.0/gems/fluentd-0.14.8/lib/fluent/plugin/in_tail.rb:151:in `initialize'
2019-04-15 18:53:26 +0000 [error]: /usr/lib64/ruby/gems/2.5.0/gems/fluentd-0.14.8/lib/fluent/plugin/in_tail.rb:151:in `open'
...
7.2. Validation errors
May happen during the validation phase initiated by running the SDH installation script with the validate flag:
# ./install.sh --validate --namespace=sdh
7.2.1. Services not installed
Failure description
vora-vsystem
validation complains about services not installed even though deployed:
2018-08-28T13:32:19+0200 [INFO] Validating...
2018-08-28T13:32:19+0200 [INFO] Running validation for vora-cluster...OK!
2018-08-28T13:33:14+0200 [INFO] Running validation for vora-sparkonk8s...OK!
2018-08-28T13:34:56+0200 [INFO] Running validation for vora-vsystem...2018-08-28T13:35:01+0200 [ERROR] Failed! Please see the validation logs -> /home/miminar/SAPDataHub-2.3.144-Foundation/logs/20180828_133214/vora-vsystem_validation_log.txt
2018-08-28T13:35:01+0200 [INFO] Running validation for datahub-app-base-db...OK!
2018-08-28T13:35:01+0200 [ERROR] There is a failed validation. Exiting...
# cat /home/miminar/SAPDataHub-2.3.144-Foundation/logs/20180828_133214/vora-vsystem_validation_log.txt
2018-08-28T13:34:56+0200 [INFO] Connecting to vSystem ...
2018-08-28T13:34:57+0200 [INFO] Wait until pod vsystem-vrep-0 is running...
2018-08-28T13:34:57+0200 [INFO] Wait until containers in the pod vsystem-vrep-0 are ready...
2018-08-28T13:34:58+0200 [INFO] Wait until pod vsystem-7d7ffdd649-8mcxv is running...
2018-08-28T13:34:58+0200 [INFO] Wait until containers in the pod vsystem-7d7ffdd649-8mcxv are ready...
2018-08-28T13:35:01+0200 [INFO] Logged in!
2018-08-28T13:35:01+0200 [INFO] Installed services:
/home/miminar/SAPDataHub-2.3.144-Foundation/validation/vora-vsystem/vsystem-validation.sh: line 19: error: command not found
Alternatively, the following message may appear at the end of the vora-vsyste_validation_log.txt
:
2018-08-31T08:59:52+0200 [ERROR] Connection Management is not installed
Resolution
If there are proxy environment variables set, make sure to include 127.0.0.1
and localhost
in no_proxy
and NO_PROXY
environment variables. You may find Setting Proxy Overrides and Working with HTTP Proxies helpful. Apart from the recommended settings, do not forget to include My_Image_Registry_FQDN
in the NO_PROXY
settings if the registry is hosted inside the proxied network.
7.2.2. Less than desired daemonset pods deployed
If you can see the diagnostics pod failing because of less than desired daemonset pods available. Most probably, there is a default node selector set in the master config.
Start diagnostics readiness checks for namespace dh24
2018-12-07T11:00:13+0100 [INFO] Check readiness of daemonset diagnostics-fluentd ............. failed
2018-12-07T11:05:23+0100 [ERROR] daemonset diagnostics-fluentd not ready: found 3/4 ready pods
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/diagnostics-fluentd 4 3 3 3 3 <none> 16d
daemonset.apps/diagnostics-prometheus-node-exporter 4 3 3 3 3 <none> 16d
daemonset.apps/vsystem-module-loader 4 3 3 0 3 <none> 16d
Make sure to unset the default node selector on the Data Hub project to fix this.
7.2.3. Diagnostics Prometheus Node Exporter pods not starting
During an installation or upgrade, it may happen, that the Node Exporter pods keep restarting:
# oc get pods | grep node-exporter
diagnostics-prometheus-node-exporter-5rkm8 0/1 CrashLoopBackOff 6 8m
diagnostics-prometheus-node-exporter-hsww5 0/1 CrashLoopBackOff 6 8m
diagnostics-prometheus-node-exporter-jxxpn 0/1 CrashLoopBackOff 6 8m
diagnostics-prometheus-node-exporter-rbw82 0/1 CrashLoopBackOff 7 8m
diagnostics-prometheus-node-exporter-s2jsz 0/1 CrashLoopBackOff 6 8m
The validation will fail like this:
2019-08-15T15:05:01+0200 [INFO] Validating...
2019-08-15T15:05:01+0200 [INFO] Running validation for vora-cluster...OK!
2019-08-15T15:05:51+0200 [INFO] Running validation for vora-vsystem...OK!
2019-08-15T15:05:57+0200 [INFO] Running validation for vora-diagnostic...2019-08-15T15:11:06+0200 [ERROR] Failed! Please see the validation logs -> /root/wsp/clust/foundation/logs/20190815_150455/vora-diagnostic_validation_log.txt
...
2019-08-15T15:11:35+0200 [ERROR] There is a failed validation. Exiting...
# cat /root/wsp/clust/foundation/logs/20190815_150455/vora-diagnostic_validation_log.txt
2019-08-15T15:05:57+0200 [INFO] Start diagnostics readiness checks for namespace sdhup
2019-08-15T15:05:57+0200 [INFO] Check readiness of daemonset diagnostics-fluentd ... ok
2019-08-15T15:05:58+0200 [INFO] Check readiness of daemonset diagnostics-prometheus-node-exporter ............. failed
2019-08-15T15:11:06+0200 [ERROR] daemonset diagnostics-prometheus-node-exporter not ready: found 2/5 ready pods
The possible reason is that the limits on resource consumption set on the pods are too low. To address this post-installation, you can patch the daemonset like this (in the SDH's namespace):
# oc patch -p '{"spec": {"template": {"spec": {"containers": [
{ "name": "diagnostics-prometheus-node-exporter",
"resources": {"limits": {"cpu": "200m", "memory": "100M"}}
}]}}}}' ds/diagnostics-prometheus-node-exporter
To address this during the installation (using any installation method), add the following parameters:
-e=vora-diagnostics.resources.prometheusNodeExporter.resources.limits.cpu=200m
-e=vora-diagnostics.resources.prometheusNodeExporter.resources.limits.memory=100M
And then restart the validation (using the manual method) like this:
# ./install.sh --validate -n=sdh
7.2.4. Checkpoint store validation
If you see the following error during the installation for the checkpoint store validation, it means the bucket or the given directory does not exist. Make sure to create them first.
2018-12-05T06:30:17-0500 [INFO] Validating checkpoint store...
2018-12-05T06:30:17-0500 [INFO] Checking connection...
2018-12-05T06:30:42-0500 [INFO] AFSI CLI ouput:
2018-12-05T06:30:42-0500 [INFO] Unknown error when executing operation: Couldn't open URL : Cannot open connection: file/directory 'bucket1/dir' does not exist.
pod sdh/checkpoint-store-administration terminated (Error)
2018-12-05T06:30:42-0500 [ERROR] Connection check failed!
2018-12-05T06:30:42-0500 [ERROR] Checkpoint store validation failed!
2018-12-05T06:30:42-0500 [ERROR] Please reconfigure your checkpoint store connection...
7.2.5. Node goes down when new tenants are created or new users added to SDH
If a node running vsystem-vrep-0
pod goes down when a new SDH tenant or user is created or a new launchpad is accessed, there is probably nfsv4
kernel module not loaded.
Diagnosis
- Run
oc get pods -o wide
insdh
namespace and look forvsystem-vrep-0
, which is either not running or restarted. - Determine its corresponding node from the output.
- Login to the node via
ssh
. - Run
systemctl
and watch it hang. - Run
docker ps
and watch it hand.
Resolution
- Make sure to follow Load nfsd kernel modules to load the necessary kernel modules.
- Reboot the hanging node.
7.3. Pipeline Modeler troubleshooting
7.3.1. Graphs cannot be run in the Pipeline Modeler
If in the log of the vflow pod, there are problems with reaching outside of the private network like the following output shows, make sure to verify your proxy settings and make sure that the installation script is run with the following parameters:
# ./install.sh --cluster-http-proxy="${HTTP_PROXY}" --cluster-https-proxy="${HTTPS_PROXY}" --cluster-no-proxy="${NO_PROXY}"
vflow log can be displayed with a command like oc logs $(oc get pods -o name -l vora-component=vflow | head -n 1)
:
W: Failed to fetch http://deb.debian.org/debian/dists/stretch/InRelease Could not connect to deb.debian.org:80 (5.153.231.4), connection timed out [IP: 5.153.231.4 80]
W: Failed to fetch http://security.debian.org/debian-security/dists/stretch/updates/InRelease Could not connect to security.debian.org:80 (217.196.149.233), connection timed out [IP: 217.196.149.233 80]
W: Failed to fetch http://deb.debian.org/debian/dists/stretch-updates/InRelease Unable to connect to deb.debian.org:http: [IP: 5.153.231.4 80]
7.3.2. Graphs cannot be built by the Pipeline Modeler
If an attempt to run a pipeline fails with a message like the one below, the most probable reason is that SELinux prevents the modeler from accessing the docker socket.
failed to prepare graph description: failed to prepare image: error building docker image. Docker daemon error: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.23/build?buildargs=%7B%22HTTPS_PROXY%22%3A%22%22%2C%22HTTP_PROXY%22%3A%22%22%2C%22NO_PROXY%22%3A%22%22%2C%22http_proxy%22%3A%22%22%2C%22https_proxy%22%3A%22%22%2C%22no_proxy%22%3A%22%22%7D&cachefrom=null&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&forcerm=1&labels=null&memory=0&memswap=0&networkmode=&rm=1&shmsize=0&t=ip-172-18-11-229.ec2.internal%3A5000%2Fvora%2Fvflow-node-482f9340ff573d1a7a03108d18556792bb70ae2a%3A2.5.29-com.sap.debian&ulimits=null: dial unix /var/run/docker.sock: connect: permission denied
To verify it is indeed an SELinux problem, you can inspect the logs as a root user on a node where the vflow pod is running like this:
# ausearch --input-logs -m AVC,USER_AVC,SELINUX_ERR,USER_SELINUX_ERR -i | grep docker | tail -n 10
type=AVC msg=audit(05/09/2019 12:59:00.741:170409) : avc: denied { connectto } for pid=119617 comm=tmp.IojHz2WXIo path=/run/docker.sock scontext=system_u:system_r:container_t:s0:c9,c12 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_stream_socket permissive=0
The output says that the connection to the docker.sock
has been denied to a application with PID 119617 running in a container. This confirms there is an SELinux issue.
7.3.2.1. Determine the Pipeline Modeler's node
To determine the node, where vflow pod runs and where to run the ausearch
command above, you can run the following:
# oc get -o wide pods -l datahub.sap.com/app-component=vflow
NAME READY STATUS RESTARTS AGE IP NODE
vflow-9761d22d34f4e7fa22ff797e1e10e22aea9a0771qh2pn-76f84dxrqgm 1/1 Running 0 18m 10.129.0.63 ip-172-18-4-77.ec2.internal
If there are multiple vflow pods, you can filter further by tenant name and tenant user. In the example below it is default and sdhadmin respectively:
# oc get -o wide pods -l datahub.sap.com/app-component=vflow,vsystem.datahub.sap.com/tenant=default,vsystem.datahub.sap.com/user=sdhadmin
NAME READY STATUS RESTARTS AGE IP NODE
vflow-9761d22d34f4e7fa22ff797e1e10e22aea9a0771qh2pn-76f84dxrqgm 1/1 Running 0 18m 10.129.0.63 ip-172-18-4-77.ec2.internal
7.3.2.2. Fix the SELinux issue
Please follow Deploy SDH Observer to automatically patch all recent and future vflow pods if your OCP cluster is 3.10 or newer. Otherwise, please follow Permit Pipeline Modeler to access Docker socket.
7.3.3. Pipeline Modeler cannot push images to the registry
If the SDH is configured to build images with kaniko and the vflow registry is not configured with a certificate signed by a trusted certificate authority, the builder will not be able to push the built images there. The Pipeline Modeler will then label the graphs as dead
with a message like the following:
failed to prepare graph description: failed to prepare image: build failed for image: internal-registry.example.org:5000/vora/vflow-node-482f9340ff573d1a7a03108d18556792bb70ae2a:com.sap.debian
To determine the cause, the log of the vflow pod needs to be inspected. There, you can notice the root issue - in this case it is the insecure registry internal-registry.example.org:5000
accessible only via HTTP protocol.
# oc logs $(oc get pods -o name -l vora-component=vflow | head -n 1)
...
INFO[0019] Using files from context: [/workspace/vflow]
INFO[0019] COPY /vflow /vflow
INFO[0019] Taking snapshot of files...
INFO[0023] ENTRYPOINT ["/vflow"]
error pushing image: failed to push to destination internal-registry.example.org:5000/vora/vflow-node-482f9340ff573d1a7a03108d18556792bb70ae2a:com.sap.debian: Get https://internal-registry.example.org:5000/v2/: http: server gave HTTP response to HTTPS client |vflow|container|192|getPodLogs|build.go(126)
...
To resolve it, you can:
- either secure the registry with a properly signed certificate (not self-signed!)
- or mark the registry as insecure
7.3.4. Modeler does not run when AWS ECR registry is used
If the initialization of the vflow pod fails with a message like the one below, your SDH deployment suffers from a bug that prevents it from using the AWS IAM Role for authentication against the AWS ECR Registry.
# oc logs $(oc get pods -o name -l vora-component=vflow | head -n 1)
....
2019-07-15 12:23:03.147231|+0000|INFO |Statistics Publisher started with publication interval 30s ms|vflow|statistic|38|loop|statistics_monitor.go(89)
2019-07-15 12:23:30.446482|+0000|INFO |connecting to vrep at vsystem-vrep.sdh:8738|vflow|container|1|NewImageFactory|factory.go(131)
2019-07-15 12:23:30.446993|+0000|INFO |Creating AWS ECR Repository 'sdh26/vora/vflow-node-482f9340ff573d1a7a03108d18556792bb70ae2a'|vflow|container|1|assertRepositoryExists|ecr.go(106)
2019-07-15 12:23:35.001030|+0000|ERROR|API node execution is failed: cannot instantiate docker registry client: failed to assert repository existance: Error creating AWS ECR repository 'sdh26/vora/vflow-node-482f9340ff573d1a7a03108d18556792bb70ae2a': NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
failed to create image factory
main.runMaster
/data/xmake/prod-build7010/w/velocity/.../vflow/src/cmd/vflow/main.go:386
main.main
/data/xmake/prod-build7010/w/velocity/.../vflow/src/cmd/vflow/main.go:357
runtime.main
/data/xmake/tools/xmake-tools/FA/org.golang.download.go/go/1.11.4-bin/go/src/runtime/proc.go:201
runtime.goexit
/data/xmake/tools/xmake-tools/FA/org.golang.download.go/go/1.11.4-bin/go/src/runtime/asm_amd64.s:1333|vflow|vflow|1|main|main.go(359)
The work-around is to use a registry pull secret.
-
7.3 is applicable only for OCP 3.9. ↩︎
-
any storage provisioned by your cloud provider listed in OCP docs (3.11) / (3.10) / 3.9 ↩︎
-
For example:
openshift_docker_insecure_registries=['172.30.0.0/16', 'docker-registry.default.svc:5000', 'My_Image_Registry_FQDN:5000']
↩︎ -
See Configuring OpenShift Container Platform for AWS with Ansible for more details. ↩︎
-
The subdomain is a wildcard domain, which resolves to the OpenShift Router's IP. The router routes requests to the exposed OCP and SDH services based on the target hostname. The domain is needed to Expose SDH services externally. ↩︎
-
The environment variable
$KUBECONFIG
shall be set instead. ↩︎ -
This setting assumes that all Data Hub services are accessed under the same name using NodePort. However, using OpenShift Routes, each service will be assigned a different hostname. Therefor, for production environment, it is necessary to provide signed certificates for these routes. You may consider configuring a custom wildcard certificate for master default subdomain. ↩︎
Comments