Install SAP Data Hub 1.X Distributed Runtime on OpenShift Container Platform
Table of Contents
- 1. OpenShift Container Platform validation version matrix
- 2. Hardware/VM Requirements
- 2.1. Persistent Volumes
- 2.2. Compatibility Matrix
- 2.3. OpenShift Cluster
- 2.4. Jump Server
- 2.5. Hadoop (Optional)
- 3. Install Red Hat OpenShift Container Platform
- 3.1. Prepare the Subscription and Packages
- 3.2. Install OpenShift
- 4. Configure the Prerequisites for SAP Data Hub Distributed Runtime (SAP Vora)
- 4.1. Set up an External Docker Registry
- 4.2. Configure the OpenShift Cluster for Vora
- 4.3. Prepare the Jump Server
- 5. Install SAP Vora on OpenShift
- 5.1. Download and unpack the SAP VORA binaries
- 5.1.1. Installation on cluster with 3 nodes in total
- 5.2. Install SAP Vora
- 6. Install SAP Data Hub Flow Agent
- 7. Troubleshooting Tips
- 7.1. Configure Docker on Jump Server as a non-root user
- 7.2. How to check, if the template for Service Accounts was applied
- 7.3. SAP Vora Installation Error: render error in "vora-consul/templates/consul.yaml"
- 7.4. Vora Installation Error: timeout at “Deploying vora-consul”
- 7.5. Clean up Failed Installation, for example, namespace is `vora`
- 7.6. Un-install Helm
- 8. Additional resources
SAP Data Hub consists of several components, and one of them is the SAP Data Hub Distributed Runtime, a.k.a. SAP Vora, which should be installed on a Kubernetes cluster. Red Hat OpenShift Container Platform has been validated for running SAP Vora.
We will refer to SAP Data Hub Distributed Runtime with an abbreviation SDH from now on.
Please note that in SDH 1.3 and 1.4, the version of the Vora component is 2.2. Please don't be confused by the versioning convention.
In general, the installation of SDH follows these steps:
- Install Red Hat OpenShift Container Platform
- Configure the prerequisites for SAP Data Hub Distributed Runtime
- Download SAP Data Hub Distributed Runtime installation binaries and run installer
- Install SAP Data Hub Flow Agent
1. OpenShift Container Platform validation version matrix
The following version combinations of SDH, OCP and RHEL have been validated:
SAP Data Hub | OpenShift Container Platform | RHEL |
---|---|---|
1.3 | 3.7 | 7.4 |
1.4 | 3.9 | 7.5 |
Although not validated, other version combinations are supported and listed below in the compatibility matrix.
2. Hardware/VM Requirements
2.1. Persistent Volumes
Persistent storage is needed by SDH. It’s recommended to use storage that can be created dynamically. You can find more information in this document: Dynamic Provisioning and Creating Storage Classes
The size of the storage required by the SAP Vora on OpenShift depends on the storage type.
2.2. Compatibility Matrix
Later versions of SAP Data Hub support newer versions of Kubernetes and OpenShift Container Platform. Even if not listed in the OCP validation version matrix above, the following version combinations are fully supported and considered fully working:
SAP Data Hub | OpenShift Container Platform | RHEL |
---|---|---|
1.3 | 3.7 | 7.5, 7.4 or 7.3 |
1.4 | 3.7 or 3.9 | 7.5, 7.4 or 7.3 |
2.3. OpenShift Cluster
The following are the minimum requirements for the OpenShift Cluster Nodes:
- OS: Red Hat Enterprise Linux 7.5, 7.4 or 7.3
- CPU: 4 cores
- Memory: 16GB
- Diskspace:
/var
- used in docker configuration:- 50GB if you are using statically provisioned storage
- 20GB if you are using dynamically provisioned storage
/var/local
- used in Vora installation:/var/local/vora
, and/var/local/db
- 50 GB, if you are using statically provisioned storage
- no minimum requirement if you are using dynamically provisioned storage
- 100 GB of free LVM storage for the docker-pool
2.4. Jump Server
For the installation of SAP Data Hub Distributed Runtime, it is highly recommended to do this from an external Jump Server and not from within the OpenShift Cluster, because you need to build the docker images for the SAP Data Hub Distributed Runtime on the Jump Server.
On OpenShift, you need to setup an external registry to install SDH, otherwise the installer might fail due to permission problems or wrong certificates. The Jump Server can also host the external docker registry.
The hardware requirement for the Jump Server can be:
- OS: Red Hat Enterprise Linux 7.5, 7.4 or 7.3
- CPU: 2 cores
- Memory: 16GB
- Diskspace:
/
- 15GB to put the work directory and the installation binaries of SAP Vora and SAP Data Hub Flow Agent- 50 GB of free LVM storage for the docker-pool
2.5. Hadoop (Optional)
It's optional to install the extensions to the Spark environment on Hadoop. Please refer to Installation Guide for SAP Data Hub - System Landscapes for details. This document doesn't cover the Hadoop part.
3. Install Red Hat OpenShift Container Platform
3.1. Prepare the Subscription and Packages
-
On each host of the OpenShift cluster, register system using subscription-manager. Look up and then and attach to the pool that provides the OpenShift Container Platform subscription.
# subscription-manager register --username=UserName --password=Password your system is registered with ID: XXXXXXXXXXXXXXXX # subscription-manager list --available # subscription-manager attach --pool=Pool_Id_Identified_From_Previous_Command
-
Subscribe each host only to the following repositories.
# subscription-manager repos --disable='*' # subscription-manager repos --enable='rhel-7-fast-datapath-rpms' \ --enable='rhel-7-server-extras-rpms' --enable='rhel-7-server-optional-rpms' \ --enable='rhel-7-server-rpms'
-
Enable the channel for OpenShift 3.7 or 3.9 on each host.
# # for OCP 3.7 # subscription-manager --enable='rhel-7-server-ose-3.7-rpms' # $ for OCP 3.9 # subscription-manager --enable='rhel-7-server-ose-3.9-rpms'
-
Install the following packages on each host.
# yum -y install curl git net-tools bind-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct # yum -y install atomic-openshift-utils ansible openshift-ansible-playbooks docker
3.2. Install OpenShift
Install OpenShift Container Platform on your desired cluster hosts. Follow the OpenShift installation guide or use the playbooks for a cloud reference architecture.
IMPORTANT: Make sure to add feature-gates
to kublet arguments with the following inventory line:
openshift_node_kubelet_args={'feature-gates':['ReadOnlyAPIDataVolumes=false']}
It will cause all secret and configMap volumes to be mounted in read-write directories in containers. SAP Vora diagnostic pods expect these directories to be write-able and fail to deploy otherwise.
For other installation methods, please make sure to add the following to /etc/origin/node/node-config.yaml
files on all the schedule-able nodes:
kubeletArguments:
feature-gates:
- ReadOnlyAPIDataVolumes=false
IMPORTANT: Make sure not to set the default node selector. Otherwise, daemon sets will fail to deploy all their pods which will cause the installation to fail. For new advanced installations, comment out lines with osm_default_node_selector
. For existing clusters, unset the node selector in /etc/origin/master/master-config.yaml
with the following lines:
projectConfig:
defaultNodeSelector: ''
NOTE: On AWS you have to label all nodes according to Labeling Clusters for AWS with: openshift_clusterid="Key=kubernetes.io/cluster/,Value=ocp37"
4. Configure the Prerequisites for SAP Data Hub Distributed Runtime (SAP Vora)
4.1. Set up an External Docker Registry
NOTE: On OpenShift, you need to use an external registry, because SAP Vora cannot use the OpenShift internal secured registry until the problem is solved.
-
On a separate host from the OpenShift cluster, setup an external registry for building and delivering the SAP VORA containers. Please follow the steps in article How do I setup/install a Docker registry?. You can install the docker registry on the Jump Server.
After the setup you should have an external docker registry with the following url:
My_Docker_Registry_FQDN:PORT
-
Configure docker on the Jump Server
The SAP Vora installer builds the containers locally on the Jump Server and pushes them to the registry that is later used for installation on OpenShift. In order to push images to the docker registry you need to add the registry to your docker config in
/etc/sysconfig/docker
and/etc/containers/registries.conf
:# vi /etc/sysconfig/docker ... OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false --insecure-registry=My_Docker_Registry_FQDN:PORT' ... # vi /etc/containers/registries.conf ... registries: - My_Docker_Registry_FQDN:PORT - registry.access.redhat.com ...
NOTE: The docker registry must be added as an insecure registry using option
--insecure-registry
.Restart the docker daemon
# systemctl restart docker
NOTE: if you configure docker on the Jump Server as a non-root user, please check Appendix 7.1 for instructions.
4.2. Configure the OpenShift Cluster for Vora
NOTE: Many commands below require cluster admin privileges. To become a cluster-admin, you can do one of the following:
-
Log-in to any master node as the root user and execute following command
# oc login -u system:admin
-
Make any existing user a cluster admin by doing the previous step followed by:
# oc adm policy add-cluster-role-to-user cluster-admin $USER
-
Copy the admin kubeconfig file from a remote master node to a local host and use that:
# scp master.node:/etc/origin/master/admin.kubeconfig . # export KUBECONFIG=$(pwd)/admin.kubeconfig # oc login -u system:admin
NOTE: For testing purpose you might now set SELinux to permissive
in case there is more SELinux configuration needed (setenforce 0
). In a production system, please check carefully and add appropriate rules according to your required setup. The following step 1 and 2 are tested to be working for the validated versions of SDH and OCP.
-
On every (scheduled) Node of the OpenShift cluster, create the following directories and add the proper SELinux fcontext
container_file_t
to them.# mkdir -p /var/local/vora /var/local/db # semanage fcontext -a -t container_file_t /var/local/vora # semanage fcontext -a -t container_file_t /var/local/db # restorecon -v /var/local/vora # restorecon -v /var/local/db # ls -Z /var/local
In the output, verify that the SELinux fcontext has been correctly set on sub-directories db and vora.
-
On every (scheduled) node of the OpenShift cluster, make sure containers can mount via NFS:
# setsebool virt_use_nfs true
-
On every (scheduled) node of the OpenShift cluster, change the SELinux security context of file
/var/run/docker.sock
.# semanage fcontext -m -t svirt_sandbox_file_t -f s "/var/run/docker\.sock" # restorecon -v /var/run/docker.sock
To make the change permanent, execute the following on all the nodes:
# cat >/etc/systemd/system/docker.service.d/socket-context.conf <<EOF [Service] ExecStartPost=/sbin/restorecon /var/run/docker.sock EOF
-
Create an OpenShift user for the SAP Vora installation, using the authentication method of your choice. For example,
dhadmin
. -
Create a project in OpenShift. The name of the project will be the namespace for the SAP Vora installation, for example,
vora
. Login to OpenShift as a cluster-admin, and perform the following configurations for the installation:# oc new-project vora # oc create sa tiller # oc adm policy add-cluster-role-to-user cluster-admin -z tiller # oc adm policy add-scc-to-group anyuid "system:serviceaccounts:$(oc project -q)" # oc adm policy add-scc-to-group hostmount-anyuid "system:serviceaccounts:$(oc project -q)" # oc adm policy add-scc-to-user privileged -z default # oc adm policy add-role-to-user admin dhadmin # oc adm policy add-cluster-role-to-user system:node-reader dhadmin
-
Verify the service account of tiller:
# oc get serviceaccounts -n vora tiller 2 1 7s
NOTE: The output should contain a
tiller
account. Otherwise review the previous step and fix the issue. You need the tiller account for the Vora installation. -
As a cluster-admin, allow the project admin to manage SAP VORA custom resources.
3.7 only! On OCP 3.7, the
admin
cluster role can be modified with the following command.# oc patch --type=json clusterrole admin \ -p '[{"op":"add", "path":"/rules/-", "value":{ "apiGroups":["sap.com"], "resources":["voraclusters","voracluster","vc"], "verbs":["create","delete","get","list","update","watch","patch"] }}]'
3.9 only! On OCP 3.9, aggregation rules need to be created. They will indirectly update the corresponding cluster roles:
# oc create -f - <<EOF kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: aggregate-sapvc-admin-edit labels: rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rules: - apiGroups: ["sap.com"] resources: ["voraclusters"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"] --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: aggregate-sapvc-view labels: # Add these permissions to the "view" default role. rbac.authorization.k8s.io/aggregate-to-view: "true" rules: - apiGroups: ["sap.com"] resources: ["voraclusters"] verbs: ["get", "list", "watch"] EOF
4.3. Prepare the Jump Server
-
Install a helm client on the Jump Server.
- Download from https://github.com/kubernetes/helm
-
unpack zip file and copy to your path
# curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 6329 100 6329 0 0 28539 0 --:--:-- --:--:-- --:--:-- 28638 # chmod 700 get_helm.sh # ./get_helm.sh Downloading https://kubernetes-helm.storage.googleapis.com/helm-v2.7.0-linux-amd64.tar.gz Preparing to install into /usr/local/bin helm installed into /usr/local/bin/helm Run 'helm init' to configure helm.
See the blog Getting started with Helm on OpenShift for more information.
-
Download and install kubectl
# curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl # chmod +x ./kubectl # mv ./kubectl /usr/local/bin/kubectl
-
Set up helm/tiller for the deployment, for example, the namespace is
vora
:# export TILLER_NAMESPACE=vora # helm init --service-account=tiller
Wait for a short time till the tiller pod is deployed
# oc get pods NAME READY STATUS RESTARTS AGE tiller-deploy-551988758-dzjx5 1/1 Running 0 1m # helm ls [There should be no error in the output. If there is no output at all, it means good news, no error]
5. Install SAP Vora on OpenShift
5.1. Download and unpack the SAP VORA binaries
Download and unpack SAP Vora installation binary onto the Jump Server.
-
Goto SAP Software Download Center, login with your SAP account and search for
SAP DATA HUB SP04
orSAP DATA HUB SP03
for versions1.4
or1.3
respectively. -
Download the SAP Data Hub Distributed Runtime file, for example:
DHDISTRUNTIM04_0-80003052.ZIP
(SP04 Patch0 for SAP DATA HUB DISTRIB RUNTM 1.0
).NOTE: The Data Hub Spark Extension is not covered here, because it is not installed on OpenShift. It has to be installed on your Hadoop Cluster.
-
Unpack the installer file. For example, when you unpack the
DHDISTRUNTIM04_0-80003052.ZIP
package, it will create the installation folderSAPVora-2.2.48-DistributedRuntime
.# unzip DHDISTRUNTIM04_0-80003052.ZIP
5.1.1. Installation on cluster with 3 nodes in total
IMPORTANT This note is useful for just small PoCs, not for production deployment.
Vora's dlog pod expects at least 3 schedulable nodes without *role=infra*
label. This requirement can be mitigated by reducing replication factor of dlog pod with the following patch applied to the runtime directory:
--- SAPVora-2.2.42-DistributedRuntime.orig/deployment/helm/vora-cluster/values.yaml
+++ SAPVora-2.2.42-DistributedRuntime/deployment/helm/vora-cluster/values.yaml
@@ -43,7 +43,7 @@ components:
dlog:
storageSize: 50Gi
bufferSize: 4g
- replicationFactor: 2
+ replicationFactor: 1
standbyFactor: 1
useHostPath: false
hostPath:
5.2. Install SAP Vora
-
Run the SAP Vora installer as described in Installing SAP Vora and SAP Data Hub Pipeline Engine.
-
Best practice examples of Installer Parameters
The installer parameters can be found in Command Line Parameters (Kubernetes). Depending on the deployment type and storage type, the usage of the parameters may vary.
--deployment-type=cloud
: utilizes dynamic storage provider by default
--deployment-type=onpremise
: utilizes either NFS persistent volumes or hostPathBelow are some of the best practice examples:
-
Deploying in cloud using dynamic storage provider:
--deployment-type=cloud
NOTE: If you are using GCE and AWS, you can change the default dynamically provisioned storage by following Changing the Default StorageClass.
-
Deploying in cloud using static storage provider:
--deployment-type=cloud --use-hostpath-for-consul=yes --use-hostpath-for-dqp=yes
-
Deploying on-premise with static storage: (
--use-hostpath-for-consul=no
and--use-hostpath-for-dqp=no
are default)--deployment-type=onpremise
-
Deploying on-premise with only one dynamic storage provisioner:
--deployment-type=onpremise --use-hostpath-for-consul=no --use-hostpath-for-dqp=no
-
Deploying on-premise, when you have multiple dynamic storage provisioners but there is no default defined, you need to specify the storage using
--vsystem-storage-class
:--deployment-type=onpremise --use-hostpath-for-consul=no --use-hostpath-for-dqp=no --vsystem-storage-class=StorageClass In Use
NOTE: If the default dynamic storage provisioner has been defined, the parameter
--vsystem-storage-class
can be omitted. To define the default dynamic storage provisioner, check this document Changing the Default StorageClass. -
NFS Persistent Volumes
The SAP Vora installer can provision NFS persistent volumes, regardless of the deployment type, whether it's
cloud
oronpremise
. It's useful when you wish to utilize persistent volumes but have no dynamic storage provisioner. So if you have a NFS server and want to have the installer provision persistent NFS volumes, use the following parameters:--provision-persistent-volumes=yes --nfs-address=Address of the NFS server --nfs-path=Path on NFS --local-nfs-path=Local path where NFS is mounted
-
-
After a successful installation, create a route for the SAP Vora service. You can find more information in OpenShift documentation Using Wildcard Routes (for a Subdomain).
-
Look up the service, for example, the namespace is
vora
:# oc get services -n vora vsystem 172.30.81.230 <nodes> 10002:31753/TCP,10000:32322/TCP 1d
-
Create the route:
# oc create route passthrough --service=vsystem -n vora # oc get route -n vora NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD vsystem vsystem-vora.wildcard-domain vsystem vsystem passthrough None
-
-
Access the SAP Vora Tools web console at
https://vsystem-vora.wildcard-domain
. -
Validate SAP Vora Installation on OpenShift
It helps to validate the SAP Vora installation before moving forward. Please follow the instructions in Validate the SAP Vora Installation.
6. Install SAP Data Hub Flow Agent
The SAP Data Hub Flow Agent can be installed before, during or after the Vora installation. This document installs Flow Agent after the Vora installation.
- Download the SAP Data Hub - Data Integration package (aka Flow Agent) from the SAP Software Download Center, for example:
DHFLOWAGENT04_0-80003551.ZIP
(SP04 Patch0 for SAP DATA HUB FLOWAGENT 1.0
). And upload the file onto the jump host. -
Unpack the package on the jump host and prepare the deployment package.
# unzip DHFLOWAGENT04_0-80003551.ZIP # cd bdh-assembly-vsystem # ./prepare.sh
NOTE: we extract the package in the same directory as the SDH's runtime zip file.
-
Import the vsolution from the SDH's runtime directory:
# cd ~/SAPVora-2.2.48-DistributedRuntime # oc login -u dhadmin # oc project vora # ./install.sh --import-vsolution --vsolution-import-path=../bdh-assembly-vsystem
NOTE: If you don't have the following environment parameters set, the installer may ask for the values. You can include them in the installer command.
--namespace=
--docker-registry=
--vora-admin-username=
--vora-admin-password=
7. Troubleshooting Tips
7.1. Configure Docker on Jump Server as a non-root user
-
Append
-G dockerroot
toOPTIONS=
in/etc/sysconfig/docker
file on your Jump Server.# vi /etc/sysconfig/docker ... OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false --insecure-registry=My_Docker_Registry_FQDN:PORT -G dockerroot' ...
-
Run the following commands on the Jump Server, after you modify the
/etc/sysconfig/docker
file.# sudo usermod -a -G dockerroot InstallUserName # sudo chown root:dockerroot /var/run/docker.sock
-
Log out and Re-log-in to the Jump Server for the changes to become effective.
7.2. How to check, if the template for Service Accounts was applied
There should be a tiller
service account in the output.
# oc get sa --all-namespaces
NAMESPACE NAME SECRETS AGE
...
vora builder 2 22h
vora default 2 22h
vora deployer 2 22h
vora tiller 2 22h
7.3. SAP Vora Installation Error: render error in "vora-consul/templates/consul.yaml"
Vora Installation Error: render error in "vora-consul/templates/consul.yaml": template: vora-consul/templates/consul.yaml:98:34: executing "vora-consul/templates/consul.yaml" at <index $global.Values...>: error calling index: index of untyped nil
Solution: run the SAP Vora installer with the parameter “--assign-nodes
”, for example, in namespace vora
.
# install.sh --namespace=vora --docker-registry=My_Docker_Registry_FQDN:PORT --assign-nodes
[... there will be output showing that the installer is doing node assignment ...]
Node assignment is done!
Now run the SAP Vora installer again.
7.4. Vora Installation Error: timeout at “Deploying vora-consul”
Vora Installation Error: timeout at “Deploying vora-consul with: helm install --namespace vora -f values.yaml …”
To view the log messages, you can login to the OpenShift web console, navigate to Applications -> Pods, select the failing pod e.g. vora-consul-2-0
, and check the log under the Events
tab.
A common error: if the external docker registry is a insecure registry, but the OpenShift cluster is configured to pull from a secure registry, you will see errors in the log. If secure registry is not feasible, follow the commands below to configure every (scheduled) Node of the OpenShift cluster to use insecure registry. Please note that insecure registry is not recommended for production environment.
# vi /etc/sysconfig/docker
INSECURE_REGISTRY='--insecure-registry My_Docker_Registry_FQDN:PORT’
# systemctl daemon-reload;systemctl restart docker
You can now test pulling the image from the docker registry, if it succeeds, you can re-try the installation.
# docker pull My_Docker_Registry_FQDN:PORT/vora/consul:0.9.0-sap10
7.5. Clean up Failed Installation, for example, namespace is `vora`
# install.sh --purge --force-deletion --namespace=vora --docker-registry=My_Docker_Registry_FQDN:PORT
7.6. Un-install Helm
In the following example, the namespace is vora
# export TILLER_NAMESPACE=vora
# helm reset
Comments