Installing OpenShift Data Science

Red Hat OpenShift Data Science 1

Use Red Hat OpenShift Cluster Manager to install Red Hat OpenShift Data Science as an Add-on to your OpenShift Dedicated cluster

Abstract

Use Red Hat Cluster Manager to install Red Hat OpenShift Data Science as an Add-on to your OpenShift Dedicated cluster.

Chapter 1. Providing feedback on Red Hat documentation

Let Red Hat know how we can make our documentation better. You can provide feedback directly from a documentation page by following the steps below.

  1. Make sure that you are logged in to the Customer Portal.
  2. Make sure that you are looking at the Multi-page HTML format of this document.
  3. Highlight the text that you want to provide feedback on. The Add Feedback prompt appears.
  4. Click Add Feedback.
  5. Enter your comments in the Feedback text box and click Submit.

Red Hat automatically creates a tracking issue each time you submit feedback. Open the link that is displayed after you click Submit and start watching the issue, or add more comments to give us more information about the problem.

Thank you for taking the time to provide your feedback.

Chapter 2. Architecture of OpenShift Data Science

Red Hat OpenShift Data Science is a fully Red Hat managed cloud service that is available as an Add-on to Red Hat OpenShift Dedicated and Red Hat OpenShift Service on Amazon Web Services (ROSA).

OpenShift Data Science integrates the following components and services:

  • At the service layer:

    OpenShift Data Science dashboard
    A customer-facing dashboard that shows available and installed applications for the OpenShift Data Science environment as well as learning resources such as tutorials, quick start examples, and documentation. You can also access administrative functionality from the dashboard, such as user management, cluster settings, and notebook image settings. In addition, data scientists can create their own projects from the dashboard. This enables them to organize their data science work into a single project.
    Model serving
    Data scientists can deploy trained machine-learning models to serve intelligent applications in production. After deployment, applications can send requests to the model using its deployed API endpoint.
    Jupyter (Red Hat managed)
    A Red Hat managed application that allows data scientists to configure their own notebook server environment and develop machine learning models in JupyterLab.
  • At the management layer:

    The Red Hat OpenShift Data Science operator
    A meta-operator that deploys and maintains all components and sub-operators that are part of OpenShift Data Science.
    Monitoring services
    Alertmanager, OpenShift Telemetry, and Prometheus work together to gather metrics from OpenShift Data Science and organize and display those metrics in useful ways for monitoring and billing purposes. Alerts from Alertmanager are sent to PagerDuty, responsible for notifying Red Hat of any issues with your managed cloud service.

When you install the OpenShift Data Science Add-on in the Cluster Manager, the following new projects are created:

  • The redhat-ods-operator project contains the OpenShift Data Science operator.
  • The redhat-ods-applications project installs the dashboard and other required components of OpenShift Data Science.
  • The redhat-ods-monitoring project contains services for monitoring and billing.
  • The rhods-notebooks project is where notebook environments are deployed by default.

You or your data scientists must create additional projects for the applications that will use your machine learning models.

Do not install independent software vendor (ISV) applications in namespaces associated with OpenShift Data Science Add-ons unless you are specifically directed to do so on the application’s card on the dashboard.

Chapter 3. Overview of deploying OpenShift Data Science

Read this section to understand how to deploy Red Hat OpenShift Data Science as a development and testing environment for data scientists.

Installing OpenShift Data Science involves the following high-level tasks:

  1. Confirm that your OpenShift Dedicated cluster meets all requirements.
  2. Configure an identity provider for OpenShift Dedicated.
  3. Add administrative users for OpenShift Dedicated. See Adding users for OpenShift Data Science for more information.
  4. Install the OpenShift Data Science Add-on. See Installing OpenShift Data Science on OpenShift Dedicated for more information.
  5. Configure user and administrator groups to provide user access to OpenShift Data Science.
  6. Provide your users with the URL for the OpenShift Dedicated cluster on which you deployed OpenShift Data Science.

Chapter 4. Requirements for OpenShift Data Science

Your environment must meet certain requirements to receive support for Red Hat OpenShift Data Science.

Installation requirements

You need to meet the following requirements before you are able to install OpenShift Data Science on your Red Hat OpenShift Dedicated or Red Hat OpenShift Service on Amazon Web Services (ROSA) cluster.

  • A Red Hat customer account

    Go to OpenShift Cluster Manager (http://console.redhat.com/openshift) and log in or register for a new account.

  • Product subscriptions

    Subscriptions for the following product and Add-on:

    • Red Hat OpenShift Dedicated or ROSA
    • Red Hat OpenShift Data Science Add-on

    Contact your Red Hat account manager to purchase new subscriptions. If you do not yet have an account manager, complete the form at https://cloud.redhat.com/products/dedicated/contact/ to request one.

  • An OpenShift Dedicated cluster

    Use an existing AWS or GCP cluster or create a new cluster by following the OpenShift Dedicated documentation: Creating an OpenShift Dedicated cluster.

    Your cluster must have at least 2 worker nodes with at least 8 CPUs and 32 GiB RAM available for OpenShift Data Science use when you install the Add-on. The installation process fails to start and an error is displayed if this requirement is not met.

By default, a cluster is created with one machine pool. You can add an additional machine pool or modify the default pool to meet the minimum requirements. However, the minimum resource requirements must be met by a single machine pool in the cluster. You cannot meet the requirements using the resources of multiple machine pools. For more information, see Creating a machine pool in OpenShift Dedicated.

  • On ROSA clusters, AWS Identity and Access Management credentials

    You cannot install OpenShift Data Science on a ROSA cluster that uses AWS Security Token Service (STS). When you install OpenShift Data Science on ROSA, you must use AWS Identity and Access Management (IAM) credentials only. See the ROSA documentation for advice on deploying without STS: Deploying ROSA without AWS STS.

As in OpenShift Dedicated, a default machine pool is created when you install a ROSA cluster. After installation, you can create additional machine pools for your cluster by using OpenShift Cluster Manager or the ROSA CLI (rosa), or you can add an additional machine pool or modify the default pool to meet the minimum requirements. However, the minimum resource requirements must be met by a single machine pool in the cluster. You cannot meet the requirements using the resources of multiple machine pools. For more information, see Creating a machine pool in ROSA.

Chapter 5. Configuring an identity provider for OpenShift Dedicated

Configure an identity provider for your OpenShift Dedicated cluster to manage users and groups.

Important

Adding more than one OpenShift Identity Provider can create problems when the same user name exists in multiple providers.

When mappingMethod is set to claim (the default mapping method for identity providers) and multiple providers have credentials associated with the same user name, the first provider used to log in to OpenShift is the one that works for that user, regardless of the order in which identity providers are configured.

Refer to Identity provider parameters in the OpenShift Dedicated documentation for more information about mapping methods.

Prerequisites

Procedure

  1. Log in to OpenShift Cluster Manager (https://console.redhat.com/openshift/).
  2. Click Clusters. The Clusters page opens.
  3. Click the name of the cluster to configure.
  4. Click the Access control tab.
  5. Click Identity providers.
  6. Click Add identity provider.

    1. Select your provider from the Identity Provider list.
    2. Complete the remaining fields relevant to the identity provider that you selected. See Configuring identity providers for more information.
  7. Click Confirm.

Verification

  • The configured identity providers are visible on the Access control tab of the Cluster details page.

5.1. Identity management options for OpenShift Data Science

Red Hat OpenShift Data Science supports the same authentication systems as Red Hat OpenShift Dedicated and Red Hat OpenShift Service on Amazon Web Services (ROSA).

Check the appropriate documentation for your cluster for more information.

Chapter 6. Adding administrative users for OpenShift Dedicated

Before you can install and configure OpenShift Data Science for your data scientist users, you must define administrative users. Only administrative users can install and configure OpenShift Data Science.

Prerequisites

Procedure

  1. Log in to OpenShift Cluster Manager (https://console.redhat.com/openshift/).
  2. Click Clusters. The Clusters page opens.
  3. Click the name of the cluster to configure.
  4. Click the Access control tab.
  5. Click Cluster Roles and Access.
  6. Under Cluster administrative users click the Add user button.

    The Add cluster user popover appears.

  7. Enter the user name in the User ID field.
  8. Select an appropriate Group for the user.

    Important

    If this user needs to use existing groups in an identity provider to control OpenShift Data Science access, select cluster-admins.

    Check Cluster administration in the OpenShift Dedicated documentation for more information about these user types.

  9. Click Add user.

Verification

  • The user name and selected group are visible in the list of Cluster administrative users.

Chapter 7. Installing OpenShift Data Science on OpenShift Dedicated

You can install Red Hat OpenShift Data Science as an Add-on to your Red Hat OpenShift Dedicated cluster using Red Hat OpenShift Cluster Manager.

Prerequisites

  • Purchase entitlements for OpenShift Data Science.
  • Credentials for OpenShift Cluster Manager (https://console.redhat.com/openshift/).
  • Administrator access to the OpenShift Dedicated cluster.

Procedure

  1. Log in to OpenShift Cluster Manager (https://console.redhat.com/openshift/).
  2. Click Clusters.

    The Clusters page opens.

  3. Click the name of the cluster you want to install OpenShift Data Science on.

    The Details page for the cluster opens.

  4. Click the Add-ons tab and locate the Red Hat OpenShift Data Science card.
  5. Click Install. The Configure Red Hat OpenShift Data Science pane appears.
  6. In the Notification email field, enter any email addresses that you want to receive important alerts about the state of Red Hat OpenShift Data Science, such as outage alerts.
  7. Click Install.

Verification

  • In OpenShift Cluster Manager, under the Add-ons tab for the cluster, confirm that the OpenShift Data Science card shows one of the following states:

    • Installing - installation is in progress; wait for this to change to Installed. This takes around 30 minutes.
    • Installed - installation is complete; verify that the View in console button is visible.
  • In OpenShift Dedicated, click HomeProjects and confirm that the following project namespaces are visible and listed as Active:

    • redhat-ods-applications
    • redhat-ods-monitoring
    • redhat-ods-operator
    • rhods-notebooks

Chapter 8. Enabling GPU support in OpenShift Data Science

To ensure that your data scientists can use compute-heavy workloads in their models, you can enable graphics processing units (GPUs) in OpenShift Data Science.

To make GPUs available in OpenShift Data Science, after you install OpenShift Data Science, you must install the NVIDIA GPU Add-on. This add-on locates and enables any GPU-enabled worker nodes in your cluster, making GPU instance types available for selection. After you have installed the NVIDIA GPU Add-on, and you have ensured there are GPU-enabled worker nodes in your cluster, your data scientists can select one of the GPU-enabled notebooks in Jupyter, along with the number of GPUs they require for their data science work.

Red Hat recommends that you use a separate machine pool for GPU nodes that have the nvidia.com/gpu NoSchedule taint.

Prerequisites

  • You have credentials for OpenShift Cluster Manager (https://console.redhat.com/openshift/).
  • You are part of the cluster-admins user group in OpenShift Dedicated.
  • You have provisioned a cluster that contains enough resources to satisfy the requirements of OpenShift Data Science and the NVIDIA GPU Add-on.
  • You have installed and logged in to Red Hat OpenShift Data Science.
  • You must have installed and logged in to the OpenShift CLI (oc).

Procedure

  1. Navigate to your cluster on OpenShift Cluster Manager.

    1. Log in to OpenShift Cluster Manager (https://console.redhat.com/openshift/).
    2. Click Clusters.

      The Clusters page opens.

    3. Click the name of the cluster that you have installed OpenShift Data Science on.

      The Details page for the cluster opens.

  2. Add a machine pool for nodes with GPUs.

    1. Click the Machine pools tab.
    2. Click the Add machine pool button.

      The Add machine pool window opens.

    3. Specify a Machine pool name.
    4. Set a Compute node instance type. Ensure that the instance type provides one or more GPUs.
    5. Set a Compute node count of at least one.
    6. Click Edit node labels and taints to expand the Node labels section.
    7. Under Taints, add a taint with the Key of nvidia.com/gpu and an Effect of NoSchedule. The Value can be set to any string, for example, true.

      Note

      When setting the taint, ensure the taint is correctly declared without typographical errors.

    8. Click Add machine pool.

      Your machine pool is created.

    9. Confirm that the Taint you specified is visible on the Details page for the machine pool, for example, nvidia.com/gpu=true:NoSchedule.
  3. Install the NVIDIA GPU Operator.

    1. Click the Add-ons tab.
    2. Click on the NVIDIA GPU Operator card.
    3. Click Install.

Verification

  • In OpenShift Cluster Manager, under the Add-ons tab for the cluster, confirm that the NVIDIA GPU operator is installed.
  • In OpenShift Dedicated web console, under ComputeNodes, confirm that each node in the new machine pool has the nvidia.com/gpu taint set, for example, nvidia.com/gpu=true:NoSchedule.
  • Check that GPU-enabled functionality is available in Red Hat OpenShift Data Science.

    • Check and validate the nvidia-device-plugin-validator logs. At the OpenShift CLI, enter the following command:

      oc logs nvidia-device-plugin-validator-<alpha-numeric-string> -n redhat-gpu-operator

      Where <alpha-numeric-string> is a randomly generated alpha-numeric string.

      If the validation is successful, the following response is returned:

      device-plugin validation is successful
    • Red Hat recommends that you run a sample GPU application to ensure GPU-enabled models can successfully run on Red Hat OpenShift Data Science. For more information, see Running a sample GPU application.
    • Run the nvidia-smi command within the relevant pod to test the GPU utilization of your sample project. For more information, see Getting information about the GPU.

Chapter 9. Sharing the instance address with users

After you have added users to Red Hat OpenShift Data Science, share the instance address with those users to let them log in and work on their data models.

Prerequisites

  • You have installed OpenShift Data Science on your OpenShift Dedicated cluster.
  • You have added at least one user to the user group for OpenShift Data Science.

Procedure

  1. Log in to OpenShift Dedicated web console.
  2. Click the application launcher ( The application launcher ).
  3. Right-click on Red Hat OpenShift Data Science and copy the URL for your OpenShift Data Science instance.
  4. Provide this instance URL to your data scientists to let them log in to OpenShift Data Science.

Verification

  • Confirm that you and your users can log in to OpenShift Data Science using the instance URL.

Chapter 10. Troubleshooting common installation problems

If you are experiencing difficulties installing the Red Hat OpenShift Data Science Add-on, read this section to understand what could be causing the problem, and how to resolve the problem.

If you cannot see the problem here or in the release notes, contact Red Hat Support.

10.1. The OpenShift Data Science operator cannot be retrieved from the image registry

Problem

When attempting to retrieve the OpenShift Data Science operator from the image registry, an Failure to pull from quay error message appears. The OpenShift Data Science operator might be unavailable for retrieval in the following circumstances:

  • The image registry is unavailable.
  • There is a problem with your network connection.
  • Your cluster is not operational and is therefore unable to retrieve the image registry.

Diagnosis

Check the logs in the Events section in OpenShift Dedicated for further information about the Failure to pull from quay error message.

Resolution

  • To resolve this issue, contact Red Hat support.

10.2. OpenShift Data Science cannot be installed due to insufficient cluster resources

Problem

When attempting to install OpenShift Data Science, an error message appears stating that installation prerequisites have not been met.

Diagnosis

  1. Log in to OpenShift Cluster Manager (https://console.redhat.com/openshift/).
  2. Click Clusters.

    The Clusters page opens.

  3. Click the name of the cluster you want to install OpenShift Data Science on.

    The Details page for the cluster opens.

  4. Click the Add-ons tab and locate the Red Hat OpenShift Data Science card.
  5. Click Install. The Configure Red Hat OpenShift Data Science pane appears.
  6. If the installation fails, click the Prerequisites tab.
  7. Note down the error message. If the error message states that you require a new machine pool, or that more resources are required, take the appropriate action to resolve the problem.

Resolution

10.3. The dedicated-admins Role-based access control (RBAC) policy cannot be created

Problem

The Role-based access control (RBAC) policy for the dedicated-admins group in the target project cannot be created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Attempt to create the RBAC policy for dedicated admins group in $target_project failed. error message.

Resolution

  • Contact Red Hat support.

10.4. OpenShift Data Science does not install on unsupported infrastructure

Problem

Customer deploying on an environment not documented as being supported by the RHODS operator.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Deploying on $infrastructure, which is not supported. Failing Installation error message.

Resolution

Before proceeding with a new installation, ensure that you have a fully supported environment on which to install OpenShift Data Science. For more information, see Requirements for OpenShift Data Science.

10.5. The creation of the OpenShift Data Science Custom Resource (CR) fails

Problem

During the installation process, the OpenShift Data Science Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Attempt to create the ODH CR failed. error message.

Resolution

Contact Red Hat support.

10.6. The creation of the OpenShift Data Science Notebooks Custom Resource (CR) fails

Problem

During the installation process, the OpenShift Data Science Notebooks Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Attempt to create the RHODS Notebooks CR failed. error message.

Resolution

Contact Red Hat support.

10.7. The Dead Man’s Snitch operator’s secret does not get created

Problem

An issue with Managed Tenants SRE automation process causes the Dead Man’s Snitch operator’s secret to not get created.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Dead Man Snitch secret does not exist. error message.

Resolution

Contact Red Hat support.

10.8. The PagerDuty secret does not get created

Problem

An issue with Managed Tenants SRE automation process causes the PagerDuty’s secret to not get created.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Pagerduty secret does not exist error message.

Resolution

Contact Red Hat support.

10.9. The SMTP secret does not exist

Problem

An issue with Managed Tenants SRE automation process causes the SMTP secret to not get created.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: SMTP secret does not exist error message.

Resolution

Contact Red Hat support.

10.10. The ODH parameter secret does not get created

Problem

An issue with the OpenShift Data Science add-on’s flow could result in the ODH parameter secret to not get created.

Diagnosis

  1. In the OpenShift Dedicated web console, change into the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-deployer from the drop-down list
  7. Check the log for the ERROR: Addon managed odh parameter secret does not exist. error message.

Resolution

Contact Red Hat support.

Chapter 11. Additional resources

Legal Notice

Copyright © 2023 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.