Chapter 3. Working on data science projects

As a data scientist, you can organize your data science work into a single project. A data science project in OpenShift AI can consist of the following components:

Workbenches
Creating a workbench allows you to add a Jupyter notebook to your project.
Cluster storage
For data science projects that require data to be retained, you can add cluster storage to the project.
Data connections
Adding a data connection to your project allows you to connect data inputs to your workbenches.
Pipelines
Standardize and automate machine learning workflows to enable you to further enhance and deploy your data science models.
Models and model servers
Deploy a trained data science model to serve intelligent applications. Your model is deployed with an endpoint that allows applications to send requests to the model.
Important

If you create an OpenShift project outside of the OpenShift AI user interface, the project is not shown on the Data Science Projects page. In addition, you cannot use features exclusive to OpenShift AI, such as workbenches and model serving, with a standard OpenShift project.

To classify your OpenShift project as a data science project, and to make available features exclusive to OpenShift AI, you must add the label opendatahub.io/dashboard: 'true' to the project namespace. After you add this label, your project is subsequently shown on the Data Science Projects page.

3.1. Using data science projects

3.1.1. Creating a data science project

To start your data science work, create a data science project. Creating a project helps you organize your work in one place. You can also enhance your data science project by adding the following functionality:

  • Workbenches
  • Storage for your project’s cluster
  • Data connections
  • Data science pipelines
  • Model servers

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click Create data science project.

    The Create a data science project dialog opens.

  3. Enter a name for your data science project.
  4. Optional: Edit the resource name for your data science project. The resource name must consist of lowercase alphanumeric characters, -, and must start and end with an alphanumeric character.
  5. Enter a description for your data science project.
  6. Click Create.

    A project details page opens. From this page, you can create workbenches, add cluster storage and data connections, import pipelines, and deploy models.

Verification

  • The project that you created is displayed on the Data Science Projects page.

3.1.2. Updating a data science project

You can update your data science project’s details by changing your project’s name and description text.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the action menu () beside the project whose details you want to update and click Edit project.

    The Edit data science project dialog opens.

  3. Optional: Update the name for your data science project.
  4. Optional: Update the description for your data science project.
  5. Click Update.

Verification

  • The data science project that you updated is displayed on the Data Science Projects page.

3.1.3. Deleting a data science project

You can delete data science projects so that they do not appear on the OpenShift AI Data Science Projects page when you no longer want to use them.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, {oai-user-group}) in OpenShift.
  • You have created a data science project.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the action menu () beside the project that you want to delete and then click Delete project.

    The Delete project dialog opens.

  3. Enter the project name in the text field to confirm that you intend to delete it.
  4. Click Delete project.

Verification

  • The data science project that you deleted is no longer displayed on the Data Science Projects page.
  • Deleting a data science project deletes any associated workbenches, data science pipelines, cluster storage, and data connections. This data is permanently deleted and is not recoverable.

3.2. Using project workbenches

3.2.1. Creating a project workbench

To examine and work with models in an isolated area, you can create a workbench. You can use this workbench to create a Jupyter notebook from an existing notebook container image to access its resources and properties. For data science projects that require data retention, you can add container storage to the workbench you are creating. If you require extra power for use with large datasets, you can assign accelerators to your workbench to optimize performance.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you use specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You have created a data science project that you can add a workbench to.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to add the workbench to.

    A project details page opens.

  3. Click the Workbenches tab.
  4. Click Create workbench.

    The Create workbench page opens.

  5. Configure the properties of the workbench you are creating.

    1. In the Name field, enter a name for your workbench.
    2. Optional: In the Description field, enter a description to define your workbench.
    3. In the Notebook image section, complete the fields to specify the notebook image to use with your workbench.

      1. From the Image selection list, select a notebook image.
    4. In the Deployment size section, specify the size of your deployment instance.

      1. From the Container size list, select a container size for your server.
      2. Optional: From the Accelerator list, select an accelerator.
      3. If you selected an accelerator in the preceding step, specify the number of accelerators to use.
    5. Optional: Select and specify values for any new environment variables.
  1. Configure the storage for your OpenShift AI cluster.

    1. Select Create new persistent storage to create storage that is retained after you log out of OpenShift AI. Complete the relevant fields to define the storage.
    2. Select Use existing persistent storage to reuse existing storage and select the storage from the Persistent storage list.
  2. To use a data connection, in the Data connections section, select the Use a data connection checkbox.

    • Create a new data connection as follows:

      1. Select Create new data connection.
      2. In the Name field, enter a unique name for the data connection.
      3. In the Access key field, enter the access key ID for the S3-compatible object storage provider.
      4. In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
      5. In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
      6. In the Region field, enter the default region of your S3-compatible object storage account.
      7. In the Bucket field, enter the name of your S3-compatible object storage bucket.
    • Use an existing data connection as follows:

      1. Select Use existing data connection.
      2. From the Data connection list, select a data connection that you previously defined.
    1. Click Create workbench.

Verification

  • The workbench that you created appears on the Workbenches tab for the project.
  • Any cluster storage that you associated with the workbench during the creation process appears on the Cluster storage tab for the project.
  • The Status column on the Workbenches tab displays a status of Starting when the workbench server is starting, and Running when the workbench has successfully started.

3.2.2. Starting a workbench

You can manually start a data science project’s workbench from the Workbenches tab on the project details page. By default, workbenches start immediately after you create them.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project that contains a workbench.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project whose workbench you want to start.

    A project details page opens.

  3. Click the Workbenches tab.
  4. Click the toggle in the Status column for the relevant workbench to start a workbench that is not running.

    The status of the workbench that you started changes from Stopped to Running. After the workbench has started, click Open to open the workbench’s notebook.

Verification

  • The workbench that you started appears on the Workbenches tab for the project, with the status of Running.

3.2.3. Updating a project workbench

If your data science work requires you to change your workbench’s notebook image, container size, or identifying information, you can update the properties of your project’s workbench. If you require extra power for use with large datasets, you can assign accelerators to your workbench to optimize performance.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you use specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project that has a workbench.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project whose workbench you want to update.

    A project details page opens.

  3. Click the Workbenches tab.
  4. Click the action menu () beside the workbench that you want to update and then click Edit workbench.

    The Edit workbench page opens.

  5. Update any of the workbench properties and then click Update workbench.

Verification

  • The workbench that you updated appears on the Workbenches tab for the project.

3.2.4. Deleting a workbench from a data science project

You can delete workbenches from your data science projects to help you remove Jupyter notebooks that are no longer relevant to your work.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project with a workbench.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to delete the workbench from.

    A project details page opens.

  3. Click the Workbenches tab.
  4. Click the action menu () beside the workbench that you want to delete and then click Delete workbench.

    The Delete workbench dialog opens.

  5. Enter the name of the workbench in the text field to confirm that you intend to delete it.
  6. Click Delete workbench.

Verification

  • The workbench that you deleted is no longer displayed in the Workbenches tab for the project.
  • The custom resource (CR) associated with the workbench’s Jupyter notebook is deleted.

3.3. Using data connections

3.3.1. Adding a data connection to your data science project

You can enhance your data science project by adding a connection to a data source. When you want to work with a very large data sets, you can store your data in an S3-compatible object storage bucket, so that you do not fill up your local storage. You also have the option of associating the data connection with an existing workbench that does not already have a connection.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You have created a data science project that you can add a data connection to.
  • You have access to S3-compatible object storage.
  • If you intend to add the data connection to an existing workbench, you have saved any data in the workbench to avoid losing work.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to add a data connection to.

    A project details page opens.

  3. Click the Data connections tab.
  4. Click Add data connection.

    The Add data connection dialog opens.

  5. Enter a name for the data connection.
  6. In the Access key field, enter the access key ID for your S3-compatible object storage provider.
  7. In the Secret key field, enter the secret access key for the S3-compatible object storage account you specified.
  8. In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
  9. In the Region field, enter the default region of your S3-compatible object storage account.
  10. In the Bucket field, enter the name of your S3-compatible object storage bucket.
  11. Optional: From the Connected workbench list, select a workbench to connect.
  12. Click Add data connection.

Verification

  • The data connection that you added appears in the Data connections tab for the project.
  • If you selected a workbench, the workbench is visible in the Connected workbenches column in the Data connections tab for the project.

3.3.2. Deleting a data connection

You can delete data connections from your data science projects to help you remove connections that are no longer relevant to your work.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project with a data connection.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to delete the data connection from.

    A project details page opens.

  3. Click the Data connections tab.
  4. Click the action menu () beside the data connection that you want to delete and then click Delete data connection.

    The Delete data connection dialog opens.

  5. Enter the name of the data connection in the text field to confirm that you intend to delete it.
  6. Click Delete data connection.

Verification

  • The data connection that you deleted is no longer displayed in the Data connections tab for the project.

3.3.3. Updating a connected data source

To use an existing data source with a different workbench, you can change the data source that is connected to your project’s workbench.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project, created a workbench, and you have defined a data connection.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project whose data source you want to change.

    A project details page opens.

  3. Click the Data connections tab.
  4. Click the action menu () beside the data source that you want to change and then click Edit data connection.

    The Edit data connection dialog opens.

  5. In the Connected workbench section, select an existing workbench from the list.
  6. Click Update data connection.

Verification

  • The updated data connection is displayed in the Data connections tab for the project.
  • You can access your S3 data source using environment variables in the connected workbench.

3.4. Configuring cluster storage

3.4.1. Adding cluster storage to your data science project

For data science projects that require data to be retained, you can add cluster storage to the project. Additionally, you can also connect cluster storage to a specific project’s workbench.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project that you can add cluster storage to.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to add the cluster storage to.

    A project details page opens.

  3. Click the Cluster storage tab.
  4. Click Add cluster storage.

    The Add storage dialog opens.

  5. Enter a name for the cluster storage.
  6. Enter a description for the cluster storage.
  7. Under Persistent storage size, enter a new size in gibibytes. The minimum size is 1 GiB, and the maximum size is 16384 GiB.
  8. Optional: Select a workbench from the list to connect the cluster storage to an existing workbench.
  9. If you selected a workbench to connect the storage to, enter the storage directory in the Mount folder field.
  10. Click Add storage.

Verification

  • The cluster storage that you added appears in the Cluster storage tab for the project.
  • A new persistent volume claim (PVC) is created with the storage size that you defined.
  • The persistent volume claim (PVC) is visible as an attached storage in the Workbenches tab for the project.

3.4.2. Updating cluster storage

If your data science work requires you to change the identifying information of a project’s cluster storage or the workbench that the storage is connected to, you can update your project’s cluster storage to change these properties.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project that contains cluster storage.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project whose storage you want to update.

    A project details page opens.

  3. Click the Cluster storage tab.
  4. Click the action menu () beside the storage that you want to update and then click Edit storage.

    The Edit storage page opens.

  5. Update the storage’s properties.

    1. Update the name for the storage, if applicable.
    2. Update the description for the storage, if applicable.
    3. Increase the Persistent storage size for the storage, if applicable.

      Note that you can only increase the storage size. Updating the storage size restarts the workbench and makes it unavailable for a period of time that is usually proportional to the size change.

    4. Update the workbench that the storage is connected to, if applicable.
    5. If you selected a new workbench to connect the storage to, enter the storage directory in the Mount folder field.
  6. Click Update storage.

If you increased the storage size, the workbench restarts and is unavailable for a period of time that is usually proportional to the size change.

Verification

  • The storage that you updated appears in the Cluster storage tab for the project.

3.4.3. Deleting cluster storage from a data science project

You can delete cluster storage from your data science projects to help you free up resources and delete unwanted storage space.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You have created a data science project with cluster storage.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to delete the storage from.

    A project details page opens.

  3. Click the Cluster storage tab.
  4. Click the action menu () beside the storage that you want to delete and then click Delete storage.

    The Delete storage dialog opens.

  5. Enter the name of the storage in the text field to confirm that you intend to delete it.
  6. Click Delete storage.

Verification

  • The storage that you deleted is no longer displayed in the Cluster storage tab for the project.
  • The persistent volume (PV) and persistent volume claim (PVC) associated with the cluster storage are both permanently deleted. This data is not recoverable.

3.5. Configuring data science pipelines

3.5.1. Configuring a pipeline server

Before you can successfully create a pipeline in OpenShift AI, you must configure a pipeline server. This task includes configuring where your pipeline artifacts and data are stored.

Note

You are not required to specify any storage directories when configuring a data connection for your pipeline server. When you import a pipeline, the /pipelines folder is created in the root folder of the bucket, containing a YAML file for the pipeline. If you upload a new version of the same pipeline, a new YAML file with a different ID is added to the /pipelines folder.

When you run a pipeline, the artifacts are stored in the /pipeline-name folder in the root folder of the bucket.

Important

If you use an external MySQL database and upgrade to OpenShift AI with DSP 2.0, the database is migrated to DSP 2.0 format, making it incompatible with earlier versions of OpenShift AI.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You have created a data science project that you can add a pipeline server to.
  • You have an existing S3-compatible object storage bucket and you have configured write access to your S3 bucket on your storage account.
  • If you are configuring a pipeline server with an external MySQL database, your database must use MySQL version 5.x.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to configure a pipeline server for.

    A project details page opens.

  3. Click the Pipelines tab.
  4. Click Configure pipeline server.

    The Configure pipeline server dialog appears.

  5. In the Object storage connection section, provide values for the mandatory fields:

    1. In the Access key field, enter the access key ID for the S3-compatible object storage provider.
    2. In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
    3. In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
    4. In the Region field, enter the default region of your S3-compatible object storage account.
    5. In the Bucket field, enter the name of your S3-compatible object storage bucket.

      Important

      If you specify incorrect data connection settings, you cannot update these settings on the same pipeline server. Therefore, you must delete the pipeline server and configure another one.

  6. In the Database section, click Show advanced database options to specify the database to store your pipeline data and select one of the following sets of actions:

    • Select Use default database stored on your cluster to deploy a MariaDB database in your project.
    • Select Connect to external MySQL database to add a new connection to an external database that your pipeline server can access.

      1. In the Host field, enter the database’s host name.
      2. In the Port field, enter the database’s port.
      3. In the Username field, enter the default user name that is connected to the database.
      4. In the Password field, enter the password for the default user account.
      5. In the Database field, enter the database name.
  7. Click Configure pipeline server.

Verification

In the Pipelines tab for the project:

  • The Import pipeline button is available.
  • When you click the action menu () and then click View pipeline server configuration, the pipeline server details are displayed.

3.5.2. Defining a pipeline

The Kubeflow Pipelines SDK enables you to define end-to-end machine learning and data pipelines. Use the latest Kubeflow Pipelines 2.0 SDK to build your data science pipeline in Python code. After you have built your pipeline, use the SDK to compile it into an Intermediate Representation (IR) YAML file. After defining the pipeline, you can import the YAML file to the OpenShift AI dashboard to enable you to configure its execution settings.

You can also use the Elyra JupyterLab extension to create and run data science pipelines within JupyterLab. For more information about creating pipelines in JupyterLab, see Working with pipelines in JupyterLab. For more information about the Elyra JupyterLab extension, see Elyra Documentation.

3.5.3. Importing a data science pipeline

To help you begin working with data science pipelines in OpenShift AI, you can import a YAML file containing your pipeline’s code to an active pipeline server, or you can import the YAML file from a URL. This file contains a Kubeflow pipeline compiled by using the Kubeflow compiler. After you have imported the pipeline to a pipeline server, you can execute the pipeline by creating a pipeline run.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You have previously created a data science project that is available and contains a configured pipeline server.
  • You have compiled your pipeline with the Kubeflow compiler and you have access to the resulting YAML file.
  • If you are uploading your pipeline from a URL, the URL is publicly accessible.

Procedure

  1. From the OpenShift AI dashboard, click Data Science PipelinesPipelines.
  2. On the Pipelines page, select the project that you want to import a pipeline to.
  3. Click Import pipeline.
  4. In the Import pipeline dialog, enter the details for the pipeline that you are importing.

    1. In the Pipeline name field, enter a name for the pipeline that you are importing.
    2. In the Pipeline description field, enter a description for the pipeline that you are importing.
    3. Select where you want to import your pipeline from by performing one of the following actions:

      • Select Upload a file to upload your pipeline from your local machine’s file system. Import your pipeline by clicking upload or by dragging and dropping a file.
      • Select Import by url to upload your pipeline from a URL and then enter the URL into the text box.
    4. Click Import pipeline.

Verification

  • The pipeline that you imported appears on the Pipelines page and on the Pipelines tab on the project details page.

For more information about using pipelines in OpenShift AI, see Working with data science pipelines.

3.6. Configuring access to data science projects

3.6.1. Configuring access to data science projects

To enable you to work collaboratively on your data science projects with other users, you can share access to your project. After creating your project, you can then set the appropriate access permissions from the OpenShift AI user interface.

You can assign the following access permission levels to your data science projects:

  • Admin - Users can modify all areas of a project, including its details (project name and description), components, and access permissions.
  • Edit - Users can modify a project’s components, such as its workbench, but they cannot edit a project’s access permissions or its details (project name and description).

3.6.2. Sharing access to a data science project

To enable your organization to work collaboratively, you can share access to your data science project with other users and groups.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. From the list of data science projects, click the name of the data science project that you want to share access to.

    A project details page opens.

  3. Click the Permissions tab.

    The Permissions page for the project opens.

  4. Provide one or more users with access to the project.

    1. In the Users section, click Add user.
    2. In the Name field, enter the user name of the user whom you want to provide access to the project.
    3. From the Permissions list, select one of the following access permission levels:

      • Admin: Users with this access level can edit project details and manage access to the project.
      • Edit: Users with this access level can view and edit project components, such as its workbenches, data connections, and storage.
    4. To confirm your entry, click Confirm ( The Confirm icon ).
    5. Optional: To add an additional user, click Add user and repeat the process.
  5. Provide one or more OpenShift groups with access to the project.

    1. In the Groups section, click Add group.
    2. From the Name list, select a group to provide access to the project.

      Note

      If you do not have cluster-admin or dedicated-admin permissions, the Name list is not visible. Instead, an input field is displayed enabling you to configure group permissions.

    3. From the Permissions list, select one of the following access permission levels:

      • Admin: Groups with this access permission level can edit project details and manage access to the project.
      • Edit: Groups with this access permission level can view and edit project components, such as its workbenches, data connections, and storage.
    4. To confirm your entry, click Confirm ( The Confirm icon ).
    5. Optional: To add an additional group, click Add group and repeat the process.

Verification

  • Users to whom you provided access to the project can perform only the actions permitted by their access permission level.
  • The Users and Groups sections on the Permissions tab show the respective users and groups that you provided with access to the project.

3.6.3. Updating access to a data science project

To change the level of collaboration on your data science project, you can update the access permissions of users and groups who have access to your project.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins) in OpenShift.
  • You have created a data science project.
  • You have previously shared access to your project with other users or groups.
  • You have administrator permissions or you are the project owner.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to change the access permissions of.

    A project details page opens.

  3. Click the Permissions tab.

    The Permissions page for the project opens.

  4. Update the user access permissions to the project.

    1. In the Name field, update the user name of the user whom you want to provide access to the project.
    2. From the Permissions list, update the user access permissions by selecting one of the following:

      • Admin: Users with this access level can edit project details and manage access to the project.
      • Edit: Users with this access level can view and edit project components, such as its workbenches, data connections, and storage.
    3. To confirm the update to the entry, click Confirm ( The Confirm icon ).
  5. Update the OpenShift groups access permissions to the project.

    1. From the Name list, update the group that has access to the project by selecting another group from the list.

      Note

      If you do not have cluster-admin or dedicated-admin permissions, the Name list is not visible. Instead, an input field displays enabling you to configure group permissions.

    2. From the Permissions list, update the group access permissions by selecting one of the following:

      • Admin: Groups with this access permission level can edit project details and manage access to the project.
      • Edit: Groups with this access permission level can view and edit project components, such as its workbenches, data connections, and storage.
    3. To confirm the update to the entry, click Confirm ( The Confirm icon ).

Verification

  • The Users and Groups sections on the Permissions tab show the respective users and groups whose project access permissions you changed.

3.6.4. Removing access to a data science project

If you no longer want to work collaboratively on your data science project, you can restrict access to your project by removing users and groups that you previously provided access to your project.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you are using specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You have created a data science project.
  • You have previously shared access to your project with other users or groups.
  • You have administrator permissions or you are the project owner.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to change the access permissions of.

    A project details page opens.

  3. Click the Permissions tab.

    The Permissions page for the project opens.

  4. Click the action menu () beside the user or group whose access permissions you want to revoke and click Delete.

Verification

  • Users whose access you have revoked can no longer perform the actions that were permitted by their access permission level.

3.7. Viewing Python packages installed on your notebook server

You can check which Python packages are installed on your notebook server and which version of the package you have by running the pip tool in a notebook cell.

Prerequisites

  • Log in to Jupyter and open a notebook.

Procedure

  1. Enter the following in a new cell in your notebook:

    !pip list
  2. Run the cell.

Verification

  • The output shows an alphabetical list of all installed Python packages and their versions. For example, if you use this command immediately after creating a notebook server that uses the Minimal image, the first packages shown are similar to the following:

    Package                           Version
    --------------------------------- ----------
    aiohttp                           3.7.3
    alembic                           1.5.2
    appdirs                           1.4.4
    argo-workflows                    3.6.1
    argon2-cffi                       20.1.0
    async-generator                   1.10
    async-timeout                     3.0.1
    attrdict                          2.0.1
    attrs                             20.3.0
    backcall                          0.2.0

3.8. Installing Python packages on your notebook server

You can install Python packages that are not part of the default notebook server image by adding the package and the version to a requirements.txt file and then running the pip install command in a notebook cell.

Note

You can also install packages directly, but Red Hat recommends using a requirements.txt file so that the packages stated in the file can be easily re-used across different notebooks. In addition, using a requirements.txt file is also useful when using a S2I build to deploy a model.

Prerequisites

  • Log in to Jupyter and open a notebook.

Procedure

  1. Create a new text file using one of the following methods:

    • Click + to open a new launcher and click Text file.
    • Click FileNewText File.
  2. Rename the text file to requirements.txt.

    1. Right-click on the name of the file and click Rename Text. The Rename File dialog opens.
    2. Enter requirements.txt in the New Name field and click Rename.
  3. Add the packages to install to the requirements.txt file.

    altair

    You can specify the exact version to install by using the == (equal to) operator, for example:

    altair==4.1.0
    Note

    Red Hat recommends specifying exact package versions to enhance the stability of your notebook server over time. New package versions can introduce undesirable or unexpected changes in your environment’s behavior.

    To install multiple packages at the same time, place each package on a separate line.

  4. Install the packages in requirements.txt to your server using a notebook cell.

    1. Create a new cell in your notebook and enter the following command:

      !pip install -r requirements.txt
    2. Run the cell by pressing Shift and Enter.
    Important

    This command installs the package on your notebook server, but you must still run the import directive in a code cell to use the package in your code.

    import altair

Verification

3.9. Updating notebook server settings by restarting your server

You can update the settings on your notebook server by stopping and relaunching the notebook server. For example, if your server runs out of memory, you can restart the server to make the container size larger.

Prerequisites

  • A running notebook server.
  • Log in to Jupyter.

Procedure

  1. Click FileHub Control Panel.

    The Notebook server control panel opens.

  2. Click the Stop notebook server button.

    The Stop server dialog opens.

  3. Click Stop server to confirm your decision.

    The Start a notebook server page opens.

  4. Update the relevant notebook server settings and click Start server.

Verification

  • The notebook server starts and contains your updated settings.