Managing users and user resources
Learn to manage user permissions and environments in Red Hat OpenShift Data Science
Abstract
Chapter 1. Usage data collection
Red Hat OpenShift Data Science administrators can choose whether to allow Red Hat to collect data about OpenShift Data Science usage in their cluster. Collecting this data allows Red Hat to monitor and improve our software and support. For further details about the data Red Hat collects, see Usage data collection notice for OpenShift Data Science.
Usage data collection is enabled by default when you install OpenShift Data Science on your OpenShift Container Platform cluster except when clusters are installed in a disconnected environment.
See Disabling usage data collection for instructions on disabling the collection of this data in your cluster. If you have disabled data collection on your cluster, and you want to enable it again, see Enabling usage data collection for more information.
1.1. Usage data collection notice for OpenShift Data Science
In connection with your use of this Red Hat offering, Red Hat may collect usage data about your use of the software. This data allows Red Hat to monitor the software and to improve Red Hat offerings and support, including identifying, troubleshooting, and responding to issues that impact users.
- What information does Red Hat collect?
Tools within the software monitor various metrics and this information is transmitted to Red Hat. Metrics include information such as:
- Information about applications enabled in the product dashboard.
- The deployment sizes used (that is, the CPU and memory resources allocated).
- Information about documentation resources accessed from the product dashboard.
- The name of the notebook images used (that is, Minimal Python, Standard Data Science, and other images.).
- A unique random identifier that generates during the initial user login to associate data to a particular username.
- Usage information about components, features, and extensions.
- Third Party Service Providers
- Red Hat uses certain third party service providers to collect the telemetry data.
- Security
- Red Hat employs technical and organizational measures designed to protect the usage data.
- Personal Data
- Red Hat does not intend to collect personal information. If Red Hat discovers that personal information has been inadvertently received, Red Hat will delete such personal information and treat such personal information in accordance with Red Hat’s Privacy Statement. For more information about Red Hat’s privacy practices, see Red Hat’s Privacy Statement.
- Enabling and Disabling Usage Data
- You can disable or enable usage data by following the instructions in Disabling usage data collection or Enabling usage data collection.
1.2. Enabling usage data collection
Red Hat OpenShift Data Science administrators can choose whether to allow Red Hat to collect data about OpenShift Data Science usage in their cluster. Usage data collection is enabled by default when you install OpenShift Data Science on your OpenShift Container Platform cluster except when clusters are installed in a disconnected environment. If you have disabled data collection previously, you can re-enable it by following these steps.
Prerequisites
- You have logged in to Red Hat OpenShift Data Science.
- You are part of the OpenShift Data Science administrator group in OpenShift Container Platform except when clusters are installed in a disconnected environment.
Procedure
- From the OpenShift Data Science dashboard, click Settings → Cluster settings.
- Locate the Usage data collection section.
- Select the Allow collection of usage data checkbox.
- Click Save changes.
Verification
-
A notification is shown when settings are updated:
Settings changes saved.
Additional resources
1.3. Disabling usage data collection
Red Hat OpenShift Data Science administrators can choose whether to allow Red Hat to collect data about OpenShift Data Science usage in their cluster. Usage data collection is enabled by default when you install OpenShift Data Science on your OpenShift Container Platform cluster except when clusters are installed in a disconnected environment.
You can disable data collection by following these steps.
Prerequisites
- You have logged in to Red Hat OpenShift Data Science.
- You are part of the OpenShift Data Science administrator group in OpenShift Container Platform except when clusters are installed in a disconnected environment.
Procedure
- From the OpenShift Data Science dashboard, click Settings → Cluster settings.
- Locate the Usage data collection section.
- Deselect the Allow collection of usage data checkbox.
- Click Save changes.
Verification
-
A notification is shown when settings are updated:
Settings changes saved.
Additional resources
Chapter 2. Overview of user permissions
By default, all OpenShift users have access to Red Hat OpenShift Data Science. In addition, users with the cluster-admin
role, automatically have administrator access in OpenShift Data Science.
Alternatively, you can create specialized user groups to restrict access to OpenShift Data Science for users and administrators. Therefore, you must decide if you want to restrict access to your OpenShift Data Science deployment using specialized user groups, as opposed to allowing all OpenShift users access.
If you decide to restrict access, and you already have user groups defined in your configured identity provider, you can add these user groups to your OpenShift Data Science deployment. If you decide to use specialized user groups without adding these groups from an identity provider, you must create the groups in OpenShift Data Science and then add the appropriate users to them.
There are some operations relevant to OpenShift Data Science that require the cluster-admin
role. Those operations include:
- Adding users to the OpenShift Data Science user and administrator groups, if you are using specialized groups.
- Removing users from the OpenShift Data Science user and administrator groups, if you are using specialized groups.
- Managing custom environment and storage configuration for users in OpenShift, such as Jupyter notebook resources, ConfigMaps, and persistent volume claims (PVCs).
Although users of OpenShift Data Science and its components are authenticated through OpenShift, session management is separate from authentication. This means that logging out of OpenShift or OpenShift Data Science does not affect a logged in Jupyter session running on those platforms. This means that when a user’s permissions change, that user must log out of all current sessions in order for the changes to take effect.
Chapter 3. User types
Red Hat OpenShift Data Science has the following user types:
Table 3.1. User types
User Type | Permissions |
---|---|
Data scientists | Data scientists can access and use individual components of Red Hat OpenShift Data Science, such as Jupyter. |
IT operations administrators | In addition to the actions permitted to a data scientist, IT operations administrators can:
|
Additional resources
Chapter 4. Defining OpenShift Data Science admin and user groups
By default, users with the cluster-admin
role are OpenShift Data Science administrators, but all users authenticated in OpenShift can access OpenShift Data Science. A cluster admin is a superuser that can perform any action in any project in the OpenShift cluster. When bound to a user with a local binding, they have full control over quota and every action on every resource in the project. You can also define additional OpenShift Data Science admin and user groups using the dashboard.
Prerequisites
- You have logged in to Red Hat OpenShift Data Science as described in Logging in to OpenShift Data Science.
-
You have the
cluster-admin
role in OpenShift Container Platform. - The groups that you want to define as admin and user groups exist in OpenShift Container Platform.
Procedure
- From the OpenShift Data Science dashboard, click Settings → User management.
- Define your OpenShift Data Science admin groups: Under Data science administrator groups, click the text box and select an OpenShift group. Repeat this process to define multiple admin groups.
Define your OpenShift Data Science user groups: Under Data science user groups, click the text box and select an OpenShift group. Repeat this process to define multiple user groups.
ImportantThe
system:authenticated
setting allows all users authenticated in OpenShift to access OpenShift Data Science.- Click Save changes.
Verification
- Admin users can successfully log in to OpenShift Data Science and perform administrative functions.
- Non-admin users can successfully log in to OpenShift Data Science, and can access and use individual components, such as Jupyter.
Chapter 5. Adding users for OpenShift Data Science
By default, all OpenShift users have access to Red Hat OpenShift Data Science. If you are using these default permission settings, no further action is required. However, if you plan to restrict access to your OpenShift Data Science instance by defining specialized user groups, you must grant users permission to access Red Hat OpenShift Data Science by adding user accounts to the Red Hat OpenShift Data Science user group, administrator group, or both. You can either use the default group name, or specify a group name that already exists in your identity provider.
The user group provides the user with access to developer functions in the Red Hat OpenShift Data Science dashboard, and associated services, such as Jupyter.
The administrator group provides the user with access to developer and administrator functions in the Red Hat OpenShift Data Science dashboard and associated services, such as Jupyter.
If you have restricted access using specialized user groups, users that are not in the OpenShift Data Science user group or administrator group can still view the dashboard, but are unable to use associated services, such as Jupyter. They are also unable to access the Cluster settings page.
To use the default group names, see Adding users to specialized OpenShift Data Science user groups. This method is easy to set up, but you must manually configure user lists in the OpenShift Container Platform web console.
5.1. Adding users to specialized OpenShift Data Science user groups
All OpenShift users have access to Red Hat OpenShift Data Science by default. Additionally, users with the cluster-admin
role automatically have administrator access to OpenShift Data Science. To further restrict access to OpenShift Data Science, you can continue to create specialized OpenShift Data Science administrator and user groups.
Follow the steps in this section to add users to your specialized OpenShift Data Science administrator and user groups. This method is easy to set up, but you must manage the user lists manually in the OpenShift Container Platform web console.
Prerequisites
- You have configured a supported identity provider for OpenShift Container Platform.
-
You are assigned the
cluster-admin
role in OpenShift Container Platform. - You have defined an OpenShift Data Science administrator group and user group.
Procedure
- In the OpenShift Container Platform web console, click User Management → Groups.
Click the name of the group you want to add users to.
-
For administrative users, click the administrator group, for example,
rhods-admins
. For normal users, click the user group, for example,
rhods-users
.The Group details page for that group appears.
-
For administrative users, click the administrator group, for example,
Click Actions → Add Users.
The Add Users dialog appears.
- In the the Users field, enter the relevant user name to add to the group.
- Click Save.
Verification
- Click the Details tab for each group and confirm that the Users section contains the user names that you added.
5.2. Additional resources
Chapter 6. Viewing OpenShift Data Science users
By default, all OpenShift users have access to Red Hat OpenShift Data Science. In addition, users with the cluster-admin
role automatically have administrator access in OpenShift Data Science. However, you can create specialized user groups to restrict access to OpenShift Data Science for users and administrators. Follow these steps if you have defined specialized OpenShift Data Science user groups, so that you can view the users that belong to these groups.
Prerequisites
- The Red Hat OpenShift Data Science user group, administrator group, or both exist.
-
You have the
cluster-admin
role in OpenShift Container Platform. - You have configured a supported identity provider for OpenShift Container Platform.
Procedure
- In the OpenShift Container Platform web console, click User Management → Groups.
Click the name of the group containing the users that you want to view.
-
For administrative users, click the name of your administrator group. for example,
rhods-admins
. -
For normal users, click the name of your user group, for example,
rhods-users
.
The Group details page for the group appears.
-
For administrative users, click the name of your administrator group. for example,
Verification
- In the Users section for the relevant group, you can view the users who have permission to access Red Hat OpenShift Data Science.
Chapter 7. Deleting users and user resources
Users assigned the cluster-admin
role in OpenShift can revoke user access to Jupyter and delete user resources from Red Hat OpenShift Data Science.
To completely remove a user from OpenShift Data Science, you must remove them from the allowed group in your OpenShift identity provider.
7.1. Backing up storage data
Red Hat recommends that you back up the data on your persistent volume claims (PVCs) regularly. Backing up your data is particularly important before deleting a user and before uninstalling OpenShift Data Science, as all PVCs are deleted when OpenShift Data Science is uninstalled.
See the documentation for your cluster platform for more information about backing up your PVCs.
Additional resources
7.2. Revoking user access to Jupyter
You can revoke a user’s access to Jupyter to prevent them from running notebook servers and consuming resources in your cluster through Jupyter, while still allowing them access to OpenShift Data Science and other services that use OpenShift’s identity provider for authentication.
Follow these steps only if you have restricted access to OpenShift Data Science using specialized user groups. To completely remove a user from OpenShift Data Science, you must remove them from the allowed group in your OpenShift identity provider.
Prerequisites
- You have stopped any notebook servers owned by the user you want to delete.
-
You are assinged the
cluster-admin
role in OpenShift Container Platform. - If you are using specialized OpenShift Data Science user groups, the user is part of the OpenShift Data Science user group, administrator group, or both.
Procedure
- In the OpenShift Container Platform web console, click User Management → Groups.
Click the name of the group that you want to remove the user from.
-
For administrative users, click the name of your administrator group, for example,
rhods-admins
. -
For normal users, click the name of your user group, for example,
rhods-users
.
The Group details page for the group appears.
-
For administrative users, click the name of your administrator group, for example,
- In the Users section on the Details tab, locate the user that you want to remove.
- Click the action menu (⋮) beside the user that you want to remove and click Remove user.
Verification
- Check the Users section on the Details tab and confirm that the user that you removed is not visible.
-
In the
rhods-notebooks
project, check under Workload → Pods and ensure that there is no notebook server pod for this user. If you can see a pod namedjupyter-nb-<username>-*
for the user that you have removed, delete that pod to ensure that the deleted user is not consuming resources on the cluster. - In the data science dashboard, check the list of data science projects. Delete any projects that belong to the user.
7.3. Cleaning up after deleting users
After removing a user’s access to Red Hat OpenShift Data Science or Jupyter, you must also delete their associated configuration files from OpenShift Container Platform. It is recommended that you back up the user’s data before removing their configuration files.
Prerequisites
- (Optional) If you want to completely remove the user’s access to OpenShift Data Science, you have removed their credentials from your identity provider.
- You have revoked the user’s access to Jupyter.
- You have backed up the user’s storage data.
-
If you are using specialized OpenShift Data Science groups, you are part of the administrator group (for example,
rhods-admins
). If you are not using specialized groups, you are part of the OpenShift Dedicated administrator group. See Adding administrative users for OpenShift Container Platform for more information. - You have logged in to the OpenShift Container Platform web console.
- You have logged in to OpenShift Data Science.
Procedure
Delete the user’s persistent volume claim (PVC).
- Click Storage → PersistentVolumeClaims.
-
If it is not already selected, select the
rhods-notebooks
project from the project list. Locate the
jupyter-nb-<username>
PVC.Replace
<username>
with the relevant user name.Click the action menu (⋮) and select Delete PersistentVolumeClaim from the list.
The Delete PersistentVolumeClaim dialog appears.
- Inspect the dialog and confirm that you are deleting the correct PVC.
- Click Delete.
Delete the user’s ConfigMap.
- Click Workloads → ConfigMaps.
-
If it is not already selected, select the
rhods-notebooks
project from the project list. Locate the
jupyterhub-singleuser-profile-<username>
ConfigMap.Replace
<username>
with the relevant user name.Click the action menu (⋮) and select Delete ConfigMap from the list.
The Delete ConfigMap dialog appears.
- Inspect the dialog and confirm that you are deleting the correct ConfigMap.
- Click Delete.
Verification
- The user cannot access Jupyter any more, and sees an "Access permission needed" message if they try.
- The user’s single-user profile, persistent volume claim (PVC), and ConfigMap are not visible in OpenShift Container Platform.
Chapter 8. Allocating additional resources to OpenShift Data Science users
As a cluster administrator, you can allocate additional resources to a cluster to support compute-intensive data science work. This includes increasing the number of nodes in the cluster and changing the cluster’s allocated machine pool.
For more information about allocating additional resources to an OpenShift Container Platform cluster, see Manually scaling a machine set.
Chapter 9. Enabling GPU support in OpenShift Data Science
Optionally, to ensure that your data scientists can use compute-heavy workloads in their models, you can enable graphics processing units (GPUs) in OpenShift Data Science. To enable GPUs on OpenShift, you must install the NVIDIA GPU Operator. As a prerequisite to installing the NVIDIA GPU Operator, you must install the Node Feature Discovery (NFD) Operator. For information about how to install these operators, see GPU Operator on OpenShift.
Follow the instructions in this chapter only if you want to enable GPU support in an unrestricted self-managed environment. To enable GPU support in a disconnected self-managed environment, see Enabling GPU support in OpenShift Data Science instead.
Additional resources
Chapter 10. Configuring the default PVC size for your cluster
To configure how resources are claimed within your OpenShift Data Science cluster, you can change the default size of the cluster’s persistent volume claim (PVC) ensuring that the storage requested matches your common storage workflow. PVCs are requests for resources in your cluster and also act as claim checks to the resource.
Prerequisites
- You have logged in to Red Hat OpenShift Data Science.
Changing this setting restarts the Jupyter pod making Jupyter unavailable for up to 30 seconds. As a workaround, it is recommended that you perform this action outside of your organization’s typical working day.
Procedure
- From the OpenShift Data Science dashboard, click Settings → Cluster settings.
- Under PVC size, enter a new size in gibibytes. The minimum size is 1 GiB, and the maximum size is 16384 GiB.
- Click Save changes.
Verification
- New PVCs are created with the default storage size that you configured.
Additional resources
Chapter 11. Restoring the default PVC size for your cluster
To change the size of resources utilized within your OpenShift Data Science cluster, you can restore the default size of your cluster’s persistent volume claim (PVC).
Prerequisites
- You have logged in to Red Hat OpenShift Data Science.
- You are part of the OpenShift Data Science administrator group in OpenShift Container Platform.
Procedure
- From the OpenShift Data Science dashboard, click Settings → Cluster settings.
- Click Restore Default to restore the default PVC size of 20GiB.
- Click Save changes.
Verification
- New PVCs are created with the default storage size of 20 GiB.
Additional resources
Chapter 12. Managing notebook servers
12.1. Accessing notebook servers owned by other users
Administrators can access notebook servers that are owned by other users to correct configuration errors or help a data scientist troubleshoot problems with their environment.
Prerequisites
- You are part of the OpenShift Dedicated administrator group. See Adding administrative users for OpenShift Dedicated for more information.
- You have launched the Jupyter application. See Launching Jupyter and starting a notebook server.
- The notebook server that you want to access is running.
Procedure
- On the page that opens when you launch Jupyter, click the Administration tab.
On the Administration page, perform the following actions:
- In the Users section, locate the user that the notebook server belongs to.
- Click View server beside the relevant user.
- On the Notebook server control panel page, click Access notebook server.
Verification
- The user’s notebook server opens in JupyterLab.
12.2. Stopping idle notebooks
You can reduce resource usage in your OpenShift Data Science deployment by stopping notebook servers that have been idle (without logged in users) for a period of time. This is useful when resource demand in the cluster is high. By default, idle notebooks are not stopped after a specific time limit.
If you have configured your cluster settings to disconnect all users from a cluster after a specified time limit, then this setting takes precedence over the idle notebook time limit. Users are logged out of the cluster when their session duration reaches the cluster-wide time limit.
Prerequisites
- You have logged in to Red Hat OpenShift Data Science.
- You are part of the OpenShift Data Science administrator group in OpenShift Container Platform.
Procedure
- From the OpenShift Data Science dashboard, click Settings → Cluster settings.
- Under Stop idle notebooks, select Stop idle notebooks after.
- Enter a time limit, in hours and minutes, for when idle notebooks are stopped.
- Click Save changes.
Verification
The
notebook-controller-culler-config
ConfigMap, located in theredhat-ods-applications
project on the Workloads → ConfigMaps page, contains the following culling configuration settings:-
ENABLE_CULLING
: Specifies if the culling feature is enabled or disabled (this isfalse
by default). -
IDLENESS_CHECK_PERIOD
: The polling frequency to check for a notebook’s last known activity (in minutes). -
CULL_IDLE_TIME
: The maximum allotted time to scale an inactive notebook to zero (in minutes).
-
- Idle notebooks stop at the time limit that you set.
12.3. Configuring a custom notebook image
In addition to notebook images provided and supported by Red Hat and independent software vendors (ISVs), you can configure custom notebook images that cater to your project’s specific requirements.
Red Hat supports you in adding custom notebook images to your deployment of OpenShift Data Science and ensuring that they are available for selection when creating a notebook server. However, Red Hat does not support the contents of your custom notebook image. That is, if your custom notebook image is available for selection during notebook server creation, but does not create a usable notebook server, Red Hat does not provide support to fix your custom notebook image.
Prerequisites
- You have logged in to Red Hat OpenShift Data Science.
-
You are assigned the
cluster-admin
role in OpenShift Container Platform. - Your custom notebook image exists in an image registry and is accessible.
Procedure
From the OpenShift Data Science dashboard, click Settings → Notebook images.
The Notebook image settings page appears. Previously imported notebook images are displayed. To enable or disable a previously imported notebook image, on the row containing the relevant notebook image, click the toggle in the Enabled column.
Click Import new image. Alternatively, if no previously imported images were found, click Import image.
The Import Notebook images dialog appears.
- In the Repository field, enter the URL of the repository containing the notebook image.
- In the Name field, enter an appropriate name for the notebook image.
- In the Description field, enter an appropriate description for the notebook image.
Optional: Add software to the notebook image. After the import has completed, the software is added to the notebook image’s meta-data and displayed on the Jupyter server creation page.
- Click the Software tab.
- Click the Add software button.
-
Click Edit (
).
- Enter the Software name.
- Enter the software Version.
-
Click Confirm (
) to confirm your entry.
- To add additional software, click Add software, complete the relevant fields, and confirm your entry.
Optional: Add packages to the notebook images. After the import has completed, the packages are added to the notebook image’s meta-data and displayed on the Jupyter server creation page.
- Click the Packages tab.
- Click the Add package button.
-
Click Edit (
).
- Enter the Package name.
- Enter the package Version.
-
Click Confirm (
) to confirm your entry.
- To add an additional package, click Add package, complete the relevant fields, and confirm your entry.
- Click Import.
Verification
- The notebook image that you imported is displayed in the table on the Notebook image settings page.
- Your custom notebook image is available for selection on the Start a notebook server page in Jupyter.
Additional resources