Chapter 3. Creating a workbench and a notebook

3.1. Creating a workbench and selecting a notebook image

A workbench is an instance of your development and experimentation environment. Within a workbench you can select a notebook image for your data science work.

Prerequisites

Procedure

  1. Navigate to the project detail page for the data science project that you created in Setting up your data science project.
  2. Click the Workbenches tab, and then click the Create workbench button.

    Create workbench button
  3. Fill out the name and description.

    Workbench name and description

    Red Hat provides several supported notebook images. In the Notebook image section, you can choose one of these images or any custom images that an administrator has set up for you. The Tensorflow image has the libraries needed for this tutorial.

  4. Select the latest Tensorflow image.

    Workbench image
  5. Choose a small deployment.

    Workbench size
  6. Leave the default environment variables and storage options.

    Workbench storage
  7. Under Data connections, select Use existing data connection and select My Storage (the object storage that you configured previously) from the list.

    Data connection
  8. Click the Create workbench button.

    Create workbench button

Verification

In the Workbenches tab for the project, the status of the workbench changes from Starting to Running.

Workbench list
Note

If you made a mistake, you can edit the workbench to make changes.

Workbench list edit

3.2. Importing the tutorial files into the Jupyter environment

The Jupyter environment is a web-based environment, but everything you do inside it happens on Red Hat OpenShift AI and is powered by the OpenShift cluster. This means that, without having to install and maintain anything on your own computer, and without disposing of valuable local resources such as CPU, GPU and RAM, you can conduct your data science work in this powerful and stable managed environment.

Prerequisite

You created a workbench, as described in Creating a workbench and selecting a Notebook image.

Procedure

  1. Click the Open link next to your workbench. If prompted, log in and allow the Notebook to authorize your user.

    Open workbench

    Your Jupyter environment window opens.

    This file-browser window shows the files and folders that are saved inside your own personal space in OpenShift AI.

  2. Bring the content of this tutorial inside your Jupyter environment:

    1. On the toolbar, click the Git Clone icon:

      Git Clone icon
    2. Enter the following tutorial Git https URL:

      https://github.com/rh-aiservices-bu/fraud-detection.git
      Git Modal
    3. Check the Include submodules option.
    4. Click Clone.

Verification

Double-click the newly-created folder, fraud-detection:

Jupyter file browser

In the file browser, you should see the notebooks that you cloned from Git.

Jupyter file browser - fraud-detection

or

Training a model

3.3. Running code in a notebook

Note

If you’re already at ease with Jupyter, you can skip to the next section.

A notebook is an environment where you have cells that can display formatted text or code.

This is an empty cell:

Jupyter Cell

This is a cell with some code:

Jupyter Cell Code

Code cells contain Python code that you can run interactively. You can modify the code and then run it. The code does not run on your computer or in the browser, but directly in the environment that you are connected to, Red Hat OpenShift AI in our case.

You can run a code cell from the notebook interface or from the keyboard:

  • From the user interface: Select the cell (by clicking inside the cell or to the left side of the cell) and then click Run from the toolbar.

    Jupyter Run
  • From the keyboard: Press CTRL + ENTER to run a cell or press SHIFT + ENTER to run the cell and automatically select the next one.

After you run a cell, you can see the result of its code as well as information about when the cell was run, as shown in this example:

Jupyter run cell

When you save a notebook, the code and the results are saved. You can reopen the notebook to look at the results without having to run the program again, while still having access to the code.

Notebooks are so named because they are like a physical notebook: you can take notes about your experiments (which you will do), along with the code itself, including any parameters that you set. You can see the output of the experiment inline (this is the result from a cell after it’s run), along with all the notes that you want to take (to do that, from the menu switch the cell type from Code to Markdown).

3.3.1. Try it

Now that you know the basics, give it a try!

Prerequisite

Procedure

  1. In your Jupyter environment, locate the 0_sandbox.ipynb file and double-click it to launch the notebook. The notebook opens in a new tab in the content section of the environment.

    Notebook 0
  2. Experiment by, for example, running the existing cells, adding more cells and creating functions.

    You can do what you want - it’s your environment and there is no risk of breaking anything or impacting other users. This environment isolation is also a great advantage brought by OpenShift AI.

  3. Optionally, create a new notebook in which the code cells are run by using a Python 3 kernel:

    1. Create a new notebook by either selecting File →New →Notebook or by clicking the Python 3 tile in the Notebook section of the launcher window:

      alt text

You can use different kernels, with different languages or versions, to run in your notebook.

Additional resource

To learn more about notebooks, go to the Jupyter site.

Next step

Training a model

3.4. Training a model

Now that you know how the Jupyter notebook environment works, the real work can begin!

In your notebook environment, open the 1_experiment_train.ipynb file and follow the instructions directly in the notebook. The instructions guide you through some simple data exploration, experimentation, and model training tasks.

Jupyter Notebook 1