Chapter 23. Using Podman in HPC environment

You can use Podman with Open MPI (Message Passing Interface) to run containers in a High Performance Computing (HPC) environment.

23.1. Using Podman with MPI

The example is based on the ring.c program taken from Open MPI. In this example, a value is passed around by all processes in a ring-like fashion. Each time the message passes rank 0, the value is decremented. When each process receives the 0 message, it passes it on to the next process and then quits. By passing the 0 first, every process gets the 0 message and can quit normally.

Prerequisites

  • The container-tools meta-package is installed.

Procedure

  1. Install Open MPI:

    # dnf install openmpi
  2. To activate the environment modules, type:

    $ . /etc/profile.d/modules.sh
  3. Load the mpi/openmpi-x86_64 module:

    $ module load mpi/openmpi-x86_64

    Optionally, to automatically load mpi/openmpi-x86_64 module, add this line to the .bashrc file:

    $ echo "module load mpi/openmpi-x86_64" >> .bashrc
  4. To combine mpirun and podman, create a container with the following definition:

    $ cat Containerfile
    FROM registry.access.redhat.com/ubi9/ubi
    
    RUN dnf -y install openmpi-devel wget && \
        dnf clean all
    
    RUN wget https://raw.githubusercontent.com/open-mpi/ompi/master/test/simple/ring.c && \
        /usr/lib64/openmpi/bin/mpicc ring.c -o /home/ring && \
        rm -f ring.c
  5. Build the container:

    $ podman build --tag=mpi-ring .
  6. Start the container. On a system with 4 CPUs this command starts 4 containers:

    $ mpirun \
       --mca orte_tmpdir_base /tmp/podman-mpirun \
       podman run --env-host \
        -v /tmp/podman-mpirun:/tmp/podman-mpirun \
        --userns=keep-id \
        --net=host --pid=host --ipc=host \
        mpi-ring /home/ring
    Rank 2 has cleared MPI_Init
    Rank 2 has completed ring
    Rank 2 has completed MPI_Barrier
    Rank 3 has cleared MPI_Init
    Rank 3 has completed ring
    Rank 3 has completed MPI_Barrier
    Rank 1 has cleared MPI_Init
    Rank 1 has completed ring
    Rank 1 has completed MPI_Barrier
    Rank 0 has cleared MPI_Init
    Rank 0 has completed ring
    Rank 0 has completed MPI_Barrier

    As a result, mpirun starts up 4 Podman containers and each container is running one instance of the ring binary. All 4 processes are communicating over MPI with each other.

Additional resources

23.2. The mpirun options

The following mpirun options are used to start the container:

  • --mca orte_tmpdir_base /tmp/podman-mpirun line tells Open MPI to create all its temporary files in /tmp/podman-mpirun and not in /tmp. If using more than one node this directory will be named differently on other nodes. This requires mounting the complete /tmp directory into the container which is more complicated.

The mpirun command specifies the command to start, the podman command. The following podman options are used to start the container:

  • run command runs a container.
  • --env-host option copies all environment variables from the host into the container.
  • -v /tmp/podman-mpirun:/tmp/podman-mpirun line tells Podman to mount the directory where Open MPI creates its temporary directories and files to be available in the container.
  • --userns=keep-id line ensures the user ID mapping inside and outside the container.
  • --net=host --pid=host --ipc=host line sets the same network, PID and IPC namespaces.
  • mpi-ring is the name of the container.
  • /home/ring is the MPI program in the container.

Additional resources