Chapter 16. Using Podman in HPC environment
You can use Podman with Open MPI (Message Passing Interface) to run containers in a High Performance Computing (HPC) environment.
16.1. Using Podman with MPI
The example is based on the ring.c program taken from Open MPI. In this example, a value is passed around by all processes in a ring-like fashion. Each time the message passes rank 0, the value is decremented. When each process receives the 0 message, it passes it on to the next process and then quits. By passing the 0 first, every process gets the 0 message and can quit normally.
Install Open MPI:
$ sudo dnf install openmpi
To activate the environment modules, type:
$ . /etc/profile.d/modules.sh
$ module load mpi/openmpi-x86_64
Optionally, to automatically load
mpi/openmpi-x86_64module, add this line to the
$ echo "module load mpi/openmpi-x86_64" >> .bashrc
podman, create a container with the following definition:
$ cat Containerfile FROM registry.access.redhat.com/ubi9/ubi RUN dnf -y install openmpi-devel wget && \ dnf clean all RUN wget https://raw.githubusercontent.com/open-mpi/ompi/master/test/simple/ring.c && \ /usr/lib64/openmpi/bin/mpicc ring.c -o /home/ring && \ rm -f ring.c
Build the container:
$ podman build --tag=mpi-ring .
Start the container. On a system with 4 CPUs this command starts 4 containers:
$ mpirun \ --mca orte_tmpdir_base /tmp/podman-mpirun \ podman run --env-host \ -v /tmp/podman-mpirun:/tmp/podman-mpirun \ --userns=keep-id \ --net=host --pid=host --ipc=host \ mpi-ring /home/ring Rank 2 has cleared MPI_Init Rank 2 has completed ring Rank 2 has completed MPI_Barrier Rank 3 has cleared MPI_Init Rank 3 has completed ring Rank 3 has completed MPI_Barrier Rank 1 has cleared MPI_Init Rank 1 has completed ring Rank 1 has completed MPI_Barrier Rank 0 has cleared MPI_Init Rank 0 has completed ring Rank 0 has completed MPI_Barrier
As a result,
mpirunstarts up 4 Podman containers and each container is running one instance of the
ringbinary. All 4 processes are communicating over MPI with each other.
16.2. The mpirun options
mpirun options are used to start the container:
--mca orte_tmpdir_base /tmp/podman-mpirunline tells Open MPI to create all its temporary files in
/tmp/podman-mpirunand not in
/tmp. If using more than one node this directory will be named differently on other nodes. This requires mounting the complete
/tmpdirectory into the container which is more complicated.
mpirun command specifies the command to start, the
podman command. The following
podman options are used to start the container:
runcommand runs a container.
--env-hostoption copies all environment variables from the host into the container.
-v /tmp/podman-mpirun:/tmp/podman-mpirunline tells Podman to mount the directory where Open MPI creates its temporary directories and files to be available in the container.
--userns=keep-idline ensures the user ID mapping inside and outside the container.
--net=host --pid=host --ipc=hostline sets the same network, PID and IPC namespaces.
mpi-ringis the name of the container.
/home/ringis the MPI program in the container.