Getting started with High Performance Computing (HPC) in Red Hat Enterprise Linux 8

Updated -

Introduction

What is High Performance Computing?

High Performance Computing (HPC) generally refers to processing data with complex calculations at high speeds. While historically, so-called “supercomputers” were single fast processors, current HPC uses often massive clusters of CPUs, aggregating computing power to deliver significantly higher performance than single desktops or servers for solving large numeric problems in engineering, science, and business. In an HPC cluster, each component computer is often referred to as a node.

Technical background

HPC clusters run batches of computations. The core of any HPC cluster is the scheduler, used to keep track of available resources, allowing job requests to be efficiently assigned to compute resources (CPU and GPU).

The most common way for an HPC job to use more than one cluster node is via the Message Passing Interface (MPI). MPI is a specification for the developers and users of message passing libraries. MPI allows you to launch a job running across the entire cluster with a single command, and it also provides application-level communication across the cluster.

MPI constitutes a standardized and portable message-passing system which consists of a library and a protocol to support parallel computing. MPI enables passing information between various nodes of a HPC cluster or between particular clusters, and has different implementations that provide the libraries to run HPC applications in a distributed manner across different physical nodes.

Red Hat provides the following MPI implementations:

  • Open MPI - provided in the openmpi package; an open source and freely available implementation of both the MPI-1 and MPI-2 standards; combines technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI)
  • MPICH - provided in the mpich package; a high-performance and widely portable implementation of the MPI-1, MPI-2, and MPI-3 standards
  • MVAPICH - provided in the mvapich package; an implementation of MPICH with Infiniband support

The openmpi, mpich and mvapich packages are included in Red Hat Enterprise Linux.

Red Hat recommends Open MPI as the MPI implementation for HPC clusters.

HPC systems often replace the MPI libraries that ship in RHEL with vendor specific versions that take full advantage of high-performance interconnects such as Infiniband.

High Performance Computing on Red Hat Enterprise Linux

The Red Hat HPC offering is a special use case that cost effectively addresses HPC clusters. It is based on standard RHEL Server components and uses standard installation and entitlement. To be eligible for HPC, the workload must be non-interactive and externally scheduled, usually bound by computational resources.

Two separate subscriptions are needed to build a RHEL HPC cluster:

  • Head Nodes - control the entire cluster; used for management, job control, and launching jobs across the cluster
  • Compute Nodes - perform the actual HPC calculations

An HPC cluster must include at least one Head Node and one Compute Node.

How to install head node and compute node

Previous versions of RHEL for HPC were built using subsets of RHEL Server. In RHEL 8, both head and compute nodes are based on the standard Red Hat Enterprise Linux Server software stack. HPC systems are identified at installation time by subscribing each system with a HPC Head Node or HPC Compute Node entitlement.

A head node is basically a Red Hat Enterprise Linux Server subscribed to one of the head node entitlements.
A compute node is basically a Red Hat Enterprise Linux Server subscribed to one of the compute node entitlements .

  1. Download the ISO image files of the Red Hat Enterprise Linux Server 8 from the Red Hat Customer Portal.

    Download Red Hat Enterprise Linux Server.

    Use one of two basic types of installation media:

    • Binary DVD is a full installation image which can be used to boot the installation program and perform an entire installation without additional package repositories.

    • Boot ISO is a minimal boot image which can be used to boot the installation program. The boot ISO is smaller than the full installation image, but the installation from the boot ISO is a network-based installation that requires the creation of a local repository from which the software will be installed. For more information on working with yum repositories, see Configuring yum and yum repositories.

    See Performing a standard RHEL installation for further information on the installation media.

  2. Install Red Hat Enterprise Linux Server 8.

    Once you have downloaded the ISO, create an installation CD or DVD, or use USB media.

    There are several possible ways of installing Red Hat Enterprise Linux 8.

    Red Hat recommends installing using the Graphical User Interface that will guide you through the whole installation process.

    Another option is to perform a scripted installation using the Kickstart file as described in Performing an advanced RHEL installation.

  3. Register and subscribe the head node and all compute nodes.

    A head node must be subscribed to one of the head node entitlements. You do not need any additional subscriptions apart from your subscription for Red Hat Enterprise Linux 8 for a head node.

    You can use the Registration Assistant Red Hat Customer Portal Lab or follow the instructions in Configuring and managing basic system settings to register and subscribe your system.

    Each compute node must be subscribed to one of the compute node entitlements. Any additional subscriptions apart from your subscription for Red Hat Enterprise Linux 8 for a compute node are not needed.

    If you use Red Hat Satellite, you can register all compute nodes and entitle them to Red Hat Satellite provisioning. Provisioning refers to a process that starts with a bare physical or virtual machine and ends with a fully configured, ready-to-use operating system. Red Hat Satellite provides an ability to define and automate provisioning for a large number of hosts. Thus, you can ensure the installation of the operating system, registration, subscription management, and package and patching management for all compute nodes.

    For basic information on Red Hat Satellite, see The Red Hat Satellite Quick Start Guide. If you want to know more about provisioning with Red Hat Satellite, see The Red Hat Satellite Provisioning Guide. All documentation related to Red Hat Satellite can be found at Red Hat Customer Portal.

Communication between head node and compute nodes

To run jobs on an HPC cluster, you need to install a scheduler or a manager for the HPC cluster. The RHEL for HPC offering does not include an HPC scheduler. You can choose a third-party scheduler, such as SLURM (open source), PBS Pro, Condor, MOAB, or Spectrum LSF. Install the scheduler of your choice, and configure the connectivity between the head node and the compute nodes according to the instructions of the scheduler.

Comments