CRIU - Checkpoint/Restore in user space

Updated -

Introduction

Checkpoint/restore technology lets you save the current state of a process and then restore it later to its previous state (before checkpointing).

All information related to the check­pointed pro­cess is stored in one or more image files. These image files contain information about the process, such as memory pages, file descriptors, inter-process communication, and so on. You can restore a process on the same system or to another system.

Checkpoint/restore originated in High Performance Com­puting (HPC). It was particularly valuable in HPC environments where a single application might be distributed to hundreds or thou­sand of cores. In HPC the failure of a single component can lead to data loss, and then the CPU cycles of those hun­dreds or thousand of cores are wasted.

Checkpoint/restore modes of operation

You can use different approaches to avoid data loss caused by failures with checkpoint/restore. Here are some examples:

  • The application can checkpoint/restore itself to store its current state.
  • The application can be checkpointed/restored in a “semi-transparent” mode by intercepting system calls.
  • The application can be checkpointed/restored in fully transparent mode at the operating sys­tem level.

The advantage of a fully transparent operating-system-level check­point/restore is that you do not have prerequisites before you can checkpoint and restore. There is no requirement for special libraries to be linked against the application or spe­cially prepared environments to intercept system calls. However, this approach requires a more complex tool to checkpoint and restore.

Checkpoint/restore implementations

Many different checkpoint/restore implementations are available, but for many years they have not been easily available in most Linux distributions. Most Linux implementations have been too limited in their functionality or only useful to a limited audience.

With the rising interest in Linux container technology, checkpoint/restore has begun attracting more attention. You can use checkpointing and restoring a process as a means of fault tolerance. You can also use it for load balancing by migrating a running process from one system to another.

Migrating a running process is nothing more than checkpointing a process, transferring it to the destina­tion system, and restoring the process to its original state. Checkpoint/restore technology can restore a whole process group. As a result, checkpoint/restore could become the perfect base technology for container migration.

Architecture

Early implementations of checkpoint/restore did not focus on upstream inclusion. As a result, there was no agreement in the Linux kernel community on the design. This led to the adoption of solutions that were not officially accepted by the Linux community.

An in-kernel checkpoint/restore implementation was developed in cooperation with the Linux community. The in-kernel checkpoint/restore approach was getting too complex to be integrated into the Linux kernel and was therefore not further developed and abandoned.

To solve the problems of these earlier implementations, CRIU takes another approach. It implements as much functionality as possible in the user space and uses existing interfaces to implement checkpoint/restore successfully.

One of the most important kernel interfaces for CRIU is the ptrace interface. CRIU relies on being able to seize the process via ptrace. Then, it injects parasite code to dump the memory pages of the process into image files from within the process's address space.

For each checkpointed part of the process, separate image files are created. Information about memory pages, for example, is collected from /proc/$PID/smaps, /proc/$PID/mapfiles/ and from /proc/$PID/pagemap.

The memory pages image files require the most storage space, especially compared to the remaining image files. The remaining image files contain additional information about the checkpointed process, such as opened files, credentials, registers, task state, and so on. To checkpoint a process tree (a process and all its child processes), CRIU checkpoints each connected child process.

To restore a process, CRIU uses the information gathered during checkpointing. Remember that you can restore a process only if it has the same PID (process ID) it had when it was originally checkpointed. If another process is using this PID, the restore fails.

One of the reasons that the process must be restored with the same PID is that parent-child process trees have to be restored exactly as they were. It is not possible to re-parent a process. To restore a process with the same PID, a newly introduced kernel interface is used to influence which PID the kernel gives to the next process.

If the process just created with clone() has the correct PID, CRIU transforms it into the same state the process was in before being checkpointed. Files are opened and positioned as they were before, memory is restored to the same state, and all other remaining information from the image files is used to restore the process. Once the state is restored, the remaining parts of the restorer are removed. Then, the restored process resumes control and continues from the point at which it was previously checkpointed.

Configuration

Because Red Hat Enterprise Linux 7.2 contains CRIU as a Technology Preview, you do not need to configure it further, other than installing the CRIU package:

# yum install criu

Usage

Once the CRIU package has been installed, processes can be dumped with the following command:

# criu dump -D /path/to/image-dir -t PID

This checkpoints the process with the indicated PID to the directory /path/to/image-dir. To restore the process from the image files in /path/to/image-dir use:

# criu restore -D /path/to/image-dir

These examples show CRIU in its simplest form. Depending on the process to be checkpointed and restored, you may need to use other command-line options.

Limitations

One general limitation is that CRIU can only checkpoint and restore processes using inter-process communication (IPC) if the processes are running inside of an IPC-namespace.

Existing parent-child relations in process trees must be kept intact. This means that CRIU always checkpoints and restores a parent process and all its child processes. It is not possible to checkpoint and restart a parent process on its own.

This limitation is related to the requirement that the PID must stay the same. A CRIU restore process fails if the intended PID is in use.

For a successful migration the used libraries must be the exact same version on both the source and destination systems. This limitation exists because the process already has all required libraries loaded and expects that the functions provided by those libraries are at the same address as they were during start up.

Use cases

One of the main use cases of CRIU is to migrate a Linux container. Depending on the applied container technology, checkpointing and restoring with the help of CRIU might already be included.

References

http://criu.org/

Comments