Chapter 2. Introduction to cloud-init

The cloud-init utility automates the initialization of cloud instances during system boot. You can configure cloud-init to perform a variety of tasks:

  • Configuring a host name
  • Installing packages on an instance
  • Running scripts
  • Suppressing default virtual machine (VM) behavior

Prerequisites

The cloud-init is available in various types of RHEL images. For example:

  • If you download a KVM guest image from the Red Hat Customer Portal, the image comes preinstalled with the cloud-init package. After you launch the instance, the cloud-init package becomes enabled. KVM guest images on the Red Hat Customer Portal are intended to use with Red Hat Virtualization (RHV), the Red Hat OpenStack Platform (RHOSP), and Red Hat OpenShift Virtualization.
  • You can also download the RHEL ISO image from the Red Hat Customer Portal to create a custom guest image. In this case, you need to install the cloud-init package on the customized guest image.
  • If you require to use an image from a cloud service provider (for example, AWS or Azure), use the RHEL image builder to create the image. Image builder images are customized for specific cloud providers. The following image types include cloud-init already installed:

Most cloud platforms support cloud-init, but configuration procedures and supported options vary. Alternatively, you can configure cloud-init for the NoCloud environment.

In addition, you can configure cloud-init on one VM and then use that VM as a template to create additional VMs or clusters of VMs.

Specific Red Hat products, for example, Red Hat Virtualization, have documented procedures to configure cloud-init for those products.

2.1. Overview of the cloud-init configuration

The cloud-init utility uses YAML-formatted configuration files to apply user-defined tasks to instances. When an instance boots, the cloud-init service starts and executes the instructions from the YAML file. Depending on the configuration, tasks complete either during the first boot or on subsequent boots of the VM.

To define the specific tasks, configure the /etc/cloud/cloud.cfg file and add directives under the /etc/cloud/cloud.cfg.d/ directory.

  • The cloud.cfg file includes directives for various system configurations, such as user access, authentication, and system information.

    The file also includes default and optional modules for cloud-init. These modules execute in order in the following phases: .. The cloud-init initialization phase .. The configuration phase .. The final phase.

    + In the cloud.cfg file, the modules for the three phases are listed under cloud_init_modules, cloud_config_modules, and cloud_final_modules respectively.

  • You can add additional directives for cloud-init in the cloud.cfg.d directory. When adding directives to the cloud.cfg.d directory, you need to add them to a custom file named *.cfg, and always include #cloud-config at the top of the file.

2.2. cloud-init operates in stages

During system boot, the cloud-init utility operates in five stages that determine whether cloud-init runs and where it finds its datasources, among other tasks. The stages are as follows:

  1. Generator stage: By using the systemd service, this phase determines whether to run cloud-init utility at the time of boot.
  2. Local stage: cloud-init searches local datasources and applies network configuration, including the DHCP-based fallback mechanism.
  3. Network stage: cloud-init processes user data by running modules listed under cloud_init_modules in the /etc/cloud/cloud.cfg file. You can add, remove, enable, or disable modules in the cloud_init_modules section.
  4. Config stage: cloud-init runs modules listed under cloud_config_modules section in the /etc/cloud/cloud.cfg file. You can add, remove, enable, or disable modules in the cloud_config_modules section.
  5. Final stage: cloud-init runs modules and configurations included in the cloud_final_modules section of the /etc/cloud/cloud.cfg file. It can include the installation of specific packages, as well as triggering configuration management plug-ins and user-defined scripts. You can add, remove, enable, or disable modules in the cloud_final_modules section.

Additional resources

2.3. cloud-init modules execute in phases

When cloud-init runs, it executes the modules within cloud.cfg in order within three phases:

  1. The network phase (cloud_init_modules)
  2. The configuration phase (cloud_config_modules)
  3. The final phase (cloud_final_modules)

When cloud-init runs for the first time on a VM, all the modules you have configured run in their respective phases. On a subsequent running of cloud-init, whether a module runs within a phase depends on the module frequency of the individual module. Some modules run every time cloud-init runs; some modules only run the first time cloud-init runs, even if the instance ID changes.

Note

An instance ID uniquely identifies an instance. When an instance ID changes, cloud-init treats the instance as a new instance.

The possible module frequency values are as follows:

  • Per instance means that the module runs on first boot of an instance. For example, if you clone an instance or create a new instance from a saved image, the modules designated as per instance run again.
  • Per once means that the module runs only once. For example, if you clone an instance or create a new instance from a saved image, the modules designated per once do not run again on those instances.
  • Per always means the module runs on every boot.
Note

You can override a module’s frequency when you configure the module or by using the command line.

2.4. cloud-init acts upon user data, metadata, and vendor data

The datasources that cloud-init consumes are user data, metadata, and vendor data.

  • User data includes directives you specify in the cloud.cfg file and in the cloud.cfg.d directory, for example, user data can include files to run, packages to install, and shell scripts. Refer to the cloud-init Documentation section User-Data Formats for information about the types of user data that cloud-init allows.
  • Metadata includes data associated with a specific datasource, for example, metadata can include a server name and instance ID. If you are using a specific cloud platform, the platform determines where your instances find user data and metadata. Your platform may require that you add metadata and user data to an HTTP service; in this case, when cloud-init runs it consumes metadata and user data from the HTTP service.
  • Vendor data is optionally provided by the organization (for example, a cloud provider) and includes information that can customize the image to better fit the environment where the image runs. cloud-init acts upon optional vendor data and user data after it reads any metadata and initializes the system. By default, vendor data runs on the first boot. You can disable vendor data execution.

    Refer to the cloud-init Documentation section Instance Metadata for a description of metadata; Datasources for a list of datasources; and Vendor Data for more information about vendor data.

2.5. cloud-init identifies the cloud platform

cloud-init attempts to identify the cloud platform using the script ds-identify. The script runs on the first boot of an instance.

Adding a datasource directive can save time when cloud-init runs. You would add the directive in the /etc/cloud/cloud.cfg file or in the /etc/cloud/cloud.cfg.d directory. For example:

datasource_list:[Ec2]

Beyond adding the directive for your cloud platform, you can further configure cloud-init by adding additional configuration details, such as metadata URLs.

datasource_list: [Ec2]
datasource:
  Ec2:
    metadata_urls: ['http://169.254.169.254']

After cloud-init runs, you can view a log file (run/cloud-init/ds-identify.log) that provides detailed information about the platform.

2.6. Additional resources