Chapter 8. Pre-caching glance images

When you configure OpenStack Compute to use local ephemeral storage, glance images are cached to quicken the deployment of instances. If an image that is necessary for an instance is not already cached, it is downloaded to the local disk of the Compute node when you create the instance.

The process of downloading a glance image takes a variable amount of time, depending on the image size and network characteristics such as bandwidth and latency.

If you attempt to start an instance, and the image is not available on the on the Ceph cluster that is local, launching an instance will fail with the following message:

Build of instance 3c04e982-c1d1-4364-b6bd-f876e399325b aborted: Image 20c5ff9d-5f54-4b74-830f-88e78b9999ed is unacceptable: No image locations are accessible

You see the following in the Compute service log:

'Image %s is not on my ceph and [workarounds]/ never_download_image_if_on_rbd=True; refusing to fetch and upload.',

The instance fails to start due to a parameter in the nova.conf configuration file called never_download_image_if_on_rbd, which is set to true by default for DCN deployments. You can control this value using the heat parameter NovaDisableImageDownloadToRbd which you can find in the dcn-hci.yaml file.

If you set the value of NovaDisableImageDownloadToRbd to false prior to deploying the overcloud, the following occurs:

  • The Compute service (nova) will automatically stream images available at the central location if they are not available locally.
  • You will not be using a COW copy from glance images.
  • The Compute (nova) storage will potentially contain multiple copies of the same image, depending on the number of instances using it.
  • You may saturate both the WAN link to the central location as well as the nova storage pool.

Red Hat recommends leaving this value set to true, and ensuring required images are available locally prior to launching an instance. For more information on making images available to the edge, see Section 7.3, “Copying an image to a new site”.

For images that are local, you can speed up the creation of VMs by using the tripleo_nova_image_cache.yml ansible playbook to pre-cache commonly used images or images that are likely to be deployed in the near future.

8.1. Running the tripleo_nova_image_cache.yml ansible playbook

Procedure

  1. Create an ansible inventory file for the stack. You can specify multiple stacks in a comma delimited list to cache images at more than one site:

    $ source stackrc
    $ tripleo-ansible-inventory --plan central,dcn0,dc1 \
    --static-yaml-inventory inventory.yaml
  2. Create a list of image IDs that you want to pre-cache:

    1. Retrieve a comprehensive list of available images:

      $ source centralrc
      
      $ openstack image list
      +--------------------------------------+---------+--------+
      | ID                                   | Name    | Status |
      +--------------------------------------+---------+--------+
      | 07bc2424-753b-4f65-9da5-5a99d8383fe6 | image_0 | active |
      | d5187afa-c821-4f22-aa4b-4e76382bef86 | image_1 | active |
      +--------------------------------------+---------+--------+
    2. Create an ansible playbook argument file called nova_cache_args.yml, and add the IDs of the images that you want to pre-cache:

      ...
      tripleo_nova_image_cache_images:
        - id: 07bc2424-753b-4f65-9da5-5a99d8383fe6
        - id: d5187afa-c821-4f22-aa4b-4e76382bef86
  3. Run the tripleo_nova_image_cache.yml ansible playbook:

    ansible-playbook -i inventory.yaml \
    --extra-vars "@nova_cache_args.yml" \
    /usr/share/ansible/tripleo-playbooks/tripleo_nova_image_cache.yml

8.2. Performance considerations

You can specify the number of images that you want to download concurrently with the ansible forks parameter, which defaults to a value of 5. You can reduce the time to distribute this image by increasing the value of the forks parameter, however you must balance this with the increase in network and glance-api load.

You can find the forks parameter in the configuration file for ansible, ansible.cfg:

$ grep forks /etc/ansible/ansible.cfg
#forks          = 5

8.3. Optimizing the image distribution to DCN sites

You can reduce WAN traffic by using a proxy for glance image distribution. When you configure a proxy:

  • Glance images are downloaded to a single Compute node that acts as the proxy.
  • The proxy redistributes the glance image to other Compute nodes in the inventory.

You can place the following parameters in the nova_cache_args.yml ansible argument file to configure a proxy node.

Set the tripleo_nova_image_cache_use_proxy parameter to true to enable the image cache proxy.

The image proxy uses secure copy scp to distribute images to other nodes in the inventory. SCP is inefficient over networks with high latency, such as a WAN between DCN sites. Red Hat recommends that you limit the playbook target to a single DCN location, which correlates to a single stack.

Use the tripleo_nova_image_cache_proxy_hostname parameter to select the image cache proxy. The default proxy is the first compute node in the ansible inventory file. Use the tripleo_nova_image_cache_plan parameter to limit the playbook inventory to a single site:

tripleo_nova_image_cache_use_proxy: true
tripleo_nova_image_cache_proxy_hostname: dcn0-novacompute-1
tripleo_nova_image_cache_plan: dcn0

8.4. Configuring the nova-cache cleanup

A background process runs periodically to remove images from the nova cache when both of the following conditions are true:

  • The image is not in use by an instance.
  • The age of the image is greater than the value for the nova parameter remove_unused_original_minimum_age_seconds.

The default value for the remove_unused_original_minimum_age_seconds parameter is 86400. The value is expressed in seconds and is equal to 24 hours. You can control this value with the NovaImageCachTTL tripleo-heat-templates parameter during the initial deployment, or during a stack update of your cloud:

parameter_defaults:
  NovaImageCacheTTL: 604800 # Default to 7 days for all compute roles
  Compute2Parameters:
    NovaImageCacheTTL: 1209600 # Override to 14 days for the Compute2 compute role

When you instruct the playbook to pre-cache an image that already exists on a Compute node, ansible does not report a change, but the age of the image is reset to 0. Run the ansible play more frequently than the value of the NovaImageCacheTTL parameter to maintain a cache of images.