Chapter 3. Snapshots

A snapshot is a read-only copy of the state of an image at a particular point in time. One of the advanced features of Ceph block devices is that you can create snapshots of the images to retain a history of an image’s state. Ceph also supports snapshot layering, which allows you to clone images (for example a VM image) quickly and easily. Ceph supports block device snapshots using the rbd command and many higher level interfaces, including QEMU, libvirt,OpenStack and CloudStack.

Important

To use RBD snapshots, you must have a running Ceph cluster.

Note

If a snapshot is taken while I/O is still in progress in a image, the snapshot might not get the exact or latest data of the image and the snapshot may have to be cloned to a new image to be mountable. So, we recommend to stop I/O before taking a snapshot of an image. If the image contains a filesystem, the filesystem must be in a consistent state before taking a snapshot. To stop I/O you can use fsfreeze command. See fsfreeze(8) man page for more details. For virtual machines, qemu-guest-agent can be used to automatically freeze filesystems when creating a snapshot.

diag da77489ea8c396db63914ce19237174d

3.1. Cephx Notes

When cephx is enabled (it is by default), you must specify a user name or ID and a path to the keyring containing the corresponding key for the user. You may also add the CEPH_ARGS environment variable to avoid re-entry of the following parameters:

rbd --id {user-ID} --keyring=/path/to/secret [commands]
rbd --name {username} --keyring=/path/to/secret [commands]

For example:

rbd --id admin --keyring=/etc/ceph/ceph.keyring [commands]
rbd --name client.admin --keyring=/etc/ceph/ceph.keyring [commands]
Tip

Add the user and secret to the CEPH_ARGS environment variable so that you don’t need to enter them each time.

3.2. Snapshot Basics

The following procedures demonstrate how to create, list, and remove snapshots using the rbd command on the command line.

3.2.1. Creating Snapshots

To create a snapshot with rbd, specify the snap create option, the pool name and the image name:

rbd --pool {pool-name} snap create --snap {snap-name} {image-name}
rbd snap create {pool-name}/{image-name}@{snap-name}

For example:

rbd --pool rbd snap create --snap snapname foo
rbd snap create rbd/foo@snapname

3.2.2. Listing Snapshots

To list snapshots of an image, specify the pool name and the image name:

rbd --pool {pool-name} snap ls {image-name}
rbd snap ls {pool-name}/{image-name}

For example:

rbd --pool rbd snap ls foo
rbd snap ls rbd/foo

3.2.3. Rollbacking Snapshots

To rollback to a snapshot with rbd, specify the snap rollback option, the pool name, the image name and the snap name:

rbd --pool {pool-name} snap rollback --snap {snap-name} {image-name}
rbd snap rollback {pool-name}/{image-name}@{snap-name}

For example:

rbd --pool rbd snap rollback --snap snapname foo
rbd snap rollback rbd/foo@snapname
Note

Rolling back an image to a snapshot means overwriting the current version of the image with data from a snapshot. The time it takes to execute a rollback increases with the size of the image. It is faster to clone from a snapshot than to rollback an image to a snapshot, and it is the preferred method of returning to a pre-existing state.

3.2.4. Deleting Snapshots

To delete a snapshot with rbd, specify the snap rm option, the pool name, the image name and the snapshot name:

rbd --pool <pool-name> snap rm --snap <snap-name> <image-name>
rbd snap rm <pool-name-/<image-name>@<snap-name>

For example:

$ rbd --pool rbd snap rm --snap snapname foo
$ rbd snap rm rbd/foo@snapname
Important

If an image has any clones, the cloned images retain reference to the parent image snapshot. To delete the parent image snapshot, you must flatten the child images first. See Flattening a Cloned Image for details.

Note

Ceph OSD daemons delete data asynchronously, so deleting a snapshot does not free up the disk space immediately.

3.2.5. Purging Snapshots

To delete all snapshots for an image with rbd, specify the snap purge option and the image name:

rbd --pool {pool-name} snap purge {image-name}
rbd snap purge {pool-name}/{image-name}

For example:

rbd --pool rbd snap purge foo
rbd snap purge rbd/foo

3.2.6. Renaming Snapshots

To rename a snapshot:

rbd snap rename <pool-name>/<image-name>@<original-snapshot-name> <pool-name>/<image-name>@<new-snapshot-name>

Example

To rename the snap1 snapshot of the dataset image on the data pool to snap2:

$ rbd snap rename data/dataset@snap1 data/dataset@snap2

Execute the rbd help snap rename command to display additional details on renaming snapshots.

3.3. Layering

Ceph supports the ability to create many copy-on-write (COW) or copy-on-read (COR) clones of a block device snapshot. Snapshot layering enables Ceph block device clients to create images very quickly. For example, you might create a block device image with a Linux VM written to it; then, snapshot the image, protect the snapshot, and create as many clones as you like. A snapshot is read-only, so cloning a snapshot simplifies semantics—​making it possible to create clones rapidly.

diag 783bfb85b3c774ad7e507408a4bec167
Note

The terms parent and child mean a Ceph block device snapshot (parent), and the corresponding image cloned from the snapshot (child). These terms are important for the command line usage below.

Each cloned image (child) stores a reference to its parent image, which enables the cloned image to open the parent snapshot and read it. This reference is removed when the clone is flattened that is, when information from the snapshot is completely copied to the clone. For more information on flattening see Section 3.3.6, “Flattening Cloned Images”.

A clone of a snapshot behaves exactly like any other Ceph block device image. You can read to, write from, clone, and resize cloned images. There are no special restrictions with cloned images. However, the clone of a snapshot refers to the snapshot, so you MUST protect the snapshot before you clone it.

A clone of a snapshot can be a copy-on-write (COW) or copy-on-read (COR) clone. Copy-on-write (COW) is always enabled for clones while copy-on-read (COR) has to be enabled explicitly. Copy-on-write (COW) copies data from the parent to the clone when it writes to an unallocated object within the clone. Copy-on-read (COR) copies data from the parent to the clone when it reads from an unallocated object within the clone. Reading data from a clone will only read data from the parent if the object does not yet exist in the clone. Rados block device breaks up large images into multiple objects (defaults to 4 MB) and all copy-on-write (COW) and copy-on-read (COR) operations occur on a full object (that is writing 1 byte to a clone will result in a 4 MB object being read from the parent and written to the clone if the destination object does not already exist in the clone from a previous COW/COR operation).

Whether or not copy-on-read (COR) is enabled, any reads that cannot be satisfied by reading an underlying object from the clone will be rerouted to the parent. Since there is practically no limit to the number of parents (meaning that you can clone a clone), this reroute continues until an object is found or you hit the base parent image. If copy-on-read (COR) is enabled, any reads that fail to be satisfied directly from the clone result in a full object read from the parent and writing that data to the clone so that future reads of the same extent can be satisfied from the clone itself without the need of reading from the parent.

This is essentially an on-demand, object-by-object flatten operation. This is specially useful when the clone is in a high-latency connection away from it’s parent (parent in a different pool in another geographical location). Copy-on-read (COR) reduces the amortized latency of reads. The first few reads will have high latency because it will result in extra data being read from the parent (for example, you read 1 byte from the clone but now 4 MB has to be read from the parent and written to the clone), but all future reads will be served from the clone itself.

To create copy-on-read (COR) clones from snapshot you have to explicitly enable this feature by adding rbd_clone_copy_on_read = true under [global] or [client] section in your ceph.conf file.

Note

Ceph only supports cloning for format 2 images (created with rbd create --image-format 2), and is not yet supported by the kernel rbd module. So you MUST use QEMU/KVM or librbd directly to access clones in the current release.

3.3.1. Getting Started with Layering

Ceph block device layering is a simple process. You must have an image. You must create a snapshot of the image. You must protect the snapshot. Once you have performed these steps, you can begin cloning the snapshot.

diag af6f113d697245b04e39e7bd07fb897b

The cloned image has a reference to the parent snapshot, and includes the pool ID, image ID and snapshot ID. The inclusion of the pool ID means that you may clone snapshots from one pool to images in another pool.

  1. Image Template: A common use case for block device layering is to create a master image and a snapshot that serves as a template for clones. For example, a user may create an image for a RHEL7 distribution and create a snapshot for it. Periodically, the user may update the image and create a new snapshot (for example yum update, yum upgrade, followed by rbd snap create). As the image matures, the user can clone any one of the snapshots.
  2. Extended Template: A more advanced use case includes extending a template image that provides more information than a base image. For example, a user may clone an image (for example, a VM template) and install other software (for example, a database, a content management system, an analytics system, and so on) and then snapshot the extended image, which itself may be updated just like the base image.
  3. Template Pool: One way to use block device layering is to create a pool that contains master images that act as templates, and snapshots of those templates. You may then extend read-only privileges to users so that they may clone the snapshots without the ability to write or execute within the pool.
  4. Image Migration/Recovery: One way to use block device layering is to migrate or recover data from one pool into another pool.

3.3.2. Protecting Snapshots

Clones access the parent snapshots. All clones would break if a user inadvertently deleted the parent snapshot. To prevent data loss, you MUST protect the snapshot before you can clone it. To do so, run the following commands:

rbd --pool {pool-name} snap protect --image {image-name} --snap {snapshot-name}
rbd snap protect {pool-name}/{image-name}@{snapshot-name}

For example:

rbd --pool rbd snap protect --image my-image --snap my-snapshot
rbd snap protect rbd/my-image@my-snapshot
Note

You cannot delete a protected snapshot.

3.3.3. Cloning Snapshots

To clone a snapshot, you need to specify the parent pool, image and snapshot; and the child pool and image name. You must protect the snapshot before you can clone it. To do so, run the following commands:

rbd --pool {pool-name} --image {parent-image} --snap {snap-name} --dest-pool {pool-name} --dest {child-image}
rbd clone {pool-name}/{parent-image}@{snap-name} {pool-name}/{child-image-name}

For example:

rbd clone rbd/my-image@my-snapshot rbd/new-image
Note

You may clone a snapshot from one pool to an image in another pool. For example, you may maintain read-only images and snapshots as templates in one pool, and writable clones in another pool.

3.3.4. Unprotecting Snapshots

Before you can delete a snapshot, you must unprotect it first. Additionally, you may NOT delete snapshots that have references from clones. You must flatten each clone of a snapshot, before you can delete the snapshot. To do so, run the following commands:

rbd --pool {pool-name} snap unprotect --image {image-name} --snap {snapshot-name}
rbd snap unprotect {pool-name}/{image-name}@{snapshot-name}

For example:

rbd --pool rbd snap unprotect --image my-image --snap my-snapshot
rbd snap unprotect rbd/my-image@my-snapshot

3.3.5. Listing Children of a Snapshot

To list the children of a snapshot, execute the following:

rbd --pool {pool-name} children --image {image-name} --snap {snap-name}
rbd children {pool-name}/{image-name}@{snapshot-name}

For example:

rbd --pool rbd children --image my-image --snap my-snapshot
rbd children rbd/my-image@my-snapshot

3.3.6. Flattening Cloned Images

Cloned images retain a reference to the parent snapshot. When you remove the reference from the child clone to the parent snapshot, you effectively "flatten" the image by copying the information from the snapshot to the clone. The time it takes to flatten a clone increases with the size of the snapshot.

To delete a parent image snapshot associated with child images, you must flatten the child images first:

rbd --pool <pool-name> flatten --image <image-name>
rbd flatten <pool-name>/<image-name>

For example:

$ rbd --pool rbd flatten --image my-image
$ rbd flatten rbd/my-image

Because a flattened image contains all the information from the snapshot, a flattened image will use more storage space than a layered clone.

Note

If the deep flatten feature is enabled on an image, the image clone is dissociated from its parent by default.