Chapter 1. Deploying VDO

As a system administrator, you can use VDO to create deduplicated and compressed storage pools.

1.1. Introduction to VDO

Virtual Data Optimizer (VDO) provides inline data reduction for Linux in the form of deduplication, compression, and thin provisioning. When you set up a VDO volume, you specify a block device on which to construct your VDO volume and the amount of logical storage you plan to present.

  • When hosting active VMs or containers, Red Hat recommends provisioning storage at a 10:1 logical to physical ratio: that is, if you are utilizing 1 TB of physical storage, you would present it as 10 TB of logical storage.
  • For object storage, such as the type provided by Ceph, Red Hat recommends using a 3:1 logical to physical ratio: that is, 1 TB of physical storage would present as 3 TB logical storage.

In either case, you can simply put a file system on top of the logical device presented by VDO and then use it directly or as part of a distributed cloud storage architecture.

Because VDO is thinly provisioned, the file system and applications only see the logical space in use and are not aware of the actual physical space available. Scripting should be used to monitor the actual available space and generate an alert if use exceeds a threshold: for example, when the VDO volume is 80% full. See Section 2.2, “Managing free space on VDO volumes” for details.

1.2. VDO deployment scenarios

You can deploy VDO in a variety of ways to provide deduplicated storage for:

  • both block and file access
  • both local and remote storage

Because VDO exposes its deduplicated storage as a standard Linux block device, you can use it with standard file systems, iSCSI and FC target drivers, or as unified storage.

Note

VDO deployment with Ceph Storage is currently not supported.

KVM

You can deploy VDO on a KVM server configured with Direct Attached Storage.

VDO Deployment with KVM

File systems

You can create file systems on top of VDO and expose them to NFS or CIFS users with the NFS server or Samba.

Deduplicated NAS

iSCSI target

You can export the entirety of the VDO storage target as an iSCSI target to remote iSCSI initiators.

Deduplicated block storage target

LVM

On more feature-rich systems, you can use LVM to provide multiple logical unit numbers (LUNs) that are all backed by the same deduplicated storage pool.

In the following diagram, the VDO target is registered as a physical volume so that it can be managed by LVM. Multiple logical volumes (LV1 to LV4) are created out of the deduplicated storage pool. In this way, VDO can support multiprotocol unified block or file access to the underlying deduplicated storage pool.

Deduplicated unified storage

Deduplicated unified storage design enables for multiple file systems to collectively use the same deduplication domain through the LVM tools. Also, file systems can take advantage of LVM snapshot, copy-on-write, and shrink or grow features, all on top of VDO.

Encryption

Device Mapper (DM) mechanisms such as DM Crypt are compatible with VDO. Encrypting VDO volumes helps ensure data security, and any file systems above VDO are still deduplicated.

Using VDO with encryption
Important

Applying the encryption layer above VDO results in little if any data deduplication. Encryption makes duplicate blocks different before VDO can deduplicate them.

Always place the encryption layer below VDO.

1.3. VDO requirements

VDO has certain requirements on its placement and your system resources.

1.3.1. Placement of VDO in the storage stack

You should place certain storage layers under VDO and others above VDO:

Only under VDO
  • DM Multipath
  • DM Crypt
  • Software RAID (LVM or MDRAID)
Only above VDO
  • LVM cache
  • LVM logical volumes
  • LVM snapshots
  • LVM thin provisioning
Important

You can place thick-provisioned layers on top of VDO, but you cannot rely on the guarantees of thick provisioning in that case. Because the VDO layer is thin-provisioned, the effects of thin provisioning apply to all layers above it. If you do not monitor the VDO device, you might run out of physical space on thick-provisioned volumes above VDO.

The following configurations are not supported:

  • VDO on top of VDO volumes: storage → VDO → LVM → VDO
  • VDO on top of LVM snapshots
  • VDO on top of LVM cache
  • VDO on top of a loopback device
  • VDO on top of LVM thin provisioning
  • Encrypted volumes on top of VDO: storage → VDO → DM-Crypt
  • Partitions on a VDO volume
  • RAID (LVM, MDRAID, or any other type) on top of a VDO volume
Additional resources

1.3.2. VDO memory requirements

Each VDO volume has two distinct memory requirements:

The VDO module

VDO requires 370 MB of DRAM plus an additional 268 MB per each 1 TB of physical storage managed by the volume.

The Universal Deduplication Service (UDS) index

UDS requires a minimum of 250 MB of DRAM, which is also the default amount that deduplication uses.

The memory required for the UDS index is determined by the index type and the required size of the deduplication window:

Index typeDeduplication windowNote

Dense

1 TB per 1 GB of RAM

A 1 GB dense index is generally sufficient for up to 4 TB of physical storage.

Sparse

10 TB per 1 GB of RAM

A 1 GB sparse index is generally sufficient for up to 40 TB of physical storage.

The UDS Sparse Indexing feature is the recommended mode for VDO. It relies on the temporal locality of data and attempts to retain only the most relevant index entries in memory. With the sparse index, UDS can maintain a deduplication window that is ten times larger than with dense, while using the same amount of memory.

Although the sparse index provides the greatest coverage, the dense index provides more deduplication advice. For most workloads, given the same amount of memory, the difference in deduplication rates between dense and sparse indexes is negligible.

Additional resources

1.3.3. VDO storage requirements

You can configure a VDO volume to use up to 256 TB of physical storage. Only a certain part of the physical storage is usable to store data. This section provides the calculations to determine the usable size of a VDO-managed volume.

VDO requires storage for two types of VDO metadata and for the UDS index:

  • The first type of VDO metadata uses approximately 1 MB for each 4 GB of physical storage plus an additional 1 MB per slab.
  • The second type of VDO metadata consumes approximately 1.25 MB for each 1 GB of logical storage, rounded up to the nearest slab.
  • The amount of storage required for the UDS index depends on the type of index and the amount of RAM allocated to the index. For each 1 GB of RAM, a dense UDS index uses 17 GB of storage, and a sparse UDS index will use 170 GB of storage.
Additional resources

1.3.4. Examples of VDO requirements by physical volume size

The following tables provide approximate system requirements of VDO based on the size of the underlying physical volume. Each table lists requirements appropriate to the intended deployment, such as primary storage or backup storage.

The exact numbers depend on your configuration of the VDO volume.

Primary storage deployment

In the primary storage case, the UDS index is between 0.01% to 25% the size of the physical volume.

Table 1.1. Storage and memory requirements for primary storage

Physical volume sizeRAM usageDisk usageIndex type

10GB–1TB

250MB

2.5 GB

Dense

2–10TB

1GB

10GB

Dense

250MB

22GB

Sparse

11–50TB

2GB

170GB

Sparse

51–100TB

3GB

255GB

Sparse

101–256TB

12GB

1020GB

Sparse

Backup storage deployment

In the backup storage case, the UDS index covers the size of the backup set but is not bigger than the physical volume. If you expect the backup set or the physical size to grow in the future, factor this into the index size.

Table 1.2. Storage and memory requirements for backup storage

Physical volume sizeRAM usageDisk usageIndex type

10GB–1TB

250MB

2.5 GB

Dense

2–10TB

2GB

170GB

Sparse

11–50TB

10GB

850GB

Sparse

51–100TB

20GB

1700GB

Sparse

101–256TB

26GB

3400GB

Sparse

1.4. Installing VDO

This procedure installs software necessary to create, mount, and manage VDO volumes.

Procedure

  • Install the vdo and kmod-kvdo packages:

    # yum install vdo kmod-kvdo

1.5. Creating a VDO volume

This procedure creates a VDO volume on a block device.

When a VDO volume is created, VDO adds an entry to the /etc/vdoconf.yml configuration file. The vdo.service systemd unit then uses the entry to start the volume by default.

Prerequisites

Procedure

In all the following steps, replace vdo-name with the identifier you want to use for your VDO volume; for example, vdo1. You must use a different name and device for each instance of VDO on the system.

  1. Create the VDO volume:

    # vdo create \
          --name=vdo-name \
          --device=block-device \
          --vdoLogicalSize=logical-size
    • Replace block-device with the name of the block device where you want to create the VDO volume. For example, /dev/sde.
    • Replace logical-size with the amount of logical storage that the VDO volume should present:

      • For active VMs or container storage, use logical size that is ten times the physical size of your block device. For example, if your block device is 1TB in size, use 10T here.
      • For object storage, use logical size that is three times the physical size of your block device. For example, if your block device is 1TB in size, use 3T here.
    • If the physical block device is larger than 16TiB, add the --vdoSlabSize=32G option to increase the slab size on the volume to 32GiB.

      Using the default slab size of 2GiB on block devices larger than 16TiB results in the vdo create command failing with the following error:

      vdo: ERROR - vdoformat: formatVDO failed on '/dev/device': VDO Status: Exceeds maximum number of slabs supported

    Example 1.1. Creating VDO for container storage

    For example, to create a VDO volume for container storage on a 1TB block device, you might use:

    # vdo create \
          --name=vdo1 \
          --device=/dev/sde \
          --vdoLogicalSize=10T
    Important

    If a failure occurs when creating the VDO volume, remove the volume to clean up. See Section 2.11.2, “Removing an unsuccessfully created VDO volume” for details.

  2. Create a file system on top of the VDO volume:

    • For the XFS file system:

      # mkfs.xfs -K /dev/mapper/vdo-name
    • For the ext4 file system:

      # mkfs.ext4 -E nodiscard /dev/mapper/vdo-name
  3. Use the following command to wait for the system to register the new device node:

    # udevadm settle

Next steps

  1. Mount the file system. See Section 1.6, “Mounting a VDO volume” for details.
  2. Enable the discard feature for the file system on your VDO device. See Section 1.7, “Enabling periodic block discard” for details.

Additional resources

  • The vdo(8) man page

1.6. Mounting a VDO volume

This procedure mounts a file system on a VDO volume, either manually or persistently.

Prerequisites

Procedure

  • To mount the file system on the VDO volume manually, use:

    # mount /dev/mapper/vdo-name mount-point
  • To configure the file system to mount automatically at boot, add a line to the /etc/fstab file:

    • For the XFS file system:

      /dev/mapper/vdo-name mount-point xfs defaults,_netdev,x-systemd.device-timeout=0,x-systemd.requires=vdo.service 0 0
    • For the ext4 file system:

      /dev/mapper/vdo-name mount-point ext4 defaults,_netdev,x-systemd.device-timeout=0,x-systemd.requires=vdo.service 0 0

1.7. Enabling periodic block discard

This procedure enables a systemd timer that regularly discards unused blocks on all supported file systems.

Procedure

  • Enable and start the systemd timer:

    # systemctl enable --now fstrim.timer

1.8. Monitoring VDO

This procedure describes how to obtain usage and efficiency information from a VDO volume.

Prerequisites

Procedure

  • Use the vdostats utility to get information about a VDO volume:

    # vdostats --human-readable
    
    Device                   1K-blocks    Used     Available    Use%    Space saving%
    /dev/mapper/node1osd1    926.5G       21.0G    905.5G       2%      73%
    /dev/mapper/node1osd2    926.5G       28.2G    898.3G       3%      64%

Additional resources

  • The vdostats(8) man page.