Before evaluating VDO, it is important to consider the host system configuration, VDO configuration, and the workloads that will be used during testing. These choices will affect benchmarking both in terms of data optimization (space efficiency) and performance (bandwidth and latency). Items that should be considered when developing test plans are listed in the following sections.
31.2.1. System Configuration
Number and type of CPU cores available. This can be controlled by using the
Available memory and total installed memory.
Configuration of storage devices.
Linux kernel version. Note that Red Hat Enterprise Linux 7 provides only one Linux kernel version.
31.2.2. VDO Configuration
File system(s) used on VDO volumes
Size of the physical storage assigned to a VDO volume
Size of the logical VDO volume created
Sparse or dense indexing
UDS Index in memory size
VDO's thread configuration
Types of tools used to generate test data
Number of concurrent clients
The quantity of duplicate 4 KB blocks in the written data
Read and write patterns
The working set size
VDO volumes may need to be re-created in between certain tests to ensure that each test is performed on the same disk environment. Read more about this in the testing section.
31.2.4. Supported System Configurations
Red Hat has tested VDO with Red Hat Enterprise Linux 7 on the Intel 64 architecture.
The following utilities are recommended when evaluating VDO:
31.2.5. Pre-Test System Preparations
This section describes how to configure system settings to achieve optimal performance during the evaluation. Testing beyond the implicit bounds established in any particular test may result in loss of testing time due to abnormal results. For example, this guide describes a test that conducts random reads over a 100 GB address range. To test a working set of 500 GB, the amount of DRAM allocated for the VDO block map cache should be increased accordingly.
Ensure that your CPU is running at its highest performance setting.
Disable frequency scaling if possible using the BIOS configuration or the Linux
Enable Turbo mode if possible to achieve maximum throughput. Turbo mode introduces some variability in test results, but performance will meet or exceed that of testing without Turbo.
For disk-based solutions, Linux offers several I/O scheduler algorithms to handle multiple read/write requests as they are queued. By default, Red Hat Enterprise Linux uses the CFQ (completely fair queuing) scheduler, which arranges requests in a way that improves rotational disk (hard disk) access in many situations. We instead suggest using the Deadline scheduler for rotational disks, having found that it provides better throughput and latency in Red Hat lab testing. Change the device settings as follows:
# echo "deadline" > /sys/block/device/queue/scheduler
For flash-based solutions, the
noop scheduler demonstrates superior random access throughput and latency in Red Hat lab testing. Change the device settings as follows:
# echo "noop" > /sys/block/device/queue/scheduler
Storage device configuration
File systems (ext4, XFS, etc.) may have unique impacts on performance; they often skew performance measurements, making it harder to isolate VDO's impact on the results. If reasonable, we recommend measuring performance on the raw block device. If this is not possible, format the device using the file system that would be used in the target implementation.
31.2.6. VDO Internal Structures
We believe that a general understanding of VDO mechanisms is essential for a complete and successful evaluation. This understanding becomes especially important when testers wish to deviate from the test plan or devise new stimuli to emulate a particular application or use case. For more information, see Chapter 30, VDO Integration
The Red Hat test plan was written to operate with a default VDO configuration. When developing new tests, some of the VDO parameters listed in the next section must be adjusted.
31.2.7. VDO Optimizations
Perhaps the most important strategy for producing optimal performance is determining the best I/O queue depth, a characteristic that represents the load on the storage system. Most modern storage systems perform optimally with high I/O depth. VDO's performance is best demonstrated with many concurrent requests.
Synchronous vs. Asynchronous Write Policy
VDO might operate with either of two write policies, synchronous or asynchronous. By default, VDO automatically chooses the appropriate write policy for your underlying storage device.
When testing performance, you need to know which write policy VDO selected. The following command shows the write policy of your VDO volume:
# vdo status --name=my_vdo
VDO maintains a table of mappings from logical block addresses to physical block addresses, and VDO must look up the relevant mapping when accessing any particular block. By default, VDO allocates 128 MB of metadata cache in DRAM to support efficient access to 100 GB of logical space at a time. The test plan generates workloads appropriate to this configuration option.
Working sets larger than the configured cache size will require additional I/Os to service requests, in which case performance degradation will occur. If additional memory is available, the block map cache should be made larger. If the working set is larger than what the block map cache can hold in memory, additional I/O hover head can occur to lookup associated block map pages.
VDO Multithreading Configuration
VDO's thread configuration must be tuned to achieve optimal performance. Review the VDO Integration Guide for information on how to modify these settings when creating a VDO volume. Contact your Red Hat Sales Engineer to discuss how to design a test to find the optimal setting.
Because VDO performs deduplication and compression, test data sets must be chosen to effectively exercise these capabilities.
31.2.8. Special Considerations for Testing Read Performance
When testing read performance, these factors must be considered:
If a 4 KB block has never been written, VDO will not perform I/O to the storage and will immediately respond with a zero block.
If a 4 KB block has been written but contains all zeros, VDO will not perform I/O to the storage and will immediately respond with a zero block.
This behavior results in very fast read performance when there is no data to read. This makes it imperative that read tests prefill with actual data.
To prevent one test from affecting the results of another, it is suggested that a new VDO volume be created for each iteration of each test.