Chapter 8. Ceph performance benchmark

As a storage administrator, you can benchmark performance of the Red Hat Ceph Storage cluster. The purpose of this section is to give Ceph administrators a basic understanding of Ceph’s native benchmarking tools. These tools will provide some insight into how the Ceph storage cluster is performing. This is not the definitive guide to Ceph performance benchmarking, nor is it a guide on how to tune Ceph accordingly.

8.1. Performance baseline

The OSD, including the journal, disks and the network throughput should each have a performance baseline to compare against. You can identify potential tuning opportunities by comparing the baseline performance data with the data from Ceph’s native tools. Red Hat Enterprise Linux has many built-in tools, along with a plethora of open source community tools, available to help accomplish these tasks.

Additional Resources

  • For more details about some of the available tools, see this Knowledgebase article.

8.2. Benchmarking Ceph performance

Ceph includes the rados bench command to do performance benchmarking on a RADOS storage cluster. The command will execute a write test and two types of read tests. The --no-cleanup option is important to use when testing both read and write performance. By default the rados bench command will delete the objects it has written to the storage pool. Leaving behind these objects allows the two read tests to measure sequential and random read performance.

Note

Before running these performance tests, drop all the file system caches by running the following:

Example

[ceph: root@host01 /]# echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.

Procedure

  1. Create a new storage pool:

    Example

    [ceph: root@host01 /]# ceph osd pool create testbench 100 100

  2. Execute a write test for 10 seconds to the newly created storage pool:

    Example

    [ceph: root@host01 /]# rados bench -p testbench 10 write --no-cleanup
    
    Maintaining 16 concurrent writes of 4194304 bytes for up to 10 seconds or 0 objects
     Object prefix: benchmark_data_cephn1.home.network_10510
       sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
         0       0         0         0         0         0         -         0
         1      16        16         0         0         0         -         0
         2      16        16         0         0         0         -         0
         3      16        16         0         0         0         -         0
         4      16        17         1  0.998879         1   3.19824   3.19824
         5      16        18         2   1.59849         4   4.56163   3.87993
         6      16        18         2   1.33222         0         -   3.87993
         7      16        19         3   1.71239         2   6.90712     4.889
         8      16        25         9   4.49551        24   7.75362   6.71216
         9      16        25         9   3.99636         0         -   6.71216
        10      16        27        11   4.39632         4   9.65085   7.18999
        11      16        27        11   3.99685         0         -   7.18999
        12      16        27        11   3.66397         0         -   7.18999
        13      16        28        12   3.68975   1.33333   12.8124   7.65853
        14      16        28        12   3.42617         0         -   7.65853
        15      16        28        12   3.19785         0         -   7.65853
        16      11        28        17   4.24726   6.66667   12.5302   9.27548
        17      11        28        17   3.99751         0         -   9.27548
        18      11        28        17   3.77546         0         -   9.27548
        19      11        28        17   3.57683         0         -   9.27548
     Total time run:         19.505620
    Total writes made:      28
    Write size:             4194304
    Bandwidth (MB/sec):     5.742
    
    Stddev Bandwidth:       5.4617
    Max bandwidth (MB/sec): 24
    Min bandwidth (MB/sec): 0
    Average Latency:        10.4064
    Stddev Latency:         3.80038
    Max latency:            19.503
    Min latency:            3.19824

  3. Execute a sequential read test for 10 seconds to the storage pool:

    Example

    [ceph: root@host01 /]# rados bench -p testbench 10 seq
    
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
      0       0         0         0         0         0         -         0
    Total time run:        0.804869
    Total reads made:      28
    Read size:             4194304
    Bandwidth (MB/sec):    139.153
    
    Average Latency:       0.420841
    Max latency:           0.706133
    Min latency:           0.0816332

  4. Execute a random read test for 10 seconds to the storage pool:

    Example

    [ceph: root@host01 /]# rados bench -p testbench 10 rand
    
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
      0       0         0         0         0         0         -         0
      1      16        46        30   119.801       120  0.440184  0.388125
      2      16        81        65   129.408       140  0.577359  0.417461
      3      16       120       104   138.175       156  0.597435  0.409318
      4      15       157       142   141.485       152  0.683111  0.419964
      5      16       206       190   151.553       192  0.310578  0.408343
      6      16       253       237   157.608       188 0.0745175  0.387207
      7      16       287       271   154.412       136  0.792774   0.39043
      8      16       325       309   154.044       152  0.314254   0.39876
      9      16       362       346   153.245       148  0.355576  0.406032
     10      16       405       389   155.092       172   0.64734  0.398372
    Total time run:        10.302229
    Total reads made:      405
    Read size:             4194304
    Bandwidth (MB/sec):    157.248
    
    Average Latency:       0.405976
    Max latency:           1.00869
    Min latency:           0.0378431

  5. To increase the number of concurrent reads and writes, use the -t option, which the default is 16 threads. Also, the -b parameter can adjust the size of the object being written. The default object size is 4 MB. A safe maximum object size is 16 MB. Red Hat recommends running multiple copies of these benchmark tests to different pools. Doing this shows the changes in performance from multiple clients.

    Add the --run-name LABEL option to control the names of the objects that get written during the benchmark test. Multiple rados bench commands might be ran simultaneously by changing the --run-name label for each running command instance. This prevents potential I/O errors that can occur when multiple clients are trying to access the same object and allows for different clients to access different objects. The --run-name option is also useful when trying to simulate a real world workload.

    Example

    [ceph: root@host01 /]# rados bench -p testbench 10 write -t 4 --run-name client1
    
    Maintaining 4 concurrent writes of 4194304 bytes for up to 10 seconds or 0 objects
     Object prefix: benchmark_data_node1_12631
       sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
         0       0         0         0         0         0         -         0
         1       4         4         0         0         0         -         0
         2       4         6         2   3.99099         4   1.94755   1.93361
         3       4         8         4   5.32498         8     2.978   2.44034
         4       4         8         4   3.99504         0         -   2.44034
         5       4        10         6   4.79504         4   2.92419    2.4629
         6       3        10         7   4.64471         4   3.02498    2.5432
         7       4        12         8   4.55287         4   3.12204   2.61555
         8       4        14        10    4.9821         8   2.55901   2.68396
         9       4        16        12   5.31621         8   2.68769   2.68081
        10       4        17        13   5.18488         4   2.11937   2.63763
        11       4        17        13   4.71431         0         -   2.63763
        12       4        18        14   4.65486         2    2.4836   2.62662
        13       4        18        14   4.29757         0         -   2.62662
    Total time run:         13.123548
    Total writes made:      18
    Write size:             4194304
    Bandwidth (MB/sec):     5.486
    
    Stddev Bandwidth:       3.0991
    Max bandwidth (MB/sec): 8
    Min bandwidth (MB/sec): 0
    Average Latency:        2.91578
    Stddev Latency:         0.956993
    Max latency:            5.72685
    Min latency:            1.91967

  6. Remove the data created by the rados bench command:

    Example

    [ceph: root@host01 /]# rados -p testbench cleanup

8.3. Benchmarking Ceph block performance

Ceph includes the rbd bench-write command to test sequential writes to the block device measuring throughput and latency. The default byte size is 4096, the default number of I/O threads is 16, and the default total number of bytes to write is 1 GB. These defaults can be modified by the --io-size, --io-threads and --io-total options respectively.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.

Procedure

  • Run the write performance test against the block device

    Example

    [root@host01 ~]# rbd bench --io-type write image01 --pool=testbench
    
    bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern seq
      SEC       OPS   OPS/SEC   BYTES/SEC
        2     11127   5479.59  22444382.79
        3     11692   3901.91  15982220.33
        4     12372   2953.34  12096895.42
        5     12580   2300.05  9421008.60
        6     13141   2101.80  8608975.15
        7     13195    356.07  1458459.94
        8     13820    390.35  1598876.60
        9     14124    325.46  1333066.62
        ..

Additional Resources

  • See the Ceph block devices chapter in the Red Hat Ceph Storage Block Device Guide for more information on the rbd command.

8.4. Benchmarking CephFS performance

You can use the FIO tool to benchmark Ceph File System (CephFS) performance. This tool can also be used to benchmark Ceph Block Device.

Prerequisites

Procedure

  1. Navigate to the node or the application where the Block Device or the CephFS is mounted:

    Example

    [root@host01 ~]# cd /mnt/ceph-block-device
    [root@host01 ~]# cd /mnt/ceph-file-system

  2. Run FIO command. Start the bs value from 4k and repeat in power of 2 increments (4k, 8k, 16k, 32k …​ 128k…​ 512k, 1m, 2m, 4m ) and with different iodepth settings. You should also run tests at your expected workload operation size.

    Example for 4K tests with different iodepth values

    fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=4k --iodepth=32 --size=5G --runtime=60 --group_reporting=1

    Example for 8K tests with different iodepth values

    fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=8k --iodepth=32 --size=5G --runtime=60 --group_reporting=1

    Note

    For more information on the usage of fio command, see the fio man page.

8.5. Benchmarking Ceph Object Gateway performance

You can use the s3cmd tool to benchmark Ceph Object Gateway performance.

Use get and put requests to determine the performance.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.
  • s3cmd installed on the nodes.

Procedure

  1. Upload a file and measure the speed. The time command measures the duration of upload.

    Syntax

    time s3cmd put PATH_OF_SOURCE_FILE PATH_OF_DESTINATION_FILE

    Example

    time s3cmd put /path-to-local-file s3://bucket-name/remote/file

    Replace /path-to-local-file with the file you want to upload and s3://bucket-name/remote/file with the destination in your S3 bucket.

  2. Download a file and measure the speed. The time command measures the duration of download.

    Syntax

    time s3cmd get PATH_OF_DESTINATION_FILE DESTINATION_PATH

    Example

    time s3cmd get s3://bucket-name/remote/file /path-to-local-destination

    Replace s3://bucket-name/remote/file with the S3 object you want to download and /path-to-local-destination with the local directory where you want to save the file.

  3. List all the objects in the specified bucket and measure response time.

    Syntax

    time s3cmd ls s3://BUCKET_NAME

    Example

    time s3cmd ls s3://bucket-name

  4. Analyze the output to calculate upload/download speed and measure response time based on the duration reported by the time command.