Chapter 7. Administration

As a storage administrator, you can manage the Ceph Object Gateway using the radosgw-admin command line interface (CLI) or using the Red Hat Ceph Storage Dashboard.

Note

Not all of the Ceph Object Gateway features are available to the Red Hat Ceph Storage Dashboard.

7.1. Prerequisites

  • A healthy running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway software.

7.2. Creating storage policies

The Ceph Object Gateway stores the client bucket and object data by identifying placement targets, and storing buckets and objects in the pools associated with a placement target. If you don’t configure placement targets and map them to pools in the instance’s zone configuration, the Ceph Object Gateway will use default targets and pools, for example, default_placement.

Storage policies give Ceph Object Gateway clients a way of accessing a storage strategy, that is, the ability to target a particular type of storage, such as SSDs, SAS drives, and SATA drives, as a way of ensuring, for example, durability, replication, and erasure coding. For details, see the Storage Strategies guide for Red Hat Ceph Storage 5.

To create a storage policy, use the following procedure:

  1. Create a new pool .rgw.buckets.special with the desired storage strategy. For example, a pool customized with erasure-coding, a particular CRUSH ruleset, the number of replicas, and the pg_num and pgp_num count.
  2. Get the zone group configuration and store it in a file:

    Syntax

    radosgw-admin zonegroup --rgw-zonegroup=ZONE_GROUP_NAME get > FILE_NAME.json

    Example

    [root@host01 ~]# radosgw-admin zonegroup --rgw-zonegroup=default get > zonegroup.json

  3. Add a special-placement entry under placement_target in the zonegroup.json file:

    Example

    {
    	"name": "default",
    	"api_name": "",
    	"is_master": "true",
    	"endpoints": [],
    	"hostnames": [],
    	"master_zone": "",
    	"zones": [{
    		"name": "default",
    		"endpoints": [],
    		"log_meta": "false",
    		"log_data": "false",
    		"bucket_index_max_shards": 5
    	}],
    	"placement_targets": [{
    		"name": "default-placement",
    		"tags": []
    	}, {
    		"name": "special-placement",
    		"tags": []
    	}],
    	"default_placement": "default-placement"
    }

  4. Set the zone group with the modified zonegroup.json file:

    Example

    [root@host01 ~]# radosgw-admin zonegroup set < zonegroup.json

  5. Get the zone configuration and store it in a file, for example, zone.json:

    Example

    [root@host01 ~]# radosgw-admin zone get > zone.json

  6. Edit the zone file and add the new placement policy key under placement_pool:

    Example

    {
    	"domain_root": ".rgw",
    	"control_pool": ".rgw.control",
    	"gc_pool": ".rgw.gc",
    	"log_pool": ".log",
    	"intent_log_pool": ".intent-log",
    	"usage_log_pool": ".usage",
    	"user_keys_pool": ".users",
    	"user_email_pool": ".users.email",
    	"user_swift_pool": ".users.swift",
    	"user_uid_pool": ".users.uid",
    	"system_key": {
    		"access_key": "",
    		"secret_key": ""
    	},
    	"placement_pools": [{
    		"key": "default-placement",
    		"val": {
    			"index_pool": ".rgw.buckets.index",
    			"data_pool": ".rgw.buckets",
    			"data_extra_pool": ".rgw.buckets.extra"
    		}
    	}, {
    		"key": "special-placement",
    		"val": {
    			"index_pool": ".rgw.buckets.index",
    			"data_pool": ".rgw.buckets.special",
    			"data_extra_pool": ".rgw.buckets.extra"
    		}
    	}]
    }

  7. Set the new zone configuration:

    Example

    [root@host01 ~]# radosgw-admin zone set < zone.json

  8. Update the zone group map:

    Example

    [root@host01 ~]# radosgw-admin period update --commit

    The special-placement entry is listed as a placement_target.

  9. To specify the storage policy when making a request:

    Example

    $ curl -i http://10.0.0.1/swift/v1/TestContainer/file.txt -X PUT -H "X-Storage-Policy: special-placement" -H "X-Auth-Token: AUTH_rgwtxxxxxx"

7.3. Creating indexless buckets

You can configure a placement target where created buckets do not use the bucket index to store objects index; that is, indexless buckets. Placement targets that do not use data replication or listing might implement indexless buckets. Indexless buckets provide a mechanism in which the placement target does not track objects in specific buckets. This removes a resource contention that happens whenever an object write happens and reduces the number of round trips that Ceph Object Gateway needs to make to the Ceph storage cluster. This can have a positive effect on concurrent operations and small object write performance.

Important

The bucket index does not reflect the correct state of the bucket, and listing these buckets does not correctly return their list of objects. This affects multiple features. Specifically, these buckets are not synced in a multi-zone environment because the bucket index is not used to store change information. Red Hat recommends not to use S3 object versioning on indexless buckets, because the bucket index is necessary for this feature.

Note

Using indexless buckets removes the limit of the max number of objects in a single bucket.

Note

Objects in indexless buckets cannot be listed from NFS.

Prerequisites

  • A running and healthy Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway software.
  • Root-level access to a Ceph Object Gateway node.

Procedure

  1. Add a new placement target to the zonegroup:

    Example

    [ceph: root@host03 /]# radosgw-admin zonegroup placement add --rgw-zonegroup="default" \
      --placement-id="indexless-placement"

  2. Add a new placement target to the zone:

    Example

    [ceph: root@host03 /]# radosgw-admin zone placement add --rgw-zone="default" \
       --placement-id="indexless-placement" \
       --data-pool="default.rgw.buckets.data" \
       --index-pool="default.rgw.buckets.index" \
       --data_extra_pool="default.rgw.buckets.non-ec" \
       --placement-index-type="indexless"

  3. Set the zonegroup’s default placement to indexless-placement:

    Example

    [ceph: root@host03 /]# radosgw-admin zonegroup placement default --placement-id "indexless-placement"

    In this example, the buckets created in the indexless-placement target will be indexless buckets.

  4. Update and commit the period if the cluster is in a multi-site configuration:

    Example

    [ceph: root@host03 /]# radosgw-admin period update --commit

  5. Restart the Ceph Object Gateways on all nodes in the storage cluster for the change to take effect:

    Syntax

    ceph orch restart SERVICE_TYPE

    Example

    [ceph: root@host03 /]# ceph orch restart rgw

7.4. Configure bucket index resharding

As a storage administrator, you can configure bucket index resharding in single-site and multi-site deployments to improve performance.

You can reshard a bucket index either manually offline or dynamically online.

7.4.1. Bucket index resharding

The Ceph Object Gateway stores bucket index data in the index pool, which defaults to .rgw.buckets.index parameter. When the client puts many objects in a single bucket without setting quotas for the maximum number of objects per bucket, the index pool can result in significant performance degradation.

  • Bucket index resharding prevents performance bottlenecks when you add a high number of objects per bucket.
  • You can configure bucket index resharding for new buckets or change the bucket index on the existing ones.
  • You need to have the shard count as the nearest prime number to the calculated shard count. The bucket index shards that are prime numbers tend to work better in an evenly distributed bucket index entries across shards.
  • Bucket index can be resharded manually or dynamically.

    During the process of resharding bucket index dynamically, there is a periodic check of all the Ceph Object Gateway buckets and it detects buckets that require resharding. If a bucket has grown larger than the value specified in the rgw_max_objs_per_shard parameter, the Ceph Object Gateway reshards the bucket dynamically in the background. The default value for rgw_max_objs_per_shard is 100k objects per shard. Resharding bucket index dynamically works as expected on the upgraded single-site configuration without any modification to the zone or the zone group. A single site-configuration can be any of the following:

    • A default zone configuration with no realm.
    • A non-default configuration with at least one realm.
    • A multi-realm single-site configuration.

7.4.2. Recovering bucket index

Resharding a bucket that was created with bucket_index_max_shards = 0, removes the bucket’s metadata. However, you can restore the bucket indexes by recovering the affected buckets.

Resharding a bucket that was created with bucket_index_max_shards = 0, removes the bucket’s metadata. However, you can restore the bucket indexes by recovering the affected buckets.

The /usr/bin/rgw-restore-bucket-index tool creates temporary files in the /tmp directory. These temporary files consume space based on the bucket objects count from the previous buckets. The previous buckets with more than 10M objects needs more than 4GB of free space in /tmp directory. If the storage space in /tmp is exhausted, the tool fails with the following message:

ln: failed to access '/tmp/rgwrbi-object-list.4053207': No such file or directory

The temporary objects are removed.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A Ceph Object Gateway installed at a minimum of two sites.
  • The jq package installed.

Procedure

  • Perform either of the below two steps to perform recovery of bucket indexes:

    • Run radosgw-admin object reindex --bucket BUCKET_NAME --object OBJECT_NAME command.
    • Run the script - /usr/bin/rgw-restore-bucket-index -b BUCKET_NAME -p DATA_POOL_NAME.

      Example

      [root@host01 ceph]# /usr/bin/rgw-restore-bucket-index -b bucket-large-1 -p local-zone.rgw.buckets.data
      
      marker is d8a347a4-99b6-4312-a5c1-75b83904b3d4.41610.2
      bucket_id is d8a347a4-99b6-4312-a5c1-75b83904b3d4.41610.2
      number of bucket index shards is 5
      data pool is local-zone.rgw.buckets.data
      NOTICE: This tool is currently considered EXPERIMENTAL.
      The list of objects that we will attempt to restore can be found in "/tmp/rgwrbi-object-list.49946".
      Please review the object names in that file (either below or in another window/terminal) before proceeding.
      Type "proceed!" to proceed, "view" to view object list, or "q" to quit: view
      Viewing...
      Type "proceed!" to proceed, "view" to view object list, or "q" to quit: proceed!
      Proceeding...
      NOTICE: Bucket stats are currently incorrect. They can be restored with the following command after 2 minutes:
          radosgw-admin bucket list --bucket=bucket-large-1 --allow-unordered --max-entries=1073741824
      Would you like to take the time to recalculate bucket stats now? [yes/no] yes
      Done
      
      real    2m16.530s
      user    0m1.082s
      sys    0m0.870s

Note
  • The tool does not work for versioned buckets.

    [root@host01 ~]# time rgw-restore-bucket-index  --proceed serp-bu-ver-1 default.rgw.buckets.data
    NOTICE: This tool is currently considered EXPERIMENTAL.
    marker is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5
    bucket_id is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5
    Error: this bucket appears to be versioned, and this tool cannot work with versioned buckets.
  • The tool’s scope is limited to a single site only and not multisite, that is, if we execute rgw-restore-bucket-index tool at site-1, it does not recover objects in site-2 and vice versa. On a multisite, the recovery tool and the object reindex command should be executed at both sites for a bucket.

7.4.3. Limitations of bucket index resharding

Important

Use the following limitations with caution. There are implications related to your hardware selections, so you should always discuss these requirements with your Red Hat account team.

  • Maximum number of objects in one bucket before it needs resharding: Use a maximum of 102,400 objects per bucket index shard. To take full advantage of resharding and maximize parallelism, provide a sufficient number of OSDs in the Ceph Object Gateway bucket index pool. This parallelization scales with the number of Ceph Object Gateway instances, and replaces the in-order index shard enumeration with a number sequence. The default locking timeout is extended from 60 seconds to 90 seconds.
  • Maximum number of objects when using sharding: Based on prior testing, the number of bucket index shards currently supported is 65,521.
  • You can reshard a bucket three times before the other zones catch-up: Resharding is not recommended until the older generations synchronize. Around four generations of the buckets from previous reshards are supported. Once the limit is reached, dynamic resharding does not reshard the bucket again until at least one of the old log generations are fully trimmed. Using the command radosgw-admin bucket reshard throws the following error:

    Bucket _BUCKET_NAME_ already has too many log generations (4) from previous reshards that peer zones haven't finished syncing.
    Resharding is not recommended until the old generations sync, but you can force a reshard with `--yes-i-really-mean-it`.

7.4.4. Configuring bucket index resharding in simple deployments

To enable and configure bucket index resharding on all new buckets, use the rgw_override_bucket_index_max_shards parameter.

You can set the parameter to one of the following values:

  • 0 to disable bucket index sharding, which is the default value.
  • A value greater than 0 to enable bucket sharding and to set the maximum number of shards.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A Ceph Object Gateway installed at a minimum of two sites.

Procedure

  1. Calculate the recommended number of shards:

    number of objects expected in a bucket / 100,000
    Note

    The maximum number of bucket index shards currently supported is 65,521.

  2. Set the rgw_override_bucket_index_max_shards parameter accordingly:

    Syntax

    ceph config set client.rgw rgw_override_bucket_index_max_shards VALUE

    Replace VALUE with the recommended number of shards calculated:

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_override_bucket_index_max_shards 12

    • To configure bucket index resharding for all instances of the Ceph Object Gateway, set the rgw_override_bucket_index_max_shards parameter with the global option.
    • To configure bucket index resharding only for a particular instance of the Ceph Object Gateway, add rgw_override_bucket_index_max_shards parameter under the instance.
  3. Restart the Ceph Object Gateways on all nodes in the cluster to take effect:

    Syntax

    ceph orch restart SERVICE_TYPE

    Example

    [ceph: root#host01 /]# ceph orch restart rgw

7.4.5. Configuring bucket index resharding in multi-site deployments

In multi-site deployments, each zone can have a different index_pool setting to manage failover. To configure a consistent shard count for zones in one zone group, set the bucket_index_max_shards parameter in the configuration for that zone group. The default value of bucket_index_max_shards parameter is 11.

You can set the parameter to one of the following values:

  • 0 to disable bucket index sharding.
  • A value greater than 0 to enable bucket sharding and to set the maximum number of shards.
Note

Mapping the index pool, for each zone, if applicable, to a CRUSH ruleset of SSD-based OSDs might also help with bucket index performance. See the Establishing performance domains section for more information.

Important

To prevent sync issues in multi-site deployments, a bucket should not have more than three generation gaps.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A Ceph Object Gateway installed at a minimum of two sites.

Procedure

  1. Calculate the recommended number of shards:

    number of objects expected in a bucket / 100,000
    Note

    The maximum number of bucket index shards currently supported is 65,521.

  2. Extract the zone group configuration to the zonegroup.json file:

    Example

    [ceph: root@host01 /]# radosgw-admin zonegroup get > zonegroup.json

  3. In the zonegroup.json file, set the bucket_index_max_shards parameter for each named zone:

    Syntax

    bucket_index_max_shards = VALUE

    Replace VALUE with the recommended number of shards calculated:

    Example

    bucket_index_max_shards = 12

  4. Reset the zone group:

    Example

    [ceph: root@host01 /]# radosgw-admin zonegroup set < zonegroup.json

  5. Update the period:

    Example

    [ceph: root@host01 /]# radosgw-admin period update --commit

  6. Check if resharding is complete:

    Syntax

    radosgw-admin reshard status --bucket BUCKET_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin reshard status --bucket data

Verification

  • Check the sync status of the storage cluster:

    Example

    [ceph: root@host01 /]# radosgw-admin sync status

7.4.6. Resharding bucket index dynamically

You can reshard the bucket index dynamically by adding the bucket to the resharding queue. It gets scheduled to be resharded. The reshard threads run in the background and executes the scheduled resharding, one at a time.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A Ceph Object Gateway installed at a minimum of two sites.

Procedure

  1. Set the rgw_dynamic_resharding parameter is set to true.

    Example

    [ceph: root@host01 /]# radosgw-admin period get

  2. Optional: Customize Ceph configuration using the following command:

    Syntax

    ceph config set client.rgw OPTION VALUE

    Replace OPTION with the following options:

    • rgw_reshard_num_logs: The number of shards for the resharding log. The default value is 16.
    • rgw_reshard_bucket_lock_duration: The duration of the lock on a bucket during resharding. The default value is 360 seconds.
    • rgw_dynamic_resharding: Enables or disables dynamic resharding. The default value is true.
    • rgw_max_objs_per_shard: The maximum number of objects per shard. The default value is 100000 objects per shard.
    • rgw_reshard_thread_interval: The maximum time between rounds of reshard thread processing. The default value is 600 seconds.

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_reshard_num_logs 23

  3. Add a bucket to the resharding queue:

    Syntax

    radosgw-admin reshard add --bucket BUCKET --num-shards NUMBER

    Example

    [ceph: root@host01 /]# radosgw-admin reshard add --bucket data --num-shards 10

  4. List the resharding queue:

    Example

    [ceph: root@host01 /]# radosgw-admin reshard list

  5. Check the bucket log generations and shards:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket layout --bucket data
    {
        "layout": {
            "resharding": "None",
            "current_index": {
                "gen": 1,
                "layout": {
                    "type": "Normal",
                    "normal": {
                        "num_shards": 23,
                        "hash_type": "Mod"
                    }
                }
            },
            "logs": [
                {
                    "gen": 0,
                    "layout": {
                        "type": "InIndex",
                        "in_index": {
                            "gen": 0,
                            "layout": {
                                "num_shards": 11,
                                "hash_type": "Mod"
                            }
                        }
                    }
                },
                {
                    "gen": 1,
                    "layout": {
                        "type": "InIndex",
                        "in_index": {
                            "gen": 1,
                            "layout": {
                                "num_shards": 23,
                                "hash_type": "Mod"
                            }
                        }
                    }
                }
            ]
        }
    }

  6. Check bucket resharding status:

    Syntax

    radosgw-admin reshard status --bucket BUCKET

    Example

    [ceph: root@host01 /]# radosgw-admin reshard status --bucket data

  7. Process entries on the resharding queue immediately:

    [ceph: root@host01 /]# radosgw-admin reshard process
  8. Cancel pending bucket resharding:

    Warning

    You can only cancel pending resharding operations. Do not cancel ongoing resharding operations.

    Syntax

    radosgw-admin reshard cancel --bucket BUCKET

    Example

    [ceph: root@host01 /]# radosgw-admin reshard cancel --bucket data

Verification

  • Check bucket resharding status:

    Syntax

    radosgw-admin reshard status --bucket BUCKET

    Example

    [ceph: root@host01 /]# radosgw-admin reshard status --bucket data

7.4.7. Resharding bucket index dynamically in multi-site configuration

Red Hat Ceph Storage 5.3 supports dynamic bucket index resharding in multi-site configuration. The feature allows buckets to be resharded in a multi-site configuration without interrupting the replication of their objects. When rgw_dynamic_resharding is enabled, it runs on each zone independently, and the zones might choose different shard counts for the same bucket.

These steps that need to be followed are for an existing Red Hat Ceph Storage cluster only. You need to enable the resharding feature manually on the existing zones and the zone groups after upgrading the storage cluster.

Note

For a new installation of Red Hat Ceph Storage 5.3, the resharding feature for the zones and the zone groups are supported and enabled by default.

Note

You can reshard a bucket three times before the other zones catch-up. See the Limitations of bucket index resharding for more details.

Note

If a bucket is created and uploaded with more than the threshold number of objects for resharding dynamically, you need to continue to write I/Os to old buckets to begin the resharding process.

Prerequisites

  • The Red Hat Ceph Storage clusters at both sites are upgraded to the latest version.
  • All the Ceph Object Gateway daemons enabled at both the sites are upgraded to the latest version.
  • Root-level access to all the nodes.

Procedure

  1. Check if resharding is enabled on the zonegroup:

    Example

    [ceph: root@host01 /]# radosgw-admin sync status

    If zonegroup features enabled is not enabled for resharding on the zonegroup, then continue with the procedure.

  2. Enable the resharding feature on all the zonegroups in the multi-site configuration where Ceph Object Gateway is installed:

    Syntax

    radosgw-admin zonegroup modify --rgw-zonegroup=ZONEGROUP_NAME --enable-feature=resharding

    Example

    [ceph: root@host01 /]# radosgw-admin zonegroup modify --rgw-zonegroup=us --enable-feature=resharding

  3. Update the period and commit:

    Example

    [ceph: root@host01 /]# radosgw-admin period update --commit

  4. Enable the resharding feature on all the zones in the multi-site configuration where Ceph Object Gateway is installed:

    Syntax

    radosgw-admin zone modify --rgw-zone=ZONE_NAME --enable-feature=resharding

    Example

    [ceph: root@host01 /]# radosgw-admin zone modify --rgw-zone=us-east --enable-feature=resharding

  5. Update the period and commit:

    Example

    [ceph: root@host01 /]# radosgw-admin period update --commit

  6. Verify the resharding feature is enabled on the zones and zonegroups. You can see that each zone lists its supported_features and the zonegroups lists its enabled_features

    Example

    [ceph: root@host01 /]# radosgw-admin period get
    
    "zones": [
                        {
                            "id": "505b48db-6de0-45d5-8208-8c98f7b1278d",
                            "name": "us_east",
                            "endpoints": [
                                "http://10.0.208.11:8080"
                            ],
                            "log_meta": "false",
                            "log_data": "true",
                            "bucket_index_max_shards": 11,
                            "read_only": "false",
                            "tier_type": "",
                            "sync_from_all": "true",
                            "sync_from": [],
                            "redirect_zone": "",
                            "supported_features": [
                                "resharding"
                            ]
    .
    .
                    "default_placement": "default-placement",
                    "realm_id": "26cf6f23-c3a0-4d57-aae4-9b0010ee55cc",
                    "sync_policy": {
                        "groups": []
                    },
                    "enabled_features": [
                        "resharding"
                    ]

  7. Check the sync status:

    Example

    [ceph: root@host01 /]# radosgw-admin sync status
              realm 26cf6f23-c3a0-4d57-aae4-9b0010ee55cc (usa)
          zonegroup 33a17718-6c77-493e-99fe-048d3110a06e (us)
               zone 505b48db-6de0-45d5-8208-8c98f7b1278d (us_east)
    zonegroup features enabled: resharding

    In this example. you can see that the resharding feature is enabled for the us zonegroup.

  8. Optional: You can disable the resharding feature for the zonegroups:

    1. Disable the feature on all the zonegroups in the multi-site where Ceph Object Gateway is installed:

      Syntax

      radosgw-admin zonegroup modify --rgw-zonegroup=ZONEGROUP_NAME --disable-feature=resharding

      Example

      [ceph: root@host01 /]# radosgw-admin zonegroup modify --rgw-zonegroup=us --disable-feature=resharding

    2. Update the period and commit:

      Example

      [ceph: root@host01 /]# radosgw-admin period update --commit

Additional Resources

  • For more configurable parameters for dynamic bucket index resharding, see the Dynamic Bucket Index Resharding section in the Red Hat Ceph Storage Object Gateway Configuration and Administration Guide.

7.4.8. Resharding bucket index manually

If a bucket has grown larger than the initial configuration for which it was optimzed, reshard the bucket index pool by using the radosgw-admin bucket reshard command. This command performs the following tasks:

  • Creates a new set of bucket index objects for the specified bucket.
  • Distributes object entries across these bucket index objects.
  • Creates a new bucket instance.
  • Links the new bucket instance with the bucket so that all new index operations go through the new bucket indexes.
  • Prints the old and the new bucket ID to the command output.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A Ceph Object Gateway installed at a minimum of two sites.

Procedure

  1. Back up the original bucket index:

    Syntax

    radosgw-admin bi list --bucket=BUCKET > BUCKET.list.backup

    Example

    [ceph: root@host01 /]# radosgw-admin bi list --bucket=data > data.list.backup

  2. Reshard the bucket index:

    Syntax

    radosgw-admin bucket reshard --bucket=BUCKET --num-shards=NUMBER

    Example

    [ceph: root@host01 /]# radosgw-admin bucket reshard --bucket=data --num-shards=100

Verification

  • Check bucket resharding status:

    Syntax

    radosgw-admin reshard status --bucket bucket

    Example

    [ceph: root@host01 /]# radosgw-admin reshard status --bucket data

Additional Resources

7.4.9. Cleaning stale instances of bucket entries after resharding

The resharding process might not clean stale instances of bucket entries automatically and these instances can impact performance of the storage cluster.

Clean them manually to prevent the stale instances from negatively impacting the performance of the storage cluster.

Important

Contact Red Hat Support prior to cleaning the stale instances.

Important

Use this procedure only in simple deployments, not in multi-site clusters.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Ceph Object Gateway installed.

Procedure

  1. List stale instances:

    [ceph: root@host01 /]# radosgw-admin reshard stale-instances list
  2. Clean the stale instances of the bucket entries:

    [ceph: root@host01 /]# radosgw-admin reshard stale-instances rm

Verification

  • Check bucket resharding status:

    Syntax

    radosgw-admin reshard status --bucket BUCKET

    Example

    [ceph: root@host01 /]# radosgw-admin reshard status --bucket data

7.4.10. Fixing lifecycle policies after resharding

For storage clusters with resharded instances, the old lifecycle processes would have flagged and deleted the lifecycle processing as the bucket instance changed during a reshard. However, for older buckets that had lifecycle policies and have undergone resharding, you can fix such buckets with the reshard fix option.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Ceph Object Gateway installed.

Procedure

  • Fix the lifecycle policies of the older bucket:

    Syntax

    radosgw-admin lc reshard fix --bucket BUCKET_NAME

    Important

    If you do not use the --bucket argument, then the command fixes lifecycle policies for all the buckets in the storage cluster.

    Example

    [ceph: root@host01 /]# radosgw-admin lc reshard fix --bucket mybucket

7.5. Enabling compression

The Ceph Object Gateway supports server-side compression of uploaded objects using any of Ceph’s compression plugins. These include:

  • zlib: Supported.
  • snappy: Supported.
  • zstd: Supported.

Configuration

To enable compression on a zone’s placement target, provide the --compression=TYPE option to the radosgw-admin zone placement modify command. The compression TYPE refers to the name of the compression plugin to use when writing new object data.

Each compressed object stores the compression type. Changing the setting does not hinder the ability to decompress existing compressed objects, nor does it force the Ceph Object Gateway to recompress existing objects.

This compression setting applies to all new objects uploaded to buckets using this placement target.

To disable compression on a zone’s placement target, provide the --compression=TYPE option to the radosgw-admin zone placement modify command and specify an empty string or none.

Example

[root@host01 ~] radosgw-admin zone placement modify --rgw-zone=default --placement-id=default-placement --compression=zlib
{
...
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "default.rgw.buckets.index",
                "data_pool": "default.rgw.buckets.data",
                "data_extra_pool": "default.rgw.buckets.non-ec",
                "index_type": 0,
                "compression": "zlib"
            }
        }
    ],
...
}

After enabling or disabling compression, restart the Ceph Object Gateway instance so the change will take effect.

Note

Ceph Object Gateway creates a default zone and a set of pools. For production deployments, see the Creating a Realm section first.

Statistics

While all existing commands and APIs continue to report object and bucket sizes based on their uncompressed data, the radosgw-admin bucket stats command includes compression statistics for all buckets.

Syntax

radosgw-admin bucket stats --bucket=BUCKET_NAME
{
...
    "usage": {
        "rgw.main": {
            "size": 1075028,
            "size_actual": 1331200,
            "size_utilized": 592035,
            "size_kb": 1050,
            "size_kb_actual": 1300,
            "size_kb_utilized": 579,
            "num_objects": 104
        }
    },
...
}

The size is the accumulated size of the objects in the bucket, uncompressed and unencrypted. The size_kb is the accumulated size in kilobytes and is calculated as ceiling(size/1024). In this example, it is ceiling(1075028/1024) = 1050.

The size_actual is the accumulated size of all the objects after each object is distributed in a set of 4096-byte blocks. If a bucket has two objects, one of size 4100 bytes and the other of 8500 bytes, the first object is rounded up to 8192 bytes, and the second one rounded 12288 bytes, and their total for the bucket is 20480 bytes. The size_kb_actual is the actual size in kilobytes and is calculated as size_actual/1024. In this example, it is 1331200/1024 = 1300.

The size_utilized is the total size of the data in bytes after it has been compressed and/or encrypted. Encryption could increase the size of the object while compression could decrease it. The size_kb_utilized is the total size in kilobytes and is calculated as ceiling(size_utilized/1024). In this example, it is ceiling(592035/1024)= 579.

7.6. User management

Ceph Object Storage user management refers to users that are client applications of the Ceph Object Storage service; not the Ceph Object Gateway as a client application of the Ceph Storage Cluster. You must create a user, access key, and secret to enable client applications to interact with the Ceph Object Gateway service.

There are two user types:

  • User: The term 'user' reflects a user of the S3 interface.
  • Subuser: The term 'subuser' reflects a user of the Swift interface. A subuser is associated to a user .

You can create, modify, view, suspend, and remove users and subusers.

Important

When managing users in a multi-site deployment, ALWAYS issue the radosgw-admin command on a Ceph Object Gateway node within the master zone of the master zone group to ensure that users synchronize throughout the multi-site cluster. DO NOT create, modify, or delete users on a multi-site cluster from a secondary zone or a secondary zone group.

In addition to creating user and subuser IDs, you may add a display name and an email address for a user. You can specify a key and secret, or generate a key and secret automatically. When generating or specifying keys, note that user IDs correspond to an S3 key type and subuser IDs correspond to a swift key type. Swift keys also have access levels of read, write, readwrite and full.

User management command line syntax generally follows the pattern user COMMAND USER_ID where USER_ID is either the --uid= option followed by the user’s ID (S3) or the --subuser= option followed by the user name (Swift).

Syntax

radosgw-admin user <create|modify|info|rm|suspend|enable|check|stats> <--uid=USER_ID|--subuser=SUB_USER_NAME> [other-options]

Additional options may be required depending on the command you issue.

7.6.1. Multi-tenancy

The Ceph Object Gateway supports multi-tenancy for both the S3 and Swift APIs, where each user and bucket lies under a "tenant." Multi tenancy prevents namespace clashing when multiple tenants are using common bucket names, such as "test", "main", and so forth.

Each user and bucket lies under a tenant. For backward compatibility, a "legacy" tenant with an empty name is added. Whenever referring to a bucket without specifically specifying a tenant, the Swift API will assume the "legacy" tenant. Existing users are also stored under the legacy tenant, so they will access buckets and objects the same way as earlier releases.

Tenants as such do not have any operations on them. They appear and disappear as needed, when users are administered. In order to create, modify, and remove users with explicit tenants, either an additional option --tenant is supplied, or a syntax "TENANT$USER" is used in the parameters of the radosgw-admin command.

To create a user testx$tester for S3, run the following command:

Example

[root@host01 ~]# radosgw-admin --tenant testx --uid tester \
                    --display-name "Test User" --access_key TESTER \
                    --secret test123 user create

To create a user testx$tester for Swift, run one of the following commands:

Example

[root@host01 ~]# radosgw-admin --tenant testx --uid tester \
                    --display-name "Test User" --subuser tester:swift \
                    --key-type swift --access full subuser create

[root@host01 ~]# radosgw-admin key create --subuser 'testx$tester:swift' \
                    --key-type swift --secret test123

Note

The subuser with explicit tenant had to be quoted in the shell.

7.6.2. Create a user

Use the user create command to create an S3-interface user. You MUST specify a user ID and a display name. You may also specify an email address. If you DO NOT specify a key or secret, radosgw-admin will generate them for you automatically. However, you may specify a key and/or a secret if you prefer not to use generated key/secret pairs.

Syntax

radosgw-admin user create --uid=USER_ID \
[--key-type=KEY_TYPE] [--gen-access-key|--access-key=ACCESS_KEY]\
[--gen-secret | --secret=SECRET_KEY] \
[--email=EMAIL] --display-name=DISPLAY_NAME

Example

[root@host01 ~]# radosgw-admin user create --uid=janedoe --access-key=11BS02LGFB6AL6H1ADMW --secret=vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY --email=jane@example.com --display-name=Jane Doe

{ "user_id": "janedoe",
  "display_name": "Jane Doe",
  "email": "jane@example.com",
  "suspended": 0,
  "max_buckets": 1000,
  "auid": 0,
  "subusers": [],
  "keys": [
        { "user": "janedoe",
          "access_key": "11BS02LGFB6AL6H1ADMW",
          "secret_key": "vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY"}],
  "swift_keys": [],
  "caps": [],
  "op_mask": "read, write, delete",
  "default_placement": "",
  "placement_tags": [],
  "bucket_quota": { "enabled": false,
      "max_size_kb": -1,
      "max_objects": -1},
  "user_quota": { "enabled": false,
      "max_size_kb": -1,
      "max_objects": -1},
  "temp_url_keys": []}
Important

Check the key output. Sometimes radosgw-admin generates a JSON escape (\) character, and some clients do not know how to handle JSON escape characters. Remedies include removing the JSON escape character (\), encapsulating the string in quotes, regenerating the key to ensure that it does not have a JSON escape character, or specifying the key and secret manually.

7.6.3. Create a subuser

To create a subuser (Swift interface), you must specify the user ID (--uid=USERNAME), a subuser ID and the access level for the subuser. If you DO NOT specify a key or secret, radosgw-admin generates them for you automatically. However, you can specify a key, a secret, or both if you prefer not to use generated key and secret pairs.

Note

full is not readwrite, as it also includes the access control policy.

Syntax

 radosgw-admin subuser create --uid=USER_ID --subuser=SUB_USER_ID --access=[ read | write | readwrite | full ]

Example

[root@host01 ~]# radosgw-admin subuser create --uid=janedoe --subuser=janedoe:swift --access=full

{ "user_id": "janedoe",
  "display_name": "Jane Doe",
  "email": "jane@example.com",
  "suspended": 0,
  "max_buckets": 1000,
  "auid": 0,
  "subusers": [
        { "id": "janedoe:swift",
          "permissions": "full-control"}],
  "keys": [
        { "user": "janedoe",
          "access_key": "11BS02LGFB6AL6H1ADMW",
          "secret_key": "vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY"}],
  "swift_keys": [],
  "caps": [],
  "op_mask": "read, write, delete",
  "default_placement": "",
  "placement_tags": [],
  "bucket_quota": { "enabled": false,
      "max_size_kb": -1,
      "max_objects": -1},
  "user_quota": { "enabled": false,
      "max_size_kb": -1,
      "max_objects": -1},
  "temp_url_keys": []}

7.6.4. Get user information

To get information about a user, specify user info and the user ID (--uid=USERNAME).

Example

[root@host01 ~]# radosgw-admin user info --uid=janedoe

To get information about a tenanted user, specify both the user ID and the name of the tenant.

[root@host01 ~]# radosgw-admin user info --uid=janedoe --tenant=test

7.6.5. Modify user information

To modify information about a user, you must specify the user ID (--uid=USERNAME) and the attributes you want to modify. Typical modifications are to keys and secrets, email addresses, display names, and access levels.

Example

[root@host01 ~]# radosgw-admin user modify --uid=janedoe  --display-name="Jane E. Doe"

To modify subuser values, specify subuser modify and the subuser ID.

Example

[root@host01 ~]# radosgw-admin subuser modify --subuser=janedoe:swift  --access=full

7.6.6. Enable and suspend users

When you create a user, the user is enabled by default. However, you may suspend user privileges and re-enable them at a later time. To suspend a user, specify user suspend and the user ID.

[root@host01 ~]# radosgw-admin user suspend --uid=johndoe

To re-enable a suspended user, specify user enable and the user ID:

[root@host01 ~]# radosgw-admin user enable --uid=johndoe
Note

Disabling the user disables the subuser.

7.6.7. Remove a user

When you remove a user, the user and subuser are removed from the system. However, you may remove only the subuser if you wish. To remove a user (and subuser), specify user rm and the user ID.

Syntax

radosgw-admin user rm --uid=USER_ID[--purge-keys] [--purge-data]

Example

[ceph: root@host01 /]# radosgw-admin user rm --uid=johndoe --purge-data

To remove the subuser only, specify subuser rm and the subuser name.

Example

[ceph: root@host01 /]# radosgw-admin subuser rm --subuser=johndoe:swift --purge-keys

Options include:

  • Purge Data: The --purge-data option purges all data associated with the UID.
  • Purge Keys: The --purge-keys option purges all keys associated with the UID.

7.6.8. Remove a subuser

When you remove a subuser, you are removing access to the Swift interface. The user remains in the system. To remove the subuser, specify subuser rm and the subuser ID.

Syntax

radosgw-admin subuser rm --subuser=SUB_USER_ID

Example

[root@host01 /]# radosgw-admin subuser rm --subuser=johndoe:swift

Options include:

  • Purge Keys: The --purge-keys option purges all keys associated with the UID.

7.6.9. Rename a user

To change the name of a user, use the radosgw-admin user rename command. The time that this command takes depends on the number of buckets and objects that the user has. If the number is large, Red Hat recommends using the command in the Screen utility provided by the screen package.

Prerequisites

  • A working Ceph cluster.
  • root or sudo access to the host running the Ceph Object Gateway.
  • Installed Ceph Object Gateway.

Procedure

  1. Rename a user:

    Syntax

    radosgw-admin user rename --uid=CURRENT_USER_NAME --new-uid=NEW_USER_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin user rename --uid=user1 --new-uid=user2
    
    {
        "user_id": "user2",
        "display_name": "user 2",
        "email": "",
        "suspended": 0,
        "max_buckets": 1000,
        "auid": 0,
        "subusers": [],
        "keys": [
            {
                "user": "user2",
                "access_key": "59EKHI6AI9F8WOW8JQZJ",
                "secret_key": "XH0uY3rKCUcuL73X0ftjXbZqUbk0cavD11rD8MsA"
            }
        ],
        "swift_keys": [],
        "caps": [],
        "op_mask": "read, write, delete",
        "default_placement": "",
        "placement_tags": [],
        "bucket_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "user_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "temp_url_keys": [],
        "type": "rgw"
    }

    If a user is inside a tenant, specify both the user name and the tenant:

    Syntax

    radosgw-admin user rename --uid USER_NAME --new-uid NEW_USER_NAME --tenant TENANT

    Example

    [ceph: root@host01 /]# radosgw-admin user rename --uid=test$user1 --new-uid=test$user2 --tenant test
    
    1000 objects processed in tvtester1. Next marker 80_tVtester1_99
    2000 objects processed in tvtester1. Next marker 64_tVtester1_44
    3000 objects processed in tvtester1. Next marker 48_tVtester1_28
    4000 objects processed in tvtester1. Next marker 2_tVtester1_74
    5000 objects processed in tvtester1. Next marker 14_tVtester1_53
    6000 objects processed in tvtester1. Next marker 87_tVtester1_61
    7000 objects processed in tvtester1. Next marker 6_tVtester1_57
    8000 objects processed in tvtester1. Next marker 52_tVtester1_91
    9000 objects processed in tvtester1. Next marker 34_tVtester1_74
    9900 objects processed in tvtester1. Next marker 9_tVtester1_95
    1000 objects processed in tvtester2. Next marker 82_tVtester2_93
    2000 objects processed in tvtester2. Next marker 64_tVtester2_9
    3000 objects processed in tvtester2. Next marker 48_tVtester2_22
    4000 objects processed in tvtester2. Next marker 32_tVtester2_42
    5000 objects processed in tvtester2. Next marker 16_tVtester2_36
    6000 objects processed in tvtester2. Next marker 89_tVtester2_46
    7000 objects processed in tvtester2. Next marker 70_tVtester2_78
    8000 objects processed in tvtester2. Next marker 51_tVtester2_41
    9000 objects processed in tvtester2. Next marker 33_tVtester2_32
    9900 objects processed in tvtester2. Next marker 9_tVtester2_83
    {
        "user_id": "test$user2",
        "display_name": "User 2",
        "email": "",
        "suspended": 0,
        "max_buckets": 1000,
        "auid": 0,
        "subusers": [],
        "keys": [
            {
                "user": "test$user2",
                "access_key": "user2",
                "secret_key": "123456789"
            }
        ],
        "swift_keys": [],
        "caps": [],
        "op_mask": "read, write, delete",
        "default_placement": "",
        "placement_tags": [],
        "bucket_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "user_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "temp_url_keys": [],
        "type": "rgw"
    }

  2. Verify that the user has been renamed successfully:

    Syntax

    radosgw-admin user info --uid=NEW_USER_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin user info --uid=user2

    If a user is inside a tenant, use the TENANT$USER_NAME format:

    Syntax

    radosgw-admin user info --uid= TENANT$USER_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin user info --uid=test$user2

Additional Resources

  • The screen(1) manual page

7.6.10. Create a key

To create a key for a user, you must specify key create. For a user, specify the user ID and the s3 key type. To create a key for a subuser, you must specify the subuser ID and the swift keytype.

Example

[ceph: root@host01 /]# radosgw-admin key create --subuser=johndoe:swift --key-type=swift --gen-secret

{ "user_id": "johndoe",
  "rados_uid": 0,
  "display_name": "John Doe",
  "email": "john@example.com",
  "suspended": 0,
  "subusers": [
     { "id": "johndoe:swift",
       "permissions": "full-control"}],
  "keys": [
    { "user": "johndoe",
      "access_key": "QFAMEDSJP5DEKJO0DDXY",
      "secret_key": "iaSFLDVvDdQt6lkNzHyW4fPLZugBAI1g17LO0+87"}],
  "swift_keys": [
    { "user": "johndoe:swift",
      "secret_key": "E9T2rUZNu2gxUjcwUBO8n\/Ev4KX6\/GprEuH4qhu1"}]}

7.6.11. Add and remove access keys

Users and subusers must have access keys to use the S3 and Swift interfaces. When you create a user or subuser and you do not specify an access key and secret, the key and secret get generated automatically. You may create a key and either specify or generate the access key and/or secret. You may also remove an access key and secret. Options include:

  • --secret=SECRET_KEY specifies a secret key, for example, manually generated.
  • --gen-access-key generates a random access key (for S3 users by default).
  • --gen-secret generates a random secret key.
  • --key-type=KEY_TYPE specifies a key type. The options are: swift and s3.

To add a key, specify the user:

Example

[root@host01 ~]# radosgw-admin key create --uid=johndoe --key-type=s3 --gen-access-key --gen-secret

You might also specify a key and a secret.

To remove an access key, you need to specify the user and the key:

  1. Find the access key for the specific user:

    Example

    [root@host01 ~]# radosgw-admin user info --uid=johndoe

    The access key is the "access_key" value in the output:

    Example

    [root@host01 ~]# radosgw-admin user info --uid=johndoe
    {
        "user_id": "johndoe",
        ...
        "keys": [
            {
                "user": "johndoe",
                "access_key": "0555b35654ad1656d804",
                "secret_key": "h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q=="
            }
        ],
        ...
    }

  2. Specify the user ID and the access key from the previous step to remove the access key:

    Syntax

    radosgw-admin key rm --uid=USER_ID --access-key ACCESS_KEY

    Example

    [root@host01 ~]# radosgw-admin key rm --uid=johndoe --access-key 0555b35654ad1656d804

7.6.12. Add and remove admin capabilities

The Ceph Storage Cluster provides an administrative API that enables users to run administrative functions via the REST API. By default, users DO NOT have access to this API. To enable a user to exercise administrative functionality, provide the user with administrative capabilities.

To add administrative capabilities to a user, run the following command:

Syntax

radosgw-admin caps add --uid=USER_ID--caps=CAPS

You can add read, write, or all capabilities to users, buckets, metadata, and usage (utilization).

Syntax

--caps="[users|buckets|metadata|usage|zone]=[*|read|write|read, write]"

Example

[root@host01 ~]# radosgw-admin caps add --uid=johndoe --caps="users=*"

To remove administrative capabilities from a user, run the following command:

Example

[root@host01 ~]# radosgw-admin caps remove --uid=johndoe --caps={caps}

7.7. Role management

As a storage administrator, you can create, delete, or update a role and the permissions associated with that role with the radosgw-admin commands.

A role is similar to a user and has permission policies attached to it. It can be assumed by any identity. If a user assumes a role, a set of dynamically created temporary credentials are returned to the user. A role can be used to delegate access to users, applications and services that do not have permissions to access some S3 resources.

7.7.1. Creating a role

Create a role for the user with the radosgw-admin role create command. You need to create a user with assume-role-policy-doc parameter in the command, which is the trust relationship policy document that grants an entity the permission to assume the role.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • An S3 user created with user access.

Procedure

  • Create the role:

    Syntax

    radosgw-admin role create --role-name=ROLE_NAME [--path=="PATH_TO_FILE"] [--assume-role-policy-doc=TRUST_RELATIONSHIP_POLICY_DOCUMENT]

    Example

    [root@host01 ~]# radosgw-admin role create --role-name=S3Access1 --path=/application_abc/component_xyz/ --assume-role-policy-doc=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\}
    
    {
      "RoleId": "ca43045c-082c-491a-8af1-2eebca13deec",
      "RoleName": "S3Access1",
      "Path": "/application_abc/component_xyz/",
      "Arn": "arn:aws:iam:::role/application_abc/component_xyz/S3Access1",
      "CreateDate": "2022-06-17T10:18:29.116Z",
      "MaxSessionDuration": 3600,
      "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/TESTER\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
    }

    The value for --path is / by default.

7.7.2. Getting a role

Get the information about a role with the get command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • Getting the information about the role:

    Syntax

    radosgw-admin role get --role-name=ROLE_NAME

    Example

    [root@host01 ~]# radosgw-admin role get --role-name=S3Access1
    
    {
      "RoleId": "ca43045c-082c-491a-8af1-2eebca13deec",
      "RoleName": "S3Access1",
      "Path": "/application_abc/component_xyz/",
      "Arn": "arn:aws:iam:::role/application_abc/component_xyz/S3Access1",
      "CreateDate": "2022-06-17T10:18:29.116Z",
      "MaxSessionDuration": 3600,
      "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/TESTER\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
    }

Additional Resources

  • See the Creating a role section in the Red Hat Ceph Storage Object Gateway Guide for details.

7.7.3. Listing a role

You can list the roles in the specific path with the list command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • To list the roles, use the following command:

    Syntax

    radosgw-admin role list

    Example

    [root@host01 ~]# radosgw-admin role list
    [
        {
            "RoleId": "85fb46dd-a88a-4233-96f5-4fb54f4353f7",
            "RoleName": "kvm-sts",
            "Path": "/application_abc/component_xyz/",
            "Arn": "arn:aws:iam:::role/application_abc/component_xyz/kvm-sts",
            "CreateDate": "2022-09-13T11:55:09.39Z",
            "MaxSessionDuration": 7200,
            "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/kvm\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
        },
        {
            "RoleId": "9116218d-4e85-4413-b28d-cdfafba24794",
            "RoleName": "kvm-sts-1",
            "Path": "/application_abc/component_xyz/",
            "Arn": "arn:aws:iam:::role/application_abc/component_xyz/kvm-sts-1",
            "CreateDate": "2022-09-16T00:05:57.483Z",
            "MaxSessionDuration": 3600,
            "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/kvm\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
        }
    ]

7.7.4. Updating assume role policy document of a role

You can update the assume role policy document that grants an entity permission to assume the role with the modify command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • Modify the assume role policy document of a role:

    Syntax

    radosgw-admin role-trust-policy modify --role-name=ROLE_NAME --assume-role-policy-doc=TRUST_RELATIONSHIP_POLICY_DOCUMENT

    Example

    [root@host01 ~]# radosgw-admin role-trust-policy modify --role-name=S3Access1 --assume-role-policy-doc=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\}
    
    {
      "RoleId": "ca43045c-082c-491a-8af1-2eebca13deec",
      "RoleName": "S3Access1",
      "Path": "/application_abc/component_xyz/",
      "Arn": "arn:aws:iam:::role/application_abc/component_xyz/S3Access1",
      "CreateDate": "2022-06-17T10:18:29.116Z",
      "MaxSessionDuration": 3600,
      "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/TESTER\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
    }

7.7.5. Getting permission policy attached to a role

You can get the specific permission policy attached to a role with the get command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • Get the permission policy:

    Syntax

    radosgw-admin role-policy get --role-name=ROLE_NAME --policy-name=POLICY_NAME

    Example

    [root@host01 ~]# radosgw-admin role-policy get --role-name=S3Access1 --policy-name=Policy1
    
    {
      "Permission policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":[\"s3:*\"],\"Resource\":\"arn:aws:s3:::example_bucket\"}]}"
    }

7.7.6. Deleting a role

You can delete the role only after removing the permission policy attached to it.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • A role created.
  • An S3 bucket created.
  • An S3 user created with user access.

Procedure

  1. Delete the policy attached to the role:

    Syntax

    radosgw-admin role policy delete --role-name=ROLE_NAME --policy-name=POLICY_NAME

    Example

    [root@host01 ~]# radosgw-admin role policy delete --role-name=S3Access1 --policy-name=Policy1

  2. Delete the role:

    Syntax

    radosgw-admin role delete --role-name=ROLE_NAME

    Example

    [root@host01 ~]# radosgw-admin role delete --role-name=S3Access1

7.7.7. Updating a policy attached to a role

You can either add or update the inline policy attached to a role with the put command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • Update the inline policy:

    Syntax

    radosgw-admin role-policy put --role-name=ROLE_NAME --policy-name=POLICY_NAME --policy-doc=PERMISSION_POLICY_DOCUMENT

    Example

    [root@host01 ~]# radosgw-admin role-policy put --role-name=S3Access1 --policy-name=Policy1 --policy-doc=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Action\":\[\"s3:*\"\],\"Resource\":\"arn:aws:s3:::example_bucket\"\}\]\}

    In this example, you attach the Policy1 to the role S3Access1 which allows all S3 actions on an example_bucket.

7.7.8. Listing permission policy attached to a role

You can list the names of the permission policies attached to a role with the list command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • List the names of the permission policies:

    Syntax

    radosgw-admin role-policy list --role-name=ROLE_NAME

    Example

    [root@host01 ~]# radosgw-admin role-policy list --role-name=S3Access1
    
    [
      "Policy1"
    ]

7.7.9. Deleting policy attached to a role

You can delete the permission policy attached to a role with the rm command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • Delete the permission policy:

    Syntax

    radosgw-admin role policy delete --role-name=ROLE_NAME --policy-name=POLICY_NAME

    Example

    [root@host01 ~]# radosgw-admin role policy delete --role-name=S3Access1 --policy-name=Policy1

7.7.10. Updating the session duration of a role

You can update the session duration of a role with the update command.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • A role created.
  • An S3 user created with user access.

Procedure

  • Update the max-session-duration using the update command:

    Syntax

    [root@node1 ~]# radosgw-admin role update --role-name=ROLE_NAME --max-session-duration=7200

    Example

    [root@node1 ~]# radosgw-admin role update --role-name=test-sts-role --max-session-duration=7200

Verification

  • List the roles to verify the updates:

    Example

    [root@node1 ~]#radosgw-admin role list
    [
        {
            "RoleId": "d4caf33f-caba-42f3-8bd4-48c84b4ea4d3",
            "RoleName": "test-sts-role",
            "Path": "/",
            "Arn": "arn:aws:iam:::role/test-role",
            "CreateDate": "2022-09-07T20:01:15.563Z",
            "MaxSessionDuration": 7200,				<<<<<<
            "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/kvm\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
        }
    ]

Additional Resources

7.8. Quota management

The Ceph Object Gateway enables you to set quotas on users and buckets owned by users. Quotas include the maximum number of objects in a bucket and the maximum storage size in megabytes.

  • Bucket: The --bucket option allows you to specify a quota for buckets the user owns.
  • Maximum Objects: The --max-objects setting allows you to specify the maximum number of objects. A negative value disables this setting.
  • Maximum Size: The --max-size option allows you to specify a quota for the maximum number of bytes. A negative value disables this setting.
  • Quota Scope: The --quota-scope option sets the scope for the quota. The options are bucket and user. Bucket quotas apply to buckets a user owns. User quotas apply to a user.
Important

Buckets with a large number of objects can cause serious performance issues. The recommended maximum number of objects in one bucket is 100,000. To increase this number, configure bucket index sharding. See Section 7.4, “Configure bucket index resharding” for details.

7.8.1. Set user quotas

Before you enable a quota, you must first set the quota parameters.

Syntax

radosgw-admin quota set --quota-scope=user --uid=USER_ID [--max-objects=NUMBER_OF_OBJECTS] [--max-size=MAXIMUM_SIZE_IN_BYTES]

Example

[root@host01 ~]# radosgw-admin quota set --quota-scope=user --uid=johndoe --max-objects=1024 --max-size=1024

A negative value for num objects and / or max size means that the specific quota attribute check is disabled.

7.8.2. Enable and disable user quotas

Once you set a user quota, you can enable it.

Syntax

radosgw-admin quota enable --quota-scope=user --uid=USER_ID

You may disable an enabled user quota.

Syntax

radosgw-admin quota disable --quota-scope=user --uid=USER_ID

7.8.3. Set bucket quotas

Bucket quotas apply to the buckets owned by the specified uid. They are independent of the user.

Syntax

radosgw-admin quota set --uid=USER_ID --quota-scope=bucket --bucket=BUCKET_NAME [--max-objects=NUMBER_OF_OBJECTS] [--max-size=MAXIMUM_SIZE_IN_BYTES]

A negative value for NUMBER_OF_OBJECTS, MAXIMUM_SIZE_IN_BYTES, or both means that the specific quota attribute check is disabled.

7.8.4. Enable and disable bucket quotas

Once you set a bucket quota, you may enable it.

Syntax

radosgw-admin quota enable --quota-scope=bucket --uid=USER_ID

You may disable an enabled bucket quota.

Syntax

radosgw-admin quota disable --quota-scope=bucket --uid=USER_ID

7.8.5. Get quota settings

You may access each user’s quota settings via the user information API. To read user quota setting information with the CLI interface, run the following command:

Syntax

radosgw-admin user info --uid=USER_ID

To get quota settings for a tenanted user, specify the user ID and the name of the tenant:

Syntax

radosgw-admin user info --uid=USER_ID --tenant=TENANT

7.8.6. Update quota stats

Quota stats get updated asynchronously. You can update quota statistics for all users and all buckets manually to retrieve the latest quota stats.

Syntax

radosgw-admin user stats --uid=USER_ID --sync-stats

7.8.7. Get user quota usage stats

To see how much of the quota a user has consumed, run the following command:

Syntax

radosgw-admin user stats --uid=USER_ID

Note

You should run the radosgw-admin user stats command with the --sync-stats option to receive the latest data.

7.8.8. Quota cache

Quota statistics are cached for each Ceph Gateway instance. If there are multiple instances, then the cache can keep quotas from being perfectly enforced, as each instance will have a different view of the quotas. The options that control this are rgw bucket quota ttl, rgw user quota bucket sync interval, and rgw user quota sync interval. The higher these values are, the more efficient quota operations are, but the more out-of-sync multiple instances will be. The lower these values are, the closer to perfect enforcement multiple instances will achieve. If all three are 0, then quota caching is effectively disabled, and multiple instances will have perfect quota enforcement. See Appendix A, Configuration reference for more details on these options.

7.8.9. Reading and writing global quotas

You can read and write quota settings in a zonegroup map. To get a zonegroup map:

[root@host01 ~]# radosgw-admin global quota get

The global quota settings can be manipulated with the global quota counterparts of the quota set, quota enable, and quota disable commands, for example:

[root@host01 ~]# radosgw-admin global quota set --quota-scope bucket --max-objects 1024
[root@host01 ~]# radosgw-admin global quota enable --quota-scope bucket
Note

In a multi-site configuration, where there is a realm and period present, changes to the global quotas must be committed using period update --commit. If there is no period present, the Ceph Object Gateways must be restarted for the changes to take effect.

7.9. Bucket management

As a storage administrator, when using the Ceph Object Gateway you can manage buckets by moving them between users and renaming them. You can create bucket notifications to trigger on specific events. Also, you can find orphan or leaky objects within the Ceph Object Gateway that can occur over the lifetime of a storage cluster.

Note

When millions of objects are uploaded to a Ceph Object Gateway bucket with a high ingest rate, incorrect num_objects are reported with the radosgw-admin bucket stats command. With the radosgw-admin bucket list command you can correct the value of num_objects parameter.

Note

The radosgw-admin bucket stats command does not return Unknown error 2002 error and explicitly translates to POSIX error 2 such as "No such file or directory" error.

Note

In a multi-site cluster, deletion of a bucket from the secondary site does not sync the metadata changes with the primary site. Hence, Red Hat recommends to delete a bucket only from the primary site and not from the secondary site.

7.9.1. Renaming buckets

You can rename buckets. If you want to allow underscores in bucket names, then set the rgw_relaxed_s3_bucket_names option to true.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway software.
  • An existing bucket.

Procedure

  1. List the buckets:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list
    [
        "34150b2e9174475db8e191c188e920f6/swcontainer",
        "s3bucket1",
        "34150b2e9174475db8e191c188e920f6/swimpfalse",
        "c278edd68cfb4705bb3e07837c7ad1a8/ec2container",
        "c278edd68cfb4705bb3e07837c7ad1a8/demoten1",
        "c278edd68cfb4705bb3e07837c7ad1a8/demo-ct",
        "c278edd68cfb4705bb3e07837c7ad1a8/demopostup",
        "34150b2e9174475db8e191c188e920f6/postimpfalse",
        "c278edd68cfb4705bb3e07837c7ad1a8/demoten2",
        "c278edd68cfb4705bb3e07837c7ad1a8/postupsw"
    ]

  2. Rename the bucket:

    Syntax

    radosgw-admin bucket link --bucket=ORIGINAL_NAME --bucket-new-name=NEW_NAME --uid=USER_ID

    Example

    [ceph: root@host01 /]# radosgw-admin bucket link --bucket=s3bucket1 --bucket-new-name=s3newb --uid=testuser

    If the bucket is inside a tenant, specify the tenant as well:

    Syntax

    radosgw-admin bucket link --bucket=tenant/ORIGINAL_NAME --bucket-new-name=NEW_NAME --uid=TENANT$USER_ID

    Example

    [ceph: root@host01 /]# radosgw-admin bucket link --bucket=test/s3bucket1 --bucket-new-name=s3newb --uid=test$testuser

  3. Verify the bucket was renamed:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list
    [
        "34150b2e9174475db8e191c188e920f6/swcontainer",
        "34150b2e9174475db8e191c188e920f6/swimpfalse",
        "c278edd68cfb4705bb3e07837c7ad1a8/ec2container",
        "s3newb",
        "c278edd68cfb4705bb3e07837c7ad1a8/demoten1",
        "c278edd68cfb4705bb3e07837c7ad1a8/demo-ct",
        "c278edd68cfb4705bb3e07837c7ad1a8/demopostup",
        "34150b2e9174475db8e191c188e920f6/postimpfalse",
        "c278edd68cfb4705bb3e07837c7ad1a8/demoten2",
        "c278edd68cfb4705bb3e07837c7ad1a8/postupsw"
    ]

7.9.2. Moving buckets

The radosgw-admin bucket utility provides the ability to move buckets between users. To do so, link the bucket to a new user and change the ownership of the bucket to the new user.

You can move buckets:

7.9.2.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Ceph Object Gateway is installed.
  • An S3 bucket.
  • Various tenanted and non-tenanted users.

7.9.2.2. Moving buckets between non-tenanted users

The radosgw-admin bucket chown command provides the ability to change the ownership of buckets and all objects they contain from one user to another. To do so, unlink a bucket from the current user, link it to a new user, and change the ownership of the bucket to the new user.

Procedure

  1. Link the bucket to a new user:

    Syntax

    radosgw-admin bucket link --uid=USER --bucket=BUCKET

    Example

    [ceph: root@host01 /]# radosgw-admin bucket link --uid=user2 --bucket=data

  2. Verify that the bucket has been linked to user2 successfully:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list --uid=user2
    [
        "data"
    ]

  3. Change the ownership of the bucket to the new user:

    Syntax

    radosgw-admin bucket chown --uid=user --bucket=bucket

    Example

    [ceph: root@host01 /]# radosgw-admin bucket chown --uid=user2 --bucket=data

  4. Verify that the ownership of the data bucket has been successfully changed by checking the owner line in the output of the following command:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list --bucket=data

7.9.2.3. Moving buckets between tenanted users

You can move buckets between one tenanted user and another.

Procedure

  1. Link the bucket to a new user:

    Syntax

    radosgw-admin bucket link --bucket=CURRENT_TENANT/BUCKET --uid=NEW_TENANT$USER

    Example

    [ceph: root@host01 /]# radosgw-admin bucket link --bucket=test/data --uid=test2$user2

  2. Verify that the bucket has been linked to user2 successfully:

    [ceph: root@host01 /]# radosgw-admin bucket list --uid=test$user2
    [
        "data"
    ]
  3. Change the ownership of the bucket to the new user:

    Syntax

    radosgw-admin bucket chown --bucket=NEW_TENANT/BUCKET --uid=NEW_TENANT$USER

    Example

    [ceph: root@host01 /]# radosgw-admin bucket chown --bucket='test2/data' --uid='test$tuser2'

  4. Verify that the ownership of the data bucket has been successfully changed by checking the owner line in the output of the following command:

    [ceph: root@host01 /]# radosgw-admin bucket list --bucket=test2/data

7.9.2.4. Moving buckets from non-tenanted users to tenanted users

You can move buckets from a non-tenanted user to a tenanted user.

Procedure

  1. Optional: If you do not already have multiple tenants, you can create them by enabling rgw_keystone_implicit_tenants and accessing the Ceph Object Gateway from an external tenant:

    Enable the rgw_keystone_implicit_tenants option:

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_keystone_implicit_tenants true

    Access the Ceph Object Gateway from an eternal tenant using either the s3cmd or swift command:

    Example

    [ceph: root@host01 /]# swift list

    Or use s3cmd:

    Example

    [ceph: root@host01 /]# s3cmd ls

    The first access from an external tenant creates an equivalent Ceph Object Gateway user.

  2. Move a bucket to a tenanted user:

    Syntax

    radosgw-admin bucket link --bucket=/BUCKET --uid='TENANT$USER'

    Example

    [ceph: root@host01 /]# radosgw-admin bucket link --bucket=/data --uid='test$tenanted-user'

  3. Verify that the data bucket has been linked to tenanted-user successfully:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list --uid='test$tenanted-user'
    [
        "data"
    ]

  4. Change the ownership of the bucket to the new user:

    Syntax

    radosgw-admin bucket chown --bucket='tenant/bucket name' --uid='tenant$user'

    Example

    [ceph: root@host01 /]# radosgw-admin bucket chown --bucket='test/data' --uid='test$tenanted-user'

  5. Verify that the ownership of the data bucket has been successfully changed by checking the owner line in the output of the following command:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list --bucket=test/data

7.9.3. Finding orphan and leaky objects

A healthy storage cluster does not have any orphan or leaky objects, but in some cases orphan or leaky objects can occur.

An orphan object exists in a storage cluster and has an object ID associated with the RADOS object. However, there is no reference of the RADOS object with the S3 object in the bucket index reference. For example, if the Ceph Object Gateway goes down in the middle of an operation, this can cause some objects to become orphans. Also, an undiscovered bug can cause orphan objects to occur.

You can see how the Ceph Object Gateway objects map to the RADOS objects. The radosgw-admin command provides a tool to search for and produce a list of these potential orphan or leaky objects. Using the radoslist subcommand displays objects stored within buckets, or all buckets in the storage cluster. The rgw-orphan-list script displays orphan objects within a pool.

Note

The radoslist subcommand is replacing the deprecated orphans find and orphans finish subcommands.

Important

Do not use this command where Indexless buckets are in use as all the objects appear as orphaned.

Another alternate way to identity orphaned objects is to run the rados -p <pool> ls | grep BUCKET_ID command

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A running Ceph Object Gateway.

Procedure

  1. Generate a list of objects that hold data within a bucket.

    Syntax

    radosgw-admin bucket radoslist --bucket BUCKET_NAME

    Example

    [root@host01 ~]# radosgw-admin bucket radoslist --bucket mybucket

    Note

    If the BUCKET_NAME is omitted, then all objects in all buckets are displayed.

  2. Check the version of rgw-orphan-list.

    Example

    [root@host01 ~]# head /usr/bin/rgw-orphan-list

    The version should be 2023-01-11 or newer.

  3. Create a directory where you need to generate the list of orphans.

    Example

    [root@host01 ~]# mkdir orphans

  4. Navigate to the directory created earlier.

    Example

    [root@host01 ~]# cd orphans

  5. From the pool list, select the pool in which you want to find orphans. This script might run for a long time depending on the objects in the cluster.

    Example

    [root@host01 orphans]# rgw-orphan-list

    Example

    Available pools:
        .rgw.root
        default.rgw.control
        default.rgw.meta
        default.rgw.log
        default.rgw.buckets.index
        default.rgw.buckets.data
        rbd
        default.rgw.buckets.non-ec
        ma.rgw.control
        ma.rgw.meta
        ma.rgw.log
        ma.rgw.buckets.index
        ma.rgw.buckets.data
        ma.rgw.buckets.non-ec
    Which pool do you want to search for orphans?

    Enter the pool name to search for orphans.

    Important

    A data pool must be specified when using the rgw-orphan-list command, and not a metadata pool.

  6. View the details of the rgw-orphan-list tool usage. `

    Synatx

    rgw-orphan-list -h
    rgw-orphan-list POOL_NAME /DIRECTORY

    Example

    [root@host01 orphans]# rgw-orphan-list default.rgw.buckets.data /orphans
    
    2023-09-12 08:41:14 ceph-host01 Computing delta...
    2023-09-12 08:41:14 ceph-host01 Computing results...
    10 potential orphans found out of a possible 2412 (0%).         <<<<<<< orphans detected
    The results can be found in './orphan-list-20230912124113.out'.
        Intermediate files are './rados-20230912124113.intermediate' and './radosgw-admin-20230912124113.intermediate'.
    ***
    *** WARNING: This is EXPERIMENTAL code and the results should be used
    ***          only with CAUTION!
    ***
    Done at 2023-09-12 08:41:14.

  7. Run the ls -l command to verify the files ending with error should be zero length indicating the script ran without any issues.

    Example

    [root@host01 orphans]# ls -l
    
    -rw-r--r--. 1 root root    770 Sep 12 03:59 orphan-list-20230912075939.out
    -rw-r--r--. 1 root root      0 Sep 12 03:59 rados-20230912075939.error
    -rw-r--r--. 1 root root 248508 Sep 12 03:59 rados-20230912075939.intermediate
    -rw-r--r--. 1 root root      0 Sep 12 03:59 rados-20230912075939.issues
    -rw-r--r--. 1 root root      0 Sep 12 03:59 radosgw-admin-20230912075939.error
    -rw-r--r--. 1 root root 247738 Sep 12 03:59 radosgw-admin-20230912075939.intermediate

  8. Review the orphan objects listed.

    Example

    [root@host01 orphans]# cat ./orphan-list-20230912124113.out
    
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.0
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.1
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.2
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.3
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.4
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.5
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.6
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.7
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.8
    a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.9

  9. Remove orphan objects:

    Syntax

    rados -p POOL_NAME rm OBJECT_NAME

    Example

    [root@host01 orphans]# rados -p default.rgw.buckets.data rm myobject

    Warning

    Verify you are removing the correct objects. Running the rados rm command removes data from the storage cluster.

7.9.4. Managing bucket index entries

You can manage the bucket index entries of the Ceph Object Gateway in a Red Hat Ceph Storage cluster using the radosgw-admin bucket check sub-command.

Each bucket index entry related to a piece of a multipart upload object is matched against its corresponding .meta index entry. There should be one .meta entry for all the pieces of a given multipart upload. If it fails to find a corresponding .meta entry for a piece, it lists out the "orphaned" piece entries in a section of the output.

The stats for the bucket are stored in the bucket index headers. This phase loads those headers and also iterates through all the plain object entries in the bucket index and recalculates the stats. It then displays the actual and calculated stats in sections labeled "existing_header" and "calculated_header" respectively, so they can be compared.

If you use the --fix option with the bucket check sub-command, it removes the "orphaned" entries from the bucket index and also overwrites the existing stats in the header with those that it calculated. It causes all entries, including the multiple entries used in versioning, to be listed in the output.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A running Ceph Object Gateway.
  • A newly created bucket.

Procedure

  1. Check the bucket index of a specific bucket:

    Syntax

    radosgw-admin bucket check --bucket=BUCKET_NAME

    Example

    [root@rgw ~]# radosgw-admin bucket check --bucket=mybucket

  2. Fix the inconsistencies in the bucket index, including removal of orphaned objects:

    Syntax

    radosgw-admin bucket check --fix --bucket=BUCKET_NAME

    Example

    [root@rgw ~]# radosgw-admin bucket check --fix --bucket=mybucket

7.9.5. Bucket notifications

Bucket notifications provide a way to send information out of the Ceph Object Gateway when certain events happen in the bucket. Bucket notifications can be sent to HTTP, AMQP0.9.1, and Kafka endpoints. A notification entry must be created to send bucket notifications for events on a specific bucket and to a specific topic. A bucket notification can be created on a subset of event types or by default for all event types. The bucket notification can filter out events based on key prefix or suffix, regular expression matching the keys, and on the metadata attributes attached to the object, or the object tags. Bucket notifications have a REST API to provide configuration and control interfaces for the bucket notification mechanism.

Note

The bucket notifications API is enabled by default. If rgw_enable_apis configuration parameter is explicitly set, ensure that s3, and notifications are included. To verify this, run the ceph --admin-daemon /var/run/ceph/ceph-client.rgw.NAME.asok config get rgw_enable_apis command. Replace NAME with the Ceph Object Gateway instance name.

Topic management using CLI

You can manage list, get, and remove topics for the Ceph Object Gateway buckets:

  • List topics: Run the following command to list the configuration of all topics:

    Example

    [ceph: host01 /]# radosgw-admin topic list

  • Get topics: Run the following command to get the configuration of a specific topic:

    Example

    [ceph: host01 /]# radosgw-admin topic get --topic=topic1

  • Remove topics: Run the following command to remove the configuration of a specific topic:

    Example

    [ceph: host01 /]# radosgw-admin topic rm --topic=topic1

    Note

    The topic is removed even if the Ceph Object Gateway bucket is configured to that topic.

7.9.6. Creating bucket notifications

Create bucket notifications at the bucket level. The notification configuration has the Red Hat Ceph Storage Object Gateway S3 events, ObjectCreated and ObjectRemoved. These need to be published with the destination to send the bucket notifications. Bucket notifications are S3 operations.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A running HTTP server, RabbitMQ server, or a Kafka server.
  • Root-level access.
  • Installation of the Red Hat Ceph Storage Object Gateway.
  • User access key and secret key.
  • Endpoint parameters.
Important

Red Hat supports ObjectCreate events, such as put, post, multipartUpload, and copy. Red Hat also supports ObjectRemove events, such as object_delete and s3_multi_object_delete.

Listed here are two ways of creating bucket notifications:

  • Using the boto script
  • Using AWS CLI

Using the boto script

  1. Install the python3-boto3 package:

    Example

    [user@client ~]$  dnf install python3-boto3

  2. Create an S3 bucket.
  3. Create a python script topic.py to create an SNS topic for http,amqp, or kafka protocol:

    Example

    import boto3
    from botocore.client import Config
    import sys
    
    # endpoint and keys from vstart
    endpoint = 'http://127.0.0.1:8000'
    access_key='0555b35654ad1656d804'
    secret_key='h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q=='
    
    client = boto3.client('sns',
        	endpoint_url=endpoint,
        	aws_access_key_id=access_key,
        	aws_secret_access_key=secret_key,
        	config=Config(signature_version='s3'))
    
    attributes = {"push-endpoint": "amqp://localhost:5672", "amqp-exchange": "ex1", "amqp-ack-level": "broker"}
    
    client.create_topic(topic_name="mytopic", Attributes=attributes)

  4. Run the python script for creating topic:

    Example

    python3 topic.py

  5. Create a python script notification.py to create S3 bucket notification for s3:objectCreate and s3:objectRemove events:

    Example

    import boto3
    import sys
    
    # bucket name as first argument
    bucketname = sys.argv[1]
    # topic ARN as second argument
    topic_arn = sys.argv[2]
    # notification id as third argument
    notification_id = sys.argv[3]
    
    # endpoint and keys from vstart
    endpoint = 'http://127.0.0.1:8000'
    access_key='0555b35654ad1656d804'
    secret_key='h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q=='
    
    client = boto3.client('s3',
            endpoint_url=endpoint,
            aws_access_key_id=access_key,
            aws_secret_access_key=secret_key)
    
    # regex filter on the object name and metadata based filtering are extension to AWS S3 API
    # bucket and topic should be created beforehand
    
    topic_conf_list = [{'Id': notification_id,
                        'TopicArn': topic_arn,
                        'Events': ['s3:ObjectCreated:*', 's3:ObjectRemoved:*'],
                        }]
    
    client.put_bucket_notification_configuration(
       Bucket=bucketname,
       NotificationConfiguration={
           'TopicConfigurations': [
               {
                   'Id': notification_name,
                   'TopicArn': topic_arn,
                   'Events': ['s3:ObjectCreated:*', 's3:ObjectRemoved:*']
               }]})

  6. Run the python script for creating the bucket notification:

    Example

    python3 notification.py

  7. Create S3 objects in the bucket.
  8. Fetch the notification configuration:

    Example

    endpoint = 'http://127.0.0.1:8000'
    access_key='0555b35654ad1656d804'
    secret_key='h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q=='
    
    client = boto3.client('s3',
            endpoint_url=endpoint,
            aws_access_key_id=access_key,
            aws_secret_access_key=secret_key)
    
    # getting a specific notification configuration is an extension to AWS S3 API
    
    print(client.get_bucket_notification_configuration(Bucket=bucketname))

  9. Optional: Delete the objects.

    1. Verify the object deletion events at the http, rabbitmq, or kafka receiver.

Using thr AWS CLI

  1. Create topic:

    Syntax

    aws --endpoint=AWS_END_POINT sns create-topic --name NAME --attributes=ATTRIBUTES_FILE

    Example

    [user@client ~]$ aws --endpoint=http://localhost sns create-topic --name test-kafka --attributes=file://topic.json
    
    sample topic.json:
    {"push-endpoint": "kafka://localhost","verify-ssl": "False", "kafka-ack-level": "broker", "persistent":"true"}
    ref: https://docs.aws.amazon.com/cli/latest/reference/sns/create-topic.html

  2. Create the bucket notification:

    Syntax

    aws s3api put-bucket-notification-configuration --bucket BUCKET_NAME --notification-configuration NOTIFICATION_FILE

    Example

    [user@client ~]$ aws s3api put-bucket-notification-configuration --bucket my-bucket --notification-configuration file://notification.json
    
    sample notification.json
    {
        "TopicConfigurations": [
            {
                "Id": "test_notification",
                "TopicArn": "arn:aws:sns:us-west-2:123456789012:test-kafka",
                "Events": [
                    "s3:ObjectCreated:*"
                ]
            }
        ]
    }

  3. Fetch the notification configuration:

    Syntax

    aws s3api --endpoint=AWS_ENDPOINT get-bucket-notification-configuration --bucket BUCKET_NAME

    Example

    [user@client ~]$ aws s3api --endpoint=http://localhost get-bucket-notification-configuration --bucket my-bucket
    {
        "TopicConfigurations": [
            {
                "Id": "test_notification",
                "TopicArn": "arn:aws:sns:default::test-kafka",
                "Events": [
                    "s3:ObjectCreated:*"
                ]
            }
        ]
    }

7.9.7. Additional Resources

7.10. Bucket lifecycle

As a storage administrator, you can use a bucket lifecycle configuration to manage your objects so they are stored effectively throughout their lifetime. For example, you can transition objects to less expensive storage classes, archive, or even delete them based on your use case.

RADOS Gateway supports S3 API object expiration by using rules defined for a set of bucket objects. Each rule has a prefix, which selects the objects, and a number of days after which objects become unavailable.

Note

The radosgw-admin lc reshard command is deprecated in Red Hat Ceph Storage 3.3 and not supported in Red Hat Ceph Storage 4 and later releases.

7.10.1. Creating a lifecycle management policy

You can manage a bucket lifecycle policy configuration using standard S3 operations rather than using the radosgw-admin command. RADOS Gateway supports only a subset of the Amazon S3 API policy language applied to buckets. The lifecycle configuration contains one or more rules defined for a set of bucket objects.

Prerequisites

  • A running Red Hat Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • An S3 user created with user access.
  • Access to a Ceph Object Gateway client with the AWS CLI package installed.

Procedure

  1. Create a JSON file for lifecycle configuration:

    Example

    [user@client ~]$ vi lifecycle.json

  2. Add the specific lifecycle configuration rules in the file:

    Example

    {
    	"Rules": [
            {
    		    "Filter": {
    			    "Prefix": "images/"
    		    },
    		    "Status": "Enabled",
    		    "Expiration": {
    			    "Days": 1
    		    },
    		    "ID": "ImageExpiration"
    	    }
        ]
    }

    The lifecycle configuration example expires objects in the images directory that are more than 1 day old.

  3. Set the lifecycle configuration on the bucket:

    Syntax

    aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api put-bucket-lifecycle-configuration --bucket BUCKET_NAME --lifecycle-configuration file://PATH_TO_LIFECYCLE_CONFIGURATION_FILE/LIFECYCLE_CONFIGURATION_FILE.json

    Example

    [user@client ~]$ aws --endpoint-url=http://host01:80 s3api put-bucket-lifecycle-configuration --bucket testbucket --lifecycle-configuration file://lifecycle.json

    In this example, the lifecycle.json file exists in the current directory.

Verification

  • Retrieve the lifecycle configuration for the bucket:

    Syntax

    aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api get-bucket-lifecycle-configuration --bucket BUCKET_NAME

    Example

    [user@client ~]$ aws --endpoint-url=http://host01:80 s3api get-bucket-lifecycle-configuration --bucket testbucket
    {
    	"Rules": [
            {
    		    "Expiration": {
    			    "Days": 1
    		    },
    		    "ID": "ImageExpiration",
    		    "Filter": {
    			    "Prefix": "images/"
    		    },
    		    "Status": "Enabled"
    	    }
        ]
    }

  • Optional: From the Ceph Object Gateway node, log into the Cephadm shell and retrieve the bucket lifecycle configuration:

    Syntax

    radosgw-admin lc get --bucket=BUCKET_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin lc get --bucket=testbucket
    {
    	"prefix_map": {
    		"images/": {
    			"status": true,
    			"dm_expiration": false,
    			"expiration": 1,
    			"noncur_expiration": 0,
    			"mp_expiration": 0,
    			"transitions": {},
    			"noncur_transitions": {}
    		}
    	},
    	"rule_map": [
            {
    		"id": "ImageExpiration",
    		"rule": {
    			"id": "ImageExpiration",
    			"prefix": "",
    			"status": "Enabled",
    			"expiration": {
    				"days": "1",
    				"date": ""
    			},
    			"mp_expiration": {
    				"days": "",
    				"date": ""
    			},
    			"filter": {
    				"prefix": "images/",
    				"obj_tags": {
    					"tagset": {}
    				}
    			},
    			"transitions": {},
    			"noncur_transitions": {},
    			"dm_expiration": false
    		}
    	}
      ]
    }

Additional Resources

7.10.2. Deleting a lifecycle management policy

You can delete the lifecycle management policy for a specified bucket by using the s3api delete-bucket-lifecycle command.

Prerequisites

  • A running Red Hat Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • An S3 user created with user access.
  • Access to a Ceph Object Gateway client with the AWS CLI package installed.

Procedure

  • Delete a lifecycle configuration:

    Syntax

    aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api delete-bucket-lifecycle --bucket BUCKET_NAME

    Example

    [user@client ~]$ aws --endpoint-url=http://host01:80 s3api delete-bucket-lifecycle --bucket testbucket

Verification

  • Retrieve lifecycle configuration for the bucket:

    Syntax

    aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api get-bucket-lifecycle-configuration --bucket BUCKET_NAME

    Example

    [user@client ~]# aws --endpoint-url=http://host01:80  s3api get-bucket-lifecycle-configuration --bucket testbucket

  • Optional: From the Ceph Object Gateway node, retrieve the bucket lifecycle configuration:

    Syntax

    radosgw-admin lc get --bucket=BUCKET_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin lc get --bucket=testbucket

    Note

    The command does not return any information if a bucket lifecycle policy is not present.

Additional Resources

  • See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.

7.10.3. Updating a lifecycle management policy

You can update a lifecycle management policy by using the s3cmd put-bucket-lifecycle-configuration command.

Note

The put-bucket-lifecycle-configuration overwrites an existing bucket lifecycle configuration. If you want to retain any of the current lifecycle policy settings, you must include them in the lifecycle configuration file.

Prerequisites

  • A running Red Hat Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.
  • An S3 bucket created.
  • An S3 user created with user access.
  • Access to a Ceph Object Gateway client with the AWS CLI package installed.

Procedure

  1. Create a JSON file for the lifecycle configuration:

    Example

    [user@client ~]$ vi lifecycle.json

  2. Add the specific lifecycle configuration rules to the file:

    Example

    {
    	"Rules": [
            {
    		    "Filter": {
    			    "Prefix": "images/"
    		    },
    		    "Status": "Enabled",
    		    "Expiration": {
    			    "Days": 1
    		    },
    		    "ID": "ImageExpiration"
    	    },
    		{
    			"Filter": {
    				"Prefix": "docs/"
    			},
    			"Status": "Enabled",
    			"Expiration": {
    				"Days": 30
    			},
    			"ID": "DocsExpiration"
    		}
    	]
    }

  3. Update the lifecycle configuration on the bucket:

    Syntax

    aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api put-bucket-lifecycle-configuration --bucket BUCKET_NAME --lifecycle-configuration file://PATH_TO_LIFECYCLE_CONFIGURATION_FILE/LIFECYCLE_CONFIGURATION_FILE.json

    Example

    [user@client ~]$ aws --endpoint-url=http://host01:80 s3api put-bucket-lifecycle-configuration --bucket testbucket --lifecycle-configuration file://lifecycle.json

Verification

  • Retrieve the lifecycle configuration for the bucket:

    Syntax

    aws --endpointurl=RADOSGW_ENDPOINT_URL:PORT s3api get-bucket-lifecycle-configuration --bucket BUCKET_NAME

    Example

    [user@client ~]$ aws -endpoint-url=http://host01:80 s3api get-bucket-lifecycle-configuration --bucket testbucket
    
    {
        "Rules": [
            {
                "Expiration": {
                    "Days": 30
                },
                "ID": "DocsExpiration",
                "Filter": {
                    "Prefix": "docs/"
                },
                "Status": "Enabled"
            },
            {
                "Expiration": {
                    "Days": 1
                },
                "ID": "ImageExpiration",
                "Filter": {
                    "Prefix": "images/"
                },
                "Status": "Enabled"
            }
        ]
    }

  • Optional: From the Ceph Object Gateway node, log into the Cephadm shell and retrieve the bucket lifecycle configuration:

    Syntax

    radosgw-admin lc get --bucket=BUCKET_NAME

    Example

    [ceph: root@host01 /]# radosgw-admin lc get --bucket=testbucket
    {
    	"prefix_map": {
            "docs/": {
    			"status": true,
    			"dm_expiration": false,
    			"expiration": 1,
    			"noncur_expiration": 0,
    			"mp_expiration": 0,
    			"transitions": {},
    			"noncur_transitions": {}
    		},
    		"images/": {
    			"status": true,
    			"dm_expiration": false,
    			"expiration": 1,
    			"noncur_expiration": 0,
    			"mp_expiration": 0,
    			"transitions": {},
    			"noncur_transitions": {}
    		}
    	},
    	"rule_map": [
            {
            "id": "DocsExpiration",
        	"rule": {
        		"id": "DocsExpiration",
        		"prefix": "",
        		"status": "Enabled",
        		"expiration": {
        			"days": "30",
        			"date": ""
        		},
                "noncur_expiration": {
                    "days": "",
                    "date": ""
                },
        		"mp_expiration": {
        			"days": "",
        			"date": ""
        		},
        		"filter": {
        			"prefix": "docs/",
        			"obj_tags": {
        				"tagset": {}
        			}
        		},
        		"transitions": {},
        		"noncur_transitions": {},
        		"dm_expiration": false
        	}
        },
        {
    		"id": "ImageExpiration",
    		"rule": {
    			"id": "ImageExpiration",
    			"prefix": "",
    			"status": "Enabled",
    			"expiration": {
    				"days": "1",
    				"date": ""
    			},
    			"mp_expiration": {
    				"days": "",
    				"date": ""
    			},
    			"filter": {
    				"prefix": "images/",
    				"obj_tags": {
    					"tagset": {}
    				}
    			},
    			"transitions": {},
    			"noncur_transitions": {},
    			"dm_expiration": false
    		}
    	}
      ]
    }

Additional Resources

7.10.4. Monitoring bucket lifecycles

You can monitor lifecycle processing and manually process the lifecycle of buckets with the radosgw-admin lc list and radosgw-admin lc process commands.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to a Ceph Object Gateway node.
  • Creation of an S3 bucket with a lifecycle configuration policy applied.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell

  2. List bucket lifecycle progress:

    Example

    [ceph: root@host01 /]# radosgw-admin lc list
    
    [
       {
             “bucket”: “:testbucket:8b63d584-9ea1-4cf3-8443-a6a15beca943.54187.1”,
             “started”: “Thu, 01 Jan 1970 00:00:00 GMT”,
             “status” : “UNINITIAL”
       },
       {
             “bucket”: “:testbucket1:8b635499-9e41-4cf3-8443-a6a15345943.54187.2”,
             “started”: “Thu, 01 Jan 1970 00:00:00 GMT”,
             “status” : “UNINITIAL”
       }
    ]

    The bucket lifecycle processing status can be one of the following:

    • UNINITIAL - The process has not run yet.
    • PROCESSING - The process is currently running.
    • COMPLETE - The process has completed.
  3. Optional: You can manually process bucket lifecycle policies:

    1. Process the lifecycle policy for a single bucket:

      Syntax

      radosgw-admin lc process --bucket=BUCKET_NAME

      Example

      [ceph: root@host01 /]# radosgw-admin lc process --bucket=testbucket1

    2. Process all bucket lifecycle policies immediately:

      Example

      [ceph: root@host01 /]# radosgw-admin lc process

Verification

  • List the bucket lifecycle policies:

    [ceph: root@host01 /]# radosgw-admin lc list
    [
        {
              “bucket”: “:testbucket:8b63d584-9ea1-4cf3-8443-a6a15beca943.54187.1”,
              “started”: “Thu, 17 Mar 2022 21:48:50 GMT”,
              “status” : “COMPLETE”
        }
        {
              “bucket”: “:testbucket1:8b635499-9e41-4cf3-8443-a6a15345943.54187.2”,
              “started”: “Thu, 17 Mar 2022 20:38:50 GMT”,
              “status” : “COMPLETE”
        }
    ]

Additional Resources

  • See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.

7.10.5. Configuring lifecycle expiration window

You can set the time that the lifecycle management process runs each day by setting the rgw_lifecycle_work_time parameter. By default, lifecycle processing occurs once per day, at midnight.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation of the Ceph Object Gateway.
  • Root-level access to a Ceph Object Gateway node.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell

  2. Set the lifecycle expiration time:

    Syntax

    ceph config set client.rgw rgw_lifecycle_work_time %D:%D-%D:%D

    Replace %d:%d-%d:%d with start_hour:start_minute-end_hour:end_minute.

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_lifecycle_work_time 06:00-08:00

Verification

  • Retrieve the lifecycle expiration work time:

    Example

    [ceph: root@host01 /]# ceph config get client.rgw rgw_lifecycle_work_time
    
    06:00-08:00

Additional Resources

  • See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.

7.10.6. S3 bucket lifecycle transition within a storage cluster

You can use a bucket lifecycle configuration to manage objects so objects are stored effectively throughout the object’s lifetime. The object lifecycle transition rule allows you to manage, and effectively store the objects throughout the object’s lifetime. You can transition objects to less expensive storage classes, archive, or even delete them.

You can create storage classes for:

  • Fast media, such as SSD or NVMe for I/O sensitive workloads.
  • Slow magnetic media, such as SAS or SATA for archiving.

You can create a schedule for data movement between a hot storage class and a cold storage class. You can schedule this movement after a specified time so that the object expires and is deleted permanently for example you can transition objects to a storage class 30 days after you have created or even archived the objects to a storage class one year after creating them. You can do this through a transition rule. This rule applies to an object transitioning from one storage class to another. The lifecycle configuration contains one or more rules using the <Rule> element.

Additional Resources

  • See the Red Hat Ceph Storage Developer Guide for details on bucket lifecycle.

7.10.7. Transitioning an object from one storage class to another

The object lifecycle transition rule allows you to transition an object from one storage class to another class.

You can migrate data between replicated pools, erasure-coded pools, replicated to erasure-coded pools, or erasure-coded to replicated pools with the Ceph Object Gateway lifecycle transition policy.

Note

In a multi-site configuration, when the LC transition rule is applied on the first site to transition objects from one data pool to another in the same storage cluster, then the same rule is valid for the second site, if the second site has the respective data pool created and enabled with rgw application.

Prerequisites

  • Installation of the Ceph Object Gateway software.
  • Root-level access to the Ceph Object Gateway node.
  • An S3 user created with user access.

Procedure

  1. Create a new data pool:

    Syntax

    ceph osd pool create POOL_NAME

    Example

    [ceph: root@host01 /]# ceph osd pool create test.hot.data

  2. Add a new storage class:

    Syntax

    radosgw-admin zonegroup placement add  --rgw-zonegroup default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS

    Example

    [ceph: root@host01 /]# radosgw-admin zonegroup placement add  --rgw-zonegroup default --placement-id default-placement --storage-class hot.test
    {
            "key": "default-placement",
            "val": {
                "name": "default-placement",
                "tags": [],
                "storage_classes": [
                    "STANDARD",
                    "hot.test"
                ]
            }
        }

  3. Provide the zone placement information for the new storage class:

    Syntax

    radosgw-admin zone placement add --rgw-zone default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS --data-pool DATA_POOL

    Example

    [ceph: root@host01 /]# radosgw-admin zone placement add --rgw-zone default --placement-id default-placement --storage-class hot.test --data-pool test.hot.data
    {
               "key": "default-placement",
               "val": {
                   "index_pool": "test_zone.rgw.buckets.index",
                   "storage_classes": {
                       "STANDARD": {
                           "data_pool": "test.hot.data"
                       },
                       "hot.test": {
                           "data_pool": "test.hot.data",
                      }
                   },
                   "data_extra_pool": "",
                   "index_type": 0
               }

    Note

    Consider setting the compression_type when creating cold or archival data storage pools with write once.

  4. Enable the rgw application on the data pool:

    Syntax

    ceph osd pool application enable POOL_NAME rgw

    Example

    [ceph: root@host01 /]# ceph osd pool application enable test.hot.data rgw
    enabled application 'rgw' on pool 'test.hot.data'

  5. Restart all the rgw daemons.
  6. Create a bucket:

    Example

    [ceph: root@host01 /]# aws s3api create-bucket --bucket testbucket10 --create-bucket-configuration LocationConstraint=default:default-placement --endpoint-url http://10.0.0.80:8080

  7. Add the object:

    Example

    [ceph: root@host01 /]# aws --endpoint=http://10.0.0.80:8080 s3api put-object --bucket testbucket10  --key compliance-upload --body /root/test2.txt

  8. Create a second data pool:

    Syntax

    ceph osd pool create POOL_NAME

    Example

    [ceph: root@host01 /]# ceph osd pool create test.cold.data

  9. Add a new storage class:

    Syntax

    radosgw-admin zonegroup placement add  --rgw-zonegroup default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS

    Example

    [ceph: root@host01 /]# radosgw-admin zonegroup placement add  --rgw-zonegroup default --placement-id default-placement --storage-class cold.test
    {
            "key": "default-placement",
            "val": {
                "name": "default-placement",
                "tags": [],
                "storage_classes": [
                    "STANDARD",
                    "cold.test"
                ]
            }
        }

  10. Provide the zone placement information for the new storage class:

    Syntax

    radosgw-admin zone placement add --rgw-zone default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS --data-pool DATA_POOL

    Example

    [ceph: root@host01 /]# radosgw-admin zone placement add --rgw-zone default --placement-id default-placement --storage-class cold.test --data-pool test.cold.data

  11. Enable rgw application on the data pool:

    Syntax

    ceph osd pool application enable POOL_NAME rgw

    Example

    [ceph: root@host01 /]# ceph osd pool application enable test.cold.data rgw
    enabled application 'rgw' on pool 'test.cold.data'

  12. Restart all the rgw daemons.
  13. To view the zone group configuration, run the following command:

    Syntax

    radosgw-admin zonegroup get
    {
        "id": "3019de59-ddde-4c5c-b532-7cdd29de09a1",
        "name": "default",
        "api_name": "default",
        "is_master": "true",
        "endpoints": [],
        "hostnames": [],
        "hostnames_s3website": [],
        "master_zone": "adacbe1b-02b4-41b8-b11d-0d505b442ed4",
        "zones": [
            {
                "id": "adacbe1b-02b4-41b8-b11d-0d505b442ed4",
                "name": "default",
                "endpoints": [],
                "log_meta": "false",
                "log_data": "false",
                "bucket_index_max_shards": 11,
                "read_only": "false",
                "tier_type": "",
                "sync_from_all": "true",
                "sync_from": [],
                "redirect_zone": ""
            }
        ],
        "placement_targets": [
            {
                "name": "default-placement",
                "tags": [],
                "storage_classes": [
                    "hot.test",
                    "cold.test",
                    "STANDARD"
                ]
            }
        ],
        "default_placement": "default-placement",
        "realm_id": "",
        "sync_policy": {
            "groups": []
        }
    }

  14. To view the zone configuration, run the following command:

    Syntax

    radosgw-admin zone get
    {
        "id": "adacbe1b-02b4-41b8-b11d-0d505b442ed4",
        "name": "default",
        "domain_root": "default.rgw.meta:root",
        "control_pool": "default.rgw.control",
        "gc_pool": "default.rgw.log:gc",
        "lc_pool": "default.rgw.log:lc",
        "log_pool": "default.rgw.log",
        "intent_log_pool": "default.rgw.log:intent",
        "usage_log_pool": "default.rgw.log:usage",
        "roles_pool": "default.rgw.meta:roles",
        "reshard_pool": "default.rgw.log:reshard",
        "user_keys_pool": "default.rgw.meta:users.keys",
        "user_email_pool": "default.rgw.meta:users.email",
        "user_swift_pool": "default.rgw.meta:users.swift",
        "user_uid_pool": "default.rgw.meta:users.uid",
        "otp_pool": "default.rgw.otp",
        "system_key": {
            "access_key": "",
            "secret_key": ""
        },
        "placement_pools": [
            {
                "key": "default-placement",
                "val": {
                    "index_pool": "default.rgw.buckets.index",
                    "storage_classes": {
                        "cold.test": {
                            "data_pool": "test.cold.data"
                        },
                        "hot.test": {
                            "data_pool": "test.hot.data"
                        },
                        "STANDARD": {
                            "data_pool": "default.rgw.buckets.data"
                        }
                    },
                    "data_extra_pool": "default.rgw.buckets.non-ec",
                    "index_type": 0
                }
            }
        ],
        "realm_id": "",
        "notif_pool": "default.rgw.log:notif"
    }

  15. Create a bucket:

    Example

    [ceph: root@host01 /]# aws s3api create-bucket --bucket testbucket10 --create-bucket-configuration LocationConstraint=default:default-placement --endpoint-url http://10.0.0.80:8080

  16. List the objects prior to transition:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list --bucket testbucket10
    
            {
                "ETag": "\"211599863395c832a3dfcba92c6a3b90\"",
                "Size": 540,
                "StorageClass": "STANDARD",
                "Key": "obj1",
                "VersionId": "W95teRsXPSJI4YWJwwSG30KxSCzSgk-",
                "IsLatest": true,
                "LastModified": "2023-11-23T10:38:07.214Z",
                "Owner": {
                    "DisplayName": "test-user",
                    "ID": "test-user"
                }
            }

  17. Create a JSON file for lifecycle configuration:

    Example

    [ceph: root@host01 /]# vi lifecycle.json

  18. Add the specific lifecycle configuration rule in the file:

    Example

    {
        "Rules": [
            {
                "Filter": {
                    "Prefix": ""
                },
                "Status": "Enabled",
                "Transitions": [
                    {
                        "Days": 5,
                        "StorageClass": "hot.test"
                    },
     {
                        "Days": 20,
                        "StorageClass": "cold.test"
                    }
                ],
                "Expiration": {
                    "Days": 365
                },
                "ID": "double transition and expiration"
            }
        ]
    }

    The lifecycle configuration example shows an object that will transition from the default STANDARD storage class to the hot.test storage class after 5 days, again transitions after 20 days to the cold.test storage class, and finally expires after 365 days in the cold.test storage class.

  19. Set the lifecycle configuration on the bucket:

    Example

    [ceph: root@host01 /]# aws s3api put-bucket-lifecycle-configuration --bucket testbucket20 --lifecycle-configuration file://lifecycle.json

  20. Retrieve the lifecycle configuration on the bucket:

    Example

    [ceph: root@host01 /]# aws s3api get-bucket-lifecycle-configuration --bucket testbucket20
    {
        "Rules": [
            {
                "Expiration": {
                    "Days": 365
                },
                "ID": "double transition and expiration",
                "Prefix": "",
                "Status": "Enabled",
                "Transitions": [
                    {
                        "Days": 20,
                        "StorageClass": "cold.test"
                    },
                    {
                        "Days": 5,
                        "StorageClass": "hot.test"
                    }
                ]
            }
        ]
    }

  21. Verify that the object is transitioned to the given storage class:

    Example

    [ceph: root@host01 /]# radosgw-admin bucket list --bucket testbucket10
    
            {
                "ETag": "\"211599863395c832a3dfcba92c6a3b90\"",
                "Size": 540,
                "StorageClass": "cold.test",
                "Key": "obj1",
                "VersionId": "W95teRsXPSJI4YWJwwSG30KxSCzSgk-",
                "IsLatest": true,
                "LastModified": "2023-11-23T10:38:07.214Z",
                "Owner": {
                    "DisplayName": "test-user",
                    "ID": "test-user"
                }
            }

Additional Resources

  • See the Red Hat Ceph Storage Developer Guide for details on bucket lifecycle.

7.10.8. Enabling object lock for S3

Using the S3 object lock mechanism, you can use object lock concepts like retention period, legal hold, and bucket configuration to implement Write-Once-Read_Many (WORM) functionality as part of the custom workflow overriding data deletion permissions.

Important

The object version(s), not the object name, is the defining and required value for object lock to perform correctly to support the GOVERNANCE or COMPLIANCE mode. You need to know the version of the object when it is written so that you can retrieve it at a later time.

Prerequisites

  • A running Red Hat Ceph Storage cluster with Ceph Object Gateway installed.
  • Root-level access to the Ceph Object Gateway node.
  • S3 user with version-bucket creation access.

Procedure

  1. Create a bucket with object lock enabled:

    Syntax

    aws --endpoint=http://RGW_PORT:8080 s3api create-bucket --bucket BUCKET_NAME --object-lock-enabled-for-bucket

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api create-bucket --bucket worm-bucket --object-lock-enabled-for-bucket

  2. Set a retention period for the bucket:

    Syntax

    aws --endpoint=http://RGW_PORT:8080 s3api put-object-lock-configuration --bucket BUCKET_NAME --object-lock-configuration '{ "ObjectLockEnabled": "Enabled", "Rule": { "DefaultRetention": { "Mode": "RETENTION_MODE", "Days": NUMBER_OF_DAYS }}}'

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object-lock-configuration --bucket worm-bucket --object-lock-configuration '{ "ObjectLockEnabled": "Enabled", "Rule": { "DefaultRetention": { "Mode": "COMPLIANCE", "Days": 10 }}}'

    Note

    You can choose either the GOVERNANCE or COMPLIANCE mode for the RETENTION_MODE in S3 object lock, to apply different levels of protection to any object version that is protected by object lock.

    In GOVERNANCE mode, users cannot overwrite or delete an object version or alter its lock settings unless they have special permissions.

    In COMPLIANCE mode, a protected object version cannot be overwritten or deleted by any user, including the root user in your AWS account. When an object is locked in COMPLIANCE mode, its RETENTION_MODE cannot be changed, and its retention period cannot be shortened. COMPLIANCE mode helps ensure that an object version cannot be overwritten or deleted for the duration of the period.

  3. Put the object into the bucket with a retention time set:

    Syntax

    aws --endpoint=http://RGW_PORT:8080 s3api put-object --bucket BUCKET_NAME --object-lock-mode RETENTION_MODE --object-lock-retain-until-date "DATE" --key compliance-upload --body TEST_FILE

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object --bucket worm-bucket --object-lock-mode COMPLIANCE --object-lock-retain-until-date "2022-05-31" --key compliance-upload --body test.dd
    {
        "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"",
        "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD"
    }

  4. Upload a new object using the same key:

    Syntax

    aws --endpoint=http://RGW_PORT:8080 s3api put-object --bucket BUCKET_NAME --object-lock-mode RETENTION_MODE --object-lock-retain-until-date "DATE" --key compliance-upload --body PATH

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object --bucket worm-bucket --object-lock-mode COMPLIANCE --object-lock-retain-until-date "2022-05-31" --key compliance-upload --body /etc/fstab
    {
        "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"",
        "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD"
    }

Command line options

  • Set an object lock legal hold on an object version:

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object-legal-hold --bucket worm-bucket --key compliance-upload --legal-hold Status=ON

    Note

    Using the object lock legal hold operation, you can place a legal hold on an object version, thereby preventing an object version from being overwritten or deleted. A legal hold doesn’t have an associated retention period and hence, remains in effect until removed.

  • List the objects from the bucket to retrieve only the latest version of the object:

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api list-objects --bucket worm-bucket

  • List the object versions from the bucket:

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api list-objects --bucket worm-bucket
    {
        "Versions": [
            {
                "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"",
                "Size": 288,
                "StorageClass": "STANDARD",
                "Key": "hosts",
                "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD",
                "IsLatest": true,
                "LastModified": "2022-06-17T08:51:17.392000+00:00",
                "Owner": {
                    "DisplayName": "Test User in Tenant test",
                    "ID": "test$test.user"
                }
                }
            }
        ]
    }

  • Access objects using version-ids:

    Example

    [root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api get-object --bucket worm-bucket  --key compliance-upload --version-id 'IGOU.vdIs3SPduZglrB-RBaK.sfXpcd' download.1
    {
        "AcceptRanges": "bytes",
        "LastModified": "2022-06-17T08:51:17+00:00",
        "ContentLength": 288,
        "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"",
        "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD",
        "ContentType": "binary/octet-stream",
        "Metadata": {},
        "ObjectLockMode": "COMPLIANCE",
        "ObjectLockRetainUntilDate": "2023-06-17T08:51:17+00:00"
    }

7.11. Usage

The Ceph Object Gateway logs usage for each user. You can track user usage within date ranges too.

Options include:

  • Start Date: The --start-date option allows you to filter usage stats from a particular start date (format: yyyy-mm-dd[HH:MM:SS]).
  • End Date: The --end-date option allows you to filter usage up to a particular date (format: yyyy-mm-dd[HH:MM:SS]).
  • Log Entries: The --show-log-entries option allows you to specify whether or not to include log entries with the usage stats (options: true | false).
Note

You can specify time with minutes and seconds, but it is stored with 1 hour resolution.

7.11.1. Show usage

To show usage statistics, specify the usage show. To show usage for a particular user, you must specify a user ID. You may also specify a start date, end date, and whether or not to show log entries.

Example

[ceph: root@host01 /]# radosgw-admin usage show \
                --uid=johndoe --start-date=2022-06-01 \
                --end-date=2022-07-01

You may also show a summary of usage information for all users by omitting a user ID.

Example

[ceph: root@host01 /]# radosgw-admin usage show --show-log-entries=false

7.11.2. Trim usage

With heavy use, usage logs can begin to take up storage space. You can trim usage logs for all users and for specific users. You may also specify date ranges for trim operations.

Example

[ceph: root@host01 /]# radosgw-admin usage trim --start-date=2022-06-01 \
                    --end-date=2022-07-31

[ceph: root@host01 /]# radosgw-admin usage trim --uid=johndoe
[ceph: root@host01 /]# radosgw-admin usage trim --uid=johndoe --end-date=2021-04-31

7.12. Ceph Object Gateway data layout

Although RADOS only knows about pools and objects with their Extended Attributes (xattrs) and object map (OMAP), conceptually Ceph Object Gateway organizes its data into three different kinds:

  • metadata
  • bucket index
  • data

Metadata

There are three sections of metadata:

  • user: Holds user information.
  • bucket: Holds a mapping between bucket name and bucket instance ID.
  • bucket.instance: Holds bucket instance information.

You can use the following commands to view metadata entries:

Syntax

radosgw-admin metadata get bucket:BUCKET_NAME
radosgw-admin metadata get bucket.instance:BUCKET:BUCKET_ID
radosgw-admin metadata get user:USER
radosgw-admin metadata set user:USER

Example

[ceph: root@host01 /]# radosgw-admin metadata list
[ceph: root@host01 /]# radosgw-admin metadata list bucket
[ceph: root@host01 /]# radosgw-admin metadata list bucket.instance
[ceph: root@host01 /]# radosgw-admin metadata list user

Every metadata entry is kept on a single RADOS object.

Important

When using the radosgw-admin tool, ensure that the tool and the Ceph Cluster are of the same version. The use of mismatched versions is not supported.

Note

A Ceph Object Gateway object might consist of several RADOS objects, the first of which is the head that contains the metadata, such as manifest, Access Control List (ACL), content type, ETag, and user-defined metadata. The metadata is stored in xattrs. The head might also contain up to 512 KB of object data, for efficiency and atomicity. The manifest describes how each object is laid out in RADOS objects.

Bucket index

It is a different kind of metadata, and kept separately. The bucket index holds a key-value map in RADOS objects. By default, it is a single RADOS object per bucket, but it is possible to shard the map over multiple RADOS objects.

The map itself is kept in OMAP associated with each RADOS object. The key of each OMAP is the name of the objects, and the value holds some basic metadata of that object, the metadata that appears when listing the bucket. Each OMAP holds a header, and we keep some bucket accounting metadata in that header such as number of objects, total size, and the like.

Note

OMAP is a key-value store, associated with an object, in a way similar to how extended attributes associate with a POSIX file. An object’s OMAP is not physically located in the object’s storage, but its precise implementation is invisible and immaterial to the Ceph Object Gateway.

Data

Objects data is kept in one or more RADOS objects for each Ceph Object Gateway object.

7.12.1. Object lookup path

When accessing objects, REST APIs come to Ceph Object Gateway with three parameters:

  • Account information, which has the access key in S3 or account name in Swift
  • Bucket or container name
  • Object name or key

At present, Ceph Object Gateway only uses account information to find out the user ID and for access control. It uses only the bucket name and object key to address the object in a pool.

Account information

The user ID in Ceph Object Gateway is a string, typically the actual user name from the user credentials and not a hashed or mapped identifier.

When accessing a user’s data, the user record is loaded from an object USER_ID in the default.rgw.meta pool with users.uid namespace.

Bucket names

They are represented in the default.rgw.meta pool with root namespace. Bucket record is loaded in order to obtain a marker, which serves as a bucket ID.

Object names

The object is located in the default.rgw.buckets.data pool. Object name is MARKER_KEY, for example default.7593.4_image.png, where the marker is default.7593.4 and the key is image.png. These concatenated names are not parsed and are passed down to RADOS only. Therefore, the choice of the separator is not important and causes no ambiguity. For the same reason, slashes are permitted in object names, such as keys.

7.12.1.1. Multiple data pools

It is possible to create multiple data pools so that different users’ buckets are created in different RADOS pools by default, thus providing the necessary scaling. The layout and naming of these pools is controlled by a policy setting.

7.12.2. Bucket and object listing

Buckets that belong to a given user are listed in an OMAP of an object named USER_ID.buckets, for example, foo.buckets, in the default.rgw.meta pool with users.uid namespace. These objects are accessed when listing buckets, when updating bucket contents, and updating and retrieving bucket statistics such as quota. These listings are kept consistent with buckets in the .rgw pool.

Note

See the user-visible, encoded class cls_user_bucket_entry and its nested class cls_user_bucket for the values of these OMAP entries.

Objects that belong to a given bucket are listed in a bucket index. The default naming for index objects is .dir.MARKER in the default.rgw.buckets.index pool.

Additional Resources

7.13. Object Gateway data layout parameters

This is a list of data layout parameters for Ceph Object Gateway.

Known pools:

.rgw.root
Unspecified region, zone, and global information records, one per object.
ZONE.rgw.control
notify.N
ZONE.rgw.meta

Multiple namespaces with different kinds of metadata

namespace: root

BUCKET .bucket.meta.BUCKET:MARKER # see put_bucket_instance_info()

The tenant is used to disambiguate buckets, but not bucket instances.

Example

.bucket.meta.prodtx:test%25star:default.84099.6
.bucket.meta.testcont:default.4126.1
.bucket.meta.prodtx:testcont:default.84099.4
prodtx/testcont
prodtx/test%25star
testcont

namespace: users.uid

Contains per-user information (RGWUserInfo) in USER objects and per-user lists of buckets in omaps of USER.buckets objects. The USER might contain the tenant if non-empty.

Example

prodtx$prodt
test2.buckets
prodtx$prodt.buckets
test2

namespace: users.email
Unimportant
namespace: users.keys

47UA98JSTJZ9YAN3OS3O

This allows Ceph Object Gateway to look up users by their access keys during authentication.

namespace: users.swift
test:tester
ZONE.rgw.buckets.index
Objects are named .dir.MARKER, each contains a bucket index. If the index is sharded, each shard appends the shard index after the marker.
ZONE.rgw.buckets.data

default.7593.4__shadow_.488urDFerTYXavx4yAd-Op8mxehnvTI_1 MARKER_KEY

An example of a marker would be default.16004.1 or default.7593.4. The current format is ZONE.INSTANCE_ID.BUCKET_ID, but once generated, a marker is not parsed again, so its format might change freely in the future.

Additional Resources

7.14. Optimize the Ceph Object Gateway’s garbage collection

When new data objects are written into the storage cluster, the Ceph Object Gateway immediately allocates the storage for these new objects. After you delete or overwrite data objects in the storage cluster, the Ceph Object Gateway deletes those objects from the bucket index. Some time afterward, the Ceph Object Gateway then purges the space that was used to store the objects in the storage cluster. The process of purging the deleted object data from the storage cluster is known as Garbage Collection, or GC.

Garbage collection operations typically run in the background. You can configure these operations to either run continuously, or to run only during intervals of low activity and light workloads. By default, the Ceph Object Gateway conducts GC operations continuously. Because GC operations are a normal part of Ceph Object Gateway operations, deleted objects that are eligible for garbage collection exist most of the time.

7.14.1. Viewing the garbage collection queue

Before you purge deleted and overwritten objects from the storage cluster, use radosgw-admin to view the objects awaiting garbage collection.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the Ceph Object Gateway.

Procedure

  • To view the queue of objects awaiting garbage collection:

    Example

    [ceph: root@host01 /]# radosgw-admin gc list

Note

To list all entries in the queue, including unexpired entries, use the --include-all option.

7.14.2. Adjusting Garbage Collection Settings

The Ceph Object Gateway allocates storage for new and overwritten objects immediately. Additionally, the parts of a multi-part upload also consume some storage.

The Ceph Object Gateway purges the storage space used for deleted objects after deleting the objects from the bucket index. Similarly, the Ceph Object Gateway will delete data associated with a multi-part upload after the multi-part upload completes or when the upload has gone inactive or failed to complete for a configurable amount of time. The process of purging the deleted object data from the Red Hat Ceph Storage cluster is known as garbage collection (GC).

Viewing the objects awaiting garbage collection can be done with the following command:

radosgw-admin gc list

Garbage collection is a background activity that runs continuously or during times of low loads, depending upon how the storage administrator configures the Ceph Object Gateway. By default, the Ceph Object Gateway conducts garbage collection operations continuously. Since garbage collection operations are a normal function of the Ceph Object Gateway, especially with object delete operations, objects eligible for garbage collection exist most of the time.

Some workloads can temporarily or permanently outpace the rate of garbage collection activity. This is especially true of delete-heavy workloads, where many objects get stored for a short period of time and then deleted. For these types of workloads, storage administrators can increase the priority of garbage collection operations relative to other operations with the following configuration parameters:

  • The rgw_gc_obj_min_wait configuration option waits a minimum length of time, in seconds, before purging a deleted object’s data. The default value is two hours, or 7200 seconds. The object is not purged immediately, because a client might be reading the object. Under heavy workloads, this setting can consume too much storage or have a large number of deleted objects to purge. Red Hat recommends not setting this value below 30 minutes, or 1800 seconds.
  • The rgw_gc_processor_period configuration option is the garbage collection cycle run time. That is, the amount of time between the start of consecutive runs of garbage collection threads. If garbage collection runs longer than this period, the Ceph Object Gateway will not wait before running a garbage collection cycle again.
  • The rgw_gc_max_concurrent_io configuration option specifies the maximum number of concurrent IO operations that the gateway garbage collection thread will use when purging deleted data. Under delete heavy workloads, consider increasing this setting to a larger number of concurrent IO operations.
  • The rgw_gc_max_trim_chunk configuration option specifies the maximum number of keys to remove from the garbage collector log in a single operation. Under delete heavy operations, consider increasing the maximum number of keys so that more objects are purged during each garbage collection operation.

Starting with Red Hat Ceph Storage 4.1, offloading the index object’s OMAP from the garbage collection log helps lessen the performance impact of garbage collection activities on the storage cluster. Some new configuration parameters have been added to Ceph Object Gateway to tune the garbage collection queue, as follows:

  • The rgw_gc_max_deferred_entries_size configuration option sets the maximum size of deferred entries in the garbage collection queue.
  • The rgw_gc_max_queue_size configuration option sets the maximum queue size used for garbage collection. This value should not be greater than osd_max_object_size minus rgw_gc_max_deferred_entries_size minus 1 KB.
  • The rgw_gc_max_deferred configuration option sets the maximum number of deferred entries stored in the garbage collection queue.
Note

These garbage collection configuration parameters are for Red Hat Ceph Storage 5 and higher.

Note

In testing, with an evenly balanced delete-write workload, such as 50% delete and 50% write operations, the storage cluster fills completely in 11 hours. This is because Ceph Object Gateway garbage collection fails to keep pace with the delete operations. The cluster status switches to the HEALTH_ERR state if this happens. Aggressive settings for parallel garbage collection tunables significantly delayed the onset of storage cluster fill in testing and can be helpful for many workloads. Typical real-world storage cluster workloads are not likely to cause a storage cluster fill primarily due to garbage collection.

7.14.3. Adjusting garbage collection for delete-heavy workloads

Some workloads may temporarily or permanently outpace the rate of garbage collection activity. This is especially true of delete-heavy workloads, where many objects get stored for a short period of time and are then deleted. For these types of workloads, consider increasing the priority of garbage collection operations relative to other operations. Contact Red Hat Support with any additional questions about Ceph Object Gateway Garbage Collection.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to all nodes in the storage cluster.

Procedure

  1. Set the value of rgw_gc_max_concurrent_io to 20, and the value of rgw_gc_max_trim_chunk to 64:

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_gc_max_concurrent_io 20
    [ceph: root@host01 /]# ceph config set client.rgw rgw_gc_max_trim_chunk 64

  2. Restart the Ceph Object Gateway to allow the changed settings to take effect.
  3. Monitor the storage cluster during GC activity to verify that the increased values do not adversely affect performance.
Important

Never modify the value for the rgw_gc_max_objs option in a running cluster. You should only change this value before deploying the RGW nodes.

7.15. Optimize the Ceph Object Gateway’s data object storage

Bucket lifecycle configuration optimizes data object storage to increase its efficiency and to provide effective storage throughout the lifetime of the data.

The S3 API in the Ceph Object Gateway currently supports a subset of the AWS bucket lifecycle configuration actions:

  • Expiration
  • NoncurrentVersionExpiration
  • AbortIncompleteMultipartUpload

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to all of the nodes in the storage cluster.

7.15.1. Parallel thread processing for bucket life cycles

The Ceph Object Gateway now allows for parallel thread processing of bucket life cycles across multiple Ceph Object Gateway instances. Increasing the number of threads that run in parallel enables the Ceph Object Gateway to process large workloads more efficiently. In addition, the Ceph Object Gateway now uses a numbered sequence for index shard enumeration instead of using in-order numbering.

7.15.2. Optimizing the bucket lifecycle

Two options in the Ceph configuration file affect the efficiency of bucket lifecycle processing:

  • rgw_lc_max_worker specifies the number of lifecycle worker threads to run in parallel. This enables the simultaneous processing of both bucket and index shards. The default value for this option is 3.
  • rgw_lc_max_wp_worker specifies the number of threads in each lifecycle worker thread’s work pool. This option helps to accelerate processing for each bucket. The default value for this option is 3.

For a workload with a large number of buckets — for example, a workload with thousands of buckets — consider increasing the value of the rgw_lc_max_worker option.

For a workload with a smaller number of buckets but with a higher number of objects in each bucket — such as in the hundreds of thousands — consider increasing the value of the rgw_lc_max_wp_worker option.

Note

Before increasing the value of either of these options, please validate current storage cluster performance and Ceph Object Gateway utilization. Red Hat does not recommend that you assign a value of 10 or above for either of these options.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to all of the nodes in the storage cluster.

Procedure

  1. To increase the number of threads to run in parallel, set the value of rgw_lc_max_worker to a value between 3 and 9:

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_lc_max_worker 7

  2. To increase the number of threads in each thread’s work pool, set the value of rgw_lc_max_wp_worker to a value between 3 and 9:

    Example

    [ceph: root@host01 /]# ceph config set client.rgw rgw_lc_max_wp_worker 7

  3. Restart the Ceph Object Gateway to allow the changed settings to take effect.
  4. Monitor the storage cluster to verify that the increased values do not adversely affect performance.

Additional Resources