How does a large number of objects in a RHCS/Ceph pool affect the filestore merge and split rate, as well as the cluster performance?

Solution Verified - Updated -

Issue

  • A large number of slow requests and high disk utilisation while reading/writing data to pools in the Ceph cluster.

  • Several OSDs were dying with either failed or wrongly marked me down messages, as well as with heartbeat related stack traces.

  • Slow requests evenly distributed throughout all the SATA disks in the cluster. This indicates the problem is cluster-wide and not isolated to any specific disk, host, or rack.

  • Listing the PG directory took well over 30 seconds to complete due to the high number of files.

Environment

  • Red Hat Ceph Storage
  • Ceph Cluster with Filestore OSDs

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content