Ceph RGW and RBD latency get large after running the Ceph cluster for several months

Solution Verified - Updated 2025-10-16T11:12:57+00:00 -

Issue

Ceph RGW is slow and its latency get large
Ceph RBD latency get large and sometimes spike to over 100 ms
The latency was small just after the initial deployment of the Ceph cluster.
The latency get worse after running the Ceph cluster for several months or longer.
In the worst case, some OSD are flapping with slow requests warning messages.

Environment

Red Hat Ceph Storage 5
Red Hat Ceph Storage 6
BlueStore OSD on NVMe SSD device
Each OSD is a primary device : block (data area), block.db (RocksDB area) and block.wal (WAL area) are collocated in the same NVMe SSD device
Each NVMe SSD is very large, e.g. 1TB or more

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content

Quick Links

Help

Site Info

Related Sites

Copyright © 2026 Red Hat

Here are the common uses of Markdown.

Code blocks

~~~
Code surrounded in tildes is easier to read
~~~

Links/URLs

[Red Hat Customer Portal](https://access.redhat.com)