Ceph: Multi site replication is slower than expected, FIFO push destination issue

Solution Verified - Updated -

Issue

Multi site replication is slower than expected, FIFO push destination issue

Issue #1:
In RHCS 5.0 through RHCS 5.2.x, the processing of an internal messaging queue violates FIFO's basic semantics. The result can be observed as slow multi site replication or, for a system which is closely monitored, as a multi site replication stall. After some time, multi site replication will recover and resume syncing to the remote Ceph cluster

Issue #2:
Prior to RHCS 5.0 the Ceph Cluster did not use a FIFO to process the Multi Site Replication requests. Instead, an OMAP was used and this had its own list of issues which resulted in slow Multi Site Replication.

Issue #3:
In RHCS 5.2.x and lower, there would be little to no parallel Multi Site Replication regardless of the number of Rados Gateways (RGWs) in use.

Environment

Red Hat Ceph Storage (RHCS) 3.x
Red Hat Ceph Storage (RHCS) 4.x
Red Hat Ceph Storage (RHCS) 5.0.x
Red Hat Ceph Storage (RHCS) 5.1.x
Red Hat Ceph Storage (RHCS) 5.2.x
Red Hat Ceph Storage (RHCS) Rados Gateway
Red Hat Ceph Storage (RHCS) RGW

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content