Ceph peering breaks when IPv6 privacy extensions are enabled, Why ?

Solution Verified - Updated -

Issue

  • Ceph peering breaks when IPv6 privacy extensions are enabled, Why ?
  • It may turn out to be a bug, but my gut says that there is very little that Ceph can do to make this work and that any ceph cluster should simply disable the privacy extensions on the backend network.
  • With IPv6 privacy extensions enabled [1] and when running a ceph cluster with an IPv6 backend network, OSDs are unable to peer with other OSDs on the same box (which may happen, for instance, when the PGmap changes and a copy of a PG moves from one OSD to another).
  • With default logging levels, the source OSD complains that the destination OSD hasn't responded in X seconds, while the destination OSD has zero related information in its own log.
  • The behavior that the user sees is that upon a change in cluster state, some set of PGs may be stuck in peering or remapped,peering states. Upon closer inspection (ceph pg map
    ), you will see that the "up" and "acting" sets will contain a pair of OSDs that live on the same node.

[1]: On linux, this is configured with the sysctl option net.ipv6.conf.all.use_tempaddr. If set to >0, IPv6 privacy extensions are enabled. I've confirmed that a value of either 1 or 2 will trigger this broken peering behavior.

Environment

  • Red Hat Ceph Storage

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content