Upgrade of the secondary site of a multi-site Ceph cluster results in inconsistent sync state
Issue
- Before upgrade of secondary Ceph cluster from 4.3.2 to 5.3.2 sync status shows ok on both sides
-
After the upgrade
radosgw-admin sync statusshows ok on primary cluster that is still on 4.3.2[root@primary-node1 ~]# podman exec -it ceph-mon-$(hostname -s) radosgw-admin sync status realm c336d570-da61-4dd0-9195-44c73305214a (example) zonegroup e669103c-e89e-49e1-855e-a5fa084fa9bb (test) zone f0056d30-73c8-4989-81be-d28cc76ced14 (primary) metadata sync no sync (zone is master) data sync source: 2b3873b7-a41d-4b4f-a6d8-de65ef79ab8e (secondary) syncing full sync: 0/128 shards incremental sync: 128/128 shards data is caught up with sourcewhile it shows a sync eror on the upgraded cluster
[root@zhl20087 ~]# cephadm shell radosgw-admin sync status Inferring fsid af5fef06-9c40-4ab3-ad6a-f23d02a3b34e Using recent ceph image proxy-redhat-hub.example.com/rhceph/rhceph-5-rhel8@sha256:a4b90074f7448267f39100bf3425e3f04ad86316dde7cf6df1e16de4e315b606 realm c336d570-da61-4dd0-9195-44c73305214a (example) zonegroup e669103c-e89e-49e1-855e-a5fa084fa9bb (test) zone 2b3873b7-a41d-4b4f-a6d8-de65ef79ab8e (secondary) current time 2024-01-16T13:33:28Z zonegroup features enabled: disabled: resharding 2024-01-15T13:33:28.233+0000 7fbd84af7500 0 int RGWRESTStreamRWRequest::complete_request(optional_yield, std::__cxx11::string*, ceph::real_time*, uint64_t*, std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >*, std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >*):GET_1705412008220526087_https://ceph-rgw-rep.example.com:443: wait failed with ret=-5 2024-01-15T13:33:28.233+0000 7fbd84af7500 0 ERROR: failed to fetch mdlog info metadata sync syncing full sync: 0/64 shards failed to fetch master sync status: (5) Input/output error 2024-01-15T13:33:28.245+0000 7fbd84af7500 0 int RGWRESTStreamRWRequest::complete_request(optional_yield, std::__cxx11::string*, ceph::real_time*, uint64_t*, std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >*, std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >*):GET_1705412008235145571_https://ceph-rgw-rep.example.com:443: wait failed with ret=-5 2024-01-15T13:33:28.245+0000 7fbd84af7500 0 ERROR: failed to fetch datalog info data sync source: f0056d30-73c8-4989-81be-d28cc76ced14 (primary) failed to retrieve sync info: (5) Input/output error
Environment
- Red Hat Ceph Storage
- 4.3.2
- 5.3.2
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.