How to identify slow OSDs via slow requests log entries
Issue
- The following errors are being generated in the "ceph.log" for different OSDs.
- You want to know which OSDs are impacted the most.
2020-09-10 05:03:48.384793 osd.114 osd.114 <IP Address>:6828/3260740 17670 : cluster [WRN] slow request 30.924470 seconds old, received at 2020-09-10 05:03:17.451046: rep_scrubmap(8.1619 e63557 from shard 28) currently queued_for_pg
2020-09-10 05:03:52.588149 osd.114 osd.114 <IP Address>:6828/3260740 17672 : cluster [WRN] slow request 30.926508 seconds old, received at 2020-09-10 05:03:21.661572: osd_op(client.236355855.0:5733962369 8.4a9 8.9fa504a9 (undecoded) ondisk+write+known_if_redirected e85709) currently queued_for_pg
Environment
- Red Hat Ceph Storage 3
- Red Hat Ceph Storage 4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.